soumilshah1995 commented on issue #9764:
URL: https://github.com/apache/hudi/issues/9764#issuecomment-1730362628
Running into Different Issue
after setting these
```
import os
os.environ['AWS_ACCESS_KEY_ID'] = "XXXX"
os.environ['AWS_ACCESS_KEY'] = "XX"
os.environ['AWS_SECRET_ACCESS_KEY'] = "XX"
os.environ['AWS_SECRET_KEY'] = "XX"
```
# Code
```
from pyflink.table import EnvironmentSettings, TableEnvironment
import os
from faker import Faker
# Create a batch TableEnvironment
env_settings = EnvironmentSettings.in_batch_mode()
table_env = TableEnvironment.create(env_settings)
# Get the current working directory
CURRENT_DIR = os.getcwd()
# Define a list of JAR file names you want to add
jar_files = [
"hudi-flink-bundle_2.12-0.10.1.jar",
"flink-s3-fs-hadoop-1.16.1.jar",
# "hudi-flink1.16-bundle-0.13.0.jar",
"flink-sql-connector-kinesis-1.16.1.jar"
]
# Build the list of JAR URLs by prepending 'file:///' to each file name
jar_urls = [f"file:///{CURRENT_DIR}/{jar_file}" for jar_file in jar_files]
table_env.get_config().get_configuration().set_string(
"pipeline.jars",
";".join(jar_urls)
)
#hudi_output_path = 'file://' + os.path.join(os.getcwd(), 'output')
hudi_output_path = 's3a://datateam-sandbox-qa-demo/tmp/'
hudi_sink = f"""
CREATE TABLE t1(
uuid VARCHAR(20) PRIMARY KEY NOT ENFORCED,
name VARCHAR(10),
`partition` VARCHAR(20)
)
PARTITIONED BY (`partition`)
WITH (
'connector' = 'hudi',
'path' = '{hudi_output_path}' ,
'table.type' = 'MERGE_ON_READ'
);
"""
table_env.execute_sql(hudi_sink)
insert_into_hudi_sink_query= """
INSERT INTO t1 VALUES
('id1','Danny','par1'),
('id2','Stephen','par1');
"""
table_env.execute_sql(insert_into_hudi_sink_query)
```
I see folder created on S3 but I don't see parquet files any idea why ?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]