ZeMirella commented on issue #3699: URL: https://github.com/apache/hudi/issues/3699#issuecomment-925219008
Hi, thanks for you reply **Which line of code from HoodieSparkUtils was ran here?** The jobs hangs before even start, it hangs when it start to list files and tries to read s3 files. the hanged task that the spark history shows me is this one <img width="958" alt="Captura de Tela 2021-09-22 às 15 50 25" src="https://user-images.githubusercontent.com/75490501/134405074-b8cde70b-d81d-4299-b4a6-05cceb538386.png"> **What Hudi actions are you trying to perform?** This job was suppose to join some tables and save the output to s3, the code line where it hangs ia an create table operation, here is code line ` hudi_options = { 'hoodie.table.name': self.table_name, 'hoodie.datasource.write.recordkey.field': self.primary_key, 'hoodie.datasource.write.table.name': self.table_name, 'hoodie.datasource.write.operation': 'bulk_insert', 'hoodie.bulkinsert.shuffle.parallelism': self.bulk_insert_shuffle_parallelism, 'hoodie.datasource.hive_sync.enable': self.hive_sync_enabled, 'hoodie.datasource.hive_sync.database': self.hive_database_name, 'hoodie.datasource.hive_sync.jdbcurl': f'jdbc:hive2://{self.hive_jdbc_url}:10000', 'hoodie.datasource.hive_sync.table': self.table_name, 'hoodie.datasource.hive_sync.partition_extractor_class': 'org.apache.hudi.hive.NonPartitionedExtractor', 'hoodie.datasource.hive_sync.support_timestamp': 'true', 'hoodie.datasource.write.keygenerator.class': 'org.apache.hudi.keygen.NonpartitionedKeyGenerator', 'hoodie.datasource.write.row.writer.enable': 'false', 'hoodie.parquet.small.file.limit': 536870912, 'hoodie.parquet.max.file.size': 1073741824, 'hoodie.parquet.block.size': 536870912 } spark_df.write.format("hudi").options(**hudi_options).mode("overwrite").save(self.table_path)` **What is the total input data size are you reading?** 1,6TB **How many executors were actually created during the run?** 37 <img width="1745" alt="image" src="https://user-images.githubusercontent.com/75490501/134403621-c4ca12e1-93fa-405a-910a-595013062343.png"> -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
