MadhurContlo commented on issue #9962:
URL: https://github.com/apache/hudi/issues/9962#issuecomment-1847052355

   Hi Team. I have a somewhat similar issue actually. I am reading data from S3 
and writing to my Hudi tables. I have to write close to 500 tables in each job. 
I was using the same job when there was no data in the tables, and it worked 
smoothly. All the tables were being written and job was getting terminated 
after completion. The job took almost 2 hours to complete.
   
   After some days, the job was running indefinitely. I checked the DynamoDB 
table, but it was empty. I checked the data for some tables, and data was 
written in the hudi tables also. I had to manually terminate the EMR. Now from 
past 6 days, I run the job and terminate it manually, but the job is not 
terminating.
   
   My Configuration:
   
   EMR 6.14.0
   Hadoop 3.3.3, Hive 3.1.3, Spark 3.4.1
   
   JARs used:
   ```
   s3://pyspark-pipeline/jars/hudi-hive-sync-bundle-0.14.0.jar,
   s3://pyspark-pipeline/jars/hudi-spark3.4-bundle_2.12-0.14.0.jar,
   s3://pyspark-pipeline/jars/hudi-aws-bundle-0.14.0.jar,
   s3://pyspark-pipeline/jars/spark-avro_2.12-3.4.0.jar,
   s3://pyspark-pipeline/jars/httpclient-4.5.14.jar,
   s3://pyspark-pipeline/jars/httpcore-4.4.15.jar,
   s3://pyspark-pipeline/jars/spark-sql-kafka-0-10_2.12-3.4.0.jar,
   s3://pyspark-pipeline/jars/kafka-clients-2.8.1.jar,
   s3://pyspark-pipeline/jars/spark-token-provider-kafka-0-10_2.12-3.4.0.jar,
   s3://pyspark-pipeline/jars/commons-pool2-2.11.1.jar,
   s3://pyspark-pipeline/jars/json-20231013.jar
   ```
   
   Log I found in Yarn timeline server, and same log is close to **1000 
times**. 
   ```
   23/12/08 06:18:34 INFO AmazonDynamoDBLockClient: Heartbeat thread recieved 
interrupt, exiting run() (possibly exiting thread)
   java.lang.InterruptedException: sleep interrupted
        at java.lang.Thread.sleep(Native Method) ~[?:1.8.0_392]
        at 
com.amazonaws.services.dynamodbv2.AmazonDynamoDBLockClient.run(AmazonDynamoDBLockClient.java:1248)
 ~[hudi-aws-bundle-0.14.0.jar:0.14.0]
        at java.lang.Thread.run(Thread.java:750) ~[?:1.8.0_392]
   ```
   Actually, in the logs also, logs for many tables that were actually written 
in the Hudi database, were not there, so I can say by certainty that I do not 
have complete logs also. 
   
   Any suggestions to solve this will be very helpful. Thank you.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to