n3nash edited a comment on issue #2946:
URL: https://github.com/apache/hudi/issues/2946#issuecomment-841985843


   @AkshayChan From the message it seems to be pretty clear that some of the 
nodes are running out of disk space 
   
   ```
   Caused by: org.apache.spark.SparkException: Job aborted due to stage 
failure: Task 0 in stage 77.0 failed 4 times, most recent failure: Lost task 
0.3 in stage 77.0 (TID 22943, 172.34.88.19, executor 32): 
com.esotericsoftware.kryo.KryoException: java.io.IOException: No space left on 
device
   ```
   
   I would recommend checking the EMR instances you are provisioning and 
logging into the boxes when the job is running to see when it's running out of 
space. To give you an idea of how this can happen, whenever Hudi performs an 
upsert, it will shuffle some data around. Spark shuffle has 2 phases : map and 
reduce. The map phase spills data to the local disk and uses the KryoSerializer 
to do so. That's is where you are running into this exception. 
   
   Not much I can do here. Let me know if you need anything. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to