Jami Malikzade created SPARK-24370: -------------------------------------- Summary: spark checkpoint creates many 0 byte empty files(partitions) in checkpoint directory Key: SPARK-24370 URL: https://issues.apache.org/jira/browse/SPARK-24370 Project: Spark Issue Type: Bug Components: Spark Shell Affects Versions: 2.1.1 Reporter: Jami Malikzade
We currently facing issue, that when we call checkpoint on dataframe, it creates partitions in checkpoint dir, but some of them are empty. So we having exceptions reading dataframe back. Do you have any idea how to avoid it? it creates 200 partitions.Some are empty. I used repartition(1) before checkpoint. But it is not good wordaround. Do we have anyway , to populate all partitions with data, or avoid empty files? Pasted snapshot. !image-2018-05-23-21-10-43-673.png! -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org