Jami Malikzade created SPARK-24370:
--------------------------------------

             Summary: spark checkpoint creates many 0 byte empty 
files(partitions)  in checkpoint directory
                 Key: SPARK-24370
                 URL: https://issues.apache.org/jira/browse/SPARK-24370
             Project: Spark
          Issue Type: Bug
          Components: Spark Shell
    Affects Versions: 2.1.1
            Reporter: Jami Malikzade


We currently facing issue, that when we call checkpoint on dataframe, it 
creates partitions in checkpoint dir, but some of them are empty. So we having 
exceptions reading dataframe back.

Do you have any idea how to avoid it?

it creates 200 partitions.Some are empty. I used repartition(1) before 
checkpoint. But it is not good wordaround. Do we have anyway , to populate all 
partitions with data, or avoid empty files?

Pasted snapshot.

!image-2018-05-23-21-10-43-673.png!



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to