[jira] [Resolved] (SPARK-21435) Empty files should be skipped while write to file
[ https://issues.apache.org/jira/browse/SPARK-21435?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan resolved SPARK-21435. - Resolution: Fixed Fix Version/s: 2.3.0 > Empty files should be skipped while write to file > - > > Key: SPARK-21435 > URL: https://issues.apache.org/jira/browse/SPARK-21435 > Project: Spark > Issue Type: Improvement > Components: SQL >Affects Versions: 2.2.0 >Reporter: Li Yuanjian >Assignee: Li Yuanjian >Priority: Minor > Fix For: 2.3.0 > > > Consider of this scenario, source table has many partitions and data files, > after the query filter, only a few data write to the destination dir. > In this case the destination dir or table will have many empty files or files > only have schema meta(parquet format), I know we can use coalesce but skip > the empty file may be more better in this case. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-21435) Empty files should be skipped while write to file
[ https://issues.apache.org/jira/browse/SPARK-21435?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon resolved SPARK-21435. -- Resolution: Duplicate I am resolving this as a duplicate of SPARK-20065. I believe the fix should be essentially in the same place and they describe the similar issues. Please reopen this if the fix should be separate or I misunderstood. > Empty files should be skipped while write to file > - > > Key: SPARK-21435 > URL: https://issues.apache.org/jira/browse/SPARK-21435 > Project: Spark > Issue Type: Improvement > Components: SQL >Affects Versions: 2.2.0 >Reporter: Li Yuanjian >Priority: Minor > > Consider of this scenario, source table has many partitions and data files, > after the query filter, only a few data write to the destination dir. > In this case the destination dir or table will have many empty files or files > only have schema meta(parquet format), I know we can use coalesce but skip > the empty file may be more better in this case. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org