[jira] [Resolved] (SPARK-21435) Empty files should be skipped while write to file

2017-07-19 Thread Wenchen Fan (JIRA)

 [ 
https://issues.apache.org/jira/browse/SPARK-21435?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wenchen Fan resolved SPARK-21435.
-
   Resolution: Fixed
Fix Version/s: 2.3.0

> Empty files should be skipped while write to file
> -
>
> Key: SPARK-21435
> URL: https://issues.apache.org/jira/browse/SPARK-21435
> Project: Spark
>  Issue Type: Improvement
>  Components: SQL
>Affects Versions: 2.2.0
>Reporter: Li Yuanjian
>Assignee: Li Yuanjian
>Priority: Minor
> Fix For: 2.3.0
>
>
> Consider of this scenario, source table has many partitions and data files, 
> after the query filter, only a few data write to the destination dir.
> In this case the destination dir or table will have many empty files or files 
> only have schema meta(parquet format), I know we can use coalesce but skip 
> the empty file may be more better in this case.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Resolved] (SPARK-21435) Empty files should be skipped while write to file

2017-07-17 Thread Hyukjin Kwon (JIRA)

 [ 
https://issues.apache.org/jira/browse/SPARK-21435?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hyukjin Kwon resolved SPARK-21435.
--
Resolution: Duplicate

I am resolving this as a duplicate of SPARK-20065. I believe the fix should be 
essentially in the same place and they describe the similar issues. Please 
reopen this if the fix should be separate or I misunderstood.

> Empty files should be skipped while write to file
> -
>
> Key: SPARK-21435
> URL: https://issues.apache.org/jira/browse/SPARK-21435
> Project: Spark
>  Issue Type: Improvement
>  Components: SQL
>Affects Versions: 2.2.0
>Reporter: Li Yuanjian
>Priority: Minor
>
> Consider of this scenario, source table has many partitions and data files, 
> after the query filter, only a few data write to the destination dir.
> In this case the destination dir or table will have many empty files or files 
> only have schema meta(parquet format), I know we can use coalesce but skip 
> the empty file may be more better in this case.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org