[GitHub] spark issue #23052: [SPARK-26081][SQL] Prevent empty files for empty partiti...
Github user koertkuipers commented on the issue: https://github.com/apache/spark/pull/23052 it is pretty common for us to write empty dataframe to parquet and later read it back in same for writing to csv with header and reading it back in (with type inference disabled, we assume all strings) would this break those behaviors? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #23052: [SPARK-26081][SQL] Prevent empty files for empty partiti...
Github user srowen commented on the issue: https://github.com/apache/spark/pull/23052 Merged to master --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #23052: [SPARK-26081][SQL] Prevent empty files for empty partiti...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/23052 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/99407/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #23052: [SPARK-26081][SQL] Prevent empty files for empty partiti...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/23052 **[Test build #99407 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/99407/testReport)** for PR 23052 at commit [`083d411`](https://github.com/apache/spark/commit/083d411ec1822986dbb82fbe1896a6c0d846c7d8). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #23052: [SPARK-26081][SQL] Prevent empty files for empty partiti...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/23052 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #23052: [SPARK-26081][SQL] Prevent empty files for empty partiti...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/23052 **[Test build #99407 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/99407/testReport)** for PR 23052 at commit [`083d411`](https://github.com/apache/spark/commit/083d411ec1822986dbb82fbe1896a6c0d846c7d8). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #23052: [SPARK-26081][SQL] Prevent empty files for empty partiti...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/23052 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #23052: [SPARK-26081][SQL] Prevent empty files for empty partiti...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/23052 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/99380/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #23052: [SPARK-26081][SQL] Prevent empty files for empty partiti...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/23052 **[Test build #99380 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/99380/testReport)** for PR 23052 at commit [`586ab31`](https://github.com/apache/spark/commit/586ab316ed2b9bce07a879dc89766dc854807c21). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #23052: [SPARK-26081][SQL] Prevent empty files for empty partiti...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/23052 **[Test build #99380 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/99380/testReport)** for PR 23052 at commit [`586ab31`](https://github.com/apache/spark/commit/586ab316ed2b9bce07a879dc89766dc854807c21). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #23052: [SPARK-26081][SQL] Prevent empty files for empty partiti...
Github user MaxGekk commented on the issue: https://github.com/apache/spark/pull/23052 jenkins, retest this, please --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #23052: [SPARK-26081][SQL] Prevent empty files for empty partiti...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/23052 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #23052: [SPARK-26081][SQL] Prevent empty files for empty partiti...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/23052 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/99372/ Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #23052: [SPARK-26081][SQL] Prevent empty files for empty partiti...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/23052 **[Test build #99372 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/99372/testReport)** for PR 23052 at commit [`586ab31`](https://github.com/apache/spark/commit/586ab316ed2b9bce07a879dc89766dc854807c21). * This patch **fails Spark unit tests**. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #23052: [SPARK-26081][SQL] Prevent empty files for empty partiti...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/23052 **[Test build #99372 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/99372/testReport)** for PR 23052 at commit [`586ab31`](https://github.com/apache/spark/commit/586ab316ed2b9bce07a879dc89766dc854807c21). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #23052: [SPARK-26081][SQL] Prevent empty files for empty partiti...
Github user MaxGekk commented on the issue: https://github.com/apache/spark/pull/23052 Actually it needs similar changes like in https://github.com/apache/spark/pull/23130 --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #23052: [SPARK-26081][SQL] Prevent empty files for empty partiti...
Github user MaxGekk commented on the issue: https://github.com/apache/spark/pull/23052 > seems like a real failure I am looking at it. It seems the test is not deterministic. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #23052: [SPARK-26081][SQL] Prevent empty files for empty partiti...
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/23052 seems like a real failure --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #23052: [SPARK-26081][SQL] Prevent empty files for empty partiti...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/23052 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/99361/ Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #23052: [SPARK-26081][SQL] Prevent empty files for empty partiti...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/23052 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #23052: [SPARK-26081][SQL] Prevent empty files for empty partiti...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/23052 **[Test build #99361 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/99361/testReport)** for PR 23052 at commit [`76e1466`](https://github.com/apache/spark/commit/76e1466a39aa2a40d999791bb9d3b09628921e85). * This patch **fails Spark unit tests**. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #23052: [SPARK-26081][SQL] Prevent empty files for empty partiti...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/23052 **[Test build #99361 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/99361/testReport)** for PR 23052 at commit [`76e1466`](https://github.com/apache/spark/commit/76e1466a39aa2a40d999791bb9d3b09628921e85). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #23052: [SPARK-26081][SQL] Prevent empty files for empty partiti...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/23052 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #23052: [SPARK-26081][SQL] Prevent empty files for empty partiti...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/23052 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/5438/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #23052: [SPARK-26081][SQL] Prevent empty files for empty partiti...
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/23052 retest this please --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #23052: [SPARK-26081][SQL] Prevent empty files for empty partiti...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/23052 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #23052: [SPARK-26081][SQL] Prevent empty files for empty partiti...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/23052 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/99354/ Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #23052: [SPARK-26081][SQL] Prevent empty files for empty partiti...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/23052 **[Test build #99354 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/99354/testReport)** for PR 23052 at commit [`76e1466`](https://github.com/apache/spark/commit/76e1466a39aa2a40d999791bb9d3b09628921e85). * This patch **fails due to an unknown error code, -9**. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #23052: [SPARK-26081][SQL] Prevent empty files for empty partiti...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/23052 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/5432/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #23052: [SPARK-26081][SQL] Prevent empty files for empty partiti...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/23052 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #23052: [SPARK-26081][SQL] Prevent empty files for empty partiti...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/23052 **[Test build #99354 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/99354/testReport)** for PR 23052 at commit [`76e1466`](https://github.com/apache/spark/commit/76e1466a39aa2a40d999791bb9d3b09628921e85). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #23052: [SPARK-26081][SQL] Prevent empty files for empty partiti...
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/23052 retest this please --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #23052: [SPARK-26081][SQL] Prevent empty files for empty partiti...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/23052 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/99221/ Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #23052: [SPARK-26081][SQL] Prevent empty files for empty partiti...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/23052 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #23052: [SPARK-26081][SQL] Prevent empty files for empty partiti...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/23052 **[Test build #99221 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/99221/testReport)** for PR 23052 at commit [`76e1466`](https://github.com/apache/spark/commit/76e1466a39aa2a40d999791bb9d3b09628921e85). * This patch **fails Spark unit tests**. * This patch merges cleanly. * This patch adds the following public classes _(experimental)_: * ` class RuleSummary(` * `class QueryPlanningTracker ` * `class QueryExecution(` --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #23052: [SPARK-26081][SQL] Prevent empty files for empty partiti...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/23052 **[Test build #99221 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/99221/testReport)** for PR 23052 at commit [`76e1466`](https://github.com/apache/spark/commit/76e1466a39aa2a40d999791bb9d3b09628921e85). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #23052: [SPARK-26081][SQL] Prevent empty files for empty partiti...
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/23052 There are two more things to deal with: https://github.com/apache/spark/pull/23052#issuecomment-440687200 comment will still be valid - at least it should be double checked because dataframes originated from emptyRDD does not write anything all times. One thing is CSV for text-based datasources because it can write out headers. Thing is, the header is currently written when the first row is written - this is what https://github.com/apache/spark/pull/13252 PR targeted before. I closed this because there's no interests but we should fix. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #23052: [SPARK-26081][SQL] Prevent empty files for empty partiti...
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/23052 First of all, sometimes we do need to write "empty" files, so that we can infer schema of a parquet directory. Empty parquet file is not really empty, as it has header/footer. https://github.com/apache/spark/pull/20525 guarantees we always write out at least one empty file. One important thing is, when we write out an empty dataframe to file, and read it back, it should still be an empty dataframe. I'd suggest we skip empty file in text-based data sources, and later on send a followup PR to not write empty text files, as a perf improvement. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #23052: [SPARK-26081][SQL] Prevent empty files for empty partiti...
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/23052 cc @cloud-fan as well --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #23052: [SPARK-26081][SQL] Prevent empty files for empty partiti...
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/23052 Also, it's not always for Parquet to write empty files. That does not write empty files when data frames are created from emptyRDD (the one pointed out in the PR link I gave). We should match this behaviour as well. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #23052: [SPARK-26081][SQL] Prevent empty files for empty partiti...
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/23052 @MaxGekk I didn't mean to block this PR. Since we're going ahead for 3.0, it should be good to match and fix the behaviours across data sources. For instance, CSV should still be able to read the header. Shall we clarify each data sources behaviour? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #23052: [SPARK-26081][SQL] Prevent empty files for empty partiti...
Github user MaxGekk commented on the issue: https://github.com/apache/spark/pull/23052 I have read the tickets you pointed out but haven't found what could potentially block the changes. One of corner cases is saving an empty dataframe. In this case, no files would be written, but this is ok for text-based datasources because , in any case, we cannot restore the schema fully from empty files (comparing to parquet files where we can). So, a schema must be provided by an user. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #23052: [SPARK-26081][SQL] Prevent empty files for empty partiti...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/23052 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #23052: [SPARK-26081][SQL] Prevent empty files for empty partiti...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/23052 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/99107/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #23052: [SPARK-26081][SQL] Prevent empty files for empty partiti...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/23052 **[Test build #99107 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/99107/testReport)** for PR 23052 at commit [`9501d01`](https://github.com/apache/spark/commit/9501d01c6b0239b0da1e6ac1e99e6d5081e65258). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #23052: [SPARK-26081][SQL] Prevent empty files for empty partiti...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/23052 **[Test build #99107 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/99107/testReport)** for PR 23052 at commit [`9501d01`](https://github.com/apache/spark/commit/9501d01c6b0239b0da1e6ac1e99e6d5081e65258). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #23052: [SPARK-26081][SQL] Prevent empty files for empty partiti...
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/23052 I think now it should be good timing to match the behaviours. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #23052: [SPARK-26081][SQL] Prevent empty files for empty partiti...
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/23052 related another try https://github.com/apache/spark/pull/13252 --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #23052: [SPARK-26081][SQL] Prevent empty files for empty partiti...
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/23052 One try to add some tests for reading/writing empty dataframes was here https://github.com/apache/spark/pull/13253 fyi --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #23052: [SPARK-26081][SQL] Prevent empty files for empty partiti...
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/23052 Which should be ... this https://github.com/apache/spark/pull/12855 --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #23052: [SPARK-26081][SQL] Prevent empty files for empty partiti...
Github user MaxGekk commented on the issue: https://github.com/apache/spark/pull/23052 > Similar changes were proposed in Parquet few years ago (by me) and reverted. What was the main reason to revert it? If it is possible could you give me a link to your PR. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #23052: [SPARK-26081][SQL] Prevent empty files for empty partiti...
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/23052 @MaxGekk, actually this is kind of important behaviour change. This basically means we're unable to read the empty files back. Similar changes were proposed in Parquet few years ago (by me) and reverted. We should better investigate and match the behaviours first across datasources. IIRC, ORC does not create files (if that's not updated from what I have checked long ago) but Parquet does. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #23052: [SPARK-26081][SQL] Prevent empty files for empty partiti...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/23052 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #23052: [SPARK-26081][SQL] Prevent empty files for empty partiti...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/23052 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/98887/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #23052: [SPARK-26081][SQL] Prevent empty files for empty partiti...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/23052 **[Test build #98887 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/98887/testReport)** for PR 23052 at commit [`6f3cb18`](https://github.com/apache/spark/commit/6f3cb18d5a863f6aded763bdeb5395f6622876ff). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #23052: [SPARK-26081][SQL] Prevent empty files for empty partiti...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/23052 **[Test build #98887 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/98887/testReport)** for PR 23052 at commit [`6f3cb18`](https://github.com/apache/spark/commit/6f3cb18d5a863f6aded763bdeb5395f6622876ff). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #23052: [SPARK-26081][SQL] Prevent empty files for empty partiti...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/23052 Can one of the admins verify this patch? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #23052: [SPARK-26081][SQL] Prevent empty files for empty partiti...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/23052 Can one of the admins verify this patch? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org