[GitHub] spark issue #14322: [SPARK-16689] [SQL] FileSourceStrategy: Pruning Partitio...

2016-07-24 Thread gatorsmile
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/14322 Yeah, I did not see a noticeable performance difference based on the local tests I did. Based on the outputs of whole-stage code gen, the number of instructions is less. Thus, I think it helps a

[GitHub] spark issue #14322: [SPARK-16689] [SQL] FileSourceStrategy: Pruning Partitio...

2016-07-24 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/14322 how much benefit can we get by avoiding scan partition columns? Seems that we just parse the directory string to get the partition values, no IO is needed. --- If your project is set up for it,

[GitHub] spark issue #14322: [SPARK-16689] [SQL] FileSourceStrategy: Pruning Partitio...

2016-07-22 Thread gatorsmile
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/14322 cc @marmbrus @cloud-fan @liancheng After history checking, most of codes are done by you. Thanks! --- If your project is set up for it, you can reply to this email and have your reply appear

[GitHub] spark issue #14322: [SPARK-16689] [SQL] FileSourceStrategy: Pruning Partitio...

2016-07-22 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/14322 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/62743/ Test PASSed. ---

[GitHub] spark issue #14322: [SPARK-16689] [SQL] FileSourceStrategy: Pruning Partitio...

2016-07-22 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/14322 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #14322: [SPARK-16689] [SQL] FileSourceStrategy: Pruning Partitio...

2016-07-22 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/14322 **[Test build #62743 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/62743/consoleFull)** for PR 14322 at commit

[GitHub] spark issue #14322: [SPARK-16689] [SQL] FileSourceStrategy: Pruning Partitio...

2016-07-22 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/14322 **[Test build #62743 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/62743/consoleFull)** for PR 14322 at commit

[GitHub] spark issue #14322: [SPARK-16689] [SQL] FileSourceStrategy: Pruning Partitio...

2016-07-22 Thread gatorsmile
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/14322 **After the PR changes**, the whole-stage codegen output is like: ```JAVA == Subtree 1 / 1 == *Scan json [value#37L] Format: JSON, InputPaths:

[GitHub] spark issue #14322: [SPARK-16689] [SQL] FileSourceStrategy: Pruning Partitio...

2016-07-22 Thread gatorsmile
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/14322 **Before the PR changes**, the whole-stage codegen output is like: ```JAVA == Subtree 1 / 1 == *Project [value#37L] +- *Scan json [value#37L,p1#39,p2#40,p3#41] Format: JSON,