[GitHub] spark issue #22614: [SPARK-25561][SQL] HiveClient.getPartitionsByFilter shou...

2018-10-05 Thread viirya
Github user viirya commented on the issue:

https://github.com/apache/spark/pull/22614
  
The PR description and title may need to change accordingly. Can you update 
it?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22614: [SPARK-25561][SQL] HiveClient.getPartitionsByFilter shou...

2018-10-05 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22614
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22614: [SPARK-25561][SQL] HiveClient.getPartitionsByFilter shou...

2018-10-05 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22614
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/97014/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22614: [SPARK-25561][SQL] HiveClient.getPartitionsByFilter shou...

2018-10-05 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/22614
  
**[Test build #97014 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/97014/testReport)**
 for PR 22614 at commit 
[`f42bbec`](https://github.com/apache/spark/commit/f42bbec8d7ba23cca77f2bf83230ad2e2ceafeb9).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22614: [SPARK-25561][SQL] HiveClient.getPartitionsByFilter shou...

2018-10-05 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/22614
  
**[Test build #97014 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/97014/testReport)**
 for PR 22614 at commit 
[`f42bbec`](https://github.com/apache/spark/commit/f42bbec8d7ba23cca77f2bf83230ad2e2ceafeb9).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22614: [SPARK-25561][SQL] HiveClient.getPartitionsByFilter shou...

2018-10-05 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22614
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/96999/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22614: [SPARK-25561][SQL] HiveClient.getPartitionsByFilter shou...

2018-10-05 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22614
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22614: [SPARK-25561][SQL] HiveClient.getPartitionsByFilter shou...

2018-10-05 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/22614
  
**[Test build #96999 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/96999/testReport)**
 for PR 22614 at commit 
[`cb0577b`](https://github.com/apache/spark/commit/cb0577bef7058401b354f7695c4319d73c7156e6).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22614: [SPARK-25561][SQL] HiveClient.getPartitionsByFilter shou...

2018-10-05 Thread kmanamcheri
Github user kmanamcheri commented on the issue:

https://github.com/apache/spark/pull/22614
  
@gatorsmile I have added the config option and an additional test.

Here's the new behavior
- Setting spark.sql.metastorePartitionPruningFallback to 'false' will 
ALWAYS throw an exception if partition pushdown fails (Hive throws an 
exception). This is suggested for queries where you want to fail fast and you 
know that you have a large number of partitions.
- Setting spark.sql.metastorePartitionPruningFallback to 'true' (this is 
the default setting) will ALWAYS catch exception from Hive and retry with 
fetching all partitions. However, to be helpful to users, Spark will read the 
directSql config value from Hive and provide good log messages on what the next 
steps to do.

@dongjoon-hyun @mallman @vanzin If these look good, can we move on this to 
merge? Thanks a lot for all the comments and discussions.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22614: [SPARK-25561][SQL] HiveClient.getPartitionsByFilter shou...

2018-10-05 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/22614
  
**[Test build #96999 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/96999/testReport)**
 for PR 22614 at commit 
[`cb0577b`](https://github.com/apache/spark/commit/cb0577bef7058401b354f7695c4319d73c7156e6).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22614: [SPARK-25561][SQL] HiveClient.getPartitionsByFilter shou...

2018-10-03 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22614
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22614: [SPARK-25561][SQL] HiveClient.getPartitionsByFilter shou...

2018-10-03 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22614
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/96899/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22614: [SPARK-25561][SQL] HiveClient.getPartitionsByFilter shou...

2018-10-03 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/22614
  
**[Test build #96899 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/96899/testReport)**
 for PR 22614 at commit 
[`2ad9cf4`](https://github.com/apache/spark/commit/2ad9cf4923c19127efbdeef3fe6e3cd3e3f728ff).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22614: [SPARK-25561][SQL] HiveClient.getPartitionsByFilter shou...

2018-10-03 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/22614
  
**[Test build #96899 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/96899/testReport)**
 for PR 22614 at commit 
[`2ad9cf4`](https://github.com/apache/spark/commit/2ad9cf4923c19127efbdeef3fe6e3cd3e3f728ff).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22614: [SPARK-25561][SQL] HiveClient.getPartitionsByFilter shou...

2018-10-03 Thread gatorsmile
Github user gatorsmile commented on the issue:

https://github.com/apache/spark/pull/22614
  
Yes. Let us add a conf for controlling the fallback. Please also add the 
test cases for verifying it. Thanks!


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22614: [SPARK-25561][SQL] HiveClient.getPartitionsByFilter shou...

2018-10-03 Thread kmanamcheri
Github user kmanamcheri commented on the issue:

https://github.com/apache/spark/pull/22614
  
> Let us add a conf to control it? Failing fast is better than hanging. If 
users want to get all partitions, they can change the conf by themselves.

@gatorsmile We already have a config option 
"spark.sql.hive.metastorePartitionPruning". If that is set to false, we will 
never push down the partitions to HMS. I will add 
"spark.sql.hive.metastorePartitionPruningFallback" which in addition to the 
previous one controls the fallback behavior. Irrespective of the value of Hive 
direct SQL, if we enable the pruning fallback, we will catch the exception and 
fallback to fetch all partitions. Does this sound like a reasonable compromise 
@mallman ?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22614: [SPARK-25561][SQL] HiveClient.getPartitionsByFilter shou...

2018-10-03 Thread gatorsmile
Github user gatorsmile commented on the issue:

https://github.com/apache/spark/pull/22614
  
Let us add a conf to control it? Failing fast is better than hanging. If 
users want to get all partitions, they can change the conf by themselves. 


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22614: [SPARK-25561][SQL] HiveClient.getPartitionsByFilter shou...

2018-10-02 Thread vanzin
Github user vanzin commented on the issue:

https://github.com/apache/spark/pull/22614
  
Looks ok to me based on discussion in the bug. Will leave here to see if 
others have any comments.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22614: [SPARK-25561][SQL] HiveClient.getPartitionsByFilter shou...

2018-10-02 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22614
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22614: [SPARK-25561][SQL] HiveClient.getPartitionsByFilter shou...

2018-10-02 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22614
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/96868/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22614: [SPARK-25561][SQL] HiveClient.getPartitionsByFilter shou...

2018-10-02 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/22614
  
**[Test build #96868 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/96868/testReport)**
 for PR 22614 at commit 
[`dddffca`](https://github.com/apache/spark/commit/dddffcae8824e72d614fd6202e7fc562c490098b).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22614: [SPARK-25561][SQL] HiveClient.getPartitionsByFilter shou...

2018-10-02 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/22614
  
**[Test build #96868 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/96868/testReport)**
 for PR 22614 at commit 
[`dddffca`](https://github.com/apache/spark/commit/dddffcae8824e72d614fd6202e7fc562c490098b).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22614: [SPARK-25561][SQL] HiveClient.getPartitionsByFilter shou...

2018-10-02 Thread vanzin
Github user vanzin commented on the issue:

https://github.com/apache/spark/pull/22614
  
ok to test


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22614: [SPARK-25561][SQL] HiveClient.getPartitionsByFilter shou...

2018-10-02 Thread kmanamcheri
Github user kmanamcheri commented on the issue:

https://github.com/apache/spark/pull/22614
  
@mallman @cloud-fan @ericl @rezasafi 


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22614: [SPARK-25561][SQL] HiveClient.getPartitionsByFilter shou...

2018-10-02 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22614
  
Can one of the admins verify this patch?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org