[GitHub] spark pull request: [SPARK-12745] [SQL] Hive Parser: Limit is not ...

2016-01-11 Thread gatorsmile
Github user gatorsmile commented on the pull request: https://github.com/apache/spark/pull/10689#issuecomment-170756485 Sure, let me close it. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request: [SPARK-12745] [SQL] Hive Parser: Limit is not ...

2016-01-11 Thread gatorsmile
Github user gatorsmile closed the pull request at: https://github.com/apache/spark/pull/10689 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature

[GitHub] spark pull request: [SPARK-12745] [SQL] Hive Parser: Limit is not ...

2016-01-11 Thread rxin
Github user rxin commented on the pull request: https://github.com/apache/spark/pull/10689#issuecomment-170755933 @gatorsmile I think we'd need more proper design for limits. Let's close this as later. --- If your project is set up for it, you can reply to this email and have your

[GitHub] spark pull request: [SPARK-12745] [SQL] Hive Parser: Limit is not ...

2016-01-11 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/10689#issuecomment-170470751 **[Test build #49117 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/49117/consoleFull)** for PR 10689 at commit

[GitHub] spark pull request: [SPARK-12745] [SQL] Hive Parser: Limit is not ...

2016-01-11 Thread hvanhovell
Github user hvanhovell commented on the pull request: https://github.com/apache/spark/pull/10689#issuecomment-170473951 @gatorsmile the fix looks good. @rxin / @marmbrus / @gatorsmile I am not sure if we should support this at all. Using a limit in SELECT's connected by a

[GitHub] spark pull request: [SPARK-12745] [SQL] Hive Parser: Limit is not ...

2016-01-11 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/10689#issuecomment-170507628 **[Test build #49117 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/49117/consoleFull)** for PR 10689 at commit

[GitHub] spark pull request: [SPARK-12745] [SQL] Hive Parser: Limit is not ...

2016-01-11 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/10689#issuecomment-170507968 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your

[GitHub] spark pull request: [SPARK-12745] [SQL] Hive Parser: Limit is not ...

2016-01-11 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/10689#issuecomment-170507969 Test PASSed. Refer to this link for build results (access rights to CI server needed):

[GitHub] spark pull request: [SPARK-12745] [SQL] Hive Parser: Limit is not ...

2016-01-11 Thread JoshRosen
Github user JoshRosen commented on the pull request: https://github.com/apache/spark/pull/10689#issuecomment-170469510 Jenkins, retest this please. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not

[GitHub] spark pull request: [SPARK-12745] [SQL] Hive Parser: Limit is not ...

2016-01-11 Thread gatorsmile
Github user gatorsmile commented on the pull request: https://github.com/apache/spark/pull/10689#issuecomment-170691919 Yeah! I just read the implementation of `Limit`. As you said, the current one is not highly efficient, especially when the number of limits is not small. --- If

[GitHub] spark pull request: [SPARK-12745] [SQL] Hive Parser: Limit is not ...

2016-01-11 Thread gatorsmile
Github user gatorsmile commented on the pull request: https://github.com/apache/spark/pull/10689#issuecomment-170619409 Give two tables `tbl_a` and `tbl_b`, `tbl_a` has **billions** of rows but `tbl_b` has **thousands** of rows. `tbl_a` has one column `col_frkey_tbl_a` whose values

[GitHub] spark pull request: [SPARK-12745] [SQL] Hive Parser: Limit is not ...

2016-01-11 Thread hvanhovell
Github user hvanhovell commented on the pull request: https://github.com/apache/spark/pull/10689#issuecomment-170597130 @gatorsmile I do see the performance benefits of ```limit``` while processing. The reservation I am having is reasoning about non-toplevel ```limit``` statements. A

[GitHub] spark pull request: [SPARK-12745] [SQL] Hive Parser: Limit is not ...

2016-01-11 Thread gatorsmile
Github user gatorsmile commented on the pull request: https://github.com/apache/spark/pull/10689#issuecomment-170589671 @hvanhovell Let me share my two cents: - We have another PR to push down `Limit` through `Union ALL`. However, it is impossible to push `Limit` through `Union

[GitHub] spark pull request: [SPARK-12745] [SQL] Hive Parser: Limit is not ...

2016-01-11 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/10689#issuecomment-170634356 **[Test build #49158 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/49158/consoleFull)** for PR 10689 at commit

[GitHub] spark pull request: [SPARK-12745] [SQL] Hive Parser: Limit is not ...

2016-01-11 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/10689#issuecomment-170634811 Test FAILed. Refer to this link for build results (access rights to CI server needed):

[GitHub] spark pull request: [SPARK-12745] [SQL] Hive Parser: Limit is not ...

2016-01-11 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/10689#issuecomment-170634799 **[Test build #49158 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/49158/consoleFull)** for PR 10689 at commit

[GitHub] spark pull request: [SPARK-12745] [SQL] Hive Parser: Limit is not ...

2016-01-11 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/10689#issuecomment-170634808 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your

[GitHub] spark pull request: [SPARK-12745] [SQL] Hive Parser: Limit is not ...

2016-01-11 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/10689#issuecomment-170638417 Test FAILed. Refer to this link for build results (access rights to CI server needed):

[GitHub] spark pull request: [SPARK-12745] [SQL] Hive Parser: Limit is not ...

2016-01-11 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/10689#issuecomment-170638416 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your

[GitHub] spark pull request: [SPARK-12745] [SQL] Hive Parser: Limit is not ...

2016-01-11 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/10689#issuecomment-170638728 **[Test build #49160 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/49160/consoleFull)** for PR 10689 at commit

[GitHub] spark pull request: [SPARK-12745] [SQL] Hive Parser: Limit is not ...

2016-01-11 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/10689#issuecomment-170670978 **[Test build #49160 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/49160/consoleFull)** for PR 10689 at commit

[GitHub] spark pull request: [SPARK-12745] [SQL] Hive Parser: Limit is not ...

2016-01-11 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/10689#issuecomment-170671485 Test PASSed. Refer to this link for build results (access rights to CI server needed):

[GitHub] spark pull request: [SPARK-12745] [SQL] Hive Parser: Limit is not ...

2016-01-11 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/10689#issuecomment-170671482 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your

[GitHub] spark pull request: [SPARK-12745] [SQL] Hive Parser: Limit is not ...

2016-01-11 Thread marmbrus
Github user marmbrus commented on the pull request: https://github.com/apache/spark/pull/10689#issuecomment-170673019 That example seems kind of artificial to me. Additionally large non-terminal limits are not planned very well today so I think users are going to be surprised. ---

[GitHub] spark pull request: [SPARK-12745] [SQL] Hive Parser: Limit is not ...

2016-01-10 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request: https://github.com/apache/spark/pull/10689#discussion_r49290681 --- Diff: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/CatalystQlSuite.scala --- @@ -49,4 +49,11 @@ class CatalystQlSuite extends PlanTest

[GitHub] spark pull request: [SPARK-12745] [SQL] Hive Parser: Limit is not ...

2016-01-10 Thread gatorsmile
Github user gatorsmile commented on the pull request: https://github.com/apache/spark/pull/10689#issuecomment-170422086 @hvanhovell @rxin Could you take a look? Thank you! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well.

[GitHub] spark pull request: [SPARK-12745] [SQL] Hive Parser: Limit is not ...

2016-01-10 Thread rxin
Github user rxin commented on a diff in the pull request: https://github.com/apache/spark/pull/10689#discussion_r49288754 --- Diff: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/CatalystQlSuite.scala --- @@ -49,4 +49,11 @@ class CatalystQlSuite extends PlanTest {

[GitHub] spark pull request: [SPARK-12745] [SQL] Hive Parser: Limit is not ...

2016-01-10 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/10689#issuecomment-170421465 Test PASSed. Refer to this link for build results (access rights to CI server needed):

[GitHub] spark pull request: [SPARK-12745] [SQL] Hive Parser: Limit is not ...

2016-01-10 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/10689#issuecomment-170421463 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your

[GitHub] spark pull request: [SPARK-12745] [SQL] Hive Parser: Limit is not ...

2016-01-10 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/10689#issuecomment-170421391 **[Test build #49080 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/49080/consoleFull)** for PR 10689 at commit

[GitHub] spark pull request: [SPARK-12745] [SQL] Hive Parser: Limit is not ...

2016-01-10 Thread rxin
Github user rxin commented on a diff in the pull request: https://github.com/apache/spark/pull/10689#discussion_r49292062 --- Diff: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/CatalystQlSuite.scala --- @@ -49,4 +50,16 @@ class CatalystQlSuite extends PlanTest {

[GitHub] spark pull request: [SPARK-12745] [SQL] Hive Parser: Limit is not ...

2016-01-10 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request: https://github.com/apache/spark/pull/10689#discussion_r49292180 --- Diff: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/CatalystQlSuite.scala --- @@ -49,4 +50,16 @@ class CatalystQlSuite extends PlanTest

[GitHub] spark pull request: [SPARK-12745] [SQL] Hive Parser: Limit is not ...

2016-01-10 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/10689#issuecomment-170442349 **[Test build #49094 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/49094/consoleFull)** for PR 10689 at commit

[GitHub] spark pull request: [SPARK-12745] [SQL] Hive Parser: Limit is not ...

2016-01-10 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/10689#issuecomment-170449790 **[Test build #49094 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/49094/consoleFull)** for PR 10689 at commit

[GitHub] spark pull request: [SPARK-12745] [SQL] Hive Parser: Limit is not ...

2016-01-10 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/10689#issuecomment-170449852 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your

[GitHub] spark pull request: [SPARK-12745] [SQL] Hive Parser: Limit is not ...

2016-01-10 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/10689#issuecomment-170449853 Test FAILed. Refer to this link for build results (access rights to CI server needed):

[GitHub] spark pull request: [SPARK-12745] [SQL] Hive Parser: Limit is not ...

2016-01-10 Thread gatorsmile
GitHub user gatorsmile opened a pull request: https://github.com/apache/spark/pull/10689 [SPARK-12745] [SQL] Hive Parser: Limit is not supported inside Set Operation The current SQLContext allows the following query, which is copied from a test case in SQLQuerySuite: ```

[GitHub] spark pull request: [SPARK-12745] [SQL] Hive Parser: Limit is not ...

2016-01-10 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/10689#issuecomment-170413432 **[Test build #49080 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/49080/consoleFull)** for PR 10689 at commit