[GitHub] [spark] SparkQA commented on issue #27232: [SPARK-30525][SQL]HiveTableScanExec do not need to prune partitions again after pushing down to hive metastore
SparkQA commented on issue #27232: [SPARK-30525][SQL]HiveTableScanExec do not need to prune partitions again after pushing down to hive metastore URL: https://github.com/apache/spark/pull/27232#issuecomment-576562510 **[Test build #117165 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/117165/testReport)** for PR 27232 at commit [`d8eefe9`](https://github.com/apache/spark/commit/d8eefe9e2b0e5f5c829223f9141a8790889ee60e). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on issue #27260: [SPARK-30549][SQL] Fix the subquery shown issue in UI When enable AQE
SparkQA commented on issue #27260: [SPARK-30549][SQL] Fix the subquery shown issue in UI When enable AQE URL: https://github.com/apache/spark/pull/27260#issuecomment-576562535 **[Test build #117164 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/117164/testReport)** for PR 27260 at commit [`1dacc22`](https://github.com/apache/spark/commit/1dacc22f86a6e75575d521b22528bb74793006bb). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on issue #27304: [SPARK-30593][SQL] Revert interval ISO/ANSI SQL Standard output since we decide not to follow ANSI and no round trip
SparkQA commented on issue #27304: [SPARK-30593][SQL] Revert interval ISO/ANSI SQL Standard output since we decide not to follow ANSI and no round trip URL: https://github.com/apache/spark/pull/27304#issuecomment-576562522 **[Test build #117163 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/117163/testReport)** for PR 27304 at commit [`75d597c`](https://github.com/apache/spark/commit/75d597c113127d32e83f4f3b0164ecd0d52aa3c2). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #27304: [SPARK-30593][SQL] Revert interval ISO/ANSI SQL Standard output since we decide not to follow ANSI and no round trip
AmplabJenkins commented on issue #27304: [SPARK-30593][SQL] Revert interval ISO/ANSI SQL Standard output since we decide not to follow ANSI and no round trip URL: https://github.com/apache/spark/pull/27304#issuecomment-576563018 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #27232: [SPARK-30525][SQL]HiveTableScanExec do not need to prune partitions again after pushing down to hive metastore
AmplabJenkins commented on issue #27232: [SPARK-30525][SQL]HiveTableScanExec do not need to prune partitions again after pushing down to hive metastore URL: https://github.com/apache/spark/pull/27232#issuecomment-576563125 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/21929/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #27304: [SPARK-30593][SQL] Revert interval ISO/ANSI SQL Standard output since we decide not to follow ANSI and no round trip
AmplabJenkins removed a comment on issue #27304: [SPARK-30593][SQL] Revert interval ISO/ANSI SQL Standard output since we decide not to follow ANSI and no round trip URL: https://github.com/apache/spark/pull/27304#issuecomment-576563018 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #27304: [SPARK-30593][SQL] Revert interval ISO/ANSI SQL Standard output since we decide not to follow ANSI and no round trip
AmplabJenkins removed a comment on issue #27304: [SPARK-30593][SQL] Revert interval ISO/ANSI SQL Standard output since we decide not to follow ANSI and no round trip URL: https://github.com/apache/spark/pull/27304#issuecomment-576563029 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/21927/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #27260: [SPARK-30549][SQL] Fix the subquery shown issue in UI When enable AQE
AmplabJenkins commented on issue #27260: [SPARK-30549][SQL] Fix the subquery shown issue in UI When enable AQE URL: https://github.com/apache/spark/pull/27260#issuecomment-576563141 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/21928/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #27304: [SPARK-30593][SQL] Revert interval ISO/ANSI SQL Standard output since we decide not to follow ANSI and no round trip
AmplabJenkins commented on issue #27304: [SPARK-30593][SQL] Revert interval ISO/ANSI SQL Standard output since we decide not to follow ANSI and no round trip URL: https://github.com/apache/spark/pull/27304#issuecomment-576563029 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/21927/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #27232: [SPARK-30525][SQL]HiveTableScanExec do not need to prune partitions again after pushing down to hive metastore
AmplabJenkins commented on issue #27232: [SPARK-30525][SQL]HiveTableScanExec do not need to prune partitions again after pushing down to hive metastore URL: https://github.com/apache/spark/pull/27232#issuecomment-576563115 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #27260: [SPARK-30549][SQL] Fix the subquery shown issue in UI When enable AQE
AmplabJenkins commented on issue #27260: [SPARK-30549][SQL] Fix the subquery shown issue in UI When enable AQE URL: https://github.com/apache/spark/pull/27260#issuecomment-576563129 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #27232: [SPARK-30525][SQL]HiveTableScanExec do not need to prune partitions again after pushing down to hive metastore
AmplabJenkins removed a comment on issue #27232: [SPARK-30525][SQL]HiveTableScanExec do not need to prune partitions again after pushing down to hive metastore URL: https://github.com/apache/spark/pull/27232#issuecomment-576563115 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #27260: [SPARK-30549][SQL] Fix the subquery shown issue in UI When enable AQE
AmplabJenkins removed a comment on issue #27260: [SPARK-30549][SQL] Fix the subquery shown issue in UI When enable AQE URL: https://github.com/apache/spark/pull/27260#issuecomment-576563129 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #27232: [SPARK-30525][SQL]HiveTableScanExec do not need to prune partitions again after pushing down to hive metastore
AmplabJenkins removed a comment on issue #27232: [SPARK-30525][SQL]HiveTableScanExec do not need to prune partitions again after pushing down to hive metastore URL: https://github.com/apache/spark/pull/27232#issuecomment-576563125 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/21929/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #27260: [SPARK-30549][SQL] Fix the subquery shown issue in UI When enable AQE
AmplabJenkins removed a comment on issue #27260: [SPARK-30549][SQL] Fix the subquery shown issue in UI When enable AQE URL: https://github.com/apache/spark/pull/27260#issuecomment-576563141 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/21928/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] beliefer commented on a change in pull request #27058: [SPARK-30276][SQL] Support Filter expression allows simultaneous use of DISTINCT
beliefer commented on a change in pull request #27058: [SPARK-30276][SQL] Support Filter expression allows simultaneous use of DISTINCT URL: https://github.com/apache/spark/pull/27058#discussion_r368855224 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/RewriteDistinctAggregates.scala ## @@ -118,7 +118,66 @@ import org.apache.spark.sql.types.IntegerType * LocalTableScan [...] * }}} * - * The rule does the following things here: + * Third example: single distinct aggregate function with filter clauses (in sql): + * {{{ + * SELECT + * COUNT(DISTINCT cat1) FILTER (WHERE id > 1) as cat1_cnt1, + * COUNT(DISTINCT cat1) as cat1_cnt2, + * SUM(value) AS total + * FROM + *data + * GROUP BY + *key + * }}} + * + * This translates to the following (pseudo) logical plan: + * {{{ + * Aggregate( + *key = ['key] + *functions = [COUNT(DISTINCT 'cat1) with FILTER('id > 1), + * COUNT(DISTINCT 'cat1), + * sum('value)] + *output = ['key, 'cat1_cnt1, 'cat1_cnt2, 'total]) + * LocalTableScan [...] + * }}} + * + * This rule rewrites this logical plan to the following (pseudo) logical plan: + * {{{ + * Aggregate( + * key = ['key] + * functions = [count(if (('gid = 1)) '_gen_distinct_1 else null), + * count(if (('gid = 2)) '_gen_distinct_2 else null), + * first(if (('gid = 0)) 'total else null) ignore nulls] + * output = ['key, 'cat1_cnt, 'cat1_cnt2, 'total]) + * Aggregate( + *key = ['key, '_gen_distinct_1, '_gen_distinct_2, 'gid] + *functions = [sum('value)] + *output = ['key, '_gen_distinct_1, '_gen_distinct_2, 'gid, 'total]) + * Expand( + * projections = [('key, null, null, 0, 'value), + * ('key, '_gen_distinct_1, null, 1, null), + * ('key, null, '_gen_distinct_2, 2, null)] + * output = ['key, '_gen_distinct_1, '_gen_distinct_2, 'gid, 'value]) + * Expand( + *projections = [('key, if ('id > 1) 'cat1 else null, 'cat1, cast('value as bigint))] + *output = ['key, '_gen_distinct_1, '_gen_distinct_2, 'value]) + * LocalTableScan [...] + * }}} + * + * The rule serves two purposes: + * 1. Expand distinct aggregates which exists filter clause. + * 2. Rewrite when aggregate exists at least two distinct aggregates. + * + * The first child rule does the following things here: + * 1. Guaranteed to compute filter clause locally. + * 2. The attributes referenced by different distinct aggregate expressions are likely to overlap, + *and if no additional processing is performed, data loss will occur. To prevent this, we + *generate new attributes and replace the original ones. + * 3. If we apply the first child rule to distinct aggregate expressions which exists filter + *clause, the aggregate after expand may have at least two distinct aggregates, so we need to + *apply the second child rule too. Review comment: OK. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] dongjoon-hyun commented on a change in pull request #27242: [SPARK-30534][INFRA] Use mvn in `dev/scalastyle`
dongjoon-hyun commented on a change in pull request #27242: [SPARK-30534][INFRA] Use mvn in `dev/scalastyle` URL: https://github.com/apache/spark/pull/27242#discussion_r368855344 ## File path: dev/scalastyle ## @@ -17,18 +17,10 @@ # limitations under the License. # -SPARK_PROFILES=${1:-"-Pmesos -Pkubernetes -Pyarn -Pspark-ganglia-lgpl -Pkinesis-asl -Phive-thriftserver -Phive"} +SCRIPT_DIR="$( cd "$( dirname "$0" )" && pwd )" -# NOTE: echo "q" is needed because SBT prompts the user for input on encountering a build file -# with failure (either resolution or compilation); the "q" makes SBT quit. -ERRORS=$(echo -e "q\n" \ -| build/sbt \ -${SPARK_PROFILES} \ --Pdocker-integration-tests \ --Pkubernetes-integration-tests \ -scalastyle test:scalastyle \ -| awk '{if($1~/error/)print}' \ -) +SPARK_PROFILES=${1:-"-Pmesos -Pkubernetes -Pyarn -Pspark-ganglia-lgpl -Pkinesis-asl -Phive-thriftserver -Phive -Pdocker-integration-tests -Pkubernetes-integration-tests"} +ERRORS=$($SCRIPT_DIR/../build/mvn $SPARK_PROFILES scalastyle:check | grep "^error file") Review comment: I'm +1 for any suggestion if you make a PR and pass all system (including GitHub Action). :) BTW, did you try to checkout one of older commits than this, @HyukjinKwon ? If it works in your environment even in that case, it sounds weird to me because of the following. As you can see on the mailing list, this is reported by @tgravescs and I confirmed the situation. Then, I made this PR at that time. Also, GitHub Action log still shows the whole situation (the **failures** before this commit and the **successes** after this commit). https://user-images.githubusercontent.com/9700541/72785186-7b43ad00-3bdf-11ea-9ca4-084b7a4b8a58.png;> For now, I don't have any clue about the difference (code side? or Maven server side? or your environment?). In any way, you can move forward whiling keeping our system **green**. We had better discuss on your PR. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] dongjoon-hyun commented on a change in pull request #27242: [SPARK-30534][INFRA] Use mvn in `dev/scalastyle`
dongjoon-hyun commented on a change in pull request #27242: [SPARK-30534][INFRA] Use mvn in `dev/scalastyle` URL: https://github.com/apache/spark/pull/27242#discussion_r368855344 ## File path: dev/scalastyle ## @@ -17,18 +17,10 @@ # limitations under the License. # -SPARK_PROFILES=${1:-"-Pmesos -Pkubernetes -Pyarn -Pspark-ganglia-lgpl -Pkinesis-asl -Phive-thriftserver -Phive"} +SCRIPT_DIR="$( cd "$( dirname "$0" )" && pwd )" -# NOTE: echo "q" is needed because SBT prompts the user for input on encountering a build file -# with failure (either resolution or compilation); the "q" makes SBT quit. -ERRORS=$(echo -e "q\n" \ -| build/sbt \ -${SPARK_PROFILES} \ --Pdocker-integration-tests \ --Pkubernetes-integration-tests \ -scalastyle test:scalastyle \ -| awk '{if($1~/error/)print}' \ -) +SPARK_PROFILES=${1:-"-Pmesos -Pkubernetes -Pyarn -Pspark-ganglia-lgpl -Pkinesis-asl -Phive-thriftserver -Phive -Pdocker-integration-tests -Pkubernetes-integration-tests"} +ERRORS=$($SCRIPT_DIR/../build/mvn $SPARK_PROFILES scalastyle:check | grep "^error file") Review comment: I'm +1 for any suggestion if you make a PR and pass all system (including GitHub Action). :) BTW, did you try to checkout one of older commits than this, @HyukjinKwon ? If it works in your environment even in that case, it sounds weird to me because of the following. As you can see on the mailing list, this was reported by @tgravescs and I confirmed the situation. Then, I made this PR at that time. Also, GitHub Action log still shows the whole situation (the **failures** before this commit and the **successes** after this commit). https://user-images.githubusercontent.com/9700541/72785186-7b43ad00-3bdf-11ea-9ca4-084b7a4b8a58.png;> For now, I don't have any clue about the difference (code side? or Maven server side? or your environment?). In any way, you can move forward whiling keeping our system **green**. We had better discuss on your PR. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on issue #27288: [SPARK-30579][DOC] Document ORDER BY Clause of SELECT statement in SQL Reference
SparkQA commented on issue #27288: [SPARK-30579][DOC] Document ORDER BY Clause of SELECT statement in SQL Reference URL: https://github.com/apache/spark/pull/27288#issuecomment-576564088 **[Test build #117156 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/117156/testReport)** for PR 27288 at commit [`d1fb7bd`](https://github.com/apache/spark/commit/d1fb7bd41c43196b23f51db0f7954df167685022). * This patch **fails due to an unknown error code, -9**. * This patch merges cleanly. * This patch adds no public classes. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on issue #27260: [SPARK-30549][SQL] Fix the subquery shown issue in UI When enable AQE
SparkQA commented on issue #27260: [SPARK-30549][SQL] Fix the subquery shown issue in UI When enable AQE URL: https://github.com/apache/spark/pull/27260#issuecomment-576564091 **[Test build #117162 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/117162/testReport)** for PR 27260 at commit [`0c1650c`](https://github.com/apache/spark/commit/0c1650cc459f78425ea186b102510922ae69d092). * This patch **fails due to an unknown error code, -9**. * This patch merges cleanly. * This patch adds no public classes. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #27237: [SPARK-28330][SQL] Support ANSI SQL: result offset clause in query expression
AmplabJenkins commented on issue #27237: [SPARK-28330][SQL] Support ANSI SQL: result offset clause in query expression URL: https://github.com/apache/spark/pull/27237#issuecomment-576564166 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/117161/ Test FAILed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on issue #27019: [SPARK-30027][SQL] Support codegen for aggregate filters in HashAggregateExec
SparkQA commented on issue #27019: [SPARK-30027][SQL] Support codegen for aggregate filters in HashAggregateExec URL: https://github.com/apache/spark/pull/27019#issuecomment-576564093 **[Test build #117154 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/117154/testReport)** for PR 27019 at commit [`ba48342`](https://github.com/apache/spark/commit/ba48342489bc42aae07674bc7f09cd193726f93f). * This patch **fails due to an unknown error code, -9**. * This patch merges cleanly. * This patch adds no public classes. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on issue #27301: [MINOR][DOCS] Fix Jenkins build image and link in README.md
SparkQA commented on issue #27301: [MINOR][DOCS] Fix Jenkins build image and link in README.md URL: https://github.com/apache/spark/pull/27301#issuecomment-576564090 **[Test build #117155 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/117155/testReport)** for PR 27301 at commit [`aaaf6bd`](https://github.com/apache/spark/commit/aaaf6bd894423db39db21ea454ff865e599ecb2f). * This patch **fails due to an unknown error code, -9**. * This patch merges cleanly. * This patch adds no public classes. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #27288: [SPARK-30579][DOC] Document ORDER BY Clause of SELECT statement in SQL Reference
AmplabJenkins commented on issue #27288: [SPARK-30579][DOC] Document ORDER BY Clause of SELECT statement in SQL Reference URL: https://github.com/apache/spark/pull/27288#issuecomment-576564261 Merged build finished. Test FAILed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #27260: [SPARK-30549][SQL] Fix the subquery shown issue in UI When enable AQE
AmplabJenkins commented on issue #27260: [SPARK-30549][SQL] Fix the subquery shown issue in UI When enable AQE URL: https://github.com/apache/spark/pull/27260#issuecomment-576564136 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/117162/ Test FAILed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #27300: [SPARK-30591][SQL] Remove the nonstandard SET OWNER syntax for namespaces
AmplabJenkins commented on issue #27300: [SPARK-30591][SQL] Remove the nonstandard SET OWNER syntax for namespaces URL: https://github.com/apache/spark/pull/27300#issuecomment-576564269 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/117157/ Test FAILed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on issue #27237: [SPARK-28330][SQL] Support ANSI SQL: result offset clause in query expression
SparkQA commented on issue #27237: [SPARK-28330][SQL] Support ANSI SQL: result offset clause in query expression URL: https://github.com/apache/spark/pull/27237#issuecomment-576564092 **[Test build #117161 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/117161/testReport)** for PR 27237 at commit [`0511b8f`](https://github.com/apache/spark/commit/0511b8fc69deaaf8d52575f570ed576756b0f8aa). * This patch **fails due to an unknown error code, -9**. * This patch merges cleanly. * This patch adds no public classes. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on issue #27299: [SPARK-30547][SQL][FOLLOWUP] Update since anotation for CalendarInterval class
SparkQA commented on issue #27299: [SPARK-30547][SQL][FOLLOWUP] Update since anotation for CalendarInterval class URL: https://github.com/apache/spark/pull/27299#issuecomment-576564087 **[Test build #117145 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/117145/testReport)** for PR 27299 at commit [`275f8e6`](https://github.com/apache/spark/commit/275f8e6c477fc6d75be61664fca574008d19ea72). * This patch **fails due to an unknown error code, -9**. * This patch merges cleanly. * This patch adds no public classes. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #27300: [SPARK-30591][SQL] Remove the nonstandard SET OWNER syntax for namespaces
AmplabJenkins commented on issue #27300: [SPARK-30591][SQL] Remove the nonstandard SET OWNER syntax for namespaces URL: https://github.com/apache/spark/pull/27300#issuecomment-576564253 Merged build finished. Test FAILed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on issue #27300: [SPARK-30591][SQL] Remove the nonstandard SET OWNER syntax for namespaces
SparkQA commented on issue #27300: [SPARK-30591][SQL] Remove the nonstandard SET OWNER syntax for namespaces URL: https://github.com/apache/spark/pull/27300#issuecomment-576564089 **[Test build #117157 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/117157/testReport)** for PR 27300 at commit [`f05f883`](https://github.com/apache/spark/commit/f05f8839818da79f5b6a2d67093361f35400a703). * This patch **fails due to an unknown error code, -9**. * This patch merges cleanly. * This patch adds no public classes. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #27288: [SPARK-30579][DOC] Document ORDER BY Clause of SELECT statement in SQL Reference
AmplabJenkins commented on issue #27288: [SPARK-30579][DOC] Document ORDER BY Clause of SELECT statement in SQL Reference URL: https://github.com/apache/spark/pull/27288#issuecomment-576564276 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/117156/ Test FAILed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #27237: [SPARK-28330][SQL] Support ANSI SQL: result offset clause in query expression
AmplabJenkins commented on issue #27237: [SPARK-28330][SQL] Support ANSI SQL: result offset clause in query expression URL: https://github.com/apache/spark/pull/27237#issuecomment-576564159 Merged build finished. Test FAILed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #27260: [SPARK-30549][SQL] Fix the subquery shown issue in UI When enable AQE
AmplabJenkins commented on issue #27260: [SPARK-30549][SQL] Fix the subquery shown issue in UI When enable AQE URL: https://github.com/apache/spark/pull/27260#issuecomment-576564124 Merged build finished. Test FAILed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on issue #27300: [SPARK-30591][SQL] Remove the nonstandard SET OWNER syntax for namespaces
SparkQA commented on issue #27300: [SPARK-30591][SQL] Remove the nonstandard SET OWNER syntax for namespaces URL: https://github.com/apache/spark/pull/27300#issuecomment-576564084 **[Test build #117151 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/117151/testReport)** for PR 27300 at commit [`a3958db`](https://github.com/apache/spark/commit/a3958dbc69229e4b4f0af789d192b624b2acd964). * This patch **fails due to an unknown error code, -9**. * This patch merges cleanly. * This patch adds no public classes. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #27019: [SPARK-30027][SQL] Support codegen for aggregate filters in HashAggregateExec
AmplabJenkins removed a comment on issue #27019: [SPARK-30027][SQL] Support codegen for aggregate filters in HashAggregateExec URL: https://github.com/apache/spark/pull/27019#issuecomment-576564375 Merged build finished. Test FAILed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #27019: [SPARK-30027][SQL] Support codegen for aggregate filters in HashAggregateExec
AmplabJenkins commented on issue #27019: [SPARK-30027][SQL] Support codegen for aggregate filters in HashAggregateExec URL: https://github.com/apache/spark/pull/27019#issuecomment-576564386 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/117154/ Test FAILed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA removed a comment on issue #27019: [SPARK-30027][SQL] Support codegen for aggregate filters in HashAggregateExec
SparkQA removed a comment on issue #27019: [SPARK-30027][SQL] Support codegen for aggregate filters in HashAggregateExec URL: https://github.com/apache/spark/pull/27019#issuecomment-576522731 **[Test build #117154 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/117154/testReport)** for PR 27019 at commit [`ba48342`](https://github.com/apache/spark/commit/ba48342489bc42aae07674bc7f09cd193726f93f). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #27301: [MINOR][DOCS] Fix Jenkins build image and link in README.md
AmplabJenkins commented on issue #27301: [MINOR][DOCS] Fix Jenkins build image and link in README.md URL: https://github.com/apache/spark/pull/27301#issuecomment-576564408 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/117155/ Test FAILed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA removed a comment on issue #27300: [SPARK-30591][SQL] Remove the nonstandard SET OWNER syntax for namespaces
SparkQA removed a comment on issue #27300: [SPARK-30591][SQL] Remove the nonstandard SET OWNER syntax for namespaces URL: https://github.com/apache/spark/pull/27300#issuecomment-576508955 **[Test build #117151 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/117151/testReport)** for PR 27300 at commit [`a3958db`](https://github.com/apache/spark/commit/a3958dbc69229e4b4f0af789d192b624b2acd964). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #27260: [SPARK-30549][SQL] Fix the subquery shown issue in UI When enable AQE
AmplabJenkins removed a comment on issue #27260: [SPARK-30549][SQL] Fix the subquery shown issue in UI When enable AQE URL: https://github.com/apache/spark/pull/27260#issuecomment-576564124 Merged build finished. Test FAILed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA removed a comment on issue #27300: [SPARK-30591][SQL] Remove the nonstandard SET OWNER syntax for namespaces
SparkQA removed a comment on issue #27300: [SPARK-30591][SQL] Remove the nonstandard SET OWNER syntax for namespaces URL: https://github.com/apache/spark/pull/27300#issuecomment-576538216 **[Test build #117157 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/117157/testReport)** for PR 27300 at commit [`f05f883`](https://github.com/apache/spark/commit/f05f8839818da79f5b6a2d67093361f35400a703). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA removed a comment on issue #27288: [SPARK-30579][DOC] Document ORDER BY Clause of SELECT statement in SQL Reference
SparkQA removed a comment on issue #27288: [SPARK-30579][DOC] Document ORDER BY Clause of SELECT statement in SQL Reference URL: https://github.com/apache/spark/pull/27288#issuecomment-576536447 **[Test build #117156 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/117156/testReport)** for PR 27288 at commit [`d1fb7bd`](https://github.com/apache/spark/commit/d1fb7bd41c43196b23f51db0f7954df167685022). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #27019: [SPARK-30027][SQL] Support codegen for aggregate filters in HashAggregateExec
AmplabJenkins commented on issue #27019: [SPARK-30027][SQL] Support codegen for aggregate filters in HashAggregateExec URL: https://github.com/apache/spark/pull/27019#issuecomment-576564375 Merged build finished. Test FAILed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #27299: [SPARK-30547][SQL][FOLLOWUP] Update since anotation for CalendarInterval class
AmplabJenkins removed a comment on issue #27299: [SPARK-30547][SQL][FOLLOWUP] Update since anotation for CalendarInterval class URL: https://github.com/apache/spark/pull/27299#issuecomment-576564390 Merged build finished. Test FAILed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #27237: [SPARK-28330][SQL] Support ANSI SQL: result offset clause in query expression
AmplabJenkins removed a comment on issue #27237: [SPARK-28330][SQL] Support ANSI SQL: result offset clause in query expression URL: https://github.com/apache/spark/pull/27237#issuecomment-576564159 Merged build finished. Test FAILed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #27301: [MINOR][DOCS] Fix Jenkins build image and link in README.md
AmplabJenkins removed a comment on issue #27301: [MINOR][DOCS] Fix Jenkins build image and link in README.md URL: https://github.com/apache/spark/pull/27301#issuecomment-576564398 Merged build finished. Test FAILed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #27300: [SPARK-30591][SQL] Remove the nonstandard SET OWNER syntax for namespaces
AmplabJenkins removed a comment on issue #27300: [SPARK-30591][SQL] Remove the nonstandard SET OWNER syntax for namespaces URL: https://github.com/apache/spark/pull/27300#issuecomment-576564253 Merged build finished. Test FAILed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #27288: [SPARK-30579][DOC] Document ORDER BY Clause of SELECT statement in SQL Reference
AmplabJenkins removed a comment on issue #27288: [SPARK-30579][DOC] Document ORDER BY Clause of SELECT statement in SQL Reference URL: https://github.com/apache/spark/pull/27288#issuecomment-576564261 Merged build finished. Test FAILed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #27299: [SPARK-30547][SQL][FOLLOWUP] Update since anotation for CalendarInterval class
AmplabJenkins commented on issue #27299: [SPARK-30547][SQL][FOLLOWUP] Update since anotation for CalendarInterval class URL: https://github.com/apache/spark/pull/27299#issuecomment-576564390 Merged build finished. Test FAILed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #27301: [MINOR][DOCS] Fix Jenkins build image and link in README.md
AmplabJenkins commented on issue #27301: [MINOR][DOCS] Fix Jenkins build image and link in README.md URL: https://github.com/apache/spark/pull/27301#issuecomment-576564398 Merged build finished. Test FAILed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA removed a comment on issue #27301: [MINOR][DOCS] Fix Jenkins build image and link in README.md
SparkQA removed a comment on issue #27301: [MINOR][DOCS] Fix Jenkins build image and link in README.md URL: https://github.com/apache/spark/pull/27301#issuecomment-576536452 **[Test build #117155 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/117155/testReport)** for PR 27301 at commit [`aaaf6bd`](https://github.com/apache/spark/commit/aaaf6bd894423db39db21ea454ff865e599ecb2f). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA removed a comment on issue #27299: [SPARK-30547][SQL][FOLLOWUP] Update since anotation for CalendarInterval class
SparkQA removed a comment on issue #27299: [SPARK-30547][SQL][FOLLOWUP] Update since anotation for CalendarInterval class URL: https://github.com/apache/spark/pull/27299#issuecomment-576496394 **[Test build #117145 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/117145/testReport)** for PR 27299 at commit [`275f8e6`](https://github.com/apache/spark/commit/275f8e6c477fc6d75be61664fca574008d19ea72). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #27299: [SPARK-30547][SQL][FOLLOWUP] Update since anotation for CalendarInterval class
AmplabJenkins commented on issue #27299: [SPARK-30547][SQL][FOLLOWUP] Update since anotation for CalendarInterval class URL: https://github.com/apache/spark/pull/27299#issuecomment-576564399 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/117145/ Test FAILed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA removed a comment on issue #27260: [SPARK-30549][SQL] Fix the subquery shown issue in UI When enable AQE
SparkQA removed a comment on issue #27260: [SPARK-30549][SQL] Fix the subquery shown issue in UI When enable AQE URL: https://github.com/apache/spark/pull/27260#issuecomment-576559875 **[Test build #117162 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/117162/testReport)** for PR 27260 at commit [`0c1650c`](https://github.com/apache/spark/commit/0c1650cc459f78425ea186b102510922ae69d092). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA removed a comment on issue #27237: [SPARK-28330][SQL] Support ANSI SQL: result offset clause in query expression
SparkQA removed a comment on issue #27237: [SPARK-28330][SQL] Support ANSI SQL: result offset clause in query expression URL: https://github.com/apache/spark/pull/27237#issuecomment-576548356 **[Test build #117161 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/117161/testReport)** for PR 27237 at commit [`0511b8f`](https://github.com/apache/spark/commit/0511b8fc69deaaf8d52575f570ed576756b0f8aa). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #27300: [SPARK-30591][SQL] Remove the nonstandard SET OWNER syntax for namespaces
AmplabJenkins removed a comment on issue #27300: [SPARK-30591][SQL] Remove the nonstandard SET OWNER syntax for namespaces URL: https://github.com/apache/spark/pull/27300#issuecomment-576564593 Merged build finished. Test FAILed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #27300: [SPARK-30591][SQL] Remove the nonstandard SET OWNER syntax for namespaces
AmplabJenkins removed a comment on issue #27300: [SPARK-30591][SQL] Remove the nonstandard SET OWNER syntax for namespaces URL: https://github.com/apache/spark/pull/27300#issuecomment-576564269 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/117157/ Test FAILed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #27288: [SPARK-30579][DOC] Document ORDER BY Clause of SELECT statement in SQL Reference
AmplabJenkins removed a comment on issue #27288: [SPARK-30579][DOC] Document ORDER BY Clause of SELECT statement in SQL Reference URL: https://github.com/apache/spark/pull/27288#issuecomment-576564276 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/117156/ Test FAILed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #27300: [SPARK-30591][SQL] Remove the nonstandard SET OWNER syntax for namespaces
AmplabJenkins commented on issue #27300: [SPARK-30591][SQL] Remove the nonstandard SET OWNER syntax for namespaces URL: https://github.com/apache/spark/pull/27300#issuecomment-576564598 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/117151/ Test FAILed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #27019: [SPARK-30027][SQL] Support codegen for aggregate filters in HashAggregateExec
AmplabJenkins removed a comment on issue #27019: [SPARK-30027][SQL] Support codegen for aggregate filters in HashAggregateExec URL: https://github.com/apache/spark/pull/27019#issuecomment-576564386 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/117154/ Test FAILed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #27237: [SPARK-28330][SQL] Support ANSI SQL: result offset clause in query expression
AmplabJenkins removed a comment on issue #27237: [SPARK-28330][SQL] Support ANSI SQL: result offset clause in query expression URL: https://github.com/apache/spark/pull/27237#issuecomment-576564166 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/117161/ Test FAILed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #27299: [SPARK-30547][SQL][FOLLOWUP] Update since anotation for CalendarInterval class
AmplabJenkins removed a comment on issue #27299: [SPARK-30547][SQL][FOLLOWUP] Update since anotation for CalendarInterval class URL: https://github.com/apache/spark/pull/27299#issuecomment-576564399 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/117145/ Test FAILed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] beliefer commented on a change in pull request #27058: [SPARK-30276][SQL] Support Filter expression allows simultaneous use of DISTINCT
beliefer commented on a change in pull request #27058: [SPARK-30276][SQL] Support Filter expression allows simultaneous use of DISTINCT URL: https://github.com/apache/spark/pull/27058#discussion_r368856077 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/RewriteDistinctAggregates.scala ## @@ -148,24 +207,106 @@ object RewriteDistinctAggregates extends Rule[LogicalPlan] { val distinctAggs = exprs.flatMap { _.collect { case ae: AggregateExpression if ae.isDistinct => ae }} -// We need at least two distinct aggregates for this rule because aggregation -// strategy can handle a single distinct group. +// This rule serves two purposes: +// One is to rewrite when there exists at least two distinct aggregates. We need at least +// two distinct aggregates for this rule because aggregation strategy can handle a single +// distinct group. +// Another is to expand distinct aggregates which exists filter clause so that we can +// evaluate the filter locally. Review comment: Thanks. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #27300: [SPARK-30591][SQL] Remove the nonstandard SET OWNER syntax for namespaces
AmplabJenkins commented on issue #27300: [SPARK-30591][SQL] Remove the nonstandard SET OWNER syntax for namespaces URL: https://github.com/apache/spark/pull/27300#issuecomment-576564593 Merged build finished. Test FAILed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #27260: [SPARK-30549][SQL] Fix the subquery shown issue in UI When enable AQE
AmplabJenkins removed a comment on issue #27260: [SPARK-30549][SQL] Fix the subquery shown issue in UI When enable AQE URL: https://github.com/apache/spark/pull/27260#issuecomment-576564136 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/117162/ Test FAILed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #27301: [MINOR][DOCS] Fix Jenkins build image and link in README.md
AmplabJenkins removed a comment on issue #27301: [MINOR][DOCS] Fix Jenkins build image and link in README.md URL: https://github.com/apache/spark/pull/27301#issuecomment-576564408 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/117155/ Test FAILed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #27300: [SPARK-30591][SQL] Remove the nonstandard SET OWNER syntax for namespaces
AmplabJenkins removed a comment on issue #27300: [SPARK-30591][SQL] Remove the nonstandard SET OWNER syntax for namespaces URL: https://github.com/apache/spark/pull/27300#issuecomment-576564598 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/117151/ Test FAILed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] fuwhu commented on a change in pull request #26805: [SPARK-15616][SQL] Add optimizer rule PruneHiveTablePartitions
fuwhu commented on a change in pull request #26805: [SPARK-15616][SQL] Add optimizer rule PruneHiveTablePartitions URL: https://github.com/apache/spark/pull/26805#discussion_r368856556 ## File path: sql/hive/src/test/scala/org/apache/spark/sql/hive/execution/PruneHiveTablePartitionsSuite.scala ## @@ -0,0 +1,56 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.spark.sql.hive.execution + +import org.apache.spark.sql.QueryTest +import org.apache.spark.sql.catalyst.analysis.EliminateSubqueryAliases +import org.apache.spark.sql.catalyst.plans.logical.LogicalPlan +import org.apache.spark.sql.catalyst.rules.RuleExecutor +import org.apache.spark.sql.hive.test.TestHiveSingleton +import org.apache.spark.sql.test.SQLTestUtils + +class PruneHiveTablePartitionsSuite extends QueryTest with SQLTestUtils with TestHiveSingleton { + + object Optimize extends RuleExecutor[LogicalPlan] { +val batches = + Batch("PruneHiveTablePartitions", Once, +EliminateSubqueryAliases, new PruneHiveTablePartitions(spark)) :: Nil + } + + test("SPARK-15616 statistics pruned after going throuhg PruneHiveTablePartitions") { +withTable("test", "temp") { + sql( +s""" + |CREATE TABLE test(i int) + |PARTITIONED BY (p int) + |STORED AS textfile""".stripMargin) + spark.range(0, 1000, 1).selectExpr("id as col") +.createOrReplaceTempView("temp") + + for (part <- Seq(1, 2, 3, 4)) { +sql(s""" + |INSERT OVERWRITE TABLE test PARTITION (p='$part') Review comment: updated to use two-space indentation like PruneFileSourcePartitionsSuite. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] beliefer commented on a change in pull request #27058: [SPARK-30276][SQL] Support Filter expression allows simultaneous use of DISTINCT
beliefer commented on a change in pull request #27058: [SPARK-30276][SQL] Support Filter expression allows simultaneous use of DISTINCT URL: https://github.com/apache/spark/pull/27058#discussion_r368856747 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/RewriteDistinctAggregates.scala ## @@ -331,6 +472,14 @@ object RewriteDistinctAggregates extends Rule[LogicalPlan] { } } + private def collectAggregateExprs(a: Aggregate): Seq[AggregateExpression] = { Review comment: OK This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] fuwhu commented on a change in pull request #26805: [SPARK-15616][SQL] Add optimizer rule PruneHiveTablePartitions
fuwhu commented on a change in pull request #26805: [SPARK-15616][SQL] Add optimizer rule PruneHiveTablePartitions URL: https://github.com/apache/spark/pull/26805#discussion_r368856739 ## File path: sql/hive/src/test/scala/org/apache/spark/sql/hive/execution/PruneHiveTablePartitionsSuite.scala ## @@ -0,0 +1,56 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.spark.sql.hive.execution + +import org.apache.spark.sql.QueryTest +import org.apache.spark.sql.catalyst.analysis.EliminateSubqueryAliases +import org.apache.spark.sql.catalyst.plans.logical.LogicalPlan +import org.apache.spark.sql.catalyst.rules.RuleExecutor +import org.apache.spark.sql.hive.test.TestHiveSingleton +import org.apache.spark.sql.test.SQLTestUtils + +class PruneHiveTablePartitionsSuite extends QueryTest with SQLTestUtils with TestHiveSingleton { + + object Optimize extends RuleExecutor[LogicalPlan] { +val batches = + Batch("PruneHiveTablePartitions", Once, +EliminateSubqueryAliases, new PruneHiveTablePartitions(spark)) :: Nil + } + + test("SPARK-15616 statistics pruned after going throuhg PruneHiveTablePartitions") { +withTable("test", "temp") { + sql( +s""" + |CREATE TABLE test(i int) + |PARTITIONED BY (p int) + |STORED AS textfile""".stripMargin) + spark.range(0, 1000, 1).selectExpr("id as col") +.createOrReplaceTempView("temp") + + for (part <- Seq(1, 2, 3, 4)) { +sql(s""" + |INSERT OVERWRITE TABLE test PARTITION (p='$part') + |select col from temp""".stripMargin) + } + val analyzed1 = sql("select i from test where p>0").queryExecution.analyzed + val analyzed2 = sql("select i from test where p=1").queryExecution.analyzed + assert(Optimize.execute(analyzed1).stats.sizeInBytes/4 === Review comment: updated the code style, thanks a lot. :) This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on issue #26805: [SPARK-15616][SQL] Add optimizer rule PruneHiveTablePartitions
SparkQA commented on issue #26805: [SPARK-15616][SQL] Add optimizer rule PruneHiveTablePartitions URL: https://github.com/apache/spark/pull/26805#issuecomment-576565376 **[Test build #117166 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/117166/testReport)** for PR 26805 at commit [`b1798d5`](https://github.com/apache/spark/commit/b1798d52147b081c7073f3c096eb886a867b921d). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] dongjoon-hyun commented on a change in pull request #27242: [SPARK-30534][INFRA] Use mvn in `dev/scalastyle`
dongjoon-hyun commented on a change in pull request #27242: [SPARK-30534][INFRA] Use mvn in `dev/scalastyle` URL: https://github.com/apache/spark/pull/27242#discussion_r368857201 ## File path: dev/scalastyle ## @@ -17,18 +17,10 @@ # limitations under the License. # -SPARK_PROFILES=${1:-"-Pmesos -Pkubernetes -Pyarn -Pspark-ganglia-lgpl -Pkinesis-asl -Phive-thriftserver -Phive"} +SCRIPT_DIR="$( cd "$( dirname "$0" )" && pwd )" -# NOTE: echo "q" is needed because SBT prompts the user for input on encountering a build file -# with failure (either resolution or compilation); the "q" makes SBT quit. -ERRORS=$(echo -e "q\n" \ -| build/sbt \ -${SPARK_PROFILES} \ --Pdocker-integration-tests \ --Pkubernetes-integration-tests \ -scalastyle test:scalastyle \ -| awk '{if($1~/error/)print}' \ -) +SPARK_PROFILES=${1:-"-Pmesos -Pkubernetes -Pyarn -Pspark-ganglia-lgpl -Pkinesis-asl -Phive-thriftserver -Phive -Pdocker-integration-tests -Pkubernetes-integration-tests"} +ERRORS=$($SCRIPT_DIR/../build/mvn $SPARK_PROFILES scalastyle:check | grep "^error file") Review comment: I checked this. It still fails before this commit. Please try in your environment, @HyukjinKwon . ``` [error] [FATAL] Non-resolvable parent POM: Could not transfer artifact org.apache:apache:pom:18 from/to central ( http://repo.maven.apache.org/maven2): Failed to look for file: http://repo.maven.apache.org/maven2/org/apache/apache/18/apache-18.pom. Return code is: 501 and 'parent.relativePath' points at wrong local POM @ line 22, column 11 ``` This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] maropu opened a new pull request #27305: [WIP][SPARK-29701][SQL] Correct behaviours of group analytical queries when empty input given
maropu opened a new pull request #27305: [WIP][SPARK-29701][SQL] Correct behaviours of group analytical queries when empty input given URL: https://github.com/apache/spark/pull/27305 ### What changes were proposed in this pull request? (This is another approach of #27233) This PR intends to fix a bug when empty input given in group analytical queries (e.g., GROUPING SETS). For example, a query below with empty input has a different answer between Spark and PostgreSQL: ``` postgres=# create table gstest_empty (a integer, b integer, v integer); CREATE TABLE postgres=# select a, b, sum(v), count(*) from gstest_empty group by grouping sets ((a,b),()); a | b | sum | count ---+---+-+--- | | | 0 (1 row) scala> sql("""select a, b, sum(v), count(*) from gstest_empty group by grouping sets ((a,b),())""").show +---+---+--++ | a| b|sum(v)|count(1)| +---+---+--++ +---+---+--++ ``` In the case, we should follow the PostgreSQL answer. To fix this, this PR modified the existing resolution rules (`ResolveGroupingAnalytics` and `ResolveAggregateFunctions`) to rewrite it as an union query of aggregates with/without keys as follows; ``` scala> sql("""select a, b, sum(v), count(*) from gstest_empty group by grouping sets ((a,b),())""").explain(true) == Analyzed Logical Plan == a: int, b: int, sum(v): bigint, count(1): bigint Union :- Aggregate [a#10, b#11, spark_grouping_id#7], [a#10, b#11, sum(cast(v#3 as bigint)) AS sum(v)#5L, count(1) AS count(1)#6L] : +- Expand [List(v#3, a#1, b#2, 0)], [v#3, a#10, b#11, spark_grouping_id#7] : +- Project [v#3, a#1, b#2] :+- Relation[a#1,b#2,v#3] parquet +- Aggregate [null AS a#10, null AS b#11, sum(cast(v#3 as bigint)) AS sum(v)#5L, count(1) AS count(1)#6L] +- Project [v#3] +- Relation[a#1,b#2,v#3] parquet ``` NOTE: This pr also updates the existing test in `OptimizeMetadataOnlyQuerySuite`; it has [the rollup query](https://github.com/apache/spark/pull/27233/files#diff-7d340edd739a1f59d51d5228113fd9edL120) and this PR transforms it into a `union(aggregate with keys, aggregate without keys)` form. The test checks that no metadata-only query happens there, but the `aggregate without keys` side can be optimized by the `OptimizeMetadataOnlyQuery` rule then `testNotMetadataOnly()` fails. To avoid this case, this PR replaces the rollup query with a grouping set one. ### Why are the changes needed? For the correct SQL semantics. ### Does this PR introduce any user-facing change? No. ### How was this patch tested? UTs added. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #26805: [SPARK-15616][SQL] Add optimizer rule PruneHiveTablePartitions
AmplabJenkins commented on issue #26805: [SPARK-15616][SQL] Add optimizer rule PruneHiveTablePartitions URL: https://github.com/apache/spark/pull/26805#issuecomment-576565789 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/21930/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #26805: [SPARK-15616][SQL] Add optimizer rule PruneHiveTablePartitions
AmplabJenkins commented on issue #26805: [SPARK-15616][SQL] Add optimizer rule PruneHiveTablePartitions URL: https://github.com/apache/spark/pull/26805#issuecomment-576565783 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #26805: [SPARK-15616][SQL] Add optimizer rule PruneHiveTablePartitions
AmplabJenkins removed a comment on issue #26805: [SPARK-15616][SQL] Add optimizer rule PruneHiveTablePartitions URL: https://github.com/apache/spark/pull/26805#issuecomment-576565789 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/21930/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] dongjoon-hyun commented on a change in pull request #27242: [SPARK-30534][INFRA] Use mvn in `dev/scalastyle`
dongjoon-hyun commented on a change in pull request #27242: [SPARK-30534][INFRA] Use mvn in `dev/scalastyle` URL: https://github.com/apache/spark/pull/27242#discussion_r368857201 ## File path: dev/scalastyle ## @@ -17,18 +17,10 @@ # limitations under the License. # -SPARK_PROFILES=${1:-"-Pmesos -Pkubernetes -Pyarn -Pspark-ganglia-lgpl -Pkinesis-asl -Phive-thriftserver -Phive"} +SCRIPT_DIR="$( cd "$( dirname "$0" )" && pwd )" -# NOTE: echo "q" is needed because SBT prompts the user for input on encountering a build file -# with failure (either resolution or compilation); the "q" makes SBT quit. -ERRORS=$(echo -e "q\n" \ -| build/sbt \ -${SPARK_PROFILES} \ --Pdocker-integration-tests \ --Pkubernetes-integration-tests \ -scalastyle test:scalastyle \ -| awk '{if($1~/error/)print}' \ -) +SPARK_PROFILES=${1:-"-Pmesos -Pkubernetes -Pyarn -Pspark-ganglia-lgpl -Pkinesis-asl -Phive-thriftserver -Phive -Pdocker-integration-tests -Pkubernetes-integration-tests"} +ERRORS=$($SCRIPT_DIR/../build/mvn $SPARK_PROFILES scalastyle:check | grep "^error file") Review comment: I checked this. It still fails before this commit. Please try to checkout old commit in your environment, @HyukjinKwon . ``` [error] [FATAL] Non-resolvable parent POM: Could not transfer artifact org.apache:apache:pom:18 from/to central ( http://repo.maven.apache.org/maven2): Failed to look for file: http://repo.maven.apache.org/maven2/org/apache/apache/18/apache-18.pom. Return code is: 501 and 'parent.relativePath' points at wrong local POM @ line 22, column 11 ``` This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #26805: [SPARK-15616][SQL] Add optimizer rule PruneHiveTablePartitions
AmplabJenkins removed a comment on issue #26805: [SPARK-15616][SQL] Add optimizer rule PruneHiveTablePartitions URL: https://github.com/apache/spark/pull/26805#issuecomment-576565783 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] dongjoon-hyun commented on a change in pull request #27242: [SPARK-30534][INFRA] Use mvn in `dev/scalastyle`
dongjoon-hyun commented on a change in pull request #27242: [SPARK-30534][INFRA] Use mvn in `dev/scalastyle` URL: https://github.com/apache/spark/pull/27242#discussion_r368857201 ## File path: dev/scalastyle ## @@ -17,18 +17,10 @@ # limitations under the License. # -SPARK_PROFILES=${1:-"-Pmesos -Pkubernetes -Pyarn -Pspark-ganglia-lgpl -Pkinesis-asl -Phive-thriftserver -Phive"} +SCRIPT_DIR="$( cd "$( dirname "$0" )" && pwd )" -# NOTE: echo "q" is needed because SBT prompts the user for input on encountering a build file -# with failure (either resolution or compilation); the "q" makes SBT quit. -ERRORS=$(echo -e "q\n" \ -| build/sbt \ -${SPARK_PROFILES} \ --Pdocker-integration-tests \ --Pkubernetes-integration-tests \ -scalastyle test:scalastyle \ -| awk '{if($1~/error/)print}' \ -) +SPARK_PROFILES=${1:-"-Pmesos -Pkubernetes -Pyarn -Pspark-ganglia-lgpl -Pkinesis-asl -Phive-thriftserver -Phive -Pdocker-integration-tests -Pkubernetes-integration-tests"} +ERRORS=$($SCRIPT_DIR/../build/mvn $SPARK_PROFILES scalastyle:check | grep "^error file") Review comment: I checked this with dca838058ffd0e2c01591fd9ab0f192de446d606 again. It still fails before this commit. Please try to checkout old commit in your environment, @HyukjinKwon . ``` [error] [FATAL] Non-resolvable parent POM: Could not transfer artifact org.apache:apache:pom:18 from/to central ( http://repo.maven.apache.org/maven2): Failed to look for file: http://repo.maven.apache.org/maven2/org/apache/apache/18/apache-18.pom. Return code is: 501 and 'parent.relativePath' points at wrong local POM @ line 22, column 11 ``` This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] maropu commented on issue #27233: [SPARK-29701][SQL] Correct behaviours of group analytical queries when empty input given
maropu commented on issue #27233: [SPARK-29701][SQL] Correct behaviours of group analytical queries when empty input given URL: https://github.com/apache/spark/pull/27233#issuecomment-576566110 Yea, I'm ok with that. Still WIP though, I opened a new PR for another approach in #27305. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] beliefer commented on issue #27237: [SPARK-28330][SQL] Support ANSI SQL: result offset clause in query expression
beliefer commented on issue #27237: [SPARK-28330][SQL] Support ANSI SQL: result offset clause in query expression URL: https://github.com/apache/spark/pull/27237#issuecomment-576567129 retest this please This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on issue #27305: [WIP][SPARK-29701][SQL] Correct behaviours of group analytical queries when empty input given
SparkQA commented on issue #27305: [WIP][SPARK-29701][SQL] Correct behaviours of group analytical queries when empty input given URL: https://github.com/apache/spark/pull/27305#issuecomment-576567868 **[Test build #117167 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/117167/testReport)** for PR 27305 at commit [`a3e8d89`](https://github.com/apache/spark/commit/a3e8d8982565794e4c5f2434751d15967d1d21b0). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on issue #27237: [SPARK-28330][SQL] Support ANSI SQL: result offset clause in query expression
SparkQA commented on issue #27237: [SPARK-28330][SQL] Support ANSI SQL: result offset clause in query expression URL: https://github.com/apache/spark/pull/27237#issuecomment-576567874 **[Test build #117168 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/117168/testReport)** for PR 27237 at commit [`0511b8f`](https://github.com/apache/spark/commit/0511b8fc69deaaf8d52575f570ed576756b0f8aa). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #27305: [WIP][SPARK-29701][SQL] Correct behaviours of group analytical queries when empty input given
AmplabJenkins commented on issue #27305: [WIP][SPARK-29701][SQL] Correct behaviours of group analytical queries when empty input given URL: https://github.com/apache/spark/pull/27305#issuecomment-576568326 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/21931/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #27237: [SPARK-28330][SQL] Support ANSI SQL: result offset clause in query expression
AmplabJenkins commented on issue #27237: [SPARK-28330][SQL] Support ANSI SQL: result offset clause in query expression URL: https://github.com/apache/spark/pull/27237#issuecomment-576568352 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #27237: [SPARK-28330][SQL] Support ANSI SQL: result offset clause in query expression
AmplabJenkins commented on issue #27237: [SPARK-28330][SQL] Support ANSI SQL: result offset clause in query expression URL: https://github.com/apache/spark/pull/27237#issuecomment-576568363 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/21932/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #27305: [WIP][SPARK-29701][SQL] Correct behaviours of group analytical queries when empty input given
AmplabJenkins commented on issue #27305: [WIP][SPARK-29701][SQL] Correct behaviours of group analytical queries when empty input given URL: https://github.com/apache/spark/pull/27305#issuecomment-576568316 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] beliefer commented on a change in pull request #27058: [SPARK-30276][SQL] Support Filter expression allows simultaneous use of DISTINCT
beliefer commented on a change in pull request #27058: [SPARK-30276][SQL] Support Filter expression allows simultaneous use of DISTINCT URL: https://github.com/apache/spark/pull/27058#discussion_r368859956 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/RewriteDistinctAggregates.scala ## @@ -148,24 +207,106 @@ object RewriteDistinctAggregates extends Rule[LogicalPlan] { val distinctAggs = exprs.flatMap { _.collect { case ae: AggregateExpression if ae.isDistinct => ae }} -// We need at least two distinct aggregates for this rule because aggregation -// strategy can handle a single distinct group. +// This rule serves two purposes: +// One is to rewrite when there exists at least two distinct aggregates. We need at least +// two distinct aggregates for this rule because aggregation strategy can handle a single +// distinct group. +// Another is to expand distinct aggregates which exists filter clause so that we can +// evaluate the filter locally. // This check can produce false-positives, e.g., SUM(DISTINCT a) & COUNT(DISTINCT a). -distinctAggs.size > 1 +distinctAggs.size >= 1 || distinctAggs.exists(_.filter.isDefined) } def apply(plan: LogicalPlan): LogicalPlan = plan transformUp { -case a: Aggregate if mayNeedtoRewrite(a.aggregateExpressions) => rewrite(a) +case a: Aggregate if mayNeedtoRewrite(a.aggregateExpressions) => + val expandAggregate = extractFiltersInDistinctAggregate(a) + rewriteDistinctAggregate(expandAggregate) } - def rewrite(a: Aggregate): Aggregate = { + private def extractFiltersInDistinctAggregate(a: Aggregate): Aggregate = { Review comment: For first suggestion, you means I should add new API in dsl? dsl can't support filter clause yet. For second suggestion, OK. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #27305: [WIP][SPARK-29701][SQL] Correct behaviours of group analytical queries when empty input given
AmplabJenkins removed a comment on issue #27305: [WIP][SPARK-29701][SQL] Correct behaviours of group analytical queries when empty input given URL: https://github.com/apache/spark/pull/27305#issuecomment-576568316 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #27305: [WIP][SPARK-29701][SQL] Correct behaviours of group analytical queries when empty input given
AmplabJenkins removed a comment on issue #27305: [WIP][SPARK-29701][SQL] Correct behaviours of group analytical queries when empty input given URL: https://github.com/apache/spark/pull/27305#issuecomment-576568326 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/21931/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #27237: [SPARK-28330][SQL] Support ANSI SQL: result offset clause in query expression
AmplabJenkins removed a comment on issue #27237: [SPARK-28330][SQL] Support ANSI SQL: result offset clause in query expression URL: https://github.com/apache/spark/pull/27237#issuecomment-576568363 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/21932/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #27237: [SPARK-28330][SQL] Support ANSI SQL: result offset clause in query expression
AmplabJenkins removed a comment on issue #27237: [SPARK-28330][SQL] Support ANSI SQL: result offset clause in query expression URL: https://github.com/apache/spark/pull/27237#issuecomment-576568352 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] beliefer commented on a change in pull request #27058: [SPARK-30276][SQL] Support Filter expression allows simultaneous use of DISTINCT
beliefer commented on a change in pull request #27058: [SPARK-30276][SQL] Support Filter expression allows simultaneous use of DISTINCT URL: https://github.com/apache/spark/pull/27058#discussion_r368860174 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/RewriteDistinctAggregates.scala ## @@ -148,24 +207,106 @@ object RewriteDistinctAggregates extends Rule[LogicalPlan] { val distinctAggs = exprs.flatMap { _.collect { case ae: AggregateExpression if ae.isDistinct => ae }} -// We need at least two distinct aggregates for this rule because aggregation -// strategy can handle a single distinct group. +// This rule serves two purposes: +// One is to rewrite when there exists at least two distinct aggregates. We need at least +// two distinct aggregates for this rule because aggregation strategy can handle a single +// distinct group. +// Another is to expand distinct aggregates which exists filter clause so that we can +// evaluate the filter locally. // This check can produce false-positives, e.g., SUM(DISTINCT a) & COUNT(DISTINCT a). -distinctAggs.size > 1 +distinctAggs.size >= 1 || distinctAggs.exists(_.filter.isDefined) } def apply(plan: LogicalPlan): LogicalPlan = plan transformUp { -case a: Aggregate if mayNeedtoRewrite(a.aggregateExpressions) => rewrite(a) +case a: Aggregate if mayNeedtoRewrite(a.aggregateExpressions) => + val expandAggregate = extractFiltersInDistinctAggregate(a) + rewriteDistinctAggregate(expandAggregate) } - def rewrite(a: Aggregate): Aggregate = { + private def extractFiltersInDistinctAggregate(a: Aggregate): Aggregate = { +val aggExpressions = collectAggregateExprs(a) +val (distinctAggExpressions, regularAggExpressions) = aggExpressions.partition(_.isDistinct) +if (distinctAggExpressions.exists(_.filter.isDefined)) { + // Setup expand for the 'regular' aggregate expressions. Because we will construct a new + // aggregate, the children of the distinct aggregates will be changed to the generate + // ones, so we need creates new references to avoid collisions between distinct and + // regular aggregate children. + val regularAggExprs = regularAggExpressions.filter(_.children.exists(!_.foldable)) + val regularFunChildren = regularAggExprs +.flatMap(_.aggregateFunction.children.filter(!_.foldable)) + val regularFilterAttrs = regularAggExprs.flatMap(_.filterAttributes) + val regularAggChildren = (regularFunChildren ++ regularFilterAttrs).distinct + val regularAggChildAttrMap = regularAggChildren.map(expressionAttributePair) + val regularAggChildAttrLookup = regularAggChildAttrMap.toMap + val regularAggMap = regularAggExprs.map { +case ae @ AggregateExpression(af, _, _, filter, _) => + val newChildren = af.children.map(c => regularAggChildAttrLookup.getOrElse(c, c)) + val raf = af.withNewChildren(newChildren).asInstanceOf[AggregateFunction] + val filterOpt = filter.map(_.transform { +case a: Attribute => regularAggChildAttrLookup.getOrElse(a, a) + }) + val aggExpr = ae.copy(aggregateFunction = raf, filter = filterOpt) + (ae, aggExpr) + } -// Collect all aggregate expressions. -val aggExpressions = a.aggregateExpressions.flatMap { e => - e.collect { -case ae: AggregateExpression => ae + // Setup expand for the distinct aggregate expressions. + val distinctAggExprs = distinctAggExpressions.filter(e => e.children.exists(!_.foldable)) + val (projections, expressionAttrs, aggExprPairs) = distinctAggExprs.map { +case ae @ AggregateExpression(af, _, _, filter, _) => + // Why do we need to construct the `exprId` ? + // First, In order to reduce costs, it is better to handle the filter clause locally. + // e.g. COUNT (DISTINCT a) FILTER (WHERE id > 1), evaluate expression + // If(id > 1) 'a else null first, and use the result as output. + // Second, If at least two DISTINCT aggregate expression which may references the + // same attributes. We need to construct the generate attributes so as the output not + // lost. e.g. SUM (DISTINCT a), COUNT (DISTINCT a) FILTER (WHERE id > 1) will output + // attribute '_gen_distinct-1 and attribute '_gen_distinct-2 instead of two 'a. + // Note: We just need to illusion the expression with filter clause. + // The illusionary mechanism may result in multiple distinct aggregations uses + // different column, so we still need to call `rewrite`. + val exprId = NamedExpression.newExprId.id + val unfoldableChildren = af.children.filter(!_.foldable) + val exprAttrs = unfoldableChildren.map { e => +(e, AttributeReference(s"_gen_distinct_$exprId", e.dataType,
[GitHub] [spark] beliefer commented on a change in pull request #27058: [SPARK-30276][SQL] Support Filter expression allows simultaneous use of DISTINCT
beliefer commented on a change in pull request #27058: [SPARK-30276][SQL] Support Filter expression allows simultaneous use of DISTINCT URL: https://github.com/apache/spark/pull/27058#discussion_r368860880 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/RewriteDistinctAggregates.scala ## @@ -148,24 +207,106 @@ object RewriteDistinctAggregates extends Rule[LogicalPlan] { val distinctAggs = exprs.flatMap { _.collect { case ae: AggregateExpression if ae.isDistinct => ae }} -// We need at least two distinct aggregates for this rule because aggregation -// strategy can handle a single distinct group. +// This rule serves two purposes: +// One is to rewrite when there exists at least two distinct aggregates. We need at least +// two distinct aggregates for this rule because aggregation strategy can handle a single +// distinct group. +// Another is to expand distinct aggregates which exists filter clause so that we can +// evaluate the filter locally. // This check can produce false-positives, e.g., SUM(DISTINCT a) & COUNT(DISTINCT a). -distinctAggs.size > 1 +distinctAggs.size >= 1 || distinctAggs.exists(_.filter.isDefined) } def apply(plan: LogicalPlan): LogicalPlan = plan transformUp { -case a: Aggregate if mayNeedtoRewrite(a.aggregateExpressions) => rewrite(a) +case a: Aggregate if mayNeedtoRewrite(a.aggregateExpressions) => + val expandAggregate = extractFiltersInDistinctAggregate(a) + rewriteDistinctAggregate(expandAggregate) } - def rewrite(a: Aggregate): Aggregate = { + private def extractFiltersInDistinctAggregate(a: Aggregate): Aggregate = { +val aggExpressions = collectAggregateExprs(a) +val (distinctAggExpressions, regularAggExpressions) = aggExpressions.partition(_.isDistinct) +if (distinctAggExpressions.exists(_.filter.isDefined)) { + // Setup expand for the 'regular' aggregate expressions. Because we will construct a new + // aggregate, the children of the distinct aggregates will be changed to the generate + // ones, so we need creates new references to avoid collisions between distinct and + // regular aggregate children. + val regularAggExprs = regularAggExpressions.filter(_.children.exists(!_.foldable)) + val regularFunChildren = regularAggExprs +.flatMap(_.aggregateFunction.children.filter(!_.foldable)) + val regularFilterAttrs = regularAggExprs.flatMap(_.filterAttributes) + val regularAggChildren = (regularFunChildren ++ regularFilterAttrs).distinct + val regularAggChildAttrMap = regularAggChildren.map(expressionAttributePair) + val regularAggChildAttrLookup = regularAggChildAttrMap.toMap + val regularAggMap = regularAggExprs.map { Review comment: OK This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] HyukjinKwon commented on a change in pull request #27242: [SPARK-30534][INFRA] Use mvn in `dev/scalastyle`
HyukjinKwon commented on a change in pull request #27242: [SPARK-30534][INFRA] Use mvn in `dev/scalastyle` URL: https://github.com/apache/spark/pull/27242#discussion_r368862220 ## File path: dev/scalastyle ## @@ -17,18 +17,10 @@ # limitations under the License. # -SPARK_PROFILES=${1:-"-Pmesos -Pkubernetes -Pyarn -Pspark-ganglia-lgpl -Pkinesis-asl -Phive-thriftserver -Phive"} +SCRIPT_DIR="$( cd "$( dirname "$0" )" && pwd )" -# NOTE: echo "q" is needed because SBT prompts the user for input on encountering a build file -# with failure (either resolution or compilation); the "q" makes SBT quit. -ERRORS=$(echo -e "q\n" \ -| build/sbt \ -${SPARK_PROFILES} \ --Pdocker-integration-tests \ --Pkubernetes-integration-tests \ -scalastyle test:scalastyle \ -| awk '{if($1~/error/)print}' \ -) +SPARK_PROFILES=${1:-"-Pmesos -Pkubernetes -Pyarn -Pspark-ganglia-lgpl -Pkinesis-asl -Phive-thriftserver -Phive -Pdocker-integration-tests -Pkubernetes-integration-tests"} +ERRORS=$($SCRIPT_DIR/../build/mvn $SPARK_PROFILES scalastyle:check | grep "^error file") Review comment: Thanks for pointer, @dongjoon-hyun! This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] beliefer commented on a change in pull request #27058: [SPARK-30276][SQL] Support Filter expression allows simultaneous use of DISTINCT
beliefer commented on a change in pull request #27058: [SPARK-30276][SQL] Support Filter expression allows simultaneous use of DISTINCT URL: https://github.com/apache/spark/pull/27058#discussion_r368863640 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/RewriteDistinctAggregates.scala ## @@ -148,24 +207,106 @@ object RewriteDistinctAggregates extends Rule[LogicalPlan] { val distinctAggs = exprs.flatMap { _.collect { case ae: AggregateExpression if ae.isDistinct => ae }} -// We need at least two distinct aggregates for this rule because aggregation -// strategy can handle a single distinct group. +// This rule serves two purposes: +// One is to rewrite when there exists at least two distinct aggregates. We need at least +// two distinct aggregates for this rule because aggregation strategy can handle a single +// distinct group. +// Another is to expand distinct aggregates which exists filter clause so that we can +// evaluate the filter locally. // This check can produce false-positives, e.g., SUM(DISTINCT a) & COUNT(DISTINCT a). -distinctAggs.size > 1 +distinctAggs.size >= 1 || distinctAggs.exists(_.filter.isDefined) } def apply(plan: LogicalPlan): LogicalPlan = plan transformUp { -case a: Aggregate if mayNeedtoRewrite(a.aggregateExpressions) => rewrite(a) +case a: Aggregate if mayNeedtoRewrite(a.aggregateExpressions) => + val expandAggregate = extractFiltersInDistinctAggregate(a) + rewriteDistinctAggregate(expandAggregate) } - def rewrite(a: Aggregate): Aggregate = { + private def extractFiltersInDistinctAggregate(a: Aggregate): Aggregate = { +val aggExpressions = collectAggregateExprs(a) +val (distinctAggExpressions, regularAggExpressions) = aggExpressions.partition(_.isDistinct) +if (distinctAggExpressions.exists(_.filter.isDefined)) { + // Setup expand for the 'regular' aggregate expressions. Because we will construct a new Review comment: Thanks. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] yaooqinn commented on issue #27299: [SPARK-30547][SQL][FOLLOWUP] Update since anotation for CalendarInterval class
yaooqinn commented on issue #27299: [SPARK-30547][SQL][FOLLOWUP] Update since anotation for CalendarInterval class URL: https://github.com/apache/spark/pull/27299#issuecomment-576572236 retest this please This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] Ngone51 commented on issue #27302: [SPARK-30506][SQL][DOC] Document for generic file source options/configs
Ngone51 commented on issue #27302: [SPARK-30506][SQL][DOC] Document for generic file source options/configs URL: https://github.com/apache/spark/pull/27302#issuecomment-576572256 > Could you make a link from sql-data-sources-avro.md for pathGlobFilter to docs/sql-data-sources-generic-options.md? @MaxGekk Sure, I will. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] yaooqinn commented on issue #27300: [SPARK-30591][SQL] Remove the nonstandard SET OWNER syntax for namespaces
yaooqinn commented on issue #27300: [SPARK-30591][SQL] Remove the nonstandard SET OWNER syntax for namespaces URL: https://github.com/apache/spark/pull/27300#issuecomment-576572383 retest this please This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on issue #27299: [SPARK-30547][SQL][FOLLOWUP] Update since anotation for CalendarInterval class
SparkQA commented on issue #27299: [SPARK-30547][SQL][FOLLOWUP] Update since anotation for CalendarInterval class URL: https://github.com/apache/spark/pull/27299#issuecomment-576573058 **[Test build #117170 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/117170/testReport)** for PR 27299 at commit [`275f8e6`](https://github.com/apache/spark/commit/275f8e6c477fc6d75be61664fca574008d19ea72). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org