[GitHub] [spark] dongjoon-hyun commented on a change in pull request #32577: [SPARK-35422][SQL] Fix plan-printing issues to pass the TPCDS plan stability tests in Scala v2.13
dongjoon-hyun commented on a change in pull request #32577: URL: https://github.com/apache/spark/pull/32577#discussion_r634064688 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/predicates.scala ## @@ -544,7 +544,14 @@ case class InSet(child: Expression, hset: Set[Any]) extends UnaryExpression with require(hset != null, "hset could not be null") - override def toString: String = s"$child INSET ${hset.mkString("(", ",", ")")}" + override def toString: String = { +val listString = hset.toSeq + .map(elem => Literal(elem, child.dataType).toString) + // Sort elements for deterministic behaviours + .sorted Review comment: Got it. Thank you for checking. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] dongjoon-hyun commented on pull request #32576: [SPARK-35429][CORE] Remove commons-httpclient due to EOL and CVEs
dongjoon-hyun commented on pull request #32576: URL: https://github.com/apache/spark/pull/32576#issuecomment-842864930 Thank you for the confirmation. I'll close this according to his advice and history. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] dongjoon-hyun closed pull request #32576: [SPARK-35429][CORE] Remove commons-httpclient due to EOL and CVEs
dongjoon-hyun closed pull request #32576: URL: https://github.com/apache/spark/pull/32576 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] dongjoon-hyun commented on pull request #32573: [SPARK-35425][BUILD] Pin jinja2 in `spark-rm/Dockerfile` and add as a required dependency in the release README.md
dongjoon-hyun commented on pull request #32573: URL: https://github.com/apache/spark/pull/32573#issuecomment-842864354 Thank you for investigation. +1 for pinning idea. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] sarutak edited a comment on pull request #32573: [SPARK-35425][BUILD] Pin jinja2 in `spark-rm/Dockerfile` and add as a required dependency in the release README.md
sarutak edited a comment on pull request #32573: URL: https://github.com/apache/spark/pull/32573#issuecomment-842860367 @dongjoon-hyun The Dockerfile in `branch-3.0` installs the following packages and the version of Jinja2 is not specified but `make html` under `python/doc` succeeds. Maybe whether the version of Jinja2 affects or not is subject to the version of `Sphinx`. * sphinx==2.3.1 * mkdocs==1.0.4 * numpy==1.18.1 But it might be good to pin the version of Jinja2 just in case for `branch-3.0` too. What do you think? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] sarutak edited a comment on pull request #32573: [SPARK-35425][BUILD] Pin jinja2 in `spark-rm/Dockerfile` and add as a required dependency in the release README.md
sarutak edited a comment on pull request #32573: URL: https://github.com/apache/spark/pull/32573#issuecomment-842860367 @dongjoon-hyun The Dockerfile in `branch-3.0` installs the following packages and the version of Jinja2 is not specified but `make html` under `python/doc` succeeds. Maybe whether the version of Jinja2 affects or not is subject to the version of `Sphinx`. * sphinx==2.3.1 * mkdocs==1.0.4 * numpy==1.18.1 But it might be pin the version of Jinja2 just in case for `branch-3.0` too. What do you think? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] dongjoon-hyun commented on pull request #32578: [SPARK-35431][SQL][TESTS] Sort elements generated by collect_set in SQLQueryTestSuite
dongjoon-hyun commented on pull request #32578: URL: https://github.com/apache/spark/pull/32578#issuecomment-842863780 Merged to master. Thank you, @maropu and @HyukjinKwon . -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #32494: [SPARK-35362][SQL] Update null count in the column stats for UNION operator stats estimation
AmplabJenkins removed a comment on pull request #32494: URL: https://github.com/apache/spark/pull/32494#issuecomment-842859767 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/138653/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #32577: [SPARK-35422][SQL] Fix plan-printing issues to pass the TPCDS plan stability tests in Scala v2.13
AmplabJenkins removed a comment on pull request #32577: URL: https://github.com/apache/spark/pull/32577#issuecomment-842859768 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/43181/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #32553: [SPARK-35389][SQL] V2 ScalarFunction should support magic method with null arguments
AmplabJenkins removed a comment on pull request #32553: URL: https://github.com/apache/spark/pull/32553#issuecomment-842859770 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/138655/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #32546: [SPARK-35395][DOCS] Move ORC data source options from Python and Scala into a single page
AmplabJenkins removed a comment on pull request #32546: URL: https://github.com/apache/spark/pull/32546#issuecomment-842859771 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/138654/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA removed a comment on pull request #32553: [SPARK-35389][SQL] V2 ScalarFunction should support magic method with null arguments
SparkQA removed a comment on pull request #32553: URL: https://github.com/apache/spark/pull/32553#issuecomment-842744214 **[Test build #138655 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/138655/testReport)** for PR 32553 at commit [`94fd2d0`](https://github.com/apache/spark/commit/94fd2d094866ddf4e28bffaa5c8c383008a9a0d9). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] dongjoon-hyun closed pull request #32578: [SPARK-35431][SQL][TESTS] Sort elements generated by collect_set in SQLQueryTestSuite
dongjoon-hyun closed pull request #32578: URL: https://github.com/apache/spark/pull/32578 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA removed a comment on pull request #32494: [SPARK-35362][SQL] Update null count in the column stats for UNION operator stats estimation
SparkQA removed a comment on pull request #32494: URL: https://github.com/apache/spark/pull/32494#issuecomment-842738920 **[Test build #138653 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/138653/testReport)** for PR 32494 at commit [`74d73a3`](https://github.com/apache/spark/commit/74d73a3dab2804dfbd85f0c3ce8818ee5bb87780). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] sarutak commented on pull request #32573: [SPARK-35425][BUILD] Pin jinja2 in `spark-rm/Dockerfile` and add as a required dependency in the release README.md
sarutak commented on pull request #32573: URL: https://github.com/apache/spark/pull/32573#issuecomment-842860367 @dongjoon-hyun The Dockerfile in `branch-3.0` installs the following packages and without the version of Jinja2 specified but `make html` under `python/doc` succeeds. Maybe whether the version of Jinja2 affects or not is subject to the version of `Sphinx`. * sphinx==2.3.1 * mkdocs==1.0.4 * numpy==1.18.1 But it might be pin the version of Jinja2 just in case for `branch-3.0` too. What do you think? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #32494: [SPARK-35362][SQL] Update null count in the column stats for UNION operator stats estimation
AmplabJenkins commented on pull request #32494: URL: https://github.com/apache/spark/pull/32494#issuecomment-842859767 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/138653/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #32553: [SPARK-35389][SQL] V2 ScalarFunction should support magic method with null arguments
AmplabJenkins commented on pull request #32553: URL: https://github.com/apache/spark/pull/32553#issuecomment-842859770 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/138655/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #32577: [SPARK-35422][SQL] Fix plan-printing issues to pass the TPCDS plan stability tests in Scala v2.13
AmplabJenkins commented on pull request #32577: URL: https://github.com/apache/spark/pull/32577#issuecomment-842859768 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/43181/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #32546: [SPARK-35395][DOCS] Move ORC data source options from Python and Scala into a single page
AmplabJenkins commented on pull request #32546: URL: https://github.com/apache/spark/pull/32546#issuecomment-842859771 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/138654/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #32204: [SPARK-34494][SQL][DOCS] Move JSON data source options from Python and Scala into a single page
SparkQA commented on pull request #32204: URL: https://github.com/apache/spark/pull/32204#issuecomment-842859104 Kubernetes integration test starting URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/43186/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #32578: [SPARK-35431][SQL][TESTS] Sort elements generated by collect_set in SQLQueryTestSuite
SparkQA commented on pull request #32578: URL: https://github.com/apache/spark/pull/32578#issuecomment-842858762 Kubernetes integration test starting URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/43184/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] wangyum commented on pull request #32576: [SPARK-35429][CORE] Remove commons-httpclient due to EOL and CVEs
wangyum commented on pull request #32576: URL: https://github.com/apache/spark/pull/32576#issuecomment-842858756 We tried to remove commons-httpclient before: https://github.com/apache/spark/pull/31528 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #32576: [SPARK-35429][CORE] Remove commons-httpclient due to EOL and CVEs
SparkQA commented on pull request #32576: URL: https://github.com/apache/spark/pull/32576#issuecomment-842855336 Kubernetes integration test starting URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/43185/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] sarutak edited a comment on pull request #32573: [SPARK-35425][BUILD] Pin jinja2 in `spark-rm/Dockerfile` and add as a required dependency in the release README.md
sarutak edited a comment on pull request #32573: URL: https://github.com/apache/spark/pull/32573#issuecomment-842854364 > Oh, what I was focused was Dockerfile change. Yes, I understand. I'll check that whether the change for the Dockerfile is necessary for `branch-3.0` though #32509 was not necessary. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] sarutak commented on pull request #32573: [SPARK-35425][BUILD] Pin jinja2 in `spark-rm/Dockerfile` and add as a required dependency in the release README.md
sarutak commented on pull request #32573: URL: https://github.com/apache/spark/pull/32573#issuecomment-842854364 > Oh, what I was focused was Dockerfile change. Yes, I understand. I'll check that whether the change for the Dockerfile is necessary for `branch-3.0` though #32509 was not necessary. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] dongjoon-hyun commented on pull request #32573: [SPARK-35425][BUILD] Pin jinja2 in `spark-rm/Dockerfile` and add as a required dependency in the release README.md
dongjoon-hyun commented on pull request #32573: URL: https://github.com/apache/spark/pull/32573#issuecomment-842851516 Oh, what I was focused was `Dockerfile` change. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #32553: [SPARK-35389][SQL] V2 ScalarFunction should support magic method with null arguments
SparkQA commented on pull request #32553: URL: https://github.com/apache/spark/pull/32553#issuecomment-842848579 **[Test build #138655 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/138655/testReport)** for PR 32553 at commit [`94fd2d0`](https://github.com/apache/spark/commit/94fd2d094866ddf4e28bffaa5c8c383008a9a0d9). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] maropu commented on pull request #32578: [SPARK-35431][SQL][TESTS] Sort elements generated by collect_set in SQLQueryTestSuite
maropu commented on pull request #32578: URL: https://github.com/apache/spark/pull/32578#issuecomment-842845175 > Oh, @maropu . Could you use a new JIRA ID? okay, updated. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #32494: [SPARK-35362][SQL] Update null count in the column stats for UNION operator stats estimation
SparkQA commented on pull request #32494: URL: https://github.com/apache/spark/pull/32494#issuecomment-842841164 **[Test build #138653 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/138653/testReport)** for PR 32494 at commit [`74d73a3`](https://github.com/apache/spark/commit/74d73a3dab2804dfbd85f0c3ce8818ee5bb87780). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #32577: [SPARK-35422][SQL] Fix plan-printing issues to pass the TPCDS plan stability tests in Scala v2.13
SparkQA commented on pull request #32577: URL: https://github.com/apache/spark/pull/32577#issuecomment-842839451 Kubernetes integration test status success URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/43181/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] sarutak commented on pull request #32573: [SPARK-35425][BUILD] Pin jinja2 in `spark-rm/Dockerfile` and add as a required dependency in the release README.md
sarutak commented on pull request #32573: URL: https://github.com/apache/spark/pull/32573#issuecomment-842837121 @dongjoon-hyun #32509 was merged to `master` and `branch-3.1`. But I'll check whether we need to merge this to `branch-3.0` too. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA removed a comment on pull request #32546: [SPARK-35395][DOCS] Move ORC data source options from Python and Scala into a single page
SparkQA removed a comment on pull request #32546: URL: https://github.com/apache/spark/pull/32546#issuecomment-842742900 **[Test build #138654 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/138654/testReport)** for PR 32546 at commit [`3740c52`](https://github.com/apache/spark/commit/3740c52865a867e7b3e6904a0fefc8304870438b). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] maropu commented on a change in pull request #32577: [SPARK-35422][SQL] Fix plan-printing issues to pass the TPCDS plan stability tests in Scala v2.13
maropu commented on a change in pull request #32577: URL: https://github.com/apache/spark/pull/32577#discussion_r634040501 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/predicates.scala ## @@ -544,7 +544,14 @@ case class InSet(child: Expression, hset: Set[Any]) extends UnaryExpression with require(hset != null, "hset could not be null") - override def toString: String = s"$child INSET ${hset.mkString("(", ",", ")")}" + override def toString: String = { +val listString = hset.toSeq + .map(elem => Literal(elem, child.dataType).toString) + // Sort elements for deterministic behaviours + .sorted Review comment: We cannot because `Seq[Any]` cannot be sorted; ``` [error] /Users/maropu/Repositories/spark/spark-master/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/predicates.scala:648:8: No implicit Ordering defined for Any. [error] .sorted ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #32546: [SPARK-35395][DOCS] Move ORC data source options from Python and Scala into a single page
SparkQA commented on pull request #32546: URL: https://github.com/apache/spark/pull/32546#issuecomment-842834872 **[Test build #138654 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/138654/testReport)** for PR 32546 at commit [`3740c52`](https://github.com/apache/spark/commit/3740c52865a867e7b3e6904a0fefc8304870438b). * This patch passes all tests. * This patch merges cleanly. * This patch adds the following public classes _(experimental)_: * `case class UpdatingSessionsExec(` * `class UpdatingSessionsIterator(` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] cloud-fan commented on a change in pull request #32553: [SPARK-35389][SQL] V2 ScalarFunction should support magic method with null arguments
cloud-fan commented on a change in pull request #32553: URL: https://github.com/apache/spark/pull/32553#discussion_r634039384 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/objects/objects.scala ## @@ -50,7 +50,14 @@ trait InvokeLike extends Expression with NonSQLExpression { def propagateNull: Boolean - protected lazy val needNullCheck: Boolean = propagateNull && arguments.exists(_.nullable) + def propagateNullForPrimitive: Boolean Review comment: yea, I agree. If users want to handle null by themselves, they should use boxed primitive types as the UDF parameter type. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] maropu commented on pull request #32577: [SPARK-35422][SQL] Fix plan-printing issues to pass the TPCDS plan stability tests in Scala v2.13
maropu commented on pull request #32577: URL: https://github.com/apache/spark/pull/32577#issuecomment-842829444 > BTW, @maropu . I'm just wondering if there is any breaking change in the user perspective. Ur, yes, you're right. I updated the `Does this PR introduce any user-facing change?` section in the PR description (No -> Yes). I think this fix affects explain output strings in most cases, so this rarely breaks user's applications, I believe. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #32546: [SPARK-35395][DOCS] Move ORC data source options from Python and Scala into a single page
AmplabJenkins removed a comment on pull request #32546: URL: https://github.com/apache/spark/pull/32546#issuecomment-842821200 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/43182/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #32573: [SPARK-35425][BUILD] Pin jinja2 in `spark-rm/Dockerfile` and add as a required dependency in the release README.md
AmplabJenkins removed a comment on pull request #32573: URL: https://github.com/apache/spark/pull/32573#issuecomment-842821201 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/138657/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #32576: [SPARK-35429][CORE] Remove commons-httpclient due to EOL and CVEs
AmplabJenkins removed a comment on pull request #32576: URL: https://github.com/apache/spark/pull/32576#issuecomment-842690102 Can one of the admins verify this patch? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA removed a comment on pull request #32573: [SPARK-35425][BUILD] Pin jinja2 in `spark-rm/Dockerfile` and add as a required dependency in the release README.md
SparkQA removed a comment on pull request #32573: URL: https://github.com/apache/spark/pull/32573#issuecomment-842760352 **[Test build #138657 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/138657/testReport)** for PR 32573 at commit [`db1257d`](https://github.com/apache/spark/commit/db1257dce192e0d4d2e39e83ddfe57d0c727017c). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #32204: [SPARK-34494][SQL][DOCS] Move JSON data source options from Python and Scala into a single page
SparkQA commented on pull request #32204: URL: https://github.com/apache/spark/pull/32204#issuecomment-842821977 **[Test build #138665 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/138665/testReport)** for PR 32204 at commit [`52b6ba8`](https://github.com/apache/spark/commit/52b6ba8994747ec6132bb2cb40307cdf09aaa88f). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #32578: [SPARK-35422][SQL][TESTS] Sort elements generated by collect_set in SQLQueryTestSuite
SparkQA commented on pull request #32578: URL: https://github.com/apache/spark/pull/32578#issuecomment-842821760 **[Test build #138663 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/138663/testReport)** for PR 32578 at commit [`3a8ce14`](https://github.com/apache/spark/commit/3a8ce14f368b278c9461f01b8e8fc70051fba272). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #32576: [SPARK-35429][CORE] Remove commons-httpclient due to EOL and CVEs
SparkQA commented on pull request #32576: URL: https://github.com/apache/spark/pull/32576#issuecomment-842821776 **[Test build #138664 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/138664/testReport)** for PR 32576 at commit [`5a5fc37`](https://github.com/apache/spark/commit/5a5fc3779a1b34f60faa6923ce65200dd51bdb29). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] dongjoon-hyun commented on pull request #32578: [SPARK-35422][SQL][TESTS] Sort elements generated by collect_set in SQLQueryTestSuite
dongjoon-hyun commented on pull request #32578: URL: https://github.com/apache/spark/pull/32578#issuecomment-842821361 Oh, @maropu . Could you use a new JIRA ID? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #32573: [SPARK-35425][BUILD] Pin jinja2 in `spark-rm/Dockerfile` and add as a required dependency in the release README.md
AmplabJenkins commented on pull request #32573: URL: https://github.com/apache/spark/pull/32573#issuecomment-842821201 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/138657/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #32546: [SPARK-35395][DOCS] Move ORC data source options from Python and Scala into a single page
AmplabJenkins commented on pull request #32546: URL: https://github.com/apache/spark/pull/32546#issuecomment-842821200 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/43182/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] maropu commented on pull request #32578: [SPARK-35422][SQL][TESTS] Sort elements generated by collect_set in SQLQueryTestSuite
maropu commented on pull request #32578: URL: https://github.com/apache/spark/pull/32578#issuecomment-842820689 cc: @dongjoon-hyun -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] maropu opened a new pull request #32578: [SPARK-35422][SQL][TESTS] Sort elements generated by collect_set in SQLQueryTestSuite
maropu opened a new pull request #32578: URL: https://github.com/apache/spark/pull/32578 ### What changes were proposed in this pull request? To pass `subquery/scalar-subquery/scalar-subquery-select.sql` (`SQLQueryTestSuite`) in Scala v2.13, this PR proposes to change the aggregate expr of a test query in the file from `collect_set(...)` to `sort_array(collect_set(...))` because `collect_set` depends on the `mutable.HashSet` implementation and elements in the set are printed in a different order in Scala v2.12/v2.13. ### Why are the changes needed? To pass the test in Scala v2.13. ### Does this PR introduce _any_ user-facing change? No, dev-only. ### How was this patch tested? Manually checked. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] dongjoon-hyun commented on a change in pull request #32576: [SPARK-35429][CORE] Remove commons-httpclient due to EOL and CVEs
dongjoon-hyun commented on a change in pull request #32576: URL: https://github.com/apache/spark/pull/32576#discussion_r634032503 ## File path: dev/deps/spark-deps-hadoop-3.2-hive-2.3 ## @@ -35,7 +35,6 @@ commons-compiler/3.1.4//commons-compiler-3.1.4.jar commons-compress/1.20//commons-compress-1.20.jar commons-crypto/1.1.0//commons-crypto-1.1.0.jar commons-dbcp/1.4//commons-dbcp-1.4.jar -commons-httpclient/3.1//commons-httpclient-3.1.jar Review comment: Apache Spark has this at least since Apache Spark 1.5.x. - https://github.com/apache/spark/blob/branch-1.5/dev/deps/spark-deps-hadoop-2.6#L44 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #32494: [SPARK-35362][SQL] Update null count in the column stats for UNION operator stats estimation
SparkQA commented on pull request #32494: URL: https://github.com/apache/spark/pull/32494#issuecomment-842817251 Kubernetes integration test starting URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/43183/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #32577: [SPARK-35422][SQL] Fix plan-printing issues to pass the TPCDS plan stability tests in Scala v2.13
SparkQA commented on pull request #32577: URL: https://github.com/apache/spark/pull/32577#issuecomment-842816767 Kubernetes integration test starting URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/43181/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] sunchao commented on a change in pull request #32553: [SPARK-35389][SQL] V2 ScalarFunction should support magic method with null arguments
sunchao commented on a change in pull request #32553: URL: https://github.com/apache/spark/pull/32553#discussion_r634030313 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/objects/objects.scala ## @@ -50,7 +50,14 @@ trait InvokeLike extends Expression with NonSQLExpression { def propagateNull: Boolean - protected lazy val needNullCheck: Boolean = propagateNull && arguments.exists(_.nullable) + def propagateNullForPrimitive: Boolean Review comment: OK. Let me remove the flag then, and add an extra comment to `propagateNull`. One issue with this approach is that it only applies to the magic method path, while for `produceResult` users will need to handle the null primitive values explicitly. I think we'll need to document this more carefully, otherwise it could cause confusion. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #32546: [SPARK-35395][DOCS] Move ORC data source options from Python and Scala into a single page
SparkQA commented on pull request #32546: URL: https://github.com/apache/spark/pull/32546#issuecomment-842814991 Kubernetes integration test unable to build dist. exiting with code: 1 URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/43182/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] dongjoon-hyun commented on pull request #32576: [SPARK-35429][CORE] Remove commons-httpclient due to EOL and CVEs
dongjoon-hyun commented on pull request #32576: URL: https://github.com/apache/spark/pull/32576#issuecomment-842813193 ok to test -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] dongjoon-hyun commented on pull request #32577: [SPARK-35422][SQL] Fix plan-printing issues to pass the TPCDS plan stability tests in Scala v2.13
dongjoon-hyun commented on pull request #32577: URL: https://github.com/apache/spark/pull/32577#issuecomment-842813070 BTW, @maropu . I'm just wondering if there is any breaking change in the user perspective. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] dongjoon-hyun commented on a change in pull request #32577: [SPARK-35422][SQL] Fix plan-printing issues to pass the TPCDS plan stability tests in Scala v2.13
dongjoon-hyun commented on a change in pull request #32577: URL: https://github.com/apache/spark/pull/32577#discussion_r634026986 ## File path: sql/core/src/test/resources/tpcds-plan-stability/approved-plans-v1_4/q77.sf100/explain.txt ## @@ -488,7 +488,7 @@ Input [6]: [wp_web_page_sk#76, sales#85, profit#86, wp_web_page_sk#92, returns#1 (85) Expand [codegen id : 23] Input [5]: [sales#18, returns#36, profit#37, channel#38, id#39] -Arguments: [ArrayBuffer(sales#18, returns#36, profit#37, channel#38, id#39, 0), ArrayBuffer(sales#18, returns#36, profit#37, channel#38, null, 1), ArrayBuffer(sales#18, returns#36, profit#37, null, null, 3)], [sales#18, returns#36, profit#37, channel#107, id#108, spark_grouping_id#109] +Arguments: [[sales#18, returns#36, profit#37, channel#38, id#39, 0], [sales#18, returns#36, profit#37, channel#38, null, 1], [sales#18, returns#36, profit#37, null, null, 3]], [sales#18, returns#36, profit#37, channel#107, id#108, spark_grouping_id#109] Review comment: Nice! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] dongjoon-hyun commented on a change in pull request #32577: [SPARK-35422][SQL] Fix plan-printing issues to pass the TPCDS plan stability tests in Scala v2.13
dongjoon-hyun commented on a change in pull request #32577: URL: https://github.com/apache/spark/pull/32577#discussion_r634027215 ## File path: sql/core/src/test/resources/tpcds-plan-stability/approved-plans-v2_7/q18a.sf100/explain.txt ## @@ -403,7 +403,7 @@ Arguments: [cs_bill_customer_sk#87 ASC NULLS FIRST], false, 0 Output [5]: [c_customer_sk#95, c_current_cdemo_sk#96, c_current_addr_sk#97, c_birth_month#98, c_birth_year#99] Batched: true Location [not included in comparison]/{warehouse_dir}/customer] -PushedFilters: [In(c_birth_month, [9,5,12,4,1,10]), IsNotNull(c_customer_sk), IsNotNull(c_current_cdemo_sk), IsNotNull(c_current_addr_sk)] +PushedFilters: [In(c_birth_month, [1,10,12,4,5,9]), IsNotNull(c_customer_sk), IsNotNull(c_current_cdemo_sk), IsNotNull(c_current_addr_sk)] Review comment: Yes, this is the one I worried. The sorting order. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] dongjoon-hyun commented on a change in pull request #32577: [SPARK-35422][SQL] Fix plan-printing issues to pass the TPCDS plan stability tests in Scala v2.13
dongjoon-hyun commented on a change in pull request #32577: URL: https://github.com/apache/spark/pull/32577#discussion_r634026986 ## File path: sql/core/src/test/resources/tpcds-plan-stability/approved-plans-v1_4/q77.sf100/explain.txt ## @@ -488,7 +488,7 @@ Input [6]: [wp_web_page_sk#76, sales#85, profit#86, wp_web_page_sk#92, returns#1 (85) Expand [codegen id : 23] Input [5]: [sales#18, returns#36, profit#37, channel#38, id#39] -Arguments: [ArrayBuffer(sales#18, returns#36, profit#37, channel#38, id#39, 0), ArrayBuffer(sales#18, returns#36, profit#37, channel#38, null, 1), ArrayBuffer(sales#18, returns#36, profit#37, null, null, 3)], [sales#18, returns#36, profit#37, channel#107, id#108, spark_grouping_id#109] +Arguments: [[sales#18, returns#36, profit#37, channel#38, id#39, 0], [sales#18, returns#36, profit#37, channel#38, null, 1], [sales#18, returns#36, profit#37, null, null, 3]], [sales#18, returns#36, profit#37, channel#107, id#108, spark_grouping_id#109] Review comment: Oh, is `ArrayBuffer` removed after this PR? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] dongjoon-hyun commented on a change in pull request #32577: [SPARK-35422][SQL] Fix plan-printing issues to pass the TPCDS plan stability tests in Scala v2.13
dongjoon-hyun commented on a change in pull request #32577: URL: https://github.com/apache/spark/pull/32577#discussion_r634025398 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/predicates.scala ## @@ -638,6 +645,8 @@ case class InSet(child: Expression, hset: Set[Any]) extends UnaryExpression with val valueSQL = child.sql val listSQL = hset.toSeq .map(elem => Literal(elem, child.dataType).sql) + // Sort elements for deterministic behaviours + .sorted Review comment: Ditto. Can we do `sort` first? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #32573: [SPARK-35425][DOCS] Add note about Jinja2 as a required dependency for document build.
SparkQA commented on pull request #32573: URL: https://github.com/apache/spark/pull/32573#issuecomment-842810282 **[Test build #138657 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/138657/testReport)** for PR 32573 at commit [`db1257d`](https://github.com/apache/spark/commit/db1257dce192e0d4d2e39e83ddfe57d0c727017c). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] dongjoon-hyun commented on a change in pull request #32577: [SPARK-35422][SQL] Fix plan-printing issues to pass the TPCDS plan stability tests in Scala v2.13
dongjoon-hyun commented on a change in pull request #32577: URL: https://github.com/apache/spark/pull/32577#discussion_r634025015 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/predicates.scala ## @@ -544,7 +544,14 @@ case class InSet(child: Expression, hset: Set[Any]) extends UnaryExpression with require(hset != null, "hset could not be null") - override def toString: String = s"$child INSET ${hset.mkString("(", ",", ")")}" + override def toString: String = { +val listString = hset.toSeq + .map(elem => Literal(elem, child.dataType).toString) + // Sort elements for deterministic behaviours + .sorted Review comment: Can we switch the above two lines to support numeric set in a more natural way? ```scala .sorted .map(elem => Literal(elem, child.dataType).toString) ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] dongjoon-hyun commented on pull request #32577: [SPARK-35422][SQL] Fix plan-printing issues to pass the TPCDS plan stability tests in Scala v2.13
dongjoon-hyun commented on pull request #32577: URL: https://github.com/apache/spark/pull/32577#issuecomment-842804632 Thank you so much, @maropu ! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #32553: [SPARK-35389][SQL] V2 ScalarFunction should support magic method with null arguments
AmplabJenkins removed a comment on pull request #32553: URL: https://github.com/apache/spark/pull/32553#issuecomment-842802193 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/43177/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #32494: [SPARK-35362][SQL] Update null count in the column stats for UNION operator stats estimation
AmplabJenkins removed a comment on pull request #32494: URL: https://github.com/apache/spark/pull/32494#issuecomment-842802196 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/43172/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #32469: [SPARK-35338][PYTHON] Separate arithmetic operations into data type based structures
AmplabJenkins removed a comment on pull request #32469: URL: https://github.com/apache/spark/pull/32469#issuecomment-842802195 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/138656/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #32550: [SPARK-35282][SQL] Support AQE side shuffled hash join formula using rule
AmplabJenkins removed a comment on pull request #32550: URL: https://github.com/apache/spark/pull/32550#issuecomment-842802197 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/43180/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA removed a comment on pull request #32469: [SPARK-35338][PYTHON] Separate arithmetic operations into data type based structures
SparkQA removed a comment on pull request #32469: URL: https://github.com/apache/spark/pull/32469#issuecomment-842744297 **[Test build #138656 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/138656/testReport)** for PR 32469 at commit [`114908a`](https://github.com/apache/spark/commit/114908a8a9e98848043232ebc2ac966f941316cf). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #32557: [SPARK-35411][SQL] Add essential information while serializing TreeNode to json
AmplabJenkins removed a comment on pull request #32557: URL: https://github.com/apache/spark/pull/32557#issuecomment-842802198 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/43179/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #32546: [SPARK-35395][DOCS] Move ORC data source options from Python and Scala into a single page
SparkQA commented on pull request #32546: URL: https://github.com/apache/spark/pull/32546#issuecomment-842802735 **[Test build #138661 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/138661/testReport)** for PR 32546 at commit [`2e0787c`](https://github.com/apache/spark/commit/2e0787c9763d00fd6db5794282854fa8eb94283a). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #32494: [SPARK-35362][SQL] Update null count in the column stats for UNION operator stats estimation
SparkQA commented on pull request #32494: URL: https://github.com/apache/spark/pull/32494#issuecomment-842802773 **[Test build #138662 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/138662/testReport)** for PR 32494 at commit [`4c73828`](https://github.com/apache/spark/commit/4c73828bc8c1fe30adac63dd9db735524e52e565). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #32577: [SPARK-35422][SQL] Fix plan-printing issues to pass the TPCDS plan stability tests in Scala v2.13
SparkQA commented on pull request #32577: URL: https://github.com/apache/spark/pull/32577#issuecomment-842802720 **[Test build #138660 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/138660/testReport)** for PR 32577 at commit [`30945c3`](https://github.com/apache/spark/commit/30945c3e9ca143457447ef3b8a7d51a69a4a07a9). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] maropu opened a new pull request #32577: [SPARK-35422][SQL] Fix plan-printing issues to pass the TPCDS plan stability tests in Scala v2.13
maropu opened a new pull request #32577: URL: https://github.com/apache/spark/pull/32577 ### What changes were proposed in this pull request? To pass the TPCDS-related plan stability tests in scala-2.13, this PR proposes to fix two things below; - (1) Sorts elements in the predicate `InSet` and the source filter `In` for printing their nodes. - (2) Formats nested collection elements (`Seq`, `Array`, and `Set`) recursively in `TreeNode.argString`. As for (1), it seems v2.12/v2.13 prints `Set` elements with a different order, so we need to sort them explicitly. As for (2), the `Seq` implementation is different between v2.12/v2.13, so we need to format nested `Seq` elements correctly to hide the name of its implementation (See an example below); ``` (74) Expand [codegen id : 20] Input [5]: [sales#41, RETURNS#42, profit#43, channel#44, id#45] -Arguments: [ArrayBuffer(sales#41, returns#42, ...<-- scala-2.12 +Arguments: [Vector(sales#41, returns#42, ... <-- scala-2.13 +Arguments: [[(sales#41, returns#42, ... <-- the proposed fix to hide the name of its implementation ``` ### Why are the changes needed? To pass the tests in Scala v2.13. ### Does this PR introduce _any_ user-facing change? No, dev-only. ### How was this patch tested? Manually checked. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #32557: [SPARK-35411][SQL] Add essential information while serializing TreeNode to json
AmplabJenkins commented on pull request #32557: URL: https://github.com/apache/spark/pull/32557#issuecomment-842802198 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/43179/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #32553: [SPARK-35389][SQL] V2 ScalarFunction should support magic method with null arguments
AmplabJenkins commented on pull request #32553: URL: https://github.com/apache/spark/pull/32553#issuecomment-842802193 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/43177/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #32494: [SPARK-35362][SQL] Update null count in the column stats for UNION operator stats estimation
AmplabJenkins commented on pull request #32494: URL: https://github.com/apache/spark/pull/32494#issuecomment-842802196 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/43172/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #32469: [SPARK-35338][PYTHON] Separate arithmetic operations into data type based structures
AmplabJenkins commented on pull request #32469: URL: https://github.com/apache/spark/pull/32469#issuecomment-842802195 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/138656/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #32550: [SPARK-35282][SQL] Support AQE side shuffled hash join formula using rule
AmplabJenkins commented on pull request #32550: URL: https://github.com/apache/spark/pull/32550#issuecomment-842802197 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/43180/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] lipzhu commented on pull request #32572: [SPARK-35305][BUILD] Upgrade Zookeeper to 3.7.0
lipzhu commented on pull request #32572: URL: https://github.com/apache/spark/pull/32572#issuecomment-842801439 > Are those CVE applicable to Zookeeper Client , @lipzhu ? Found https://issues.apache.org/jira/browse/ZOOKEEPER-4278 https://issues.apache.org/jira/browse/ZOOKEEPER-4272 But didn't affect zookeeper 3.7.0 version. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #32469: [SPARK-35338][PYTHON] Separate arithmetic operations into data type based structures
SparkQA commented on pull request #32469: URL: https://github.com/apache/spark/pull/32469#issuecomment-842800494 **[Test build #138656 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/138656/testReport)** for PR 32469 at commit [`114908a`](https://github.com/apache/spark/commit/114908a8a9e98848043232ebc2ac966f941316cf). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] HyukjinKwon closed pull request #32569: [SPARK-35419][PYTHON] Enable spark.sql.execution.pyspark.udf.simplifiedTraceback.enabled by default
HyukjinKwon closed pull request #32569: URL: https://github.com/apache/spark/pull/32569 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] HyukjinKwon commented on pull request #32569: [SPARK-35419][PYTHON] Enable spark.sql.execution.pyspark.udf.simplifiedTraceback.enabled by default
HyukjinKwon commented on pull request #32569: URL: https://github.com/apache/spark/pull/32569#issuecomment-842797601 Merged to master. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #32550: [SPARK-35282][SQL] Support AQE side shuffled hash join formula using rule
SparkQA commented on pull request #32550: URL: https://github.com/apache/spark/pull/32550#issuecomment-842797602 Kubernetes integration test status success URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/43180/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] itholic commented on pull request #32546: [SPARK-35395][DOCS] Move ORC data source options from Python and Scala into a single page
itholic commented on pull request #32546: URL: https://github.com/apache/spark/pull/32546#issuecomment-842796298 Jenkins, retest this please -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #32557: [SPARK-35411][SQL] Add essential information while serializing TreeNode to json
SparkQA commented on pull request #32557: URL: https://github.com/apache/spark/pull/32557#issuecomment-842791856 Kubernetes integration test status success URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/43179/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #32553: [SPARK-35389][SQL] V2 ScalarFunction should support magic method with null arguments
SparkQA commented on pull request #32553: URL: https://github.com/apache/spark/pull/32553#issuecomment-842790642 Kubernetes integration test status success URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/43177/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #32494: [SPARK-35362][SQL] Update null count in the column stats for UNION operator stats estimation
SparkQA commented on pull request #32494: URL: https://github.com/apache/spark/pull/32494#issuecomment-842790385 Kubernetes integration test status failure URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/43172/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #32469: [SPARK-35338][PYTHON] Separate arithmetic operations into data type based structures
AmplabJenkins removed a comment on pull request #32469: URL: https://github.com/apache/spark/pull/32469#issuecomment-842781687 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/43176/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #32494: [SPARK-35362][SQL] Update null count in the column stats for UNION operator stats estimation
AmplabJenkins removed a comment on pull request #32494: URL: https://github.com/apache/spark/pull/32494#issuecomment-842781686 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/138647/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #32546: [SPARK-35395][DOCS] Move ORC data source options from Python and Scala into a single page
AmplabJenkins removed a comment on pull request #32546: URL: https://github.com/apache/spark/pull/32546#issuecomment-842781690 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/43175/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #32573: [SPARK-35425][DOCS] Add note about Jinja2 as a required dependency for document build.
AmplabJenkins removed a comment on pull request #32573: URL: https://github.com/apache/spark/pull/32573#issuecomment-842781688 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/43178/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #32550: [SPARK-35282][SQL] Support AQE side shuffled hash join formula using rule
SparkQA commented on pull request #32550: URL: https://github.com/apache/spark/pull/32550#issuecomment-842785699 Kubernetes integration test starting URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/43180/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] cloud-fan commented on a change in pull request #32553: [SPARK-35389][SQL] V2 ScalarFunction should support magic method with null arguments
cloud-fan commented on a change in pull request #32553: URL: https://github.com/apache/spark/pull/32553#discussion_r633999421 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/objects/objects.scala ## @@ -50,7 +50,14 @@ trait InvokeLike extends Expression with NonSQLExpression { def propagateNull: Boolean - protected lazy val needNullCheck: Boolean = propagateNull && arguments.exists(_.nullable) + def propagateNullForPrimitive: Boolean Review comment: This reminds me of the rule `HandleNullInputsForUDF`. For primitive inputs, I don't think we have a choice and we must propagate null (`isPositive(int i)` can't handle null values). Can we detect it automatically instead of adding a new boolean flag? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #32494: [SPARK-35362][SQL] Update null count in the column stats for UNION operator stats estimation
AmplabJenkins commented on pull request #32494: URL: https://github.com/apache/spark/pull/32494#issuecomment-842781686 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/138647/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #32546: [SPARK-35395][DOCS] Move ORC data source options from Python and Scala into a single page
AmplabJenkins commented on pull request #32546: URL: https://github.com/apache/spark/pull/32546#issuecomment-842781690 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/43175/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #32469: [SPARK-35338][PYTHON] Separate arithmetic operations into data type based structures
AmplabJenkins commented on pull request #32469: URL: https://github.com/apache/spark/pull/32469#issuecomment-842781687 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/43176/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #32573: [SPARK-35425][DOCS] Add note about Jinja2 as a required dependency for document build.
AmplabJenkins commented on pull request #32573: URL: https://github.com/apache/spark/pull/32573#issuecomment-842781688 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/43178/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] viirya commented on a change in pull request #32573: [SPARK-35425][DOCS] Add note about Jinja2 as a required dependency for document build.
viirya commented on a change in pull request #32573: URL: https://github.com/apache/spark/pull/32573#discussion_r633994614 ## File path: dev/create-release/spark-rm/Dockerfile ## @@ -40,7 +40,9 @@ ARG APT_INSTALL="apt-get install --no-install-recommends -y" # TODO(SPARK-32407): Sphinx 3.1+ does not correctly index nested classes. # See also https://github.com/sphinx-doc/sphinx/issues/7551. # We should use the latest Sphinx version once this is fixed. -ARG PIP_PKGS="sphinx==3.0.4 mkdocs==1.1.2 numpy==1.19.4 pydata_sphinx_theme==0.4.1 ipython==7.19.0 nbsphinx==0.8.0 numpydoc==1.1.0" +# TODO(SPARK-35375): Jinja2 3.0.0+ causes error when building with Sphinx. +# See also https://issues.apache.org/jira/browse/SPARK-35375. +ARG PIP_PKGS="sphinx==3.0.4 mkdocs==1.1.2 numpy==1.19.4 pydata_sphinx_theme==0.4.1 ipython==7.19.0 nbsphinx==0.8.0 numpydoc==1.1.0 jinja2==2.11.3" Review comment: Oh okay. I'm just curious why it is not as same as the other place. If we know the version works, then it's fine. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #32557: [SPARK-35411][SQL] Add essential information while serializing TreeNode to json
SparkQA commented on pull request #32557: URL: https://github.com/apache/spark/pull/32557#issuecomment-842779123 Kubernetes integration test starting URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/43179/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #32553: [SPARK-35389][SQL] V2 ScalarFunction should support magic method with null arguments
SparkQA commented on pull request #32553: URL: https://github.com/apache/spark/pull/32553#issuecomment-842778626 Kubernetes integration test starting URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/43177/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #32573: [SPARK-35425][DOCS] Add note about Jinja2 as a required dependency for document build.
SparkQA commented on pull request #32573: URL: https://github.com/apache/spark/pull/32573#issuecomment-842777210 Kubernetes integration test unable to build dist. exiting with code: 1 URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/43178/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org