[GitHub] [spark] AmplabJenkins removed a comment on pull request #32446: [SPARK-35321][SQL] don't register Hive permanent functions when creating Hive client
AmplabJenkins removed a comment on pull request #32446: URL: https://github.com/apache/spark/pull/32446#issuecomment-833248386 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/42710/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #32377: [SPARK-35021][SQL] Group exception messages in connector/catalog
AmplabJenkins removed a comment on pull request #32377: URL: https://github.com/apache/spark/pull/32377#issuecomment-833248388 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/42712/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #32377: [SPARK-35021][SQL] Group exception messages in connector/catalog
AmplabJenkins commented on pull request #32377: URL: https://github.com/apache/spark/pull/32377#issuecomment-833248388 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/42712/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #32446: [SPARK-35321][SQL] don't register Hive permanent functions when creating Hive client
AmplabJenkins commented on pull request #32446: URL: https://github.com/apache/spark/pull/32446#issuecomment-833248386 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/42710/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] dongjoon-hyun commented on pull request #32446: [SPARK-35321][SQL] don't register Hive permanent functions when creating Hive client
dongjoon-hyun commented on pull request #32446: URL: https://github.com/apache/spark/pull/32446#issuecomment-833247903 Both GitHub and Jenkins seems to show CNFE consistently. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] sunchao commented on a change in pull request #32410: [SPARK-35286][SQL] Replace SessionState.start with SessionState.setCurrentSessionState
sunchao commented on a change in pull request #32410: URL: https://github.com/apache/spark/pull/32410#discussion_r627092169 ## File path: sql/hive/src/main/scala/org/apache/spark/sql/hive/client/HiveClientImpl.scala ## @@ -190,7 +190,11 @@ private[hive] class HiveClientImpl( // For this reason we cannot load the jars added by ADDJarsCommand because of class loader // got changed. We reset it to clientLoader.ClassLoader here. state.getConf.setClassLoader(clientLoader.classLoader) -SessionState.start(state) +if (version != hive.v12) { + SessionState.setCurrentSessionState(state) Review comment: Make sense. I was curious why we call `setCurrentSessionState` here but then call `start` later in `runHive`. It seems that is only used for testing so all good. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #32377: [SPARK-35021][SQL] Group exception messages in connector/catalog
SparkQA commented on pull request #32377: URL: https://github.com/apache/spark/pull/32377#issuecomment-833243489 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] ulysses-you commented on pull request #32450: [SPARK-35282][SQL] Support AQE side shuffled hash join formula
ulysses-you commented on pull request #32450: URL: https://github.com/apache/spark/pull/32450#issuecomment-833242105 cc @maropu @cloud-fan @maryannxue @c21 do you have any thought about this new config ? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] ulysses-you opened a new pull request #32450: [SPARK-35282][SQL] Support AQE side shuffled hash join formula
ulysses-you opened a new pull request #32450: URL: https://github.com/apache/spark/pull/32450 ### What changes were proposed in this pull request? Use runtime statistics to decide if we can convert join to shuffled hash join. ### Why are the changes needed? Use AQE runtime statistics to decide if we can use shuffled hash join instead of sort merge join. Currently, the formula of shuffled hash join selection dose not work due to the dymanic shuffle partition number. Add a new config `spark.sql.adaptive.shuffledHashJoinLocalMapThreshold` to decide if join can be converted to shuffled hash join safely. ### Does this PR introduce _any_ user-facing change? Yes, add a new config. ### How was this patch tested? Add new test. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] dongjoon-hyun closed pull request #32449: [SPARK-35325][SQL][TESTS] Add nested column ORC encryption test case
dongjoon-hyun closed pull request #32449: URL: https://github.com/apache/spark/pull/32449 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] dongjoon-hyun commented on pull request #32449: [SPARK-35325][SQL][TESTS] Add nested column ORC encryption test case
dongjoon-hyun commented on pull request #32449: URL: https://github.com/apache/spark/pull/32449#issuecomment-833236789 GitHub Action passed. Merged to master for Apache Spark 3.2.0. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] dongjoon-hyun commented on pull request #32446: [SPARK-35321][SQL] don't register Hive permanent functions when creating Hive client
dongjoon-hyun commented on pull request #32446: URL: https://github.com/apache/spark/pull/32446#issuecomment-833236491 Ya, it looks weird because it's CNFE. ``` Caused by: sbt.ForkMain$ForkError: java.lang.ClassNotFoundException: parquet.DefaultSource at java.net.URLClassLoader.findClass(URLClassLoader.java:382) at java.lang.ClassLoader.loadClass(ClassLoader.java:418) at java.lang.ClassLoader.loadClass(ClassLoader.java:351) ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] dongjoon-hyun commented on a change in pull request #32446: [SPARK-35321][SQL] don't register Hive permanent functions when creating Hive client
dongjoon-hyun commented on a change in pull request #32446: URL: https://github.com/apache/spark/pull/32446#discussion_r627087786 ## File path: sql/hive/src/main/scala/org/apache/spark/sql/hive/client/HiveClientImpl.scala ## @@ -273,7 +273,9 @@ private[hive] class HiveClientImpl( if (clientLoader.cachedHive != null) { clientLoader.cachedHive.asInstanceOf[Hive] } else { - val c = Hive.get(conf) + // don't register all Hive permanent functions in Hive's FunctionRegistry since Spark loads + // them through direct HMS API calls + val c = Hive.getWithFastCheck(conf, false) Review comment: Thanks. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] sunchao commented on pull request #32446: [SPARK-35321][SQL] don't register Hive permanent functions when creating Hive client
sunchao commented on pull request #32446: URL: https://github.com/apache/spark/pull/32446#issuecomment-833234313 The test failures look weird. I tried `HiveCharVarcharTestSuite` locally and it all passed. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] sunchao commented on pull request #32446: [SPARK-35321][SQL] don't register Hive permanent functions when creating Hive client
sunchao commented on pull request #32446: URL: https://github.com/apache/spark/pull/32446#issuecomment-833233542 > Out of curiosity, why does Hive need to register all permanent functions (not built-in) UDFs when initiating Hive object? Most of these are not used, isn't? The permanent functions loaded are actually shared across Hive sessions (`Hive` as well as `FunctionRegistry` live within a HS2 instance), and is only loaded once unless explicitly done with `RELOAD FUNCTIONS` command etc. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #32446: [SPARK-35321][SQL] don't register Hive permanent functions when creating Hive client
SparkQA commented on pull request #32446: URL: https://github.com/apache/spark/pull/32446#issuecomment-833232979 Kubernetes integration test status failure URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/42710/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #32446: [SPARK-35321][SQL] don't register Hive permanent functions when creating Hive client
SparkQA commented on pull request #32446: URL: https://github.com/apache/spark/pull/32446#issuecomment-833230443 Kubernetes integration test starting URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/42710/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #32446: [SPARK-35321][SQL] don't register Hive permanent functions when creating Hive client
AmplabJenkins removed a comment on pull request #32446: URL: https://github.com/apache/spark/pull/32446#issuecomment-833227124 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/138189/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #32446: [SPARK-35321][SQL] don't register Hive permanent functions when creating Hive client
AmplabJenkins commented on pull request #32446: URL: https://github.com/apache/spark/pull/32446#issuecomment-833227124 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/138189/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA removed a comment on pull request #32446: [SPARK-35321][SQL] don't register Hive permanent functions when creating Hive client
SparkQA removed a comment on pull request #32446: URL: https://github.com/apache/spark/pull/32446#issuecomment-833208705 **[Test build #138189 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/138189/testReport)** for PR 32446 at commit [`aa6dde8`](https://github.com/apache/spark/commit/aa6dde8919356fb06f702f50334ff3ac2d3efcfa). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #32446: [SPARK-35321][SQL] don't register Hive permanent functions when creating Hive client
SparkQA commented on pull request #32446: URL: https://github.com/apache/spark/pull/32446#issuecomment-833226927 **[Test build #138189 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/138189/testReport)** for PR 32446 at commit [`aa6dde8`](https://github.com/apache/spark/commit/aa6dde8919356fb06f702f50334ff3ac2d3efcfa). * This patch **fails Spark unit tests**. * This patch merges cleanly. * This patch adds no public classes. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #32377: [SPARK-35021][SQL] Group exception messages in connector/catalog
SparkQA commented on pull request #32377: URL: https://github.com/apache/spark/pull/32377#issuecomment-833225332 **[Test build #138191 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/138191/testReport)** for PR 32377 at commit [`5106ed0`](https://github.com/apache/spark/commit/5106ed085171971503ad664d9c2d98da54efc4c1). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #32442: [SPARK-35283][SQL] Support query some DDL with CTES
AmplabJenkins removed a comment on pull request #32442: URL: https://github.com/apache/spark/pull/32442#issuecomment-833224996 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/42711/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #32442: [SPARK-35283][SQL] Support query some DDL with CTES
AmplabJenkins commented on pull request #32442: URL: https://github.com/apache/spark/pull/32442#issuecomment-833224996 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/42711/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #32442: [SPARK-35283][SQL] Support query some DDL with CTES
SparkQA commented on pull request #32442: URL: https://github.com/apache/spark/pull/32442#issuecomment-833224977 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #32441: [SPARK-35318][SQL] Hide internal view properties for describe table cmd
AmplabJenkins removed a comment on pull request #32441: URL: https://github.com/apache/spark/pull/32441#issuecomment-833224031 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/42709/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #32449: [SPARK-35325][SQL][TESTS] Add nested column ORC encryption test case
AmplabJenkins removed a comment on pull request #32449: URL: https://github.com/apache/spark/pull/32449#issuecomment-833224034 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/42708/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #32441: [SPARK-35318][SQL] Hide internal view properties for describe table cmd
AmplabJenkins commented on pull request #32441: URL: https://github.com/apache/spark/pull/32441#issuecomment-833224031 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/42709/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #32449: [SPARK-35325][SQL][TESTS] Add nested column ORC encryption test case
AmplabJenkins commented on pull request #32449: URL: https://github.com/apache/spark/pull/32449#issuecomment-833224034 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/42708/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #32449: [SPARK-35325][SQL][TESTS] Add nested column ORC encryption test case
SparkQA commented on pull request #32449: URL: https://github.com/apache/spark/pull/32449#issuecomment-833219878 Kubernetes integration test unable to build dist. exiting with code: 1 URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/42708/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #32441: [SPARK-35318][SQL] Hide internal view properties for describe table cmd
SparkQA commented on pull request #32441: URL: https://github.com/apache/spark/pull/32441#issuecomment-833218892 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] sunchao commented on a change in pull request #32407: [SPARK-35261][SQL] Support static magic method for stateless ScalarFunction
sunchao commented on a change in pull request #32407: URL: https://github.com/apache/spark/pull/32407#discussion_r627073317 ## File path: sql/core/benchmarks/FunctionBenchmark-jdk11-results.txt ## @@ -0,0 +1,60 @@ +OpenJDK 64-Bit Server VM 11.0.11+9-LTS on Linux 5.4.0-1046-azure +Intel(R) Xeon(R) CPU E5-2673 v4 @ 2.30GHz +Java scalar function (long + long) -> long result_nullable = true codegen = true: Best Time(ms) Avg Time(ms) Stdev(ms)Rate(M/s) Per Row(ns) Relative + +with long_add_default 52937 542501181 9.4 105.9 1.0X +with long_add_magic 20527 20975 683 24.4 41.1 2.6X +with long_add_static_magic 20513 20944 447 24.4 41.0 2.6X + +OpenJDK 64-Bit Server VM 11.0.11+9-LTS on Linux 5.4.0-1046-azure +Intel(R) Xeon(R) CPU E5-2673 v4 @ 2.30GHz +Java scalar function (long + long) -> long result_nullable = false codegen = true: Best Time(ms) Avg Time(ms) Stdev(ms)Rate(M/s) Per Row(ns) Relative +- +with long_add_default 54502 54710 208 9.2 109.0 1.0X +with long_add_magic 21790 22062 414 22.9 43.6 2.5X +with long_add_static_magic 19014 19715 626 26.3 38.0 2.9X + +OpenJDK 64-Bit Server VM 11.0.11+9-LTS on Linux 5.4.0-1046-azure +Intel(R) Xeon(R) CPU E5-2673 v4 @ 2.30GHz +Java scalar function (long + long) -> long result_nullable = true codegen = false: Best Time(ms) Avg Time(ms) Stdev(ms)Rate(M/s) Per Row(ns) Relative +- +with long_add_default 64039 64306 293 7.8 128.1 1.0X +with long_add_magic 199121 199232 144 2.5 398.2 0.3X +with long_add_static_magic 197914 1999331892 2.5 395.8 0.3X + +OpenJDK 64-Bit Server VM 11.0.11+9-LTS on Linux 5.4.0-1046-azure +Intel(R) Xeon(R) CPU E5-2673 v4 @ 2.30GHz +Java scalar function (long + long) -> long result_nullable = false codegen = false: Best Time(ms) Avg Time(ms) Stdev(ms)Rate(M/s) Per Row(ns) Relative +-- +with long_add_default 60515 60887 453 8.3 121.0 1.0X +with long_add_magic 201052 202036 957 2.5 402.1 0.3X +with long_add_static_magic 202037 202639 584 2.5 404.1 0.3X + +OpenJDK 64-Bit Server VM 11.0.11+9-LTS on Linux 5.4.0-1046-azure +Intel(R) Xeon(R) CPU E5-2673 v4 @ 2.30GHz +scalar function (long + long) -> long result_nullable = true codegen = true: Best Time(ms) Avg Time(ms) Stdev(ms)Rate(M/s) Per Row(ns) Relative +--- +with long_add_default 62105 62467 341 8.1 124.2 1.0X +with long_add_magic 20721 227051729 24.1 41.4 3.0X + Review comment: For Scala, no, since it'll always use `Invoke` and so there is no difference. We have `long_add_static_magic` for Java UDFs above. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use
[GitHub] [spark] sunchao commented on a change in pull request #32446: [SPARK-35321][SQL] don't register Hive permanent functions when creating Hive client
sunchao commented on a change in pull request #32446: URL: https://github.com/apache/spark/pull/32446#discussion_r627072706 ## File path: sql/hive/src/main/scala/org/apache/spark/sql/hive/client/HiveClientImpl.scala ## @@ -273,7 +273,9 @@ private[hive] class HiveClientImpl( if (clientLoader.cachedHive != null) { clientLoader.cachedHive.asInstanceOf[Hive] } else { - val c = Hive.get(conf) + // don't register all Hive permanent functions in Hive's FunctionRegistry since Spark loads + // them through direct HMS API calls + val c = Hive.getWithFastCheck(conf, false) Review comment: Got you. Yes it is unrecoverable. I'll update the JIRA to `BUG`. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] c21 commented on a change in pull request #32430: [SPARK-35133][SQL] Explain codegen works with AQE
c21 commented on a change in pull request #32430: URL: https://github.com/apache/spark/pull/32430#discussion_r627071113 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/debug/package.scala ## @@ -105,13 +106,20 @@ package object debug { * @param plan the query plan for codegen * @return Sequence of WholeStageCodegen subtrees and corresponding codegen */ - def codegenStringSeq(plan: SparkPlan): Seq[(String, String, ByteCodeStats)] = { + def codegenStringSeq( + plan: SparkPlan, + sparkSession: SparkSession): Seq[(String, String, ByteCodeStats)] = { val codegenSubtrees = new collection.mutable.HashSet[WholeStageCodegenExec]() def findSubtrees(plan: SparkPlan): Unit = { plan foreach { case s: WholeStageCodegenExec => codegenSubtrees += s +case p: AdaptiveSparkPlanExec => + // Find subtrees from original input plan of AQE. Review comment: Yes, this doesn't match the actual plan for `df.explain("codegen")` if `df` is executed already. The problem is the final plan `AdaptiveSparkPlanExec.executedPlan` has `ShuffleQueryStageExec` to wrap the whole sub-plan under that shuffle. Example: ``` spark.range(5).select(col("id").as("key"), col("id").as("value")).groupBy('key).agg(max('value)) ``` ``` AdaptiveSparkPlan isFinalPlan=true +- == Final Plan == *(2) HashAggregate(keys=[key#2L], functions=[max(value#3L)], output=[key#2L, max(value)#9L]) +- CustomShuffleReader coalesced +- ShuffleQueryStage 0 +- Exchange hashpartitioning(key#2L, 5), ENSURE_REQUIREMENTS, [id=#28] +- *(1) HashAggregate(keys=[key#2L], functions=[partial_max(value#3L)], output=[key#2L, max#13L]) +- *(1) Project [id#0L AS key#2L, id#0L AS value#3L] +- *(1) Range (0, 5, step=1, splits=2) ``` The partial aggregate `HashAggregate` is wrapped inside `ShuffleQueryStage`, so cannot be pattern matched to do the explain. One way to workaround is to add pattern matching for `ShuffleQueryStageExec` as well. But anyway we need to re-run the preparation physical plan rules if `AdaptiveSparkPlan.isFinalPlan=false`. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #32442: [SPARK-35283][SQL] Support query some DDL with CTES
SparkQA commented on pull request #32442: URL: https://github.com/apache/spark/pull/32442#issuecomment-833209789 **[Test build #138190 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/138190/testReport)** for PR 32442 at commit [`09a04b0`](https://github.com/apache/spark/commit/09a04b0ec2aa7b606e03d142cfca55e8bdafa8cd). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #32446: [SPARK-35321][SQL] don't register Hive permanent functions when creating Hive client
SparkQA commented on pull request #32446: URL: https://github.com/apache/spark/pull/32446#issuecomment-833208705 **[Test build #138189 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/138189/testReport)** for PR 32446 at commit [`aa6dde8`](https://github.com/apache/spark/commit/aa6dde8919356fb06f702f50334ff3ac2d3efcfa). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] viirya commented on pull request #32404: [SPARK-35278][SQL] Invoke should find the method with correct number of parameters
viirya commented on pull request #32404: URL: https://github.com/apache/spark/pull/32404#issuecomment-833208196 > > For example, calling a method func(input: Object) with argument Tuple2 will be disallowed. > > This should be allowed, because we do provide the concrete parameter type (`Tuple2`). What should be forbidden is something like `func(s: String)` and the parameter type is `Object`. > > I think I get your point, we should probably use a util function to look up the method, instead of writing the code by ourselves, to handle cases like search the inheritance hierarchy, argument type casting, etc. > > How about https://commons.apache.org/proper/commons-lang/apidocs/org/apache/commons/lang3/reflect/MethodUtils.html#getMatchingMethod-java.lang.Class-java.lang.String-java.lang.Class...- It looks promising. Not aware there is an util function we can use. So my question would be should we use it in backports? Or only replacing with our code with the util in master only? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] dongjoon-hyun commented on pull request #32449: [SPARK-35325][SQL][TESTS] Add nested column ORC encryption test case
dongjoon-hyun commented on pull request #32449: URL: https://github.com/apache/spark/pull/32449#issuecomment-833208108 Thank you, @HyukjinKwon ! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] viirya commented on pull request #32446: [SPARK-35321][SQL] don't register Hive permanent functions when creating Hive client
viirya commented on pull request #32446: URL: https://github.com/apache/spark/pull/32446#issuecomment-833207628 retest this please -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #32441: [SPARK-35318][SQL] Hide internal view properties for describe table cmd
SparkQA commented on pull request #32441: URL: https://github.com/apache/spark/pull/32441#issuecomment-833206570 **[Test build #138188 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/138188/testReport)** for PR 32441 at commit [`10af6c8`](https://github.com/apache/spark/commit/10af6c8f3a74e40c6fce82283db13dc011b3ac8a). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #32449: [SPARK-35325][SQL][TESTS] Add nested column ORC encryption test case
SparkQA commented on pull request #32449: URL: https://github.com/apache/spark/pull/32449#issuecomment-833206537 **[Test build #138187 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/138187/testReport)** for PR 32449 at commit [`ebc4182`](https://github.com/apache/spark/commit/ebc41825ba3b41cccb4b9264a78b9f03c8ad9f10). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] viirya commented on pull request #32413: [SPARK-35288][SQL] StaticInvoke should find the method without exact argument classes match
viirya commented on pull request #32413: URL: https://github.com/apache/spark/pull/32413#issuecomment-833205906 > shall we consider [#32404 (comment)](https://github.com/apache/spark/pull/32404#issuecomment-833173900) ? It looks promising. Do you think we should use it even in backports? Or just use it in master only? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #32407: [SPARK-35261][SQL] Support static magic method for stateless ScalarFunction
AmplabJenkins removed a comment on pull request #32407: URL: https://github.com/apache/spark/pull/32407#issuecomment-833205596 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/138183/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #32407: [SPARK-35261][SQL] Support static magic method for stateless ScalarFunction
AmplabJenkins commented on pull request #32407: URL: https://github.com/apache/spark/pull/32407#issuecomment-833205596 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/138183/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] imback82 commented on pull request #32447: [SPARK-34701][SQL][FOLLOW-UP] Children/innerChildren should be mutually exclusive for AlterViewAsCommand/CreateViewCommand
imback82 commented on pull request #32447: URL: https://github.com/apache/spark/pull/32447#issuecomment-833204227 cc @cloud-fan -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA removed a comment on pull request #32407: [SPARK-35261][SQL] Support static magic method for stateless ScalarFunction
SparkQA removed a comment on pull request #32407: URL: https://github.com/apache/spark/pull/32407#issuecomment-833101335 **[Test build #138183 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/138183/testReport)** for PR 32407 at commit [`5a0b243`](https://github.com/apache/spark/commit/5a0b243df600fddc241b7aea17117972a1b9bb1d). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #32407: [SPARK-35261][SQL] Support static magic method for stateless ScalarFunction
SparkQA commented on pull request #32407: URL: https://github.com/apache/spark/pull/32407#issuecomment-833195577 **[Test build #138183 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/138183/testReport)** for PR 32407 at commit [`5a0b243`](https://github.com/apache/spark/commit/5a0b243df600fddc241b7aea17117972a1b9bb1d). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] HyukjinKwon commented on pull request #32395: [SPARK-35270][SQL][CORE] Remove the use of guava in order to upgrade guava version to 27
HyukjinKwon commented on pull request #32395: URL: https://github.com/apache/spark/pull/32395#issuecomment-833195384 Did you do something like https://github.com/apache/spark/pull/32400#issuecomment-831051189 too? If it's done, feel free to rebase which should retrigger the test. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] HyukjinKwon commented on a change in pull request #31776: [SPARK-34661][SQL] Clean up `OriginalType` and `DecimalMetadata ` usage in Parquet related code
HyukjinKwon commented on a change in pull request #31776: URL: https://github.com/apache/spark/pull/31776#discussion_r627050278 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/parquet/ParquetFilters.scala ## @@ -105,23 +105,28 @@ class ParquetFilters( fieldType: ParquetSchemaType) private case class ParquetSchemaType( - originalType: OriginalType, + logicalTypeAnnotation: LogicalTypeAnnotation, primitiveTypeName: PrimitiveTypeName, - length: Int, - decimalMetadata: DecimalMetadata) - - private val ParquetBooleanType = ParquetSchemaType(null, BOOLEAN, 0, null) - private val ParquetByteType = ParquetSchemaType(INT_8, INT32, 0, null) - private val ParquetShortType = ParquetSchemaType(INT_16, INT32, 0, null) - private val ParquetIntegerType = ParquetSchemaType(null, INT32, 0, null) - private val ParquetLongType = ParquetSchemaType(null, INT64, 0, null) - private val ParquetFloatType = ParquetSchemaType(null, FLOAT, 0, null) - private val ParquetDoubleType = ParquetSchemaType(null, DOUBLE, 0, null) - private val ParquetStringType = ParquetSchemaType(UTF8, BINARY, 0, null) - private val ParquetBinaryType = ParquetSchemaType(null, BINARY, 0, null) - private val ParquetDateType = ParquetSchemaType(DATE, INT32, 0, null) - private val ParquetTimestampMicrosType = ParquetSchemaType(TIMESTAMP_MICROS, INT64, 0, null) - private val ParquetTimestampMillisType = ParquetSchemaType(TIMESTAMP_MILLIS, INT64, 0, null) + length: Int) Review comment: cc @wangyum FYI -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] HyukjinKwon commented on pull request #31776: [SPARK-34661][SQL] Clean up `OriginalType` and `DecimalMetadata ` usage in Parquet related code
HyukjinKwon commented on pull request #31776: URL: https://github.com/apache/spark/pull/31776#issuecomment-833194844 I guess it's fine. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] linhongliu-db commented on pull request #32441: [SPARK-35318][SQL] Hide internal view properties for describe table cmd
linhongliu-db commented on pull request #32441: URL: https://github.com/apache/spark/pull/32441#issuecomment-833194000 @HyukjinKwon @maropu, thanks for reviewing. Also cc @cloud-fan. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] maropu commented on pull request #32441: [SPARK-35318][SQL] Hide internal view properties for describe table cmd
maropu commented on pull request #32441: URL: https://github.com/apache/spark/pull/32441#issuecomment-833193408 I've checked the failure and I think it is not related to this PR. NOTE: I noticed that the failure happened because `tpcds/q6.sql` sorts the `cnt` column only: https://github.com/apache/spark/blob/a0c76a8755a148e2bd774edcda12fe20f2f38c75/sql/core/src/test/resources/tpcds/q6.sql#L20 Actually, `tpcds/q6.sql` and `tpcds-v2.7.0/q6.sql` are almost the same and `tpcds-v2.7.0/q6.sql` sorts both `cnt` and `a.ca_state`: https://github.com/apache/spark/blob/a0c76a8755a148e2bd774edcda12fe20f2f38c75/sql/core/src/test/resources/tpcds-v2.7.0/q6.sql#L22 So, I'm thinking that we'd be better to remove `tpcds/q6.sql` for stable testing. I'll check if the other queries have the same issue then make a PR to fix it later. Anyway, the fix itself in this PR looks fine, too. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] dongjoon-hyun opened a new pull request #32449: [SPARK-35325][SQL][TESTS] Add nested column ORC encryption test case
dongjoon-hyun opened a new pull request #32449: URL: https://github.com/apache/spark/pull/32449 ### What changes were proposed in this pull request? ### Why are the changes needed? ### Does this PR introduce _any_ user-facing change? ### How was this patch tested? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] dongjoon-hyun commented on a change in pull request #32446: [SPARK-35321][SQL] don't register Hive permanent functions when creating Hive client
dongjoon-hyun commented on a change in pull request #32446: URL: https://github.com/apache/spark/pull/32446#discussion_r627045025 ## File path: sql/hive/src/main/scala/org/apache/spark/sql/hive/client/HiveClientImpl.scala ## @@ -273,7 +273,9 @@ private[hive] class HiveClientImpl( if (clientLoader.cachedHive != null) { clientLoader.cachedHive.asInstanceOf[Hive] } else { - val c = Hive.get(conf) + // don't register all Hive permanent functions in Hive's FunctionRegistry since Spark loads + // them through direct HMS API calls + val c = Hive.getWithFastCheck(conf, false) Review comment: Yes, the question was the previous `get_all_functions` failure blocked the rest of Spark operations or not. If it blocks all further Spark's operations, it should have a `BUG` type. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #32447: [SPARK-34701][SQL][FOLLOW-UP] Children/innerChildren should be mutually exclusive for AlterViewAsCommand/CreateViewCommand
AmplabJenkins removed a comment on pull request #32447: URL: https://github.com/apache/spark/pull/32447#issuecomment-833187648 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/42706/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #32439: [SPARK-35298][SQL] Migrate to transformWithPruning for rules in Optimizer.scala
AmplabJenkins removed a comment on pull request #32439: URL: https://github.com/apache/spark/pull/32439#issuecomment-833187652 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/42707/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #32413: [SPARK-35288][SQL] StaticInvoke should find the method without exact argument classes match
AmplabJenkins removed a comment on pull request #32413: URL: https://github.com/apache/spark/pull/32413#issuecomment-833187650 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/138181/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #32446: [SPARK-35321][SQL] don't register Hive permanent functions when creating Hive client
AmplabJenkins removed a comment on pull request #32446: URL: https://github.com/apache/spark/pull/32446#issuecomment-833187647 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/138184/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #32446: [SPARK-35321][SQL] don't register Hive permanent functions when creating Hive client
AmplabJenkins commented on pull request #32446: URL: https://github.com/apache/spark/pull/32446#issuecomment-833187647 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/138184/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #32447: [SPARK-34701][SQL][FOLLOW-UP] Children/innerChildren should be mutually exclusive for AlterViewAsCommand/CreateViewCommand
AmplabJenkins commented on pull request #32447: URL: https://github.com/apache/spark/pull/32447#issuecomment-833187648 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/42706/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #32413: [SPARK-35288][SQL] StaticInvoke should find the method without exact argument classes match
AmplabJenkins commented on pull request #32413: URL: https://github.com/apache/spark/pull/32413#issuecomment-833187650 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/138181/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #32439: [SPARK-35298][SQL] Migrate to transformWithPruning for rules in Optimizer.scala
AmplabJenkins commented on pull request #32439: URL: https://github.com/apache/spark/pull/32439#issuecomment-833187652 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/42707/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] zhengruifeng commented on pull request #32350: [SPARK-35231][SQL] logical.Range override maxRowsPerPartition
zhengruifeng commented on pull request #32350: URL: https://github.com/apache/spark/pull/32350#issuecomment-833187219 @wangyum @maropu Thanks for reviewing! I think I made this PR too complex, and will follow @wangyum 's commment to use a simpler testsuite. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] HyukjinKwon commented on pull request #32448: [SPARK-35290][SQL] Use StructType merging for unionByName with null filling
HyukjinKwon commented on pull request #32448: URL: https://github.com/apache/spark/pull/32448#issuecomment-833186519 cc @viirya FYI -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] zhengruifeng commented on a change in pull request #32350: [SPARK-35231][SQL] logical.Range override maxRowsPerPartition
zhengruifeng commented on a change in pull request #32350: URL: https://github.com/apache/spark/pull/32350#discussion_r627042329 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala ## @@ -1618,12 +1618,18 @@ object EliminateLimits extends Rule[LogicalPlan] { private def canEliminate(limitExpr: Expression, child: LogicalPlan): Boolean = { limitExpr.foldable && child.maxRows.exists { _ <= limitExpr.eval().asInstanceOf[Int] } } + private def canEliminateLocalLimit(localLimitExpr: Expression, child: LogicalPlan): Boolean = { +localLimitExpr.foldable && + child.maxRowsPerPartition.exists { _ <= localLimitExpr.eval().asInstanceOf[Int] } + } def apply(plan: LogicalPlan): LogicalPlan = plan transformDown { case Limit(l, child) if canEliminate(l, child) => child case GlobalLimit(l, child) if canEliminate(l, child) => child +case LocalLimit(l, child) if !plan.isStreaming && canEliminateLocalLimit(l, child) => Review comment: > It is not possible that a user's query reaches this optimization path now? end user's query should not reaches this path, I think. This path is only for adding a _similar_ test in `CombiningLimitsSuite` > In a streaming case, maxRowsPerPartition can be filled? (we need the condition !plan.isStreaming here?) `org.apache.spark.sql.streaming.StreamSuite.SPARK-30657: streaming limit optimization from StreamingLocalLimitExec to LocalLimitExec` fails if do not add this condition. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] dongjoon-hyun commented on pull request #32443: [SPARK-35319][K8S][BUILD] Upgrade K8s client to 5.3.1
dongjoon-hyun commented on pull request #32443: URL: https://github.com/apache/spark/pull/32443#issuecomment-833185996 Thank you, @HyukjinKwon ! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] dongjoon-hyun closed pull request #32443: [SPARK-35319][K8S][BUILD] Upgrade K8s client to 5.3.1
dongjoon-hyun closed pull request #32443: URL: https://github.com/apache/spark/pull/32443 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] dongjoon-hyun commented on pull request #32443: [SPARK-35319][K8S][BUILD] Upgrade K8s client to 5.3.1
dongjoon-hyun commented on pull request #32443: URL: https://github.com/apache/spark/pull/32443#issuecomment-833185672 Thank you so much, @viirya . GitHub Action and Jenkins passed and K8s IT manually test result is attached into the PR description. Merged to master for Apache Spark 3.2.0. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] HyukjinKwon commented on pull request #32441: [SPARK-35318][SQL] Hide internal view properties for describe table cmd
HyukjinKwon commented on pull request #32441: URL: https://github.com/apache/spark/pull/32441#issuecomment-833185051 Looks good but seems like there's one test failure to fix (https://github.com/linhongliu-db/spark/runs/2507928605?check_suite_focus=true) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #32447: [SPARK-34701][SQL][FOLLOW-UP] Children/innerChildren should be mutually exclusive for AlterViewAsCommand/CreateViewCommand
SparkQA commented on pull request #32447: URL: https://github.com/apache/spark/pull/32447#issuecomment-833183792 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #32439: [SPARK-35298][SQL] Migrate to transformWithPruning for rules in Optimizer.scala
SparkQA commented on pull request #32439: URL: https://github.com/apache/spark/pull/32439#issuecomment-833182162 Kubernetes integration test unable to build dist. exiting with code: 1 URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/42707/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] zhengruifeng edited a comment on pull request #32415: [SPARK-35295][ML] Replace fully com.github.fommil.netlib by dev.ludovic.netlib:2.0
zhengruifeng edited a comment on pull request #32415: URL: https://github.com/apache/spark/pull/32415#issuecomment-833180239 @srowen @luhenry Thanks for pinging me! I believe that GMM python test case is quite unstable. I had found that even [the way to compute the sum of weights](https://github.com/apache/spark/pull/26735#issuecomment-568451629) can cause different convergence curves. I think this GMM case has wasted too many efforts to verify, and we should apply a more stable case instead. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] cloud-fan commented on a change in pull request #32430: [SPARK-35133][SQL] Explain codegen works with AQE
cloud-fan commented on a change in pull request #32430: URL: https://github.com/apache/spark/pull/32430#discussion_r627037126 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/debug/package.scala ## @@ -105,13 +106,20 @@ package object debug { * @param plan the query plan for codegen * @return Sequence of WholeStageCodegen subtrees and corresponding codegen */ - def codegenStringSeq(plan: SparkPlan): Seq[(String, String, ByteCodeStats)] = { + def codegenStringSeq( + plan: SparkPlan, + sparkSession: SparkSession): Seq[(String, String, ByteCodeStats)] = { val codegenSubtrees = new collection.mutable.HashSet[WholeStageCodegenExec]() def findSubtrees(plan: SparkPlan): Unit = { plan foreach { case s: WholeStageCodegenExec => codegenSubtrees += s +case p: AdaptiveSparkPlanExec => + // Find subtrees from original input plan of AQE. Review comment: This doesn't match the actual plan. Why can't use `AdaptiveSparkPlanExec.executedPlan`? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] cloud-fan commented on a change in pull request #32430: [SPARK-35133][SQL] Explain codegen works with AQE
cloud-fan commented on a change in pull request #32430: URL: https://github.com/apache/spark/pull/32430#discussion_r627036985 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/debug/package.scala ## @@ -105,13 +106,20 @@ package object debug { * @param plan the query plan for codegen * @return Sequence of WholeStageCodegen subtrees and corresponding codegen */ - def codegenStringSeq(plan: SparkPlan): Seq[(String, String, ByteCodeStats)] = { + def codegenStringSeq( + plan: SparkPlan, + sparkSession: SparkSession): Seq[(String, String, ByteCodeStats)] = { val codegenSubtrees = new collection.mutable.HashSet[WholeStageCodegenExec]() def findSubtrees(plan: SparkPlan): Unit = { plan foreach { case s: WholeStageCodegenExec => codegenSubtrees += s +case p: AdaptiveSparkPlanExec => + // Find subtrees from original input plan of AQE. Review comment: This doesn't match the actual plan. Why can't use `AdaptiveSparkPlanExec.executedPlan`? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] zhengruifeng commented on pull request #32415: [SPARK-35295][ML] Replace fully com.github.fommil.netlib by dev.ludovic.netlib:2.0
zhengruifeng commented on pull request #32415: URL: https://github.com/apache/spark/pull/32415#issuecomment-833180239 @srowen @luhenry Thanks for pinging me! I believe that GMM python test case is quite unstable. I had found that the [way to compute the sum of weights](https://github.com/apache/spark/pull/26735#issuecomment-568451629) can even cause different convergence curves. I think this GMM case has wasted too many efforts to verify, and we should apply a more stable case instead. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] cloud-fan commented on a change in pull request #32430: [SPARK-35133][SQL] Explain codegen works with AQE
cloud-fan commented on a change in pull request #32430: URL: https://github.com/apache/spark/pull/32430#discussion_r627036460 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/debug/package.scala ## @@ -73,14 +74,14 @@ package object debug { * @param plan the query plan for codegen * @return single String containing all WholeStageCodegen subtrees and corresponding codegen */ - def codegenString(plan: SparkPlan): String = { + def codegenString(plan: SparkPlan, sparkSession: SparkSession): String = { Review comment: shall we simply strip these AQE wrapper nodes at the beginning using `AdaptiveSparkPlanHelper.stripAQEPlan`? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA removed a comment on pull request #32413: [SPARK-35288][SQL] StaticInvoke should find the method without exact argument classes match
SparkQA removed a comment on pull request #32413: URL: https://github.com/apache/spark/pull/32413#issuecomment-833064296 **[Test build #138181 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/138181/testReport)** for PR 32413 at commit [`d1aa94d`](https://github.com/apache/spark/commit/d1aa94deb74b9aa98bf2c38030f1a8c7c0e11dbb). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #32413: [SPARK-35288][SQL] StaticInvoke should find the method without exact argument classes match
SparkQA commented on pull request #32413: URL: https://github.com/apache/spark/pull/32413#issuecomment-833179955 **[Test build #138181 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/138181/testReport)** for PR 32413 at commit [`d1aa94d`](https://github.com/apache/spark/commit/d1aa94deb74b9aa98bf2c38030f1a8c7c0e11dbb). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] cloud-fan commented on a change in pull request #32407: [SPARK-35261][SQL] Support static magic method for stateless ScalarFunction
cloud-fan commented on a change in pull request #32407: URL: https://github.com/apache/spark/pull/32407#discussion_r627035596 ## File path: sql/core/benchmarks/FunctionBenchmark-jdk11-results.txt ## @@ -0,0 +1,60 @@ +OpenJDK 64-Bit Server VM 11.0.11+9-LTS on Linux 5.4.0-1046-azure +Intel(R) Xeon(R) CPU E5-2673 v4 @ 2.30GHz +Java scalar function (long + long) -> long result_nullable = true codegen = true: Best Time(ms) Avg Time(ms) Stdev(ms)Rate(M/s) Per Row(ns) Relative + +with long_add_default 52937 542501181 9.4 105.9 1.0X +with long_add_magic 20527 20975 683 24.4 41.1 2.6X +with long_add_static_magic 20513 20944 447 24.4 41.0 2.6X + +OpenJDK 64-Bit Server VM 11.0.11+9-LTS on Linux 5.4.0-1046-azure +Intel(R) Xeon(R) CPU E5-2673 v4 @ 2.30GHz +Java scalar function (long + long) -> long result_nullable = false codegen = true: Best Time(ms) Avg Time(ms) Stdev(ms)Rate(M/s) Per Row(ns) Relative +- +with long_add_default 54502 54710 208 9.2 109.0 1.0X +with long_add_magic 21790 22062 414 22.9 43.6 2.5X +with long_add_static_magic 19014 19715 626 26.3 38.0 2.9X + +OpenJDK 64-Bit Server VM 11.0.11+9-LTS on Linux 5.4.0-1046-azure +Intel(R) Xeon(R) CPU E5-2673 v4 @ 2.30GHz +Java scalar function (long + long) -> long result_nullable = true codegen = false: Best Time(ms) Avg Time(ms) Stdev(ms)Rate(M/s) Per Row(ns) Relative +- +with long_add_default 64039 64306 293 7.8 128.1 1.0X +with long_add_magic 199121 199232 144 2.5 398.2 0.3X +with long_add_static_magic 197914 1999331892 2.5 395.8 0.3X + +OpenJDK 64-Bit Server VM 11.0.11+9-LTS on Linux 5.4.0-1046-azure +Intel(R) Xeon(R) CPU E5-2673 v4 @ 2.30GHz +Java scalar function (long + long) -> long result_nullable = false codegen = false: Best Time(ms) Avg Time(ms) Stdev(ms)Rate(M/s) Per Row(ns) Relative +-- +with long_add_default 60515 60887 453 8.3 121.0 1.0X +with long_add_magic 201052 202036 957 2.5 402.1 0.3X +with long_add_static_magic 202037 202639 584 2.5 404.1 0.3X + +OpenJDK 64-Bit Server VM 11.0.11+9-LTS on Linux 5.4.0-1046-azure +Intel(R) Xeon(R) CPU E5-2673 v4 @ 2.30GHz +scalar function (long + long) -> long result_nullable = true codegen = true: Best Time(ms) Avg Time(ms) Stdev(ms)Rate(M/s) Per Row(ns) Relative +--- +with long_add_default 62105 62467 341 8.1 124.2 1.0X +with long_add_magic 20721 227051729 24.1 41.4 3.0X + Review comment: do we have result for `long_add_static_magic` when codegen is on? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about
[GitHub] [spark] maropu commented on pull request #28097: [SPARK-31325][SQL][Web UI] Control a plan explain mode in the events of SQL listeners via SQLConf
maropu commented on pull request #28097: URL: https://github.com/apache/spark/pull/28097#issuecomment-833178667 > I found a regression here. Before this PR, the UI shows both the logical and physical plan, now it only has the physical plan. Shall we update the formatted EXPLAIN to show both the logical and physical plan? okay, I'll open a PR to fix it. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] wForget commented on pull request #32395: [SPARK-35270][SQL][CORE] Remove the use of guava in order to upgrade guava version to 27
wForget commented on pull request #32395: URL: https://github.com/apache/spark/pull/32395#issuecomment-833178638 @HyukjinKwon I have enabled it, how to rerun these checks? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] cloud-fan commented on pull request #28097: [SPARK-31325][SQL][Web UI] Control a plan explain mode in the events of SQL listeners via SQLConf
cloud-fan commented on pull request #28097: URL: https://github.com/apache/spark/pull/28097#issuecomment-833177979 I think the same issue happens in `df.explain(...)`. If I want to see the formatted result, I can't see the logical plan. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] jerqi commented on pull request #32426: [SPARK-35297][CORE][DOC][MINOR] Modify the comment about the executor
jerqi commented on pull request #32426: URL: https://github.com/apache/spark/pull/32426#issuecomment-833177929 > Have you checked that the other places don't have a similar issue I have checked that the other places don't have a similar issue -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA removed a comment on pull request #32446: [SPARK-35321][SQL] don't register Hive permanent functions when creating Hive client
SparkQA removed a comment on pull request #32446: URL: https://github.com/apache/spark/pull/32446#issuecomment-833148555 **[Test build #138184 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/138184/testReport)** for PR 32446 at commit [`aa6dde8`](https://github.com/apache/spark/commit/aa6dde8919356fb06f702f50334ff3ac2d3efcfa). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] cloud-fan commented on pull request #28097: [SPARK-31325][SQL][Web UI] Control a plan explain mode in the events of SQL listeners via SQLConf
cloud-fan commented on pull request #28097: URL: https://github.com/apache/spark/pull/28097#issuecomment-833177424 I found a regression here. Before this PR, the UI shows both the logical and physical plan, now it only has the physical plan. Shall we update the formatted EXPLAIN to show both the logical and physical plan? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #32446: [SPARK-35321][SQL] don't register Hive permanent functions when creating Hive client
SparkQA commented on pull request #32446: URL: https://github.com/apache/spark/pull/32446#issuecomment-833177153 **[Test build #138184 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/138184/testReport)** for PR 32446 at commit [`aa6dde8`](https://github.com/apache/spark/commit/aa6dde8919356fb06f702f50334ff3ac2d3efcfa). * This patch **fails Spark unit tests**. * This patch merges cleanly. * This patch adds no public classes. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] cloud-fan commented on pull request #32413: [SPARK-35288][SQL] StaticInvoke should find the method without exact argument classes match
cloud-fan commented on pull request #32413: URL: https://github.com/apache/spark/pull/32413#issuecomment-833174648 shall we consider https://github.com/apache/spark/pull/32404#issuecomment-833173900 ? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] cloud-fan commented on pull request #32404: [SPARK-35278][SQL] Invoke should find the method with correct number of parameters
cloud-fan commented on pull request #32404: URL: https://github.com/apache/spark/pull/32404#issuecomment-833173900 > For example, calling a method func(input: Object) with argument Tuple2 will be disallowed. This should be allowed, because we do provide the concrete parameter type (`Tuple2`). What should be forbidden is something like `func(s: String)` and the parameter type is `Object`. I think I get your point, we should probably use a util function to look up the method, instead of writing the code by ourselves, to handle cases like search the inheritance hierarchy, argument type casting, etc. How about https://commons.apache.org/proper/commons-lang/apidocs/org/apache/commons/lang3/reflect/MethodUtils.html#getMatchingMethod-java.lang.Class-java.lang.String-java.lang.Class...- -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] beliefer commented on pull request #32367: [SPARK-35020][SQL] Group exception messages in catalyst/util
beliefer commented on pull request #32367: URL: https://github.com/apache/spark/pull/32367#issuecomment-833172759 cc @cloud-fan -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] allisonwang-db commented on a change in pull request #32377: [SPARK-35021][SQL] Group exception messages in connector/catalog
allisonwang-db commented on a change in pull request #32377: URL: https://github.com/apache/spark/pull/32377#discussion_r627029240 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/errors/QueryCompilationErrors.scala ## @@ -1351,4 +1352,39 @@ private[spark] object QueryCompilationErrors { new AnalysisException( s"Ambiguous field name: $fieldName. Found multiple columns that can match: $names") } + + def cannotConvertBucketWithSortColumnsToTransformError(spec: BucketSpec): Throwable = { +new AnalysisException( + s"Cannot convert bucketing with sort columns to a transform: $spec") + } + + def cannotConvertTransformsToPartitionColumnsError(nonIdTransforms: Seq[Transform]): Throwable = { +new AnalysisException("Transforms cannot be converted to partition columns: " + + nonIdTransforms.map(_.describe).mkString(", ")) + } + + def cannotPartitionByNestedColumnError(reference: NamedReference): Throwable = { +new AnalysisException(s"Cannot partition by nested column: $reference") + } + + def cannotUseCatalogError(plugin: CatalogPlugin, msg: String): Throwable = { +new AnalysisException(s"Cannot use catalog ${plugin.name}: $msg") + } + + def invalidIdentifierAsItHasMoreThanTwoNamePartsError( Review comment: How about `identifierHavingMoreThanTwoNamePartsError` ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/errors/QueryCompilationErrors.scala ## @@ -1351,4 +1352,39 @@ private[spark] object QueryCompilationErrors { new AnalysisException( s"Ambiguous field name: $fieldName. Found multiple columns that can match: $names") } + + def cannotConvertBucketWithSortColumnsToTransformError(spec: BucketSpec): Throwable = { +new AnalysisException( + s"Cannot convert bucketing with sort columns to a transform: $spec") + } + + def cannotConvertTransformsToPartitionColumnsError(nonIdTransforms: Seq[Transform]): Throwable = { +new AnalysisException("Transforms cannot be converted to partition columns: " + + nonIdTransforms.map(_.describe).mkString(", ")) + } + + def cannotPartitionByNestedColumnError(reference: NamedReference): Throwable = { +new AnalysisException(s"Cannot partition by nested column: $reference") + } + + def cannotUseCatalogError(plugin: CatalogPlugin, msg: String): Throwable = { +new AnalysisException(s"Cannot use catalog ${plugin.name}: $msg") + } + + def invalidIdentifierAsItHasMoreThanTwoNamePartsError( + quoted: String, identifier: String): Throwable = { +new AnalysisException(s"$quoted is not a valid $identifier as it has more than 2 name parts.") + } + + def emptyMultipartIdentifierError(): Throwable = { +new AnalysisException("multi-part identifier cannot be empty.") + } + + def cannotCreateTablesWithNullTypeError(): Throwable = { +new AnalysisException(s"Cannot create tables with ${NullType.simpleString} type.") + } + + def functionUnsupportedInV1CatalogError(): Throwable = { Review comment: functionUnsupportedInV2CatalogError -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] maropu commented on pull request #31899: [SPARK-34525][SQL][DOCS] Update documentation for various DDLs to reflect alternative key value notation
maropu commented on pull request #31899: URL: https://github.com/apache/spark/pull/31899#issuecomment-833172402 @nikriek Could you resolve the conflict? Also, could you fix the same issue in https://github.com/apache/spark/blame/master/docs/sql-ref-syntax-hive-format.md#L33, too? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #32447: [SPARK-34701][SQL][FOLLOW-UP] Children/innerChildren should be mutually exclusive for AlterViewAsCommand/CreateViewCommand
SparkQA commented on pull request #32447: URL: https://github.com/apache/spark/pull/32447#issuecomment-833168412 **[Test build #138185 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/138185/testReport)** for PR 32447 at commit [`2e8253d`](https://github.com/apache/spark/commit/2e8253d599986ad437ef5e448e7f61e316889567). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #32439: [SPARK-35298][SQL] Migrate to transformWithPruning for rules in Optimizer.scala
SparkQA commented on pull request #32439: URL: https://github.com/apache/spark/pull/32439#issuecomment-833168435 **[Test build #138186 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/138186/testReport)** for PR 32439 at commit [`f376055`](https://github.com/apache/spark/commit/f376055d7309000c2e8253b5fc765665d6d14d8a). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #32446: [SPARK-35321][SQL] don't register Hive permanent functions when creating Hive client
AmplabJenkins removed a comment on pull request #32446: URL: https://github.com/apache/spark/pull/32446#issuecomment-833167655 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/42705/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] viirya commented on pull request #32413: [SPARK-35288][SQL] StaticInvoke should find the method without exact argument classes match
viirya commented on pull request #32413: URL: https://github.com/apache/spark/pull/32413#issuecomment-833167726 Thanks @dongjoon-hyun @srowen @maropu. Let me wait more a few days. I will merge before the end of this week. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #32448: [SPARK-35290][SQL] Use StructType merging for unionByName with null filling
AmplabJenkins commented on pull request #32448: URL: https://github.com/apache/spark/pull/32448#issuecomment-833167741 Can one of the admins verify this patch? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #32446: [SPARK-35321][SQL] don't register Hive permanent functions when creating Hive client
AmplabJenkins commented on pull request #32446: URL: https://github.com/apache/spark/pull/32446#issuecomment-833167655 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/42705/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] dongjoon-hyun commented on pull request #32413: [SPARK-35288][SQL] StaticInvoke should find the method without exact argument classes match
dongjoon-hyun commented on pull request #32413: URL: https://github.com/apache/spark/pull/32413#issuecomment-833162546 Also, it's okay to wait for @cloud-fan 's comment if this is not that urgent. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #32446: [SPARK-35321][SQL] don't register Hive permanent functions when creating Hive client
SparkQA commented on pull request #32446: URL: https://github.com/apache/spark/pull/32446#issuecomment-833162400 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] dongjoon-hyun edited a comment on pull request #32413: [SPARK-35288][SQL] StaticInvoke should find the method without exact argument classes match
dongjoon-hyun edited a comment on pull request #32413: URL: https://github.com/apache/spark/pull/32413#issuecomment-833162052 I think you can merge this, @viirya . All comments seems to be addressed (including this, https://github.com/apache/spark/pull/32413#discussion_r627018026) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org