[GitHub] [spark] xCASx commented on a change in pull request #27230: [SPARK-27868][CORE][FOLLOWUP] Recover the default value to -1 again
xCASx commented on a change in pull request #27230: [SPARK-27868][CORE][FOLLOWUP] Recover the default value to -1 again URL: https://github.com/apache/spark/pull/27230#discussion_r367807661 ## File path: common/network-common/src/main/java/org/apache/spark/network/util/TransportConf.java ## @@ -108,8 +108,12 @@ public int numConnectionsPerPeer() { return conf.getInt(SPARK_NETWORK_IO_NUMCONNECTIONSPERPEER_KEY, 1); } - /** Requested maximum length of the queue of incoming connections. Default is 64. */ - public int backLog() { return conf.getInt(SPARK_NETWORK_IO_BACKLOG_KEY, 64); } + /** + * Requested maximum length of the queue of incoming connections. If 1, Review comment: [Here is how it implemented](https://github.com/apache/spark/blob/09ed64d795d3199a94e175273fff6fcea6b52131/common/network-common/src/main/java/org/apache/spark/network/server/TransportServer.java#L117): ```java if (conf.backLog() > 0) { bootstrap.option(ChannelOption.SO_BACKLOG, conf.backLog()); } ``` I've been thinking about wording. If use `0 or negative` instead of `< 1` it may seem that separately mentioned `0` is some special case different to `negative` values. For me it's not a big deal, if you'd like, I can change it to `0 or negative`. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] xCASx commented on a change in pull request #27230: [SPARK-27868][CORE][FOLLOWUP] Recover the default value to -1 again
xCASx commented on a change in pull request #27230: [SPARK-27868][CORE][FOLLOWUP] Recover the default value to -1 again URL: https://github.com/apache/spark/pull/27230#discussion_r367807728 ## File path: docs/configuration.md ## @@ -844,13 +844,14 @@ Apart from these, the following properties are also available, and may be useful spark.shuffle.io.backLog - 64 + -1 Length of the accept queue for the shuffle service. For large applications, this value may need to be increased, so that incoming connections are not dropped if the service cannot keep up with a large number of connections arriving in a short period of time. This needs to be configured wherever the shuffle service itself is running, which may be outside of the -application (see spark.shuffle.service.enabled option below). +application (see spark.shuffle.service.enabled option below). If set below 1, Review comment: yes This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #27254: [SPARK-30543][ML][PYSPARK] RandomForest add Param bootstrap to control sampling method
AmplabJenkins removed a comment on issue #27254: [SPARK-30543][ML][PYSPARK] RandomForest add Param bootstrap to control sampling method URL: https://github.com/apache/spark/pull/27254#issuecomment-575517283 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #27058: [SPARK-30276][SQL] Support Filter expression allows simultaneous use of DISTINCT
AmplabJenkins removed a comment on issue #27058: [SPARK-30276][SQL] Support Filter expression allows simultaneous use of DISTINCT URL: https://github.com/apache/spark/pull/27058#issuecomment-575517341 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/21684/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #27058: [SPARK-30276][SQL] Support Filter expression allows simultaneous use of DISTINCT
AmplabJenkins removed a comment on issue #27058: [SPARK-30276][SQL] Support Filter expression allows simultaneous use of DISTINCT URL: https://github.com/apache/spark/pull/27058#issuecomment-575517330 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #27058: [SPARK-30276][SQL] Support Filter expression allows simultaneous use of DISTINCT
AmplabJenkins commented on issue #27058: [SPARK-30276][SQL] Support Filter expression allows simultaneous use of DISTINCT URL: https://github.com/apache/spark/pull/27058#issuecomment-575517330 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #27058: [SPARK-30276][SQL] Support Filter expression allows simultaneous use of DISTINCT
AmplabJenkins commented on issue #27058: [SPARK-30276][SQL] Support Filter expression allows simultaneous use of DISTINCT URL: https://github.com/apache/spark/pull/27058#issuecomment-575517341 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/21684/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #27254: [SPARK-30543][ML][PYSPARK] RandomForest add Param bootstrap to control sampling method
AmplabJenkins removed a comment on issue #27254: [SPARK-30543][ML][PYSPARK] RandomForest add Param bootstrap to control sampling method URL: https://github.com/apache/spark/pull/27254#issuecomment-575517293 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/21683/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #27254: [SPARK-30543][ML][PYSPARK] RandomForest add Param bootstrap to control sampling method
AmplabJenkins commented on issue #27254: [SPARK-30543][ML][PYSPARK] RandomForest add Param bootstrap to control sampling method URL: https://github.com/apache/spark/pull/27254#issuecomment-575517293 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/21683/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #27254: [SPARK-30543][ML][PYSPARK] RandomForest add Param bootstrap to control sampling method
AmplabJenkins commented on issue #27254: [SPARK-30543][ML][PYSPARK] RandomForest add Param bootstrap to control sampling method URL: https://github.com/apache/spark/pull/27254#issuecomment-575517283 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on issue #27254: [SPARK-30543][ML][PYSPARK] RandomForest add Param bootstrap to control sampling method
SparkQA commented on issue #27254: [SPARK-30543][ML][PYSPARK] RandomForest add Param bootstrap to control sampling method URL: https://github.com/apache/spark/pull/27254#issuecomment-575516860 **[Test build #116913 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/116913/testReport)** for PR 27254 at commit [`abea260`](https://github.com/apache/spark/commit/abea260fc25123142b9f3ec951ebe8c146010fc5). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on issue #27249: [SPARK-30019][SQL] Add ALTER TABLE SET OWNER syntax
SparkQA commented on issue #27249: [SPARK-30019][SQL] Add ALTER TABLE SET OWNER syntax URL: https://github.com/apache/spark/pull/27249#issuecomment-575516868 **[Test build #116914 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/116914/testReport)** for PR 27249 at commit [`4577e93`](https://github.com/apache/spark/commit/4577e933f4c55b9fa8a9e6c7f04696457369a49d). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on issue #27058: [SPARK-30276][SQL] Support Filter expression allows simultaneous use of DISTINCT
SparkQA commented on issue #27058: [SPARK-30276][SQL] Support Filter expression allows simultaneous use of DISTINCT URL: https://github.com/apache/spark/pull/27058#issuecomment-575516885 **[Test build #116915 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/116915/testReport)** for PR 27058 at commit [`5462c0c`](https://github.com/apache/spark/commit/5462c0ce605cf95e608c945e4ca8bff358740f50). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #27249: [SPARK-30019][SQL] Add ALTER TABLE SET OWNER syntax
AmplabJenkins removed a comment on issue #27249: [SPARK-30019][SQL] Add ALTER TABLE SET OWNER syntax URL: https://github.com/apache/spark/pull/27249#issuecomment-575515533 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/116898/ Test FAILed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #27249: [SPARK-30019][SQL] Add ALTER TABLE SET OWNER syntax
AmplabJenkins removed a comment on issue #27249: [SPARK-30019][SQL] Add ALTER TABLE SET OWNER syntax URL: https://github.com/apache/spark/pull/27249#issuecomment-575515524 Merged build finished. Test FAILed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #27249: [SPARK-30019][SQL] Add ALTER TABLE SET OWNER syntax
AmplabJenkins commented on issue #27249: [SPARK-30019][SQL] Add ALTER TABLE SET OWNER syntax URL: https://github.com/apache/spark/pull/27249#issuecomment-575515533 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/116898/ Test FAILed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AngersZhuuuu commented on issue #27106: [SPARK-30435][DOC] Update doc of Supported Hive Features
AngersZh commented on issue #27106: [SPARK-30435][DOC] Update doc of Supported Hive Features URL: https://github.com/apache/spark/pull/27106#issuecomment-575515771 > Gentle ping, @AngersZh . Updated, how about current? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] zhengruifeng opened a new pull request #27254: [SPARK-30543][ML][PYSPARK] RandomForest add Param bootstrap to control sampling method
zhengruifeng opened a new pull request #27254: [SPARK-30543][ML][PYSPARK] RandomForest add Param bootstrap to control sampling method URL: https://github.com/apache/spark/pull/27254 ### What changes were proposed in this pull request? add a param `bootstrap` to control whether bootstrap samples are used. ### Why are the changes needed? Current RF with numTrees=1 will directly build a tree using the orignial dataset, while with numTrees>1 it will use bootstrap samples to build trees. This design is for training a DecisionTreeModel by the impl of RandomForest, however, it is somewhat strange. In Scikit-Learn, there is a param [bootstrap](https://scikit-learn.org/stable/modules/generated/sklearn.ensemble.RandomForestClassifier.html#sklearn.ensemble.RandomForestClassifier) to control whether bootstrap samples are used. ### Does this PR introduce any user-facing change? Yes, new param is added ### How was this patch tested? existing testsuites This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #27249: [SPARK-30019][SQL] Add ALTER TABLE SET OWNER syntax
AmplabJenkins commented on issue #27249: [SPARK-30019][SQL] Add ALTER TABLE SET OWNER syntax URL: https://github.com/apache/spark/pull/27249#issuecomment-575515524 Merged build finished. Test FAILed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA removed a comment on issue #27249: [SPARK-30019][SQL] Add ALTER TABLE SET OWNER syntax
SparkQA removed a comment on issue #27249: [SPARK-30019][SQL] Add ALTER TABLE SET OWNER syntax URL: https://github.com/apache/spark/pull/27249#issuecomment-575474393 **[Test build #116898 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/116898/testReport)** for PR 27249 at commit [`0467b1e`](https://github.com/apache/spark/commit/0467b1eec3b7877236c3c054649f5453d45d3ab3). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on issue #27249: [SPARK-30019][SQL] Add ALTER TABLE SET OWNER syntax
SparkQA commented on issue #27249: [SPARK-30019][SQL] Add ALTER TABLE SET OWNER syntax URL: https://github.com/apache/spark/pull/27249#issuecomment-575515394 **[Test build #116898 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/116898/testReport)** for PR 27249 at commit [`0467b1e`](https://github.com/apache/spark/commit/0467b1eec3b7877236c3c054649f5453d45d3ab3). * This patch **fails Spark unit tests**. * This patch merges cleanly. * This patch adds no public classes. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #27249: [SPARK-30019][SQL] Add ALTER TABLE SET OWNER syntax
AmplabJenkins removed a comment on issue #27249: [SPARK-30019][SQL] Add ALTER TABLE SET OWNER syntax URL: https://github.com/apache/spark/pull/27249#issuecomment-575514985 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/21682/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #27249: [SPARK-30019][SQL] Add ALTER TABLE SET OWNER syntax
AmplabJenkins removed a comment on issue #27249: [SPARK-30019][SQL] Add ALTER TABLE SET OWNER syntax URL: https://github.com/apache/spark/pull/27249#issuecomment-575514976 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #27249: [SPARK-30019][SQL] Add ALTER TABLE SET OWNER syntax
AmplabJenkins commented on issue #27249: [SPARK-30019][SQL] Add ALTER TABLE SET OWNER syntax URL: https://github.com/apache/spark/pull/27249#issuecomment-575514976 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #27249: [SPARK-30019][SQL] Add ALTER TABLE SET OWNER syntax
AmplabJenkins commented on issue #27249: [SPARK-30019][SQL] Add ALTER TABLE SET OWNER syntax URL: https://github.com/apache/spark/pull/27249#issuecomment-575514985 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/21682/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #26953: [SPARK-30306][CORE][PYTHON] Instrument Python UDF execution time and throughput metrics using Spark Metrics system
AmplabJenkins removed a comment on issue #26953: [SPARK-30306][CORE][PYTHON] Instrument Python UDF execution time and throughput metrics using Spark Metrics system URL: https://github.com/apache/spark/pull/26953#issuecomment-575512692 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #27251: [SPARK-30539][PYTHON][SQL] Add DataFrame.tail in PySpark
AmplabJenkins removed a comment on issue #27251: [SPARK-30539][PYTHON][SQL] Add DataFrame.tail in PySpark URL: https://github.com/apache/spark/pull/27251#issuecomment-575512786 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #27251: [SPARK-30539][PYTHON][SQL] Add DataFrame.tail in PySpark
AmplabJenkins removed a comment on issue #27251: [SPARK-30539][PYTHON][SQL] Add DataFrame.tail in PySpark URL: https://github.com/apache/spark/pull/27251#issuecomment-575512798 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/21681/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #26953: [SPARK-30306][CORE][PYTHON] Instrument Python UDF execution time and throughput metrics using Spark Metrics system
AmplabJenkins removed a comment on issue #26953: [SPARK-30306][CORE][PYTHON] Instrument Python UDF execution time and throughput metrics using Spark Metrics system URL: https://github.com/apache/spark/pull/26953#issuecomment-575512706 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/116893/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #27251: [SPARK-30539][PYTHON][SQL] Add DataFrame.tail in PySpark
AmplabJenkins commented on issue #27251: [SPARK-30539][PYTHON][SQL] Add DataFrame.tail in PySpark URL: https://github.com/apache/spark/pull/27251#issuecomment-575512786 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #26953: [SPARK-30306][CORE][PYTHON] Instrument Python UDF execution time and throughput metrics using Spark Metrics system
AmplabJenkins commented on issue #26953: [SPARK-30306][CORE][PYTHON] Instrument Python UDF execution time and throughput metrics using Spark Metrics system URL: https://github.com/apache/spark/pull/26953#issuecomment-575512692 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #26953: [SPARK-30306][CORE][PYTHON] Instrument Python UDF execution time and throughput metrics using Spark Metrics system
AmplabJenkins commented on issue #26953: [SPARK-30306][CORE][PYTHON] Instrument Python UDF execution time and throughput metrics using Spark Metrics system URL: https://github.com/apache/spark/pull/26953#issuecomment-575512706 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/116893/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #27251: [SPARK-30539][PYTHON][SQL] Add DataFrame.tail in PySpark
AmplabJenkins commented on issue #27251: [SPARK-30539][PYTHON][SQL] Add DataFrame.tail in PySpark URL: https://github.com/apache/spark/pull/27251#issuecomment-575512798 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/21681/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on issue #27251: [SPARK-30539][PYTHON][SQL] Add DataFrame.tail in PySpark
SparkQA commented on issue #27251: [SPARK-30539][PYTHON][SQL] Add DataFrame.tail in PySpark URL: https://github.com/apache/spark/pull/27251#issuecomment-575512313 **[Test build #116912 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/116912/testReport)** for PR 27251 at commit [`438c8c9`](https://github.com/apache/spark/commit/438c8c9e7513e1ac0af491129fa68f857e83aec0). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on issue #26953: [SPARK-30306][CORE][PYTHON] Instrument Python UDF execution time and throughput metrics using Spark Metrics system
SparkQA commented on issue #26953: [SPARK-30306][CORE][PYTHON] Instrument Python UDF execution time and throughput metrics using Spark Metrics system URL: https://github.com/apache/spark/pull/26953#issuecomment-575512053 **[Test build #116893 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/116893/testReport)** for PR 26953 at commit [`2d70042`](https://github.com/apache/spark/commit/2d700426f804efeecc36aac10870390bfebd2be8). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA removed a comment on issue #26953: [SPARK-30306][CORE][PYTHON] Instrument Python UDF execution time and throughput metrics using Spark Metrics system
SparkQA removed a comment on issue #26953: [SPARK-30306][CORE][PYTHON] Instrument Python UDF execution time and throughput metrics using Spark Metrics system URL: https://github.com/apache/spark/pull/26953#issuecomment-575464231 **[Test build #116893 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/116893/testReport)** for PR 26953 at commit [`2d70042`](https://github.com/apache/spark/commit/2d700426f804efeecc36aac10870390bfebd2be8). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] HeartSaVioR commented on issue #26675: [SPARK-30041][SQL][WEBUI] Add Codegen Stage Id to Stage DAG visualization in Web UI
HeartSaVioR commented on issue #26675: [SPARK-30041][SQL][WEBUI] Add Codegen Stage Id to Stage DAG visualization in Web UI URL: https://github.com/apache/spark/pull/26675#issuecomment-575511884 @cloud-fan > @HeartSaVioR org.apache.spark.sql.kafka010.KafkaDelegationTokenSuite seems still flaky. Shall we create a JIRA ticket to investigate it further? Thanks for pinging. I'll create a new JIRA issue to track this. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #27130: [SPARK-25993][SQL][TESTS] Add test cases for CREATE EXTERNAL TABLE with subdirectories
AmplabJenkins removed a comment on issue #27130: [SPARK-25993][SQL][TESTS] Add test cases for CREATE EXTERNAL TABLE with subdirectories URL: https://github.com/apache/spark/pull/27130#issuecomment-575511352 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/116894/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #27130: [SPARK-25993][SQL][TESTS] Add test cases for CREATE EXTERNAL TABLE with subdirectories
AmplabJenkins removed a comment on issue #27130: [SPARK-25993][SQL][TESTS] Add test cases for CREATE EXTERNAL TABLE with subdirectories URL: https://github.com/apache/spark/pull/27130#issuecomment-575511342 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #27130: [SPARK-25993][SQL][TESTS] Add test cases for CREATE EXTERNAL TABLE with subdirectories
AmplabJenkins commented on issue #27130: [SPARK-25993][SQL][TESTS] Add test cases for CREATE EXTERNAL TABLE with subdirectories URL: https://github.com/apache/spark/pull/27130#issuecomment-575511352 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/116894/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #27130: [SPARK-25993][SQL][TESTS] Add test cases for CREATE EXTERNAL TABLE with subdirectories
AmplabJenkins commented on issue #27130: [SPARK-25993][SQL][TESTS] Add test cases for CREATE EXTERNAL TABLE with subdirectories URL: https://github.com/apache/spark/pull/27130#issuecomment-575511342 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA removed a comment on issue #27130: [SPARK-25993][SQL][TESTS] Add test cases for CREATE EXTERNAL TABLE with subdirectories
SparkQA removed a comment on issue #27130: [SPARK-25993][SQL][TESTS] Add test cases for CREATE EXTERNAL TABLE with subdirectories URL: https://github.com/apache/spark/pull/27130#issuecomment-575465976 **[Test build #116894 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/116894/testReport)** for PR 27130 at commit [`0bb628f`](https://github.com/apache/spark/commit/0bb628f93399c2331155508eec60730277e5426c). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on issue #27130: [SPARK-25993][SQL][TESTS] Add test cases for CREATE EXTERNAL TABLE with subdirectories
SparkQA commented on issue #27130: [SPARK-25993][SQL][TESTS] Add test cases for CREATE EXTERNAL TABLE with subdirectories URL: https://github.com/apache/spark/pull/27130#issuecomment-575510864 **[Test build #116894 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/116894/testReport)** for PR 27130 at commit [`0bb628f`](https://github.com/apache/spark/commit/0bb628f93399c2331155508eec60730277e5426c). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] dongjoon-hyun commented on a change in pull request #27251: [SPARK-30539][PYTHON][SQL] Add DataFrame.tail in PySpark
dongjoon-hyun commented on a change in pull request #27251: [SPARK-30539][PYTHON][SQL] Add DataFrame.tail in PySpark URL: https://github.com/apache/spark/pull/27251#discussion_r367799321 ## File path: python/pyspark/sql/dataframe.py ## @@ -605,6 +605,22 @@ def take(self, num): """ return self.limit(num).collect() +@ignore_unicode_prefix +@since(3.0) +def tail(self, num): +""" +Returns the last `num` rows in the Dataset. Review comment: Shall we match the format with line 601? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AngersZhuuuu removed a comment on issue #27106: [SPARK-30435][DOC] Update doc of Supported Hive Features
AngersZh removed a comment on issue #27106: [SPARK-30435][DOC] Update doc of Supported Hive Features URL: https://github.com/apache/spark/pull/27106#issuecomment-575501029 > Gentle ping, @AngersZh . After install ``` sudo gem install jekyll jekyll-redirect-from rouge sudo pip install sphinx pypandoc mkdocs sudo Rscript -e 'install.packages(c("knitr", "devtools", "rmarkdown"), repos="https://cloud.r-project.org/;)' sudo Rscript -e 'devtools::install_version("roxygen2", version = "5.0.1", repos="https://cloud.r-project.org/;)' sudo Rscript -e 'devtools::install_version("testthat", version = "1.0.2", repos="https://cloud.r-project.org/;)' ``` run `jeklly build` under docs, have error as below ``` [info] Set current project to spark-parent (in build file:/Users/angerszhu/Documents/project/AngersZhu/spark/) Could not create file /Users/angerszhu/Documents/project/AngersZhu/spark/common/sketch/target/streams/$global/projectDescriptors/$global/streams/outjava.io.IOException: No such file or directory at sbt.ErrorHandling$.translate(ErrorHandling.scala:10) at sbt.IO$.touch(IO.scala:210) at sbt.std.Streams$$anon$3$$anon$2.make(Streams.scala:129) at sbt.std.Streams$$anon$3$$anon$2.text(Streams.scala:113) at sbt.std.Streams$$anon$3$$anon$2.log(Streams.scala:124) at sbt.std.TaskStreams$class.log(Streams.scala:56) at sbt.std.Streams$$anon$3$$anon$2.log$lzycompute(Streams.scala:102) at sbt.std.Streams$$anon$3$$anon$2.log(Streams.scala:102) at sbt.Classpaths$$anonfun$depMap$1.apply(Defaults.scala:1680) at sbt.Classpaths$$anonfun$depMap$1.apply(Defaults.scala:1679) at scala.Function4$$anonfun$tupled$1.apply(Function4.scala:35) at scala.Function4$$anonfun$tupled$1.apply(Function4.scala:34) at scala.Function1$$anonfun$compose$1.apply(Function1.scala:47) at sbt.$tilde$greater$$anonfun$$u2219$1.apply(TypeFunctions.scala:40) at sbt.std.Transform$$anon$4.work(System.scala:63) at sbt.Execute$$anonfun$submit$1$$anonfun$apply$1.apply(Execute.scala:228) at sbt.Execute$$anonfun$submit$1$$anonfun$apply$1.apply(Execute.scala:228) at sbt.ErrorHandling$.wideConvert(ErrorHandling.scala:17) at sbt.Execute.work(Execute.scala:237) at sbt.Execute$$anonfun$submit$1.apply(Execute.scala:228) at sbt.Execute$$anonfun$submit$1.apply(Execute.scala:228) at sbt.ConcurrentRestrictions$$anon$4$$anonfun$1.apply(ConcurrentRestrictions.scala:159) at sbt.CompletionService$$anon$2.call(CompletionService.scala:28) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748) Caused by: java.io.IOException: No such file or directory at java.io.UnixFileSystem.createFileExclusively(Native Method) at java.io.File.createNewFile(File.java:1012) at sbt.IO$$anonfun$1.apply$mcZ$sp(IO.scala:210) at sbt.IO$$anonfun$1.apply(IO.scala:210) at sbt.IO$$anonfun$1.apply(IO.scala:210) at sbt.ErrorHandling$.translate(ErrorHandling.scala:10) at sbt.IO$.touch(IO.scala:210) at sbt.std.Streams$$anon$3$$anon$2.make(Streams.scala:129) at sbt.std.Streams$$anon$3$$anon$2.text(Streams.scala:113) at sbt.std.Streams$$anon$3$$anon$2.log(Streams.scala:124) at sbt.std.TaskStreams$class.log(Streams.scala:56) at sbt.std.Streams$$anon$3$$anon$2.log$lzycompute(Streams.scala:102) at sbt.std.Streams$$anon$3$$anon$2.log(Streams.scala:102) at sbt.Classpaths$$anonfun$depMap$1.apply(Defaults.scala:1680) at sbt.Classpaths$$anonfun$depMap$1.apply(Defaults.scala:1679) at scala.Function4$$anonfun$tupled$1.apply(Function4.scala:35) at scala.Function4$$anonfun$tupled$1.apply(Function4.scala:34) at scala.Function1$$anonfun$compose$1.apply(Function1.scala:47) at sbt.$tilde$greater$$anonfun$$u2219$1.apply(TypeFunctions.scala:40) at sbt.std.Transform$$anon$4.work(System.scala:63) at sbt.Execute$$anonfun$submit$1$$anonfun$apply$1.apply(Execute.scala:228) at sbt.Execute$$anonfun$submit$1$$anonfun$apply$1.apply(Execute.scala:228) at sbt.ErrorHandling$.wideConvert(ErrorHandling.scala:17) at sbt.Execute.work(Execute.scala:237) at sbt.Execute$$anonfun$submit$1.apply(Execute.scala:228) at sbt.Execute$$anonfun$submit$1.apply(Execute.scala:228) at sbt.ConcurrentRestrictions$$anon$4$$anonfun$1.apply(ConcurrentRestrictions.scala:159) at
[GitHub] [spark] AmplabJenkins removed a comment on issue #27058: [SPARK-30276][SQL] Support Filter expression allows simultaneous use of DISTINCT
AmplabJenkins removed a comment on issue #27058: [SPARK-30276][SQL] Support Filter expression allows simultaneous use of DISTINCT URL: https://github.com/apache/spark/pull/27058#issuecomment-575508247 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/116911/ Test FAILed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #27058: [SPARK-30276][SQL] Support Filter expression allows simultaneous use of DISTINCT
AmplabJenkins removed a comment on issue #27058: [SPARK-30276][SQL] Support Filter expression allows simultaneous use of DISTINCT URL: https://github.com/apache/spark/pull/27058#issuecomment-575508198 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #27058: [SPARK-30276][SQL] Support Filter expression allows simultaneous use of DISTINCT
AmplabJenkins removed a comment on issue #27058: [SPARK-30276][SQL] Support Filter expression allows simultaneous use of DISTINCT URL: https://github.com/apache/spark/pull/27058#issuecomment-575508203 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/21680/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #27058: [SPARK-30276][SQL] Support Filter expression allows simultaneous use of DISTINCT
AmplabJenkins removed a comment on issue #27058: [SPARK-30276][SQL] Support Filter expression allows simultaneous use of DISTINCT URL: https://github.com/apache/spark/pull/27058#issuecomment-575508244 Merged build finished. Test FAILed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA removed a comment on issue #27058: [SPARK-30276][SQL] Support Filter expression allows simultaneous use of DISTINCT
SparkQA removed a comment on issue #27058: [SPARK-30276][SQL] Support Filter expression allows simultaneous use of DISTINCT URL: https://github.com/apache/spark/pull/27058#issuecomment-575507740 **[Test build #116911 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/116911/testReport)** for PR 27058 at commit [`52953ec`](https://github.com/apache/spark/commit/52953ec6b3dec4dd6d48105df2f1ef889cd9e75d). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] gengliangwang commented on a change in pull request #26675: [SPARK-30041][SQL][WEBUI] Add Codegen Stage Id to Stage DAG visualization in Web UI
gengliangwang commented on a change in pull request #26675: [SPARK-30041][SQL][WEBUI] Add Codegen Stage Id to Stage DAG visualization in Web UI URL: https://github.com/apache/spark/pull/26675#discussion_r367798842 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/SparkPlan.scala ## @@ -206,7 +206,11 @@ abstract class SparkPlan extends QueryPlan[SparkPlan] with Logging with Serializ * for visualization. */ protected final def executeQuery[T](query: => T): T = { -RDDOperationScope.withScope(sparkContext, nodeName, false, true) { +val nodeNameScope = this match { Review comment: How about overriding the `nodeName` of `WholeStageCodegenExec`. I think it is hacky to make special handling here. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #27058: [SPARK-30276][SQL] Support Filter expression allows simultaneous use of DISTINCT
AmplabJenkins commented on issue #27058: [SPARK-30276][SQL] Support Filter expression allows simultaneous use of DISTINCT URL: https://github.com/apache/spark/pull/27058#issuecomment-575508198 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #26921: [SPARK-30282][SQL] Migrate SHOW TBLPROPERTIES to new framework
AmplabJenkins removed a comment on issue #26921: [SPARK-30282][SQL] Migrate SHOW TBLPROPERTIES to new framework URL: https://github.com/apache/spark/pull/26921#issuecomment-575507826 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on issue #27058: [SPARK-30276][SQL] Support Filter expression allows simultaneous use of DISTINCT
SparkQA commented on issue #27058: [SPARK-30276][SQL] Support Filter expression allows simultaneous use of DISTINCT URL: https://github.com/apache/spark/pull/27058#issuecomment-575508233 **[Test build #116911 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/116911/testReport)** for PR 27058 at commit [`52953ec`](https://github.com/apache/spark/commit/52953ec6b3dec4dd6d48105df2f1ef889cd9e75d). * This patch **fails Scala style tests**. * This patch merges cleanly. * This patch adds no public classes. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #26921: [SPARK-30282][SQL] Migrate SHOW TBLPROPERTIES to new framework
AmplabJenkins removed a comment on issue #26921: [SPARK-30282][SQL] Migrate SHOW TBLPROPERTIES to new framework URL: https://github.com/apache/spark/pull/26921#issuecomment-575507832 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/116890/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #27058: [SPARK-30276][SQL] Support Filter expression allows simultaneous use of DISTINCT
AmplabJenkins commented on issue #27058: [SPARK-30276][SQL] Support Filter expression allows simultaneous use of DISTINCT URL: https://github.com/apache/spark/pull/27058#issuecomment-575508244 Merged build finished. Test FAILed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #27058: [SPARK-30276][SQL] Support Filter expression allows simultaneous use of DISTINCT
AmplabJenkins commented on issue #27058: [SPARK-30276][SQL] Support Filter expression allows simultaneous use of DISTINCT URL: https://github.com/apache/spark/pull/27058#issuecomment-575508247 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/116911/ Test FAILed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #27058: [SPARK-30276][SQL] Support Filter expression allows simultaneous use of DISTINCT
AmplabJenkins commented on issue #27058: [SPARK-30276][SQL] Support Filter expression allows simultaneous use of DISTINCT URL: https://github.com/apache/spark/pull/27058#issuecomment-575508203 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/21680/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on issue #27058: [SPARK-30276][SQL] Support Filter expression allows simultaneous use of DISTINCT
SparkQA commented on issue #27058: [SPARK-30276][SQL] Support Filter expression allows simultaneous use of DISTINCT URL: https://github.com/apache/spark/pull/27058#issuecomment-575507740 **[Test build #116911 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/116911/testReport)** for PR 27058 at commit [`52953ec`](https://github.com/apache/spark/commit/52953ec6b3dec4dd6d48105df2f1ef889cd9e75d). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #26921: [SPARK-30282][SQL] Migrate SHOW TBLPROPERTIES to new framework
AmplabJenkins commented on issue #26921: [SPARK-30282][SQL] Migrate SHOW TBLPROPERTIES to new framework URL: https://github.com/apache/spark/pull/26921#issuecomment-575507832 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/116890/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #26921: [SPARK-30282][SQL] Migrate SHOW TBLPROPERTIES to new framework
AmplabJenkins commented on issue #26921: [SPARK-30282][SQL] Migrate SHOW TBLPROPERTIES to new framework URL: https://github.com/apache/spark/pull/26921#issuecomment-575507826 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA removed a comment on issue #26921: [SPARK-30282][SQL] Migrate SHOW TBLPROPERTIES to new framework
SparkQA removed a comment on issue #26921: [SPARK-30282][SQL] Migrate SHOW TBLPROPERTIES to new framework URL: https://github.com/apache/spark/pull/26921#issuecomment-575443364 **[Test build #116890 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/116890/testReport)** for PR 26921 at commit [`1ff1dcd`](https://github.com/apache/spark/commit/1ff1dcd97e4cae8b76212f31d72a13a72ec89895). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on issue #26921: [SPARK-30282][SQL] Migrate SHOW TBLPROPERTIES to new framework
SparkQA commented on issue #26921: [SPARK-30282][SQL] Migrate SHOW TBLPROPERTIES to new framework URL: https://github.com/apache/spark/pull/26921#issuecomment-575506839 **[Test build #116890 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/116890/testReport)** for PR 26921 at commit [`1ff1dcd`](https://github.com/apache/spark/commit/1ff1dcd97e4cae8b76212f31d72a13a72ec89895). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #27253: [SPARK-30524] [SQL] follow up SPARK-30524 to resolve comments
AmplabJenkins removed a comment on issue #27253: [SPARK-30524] [SQL] follow up SPARK-30524 to resolve comments URL: https://github.com/apache/spark/pull/27253#issuecomment-575505179 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/21679/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #27253: [SPARK-30524] [SQL] follow up SPARK-30524 to resolve comments
AmplabJenkins removed a comment on issue #27253: [SPARK-30524] [SQL] follow up SPARK-30524 to resolve comments URL: https://github.com/apache/spark/pull/27253#issuecomment-575505172 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #27253: [SPARK-30524] [SQL] follow up SPARK-30524 to resolve comments
AmplabJenkins commented on issue #27253: [SPARK-30524] [SQL] follow up SPARK-30524 to resolve comments URL: https://github.com/apache/spark/pull/27253#issuecomment-575505179 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/21679/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #27253: [SPARK-30524] [SQL] follow up SPARK-30524 to resolve comments
AmplabJenkins commented on issue #27253: [SPARK-30524] [SQL] follow up SPARK-30524 to resolve comments URL: https://github.com/apache/spark/pull/27253#issuecomment-575505172 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on issue #27253: [SPARK-30524] [SQL] follow up SPARK-30524 to resolve comments
SparkQA commented on issue #27253: [SPARK-30524] [SQL] follow up SPARK-30524 to resolve comments URL: https://github.com/apache/spark/pull/27253#issuecomment-575504852 **[Test build #116910 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/116910/testReport)** for PR 27253 at commit [`93367a7`](https://github.com/apache/spark/commit/93367a77aa787b67a3bc775a42d3cf820007a0f0). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #27253: [SPARK-30524] [SQL] follow up SPARK-30524 to resolve comments
AmplabJenkins removed a comment on issue #27253: [SPARK-30524] [SQL] follow up SPARK-30524 to resolve comments URL: https://github.com/apache/spark/pull/27253#issuecomment-575503306 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/116908/ Test FAILed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #27253: [SPARK-30524] [SQL] follow up SPARK-30524 to resolve comments
AmplabJenkins removed a comment on issue #27253: [SPARK-30524] [SQL] follow up SPARK-30524 to resolve comments URL: https://github.com/apache/spark/pull/27253#issuecomment-575503296 Merged build finished. Test FAILed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #26921: [SPARK-30282][SQL] Migrate SHOW TBLPROPERTIES to new framework
AmplabJenkins commented on issue #26921: [SPARK-30282][SQL] Migrate SHOW TBLPROPERTIES to new framework URL: https://github.com/apache/spark/pull/26921#issuecomment-575503399 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA removed a comment on issue #27253: [SPARK-30524] [SQL] follow up SPARK-30524 to resolve comments
SparkQA removed a comment on issue #27253: [SPARK-30524] [SQL] follow up SPARK-30524 to resolve comments URL: https://github.com/apache/spark/pull/27253#issuecomment-575502708 **[Test build #116908 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/116908/testReport)** for PR 27253 at commit [`20e6e5e`](https://github.com/apache/spark/commit/20e6e5ea6ece1276f9f6fa909e7121d94d924ec3). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #26921: [SPARK-30282][SQL] Migrate SHOW TBLPROPERTIES to new framework
AmplabJenkins removed a comment on issue #26921: [SPARK-30282][SQL] Migrate SHOW TBLPROPERTIES to new framework URL: https://github.com/apache/spark/pull/26921#issuecomment-575503410 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/116889/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #26921: [SPARK-30282][SQL] Migrate SHOW TBLPROPERTIES to new framework
AmplabJenkins removed a comment on issue #26921: [SPARK-30282][SQL] Migrate SHOW TBLPROPERTIES to new framework URL: https://github.com/apache/spark/pull/26921#issuecomment-575503399 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #26921: [SPARK-30282][SQL] Migrate SHOW TBLPROPERTIES to new framework
AmplabJenkins commented on issue #26921: [SPARK-30282][SQL] Migrate SHOW TBLPROPERTIES to new framework URL: https://github.com/apache/spark/pull/26921#issuecomment-575503410 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/116889/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #27253: [SPARK-30524] [SQL] follow up SPARK-30524 to resolve comments
AmplabJenkins removed a comment on issue #27253: [SPARK-30524] [SQL] follow up SPARK-30524 to resolve comments URL: https://github.com/apache/spark/pull/27253#issuecomment-575503059 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #27252: [SPARK-29231][SQL] Constraints should be inferred from cast equality constraint
AmplabJenkins removed a comment on issue #27252: [SPARK-29231][SQL] Constraints should be inferred from cast equality constraint URL: https://github.com/apache/spark/pull/27252#issuecomment-575503151 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #27252: [SPARK-29231][SQL] Constraints should be inferred from cast equality constraint
AmplabJenkins removed a comment on issue #27252: [SPARK-29231][SQL] Constraints should be inferred from cast equality constraint URL: https://github.com/apache/spark/pull/27252#issuecomment-575503157 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/21678/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #27253: [SPARK-30524] [SQL] follow up SPARK-30524 to resolve comments
AmplabJenkins removed a comment on issue #27253: [SPARK-30524] [SQL] follow up SPARK-30524 to resolve comments URL: https://github.com/apache/spark/pull/27253#issuecomment-575503068 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/21677/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #27252: [SPARK-29231][SQL] Constraints should be inferred from cast equality constraint
AmplabJenkins commented on issue #27252: [SPARK-29231][SQL] Constraints should be inferred from cast equality constraint URL: https://github.com/apache/spark/pull/27252#issuecomment-575503151 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #27253: [SPARK-30524] [SQL] follow up SPARK-30524 to resolve comments
AmplabJenkins commented on issue #27253: [SPARK-30524] [SQL] follow up SPARK-30524 to resolve comments URL: https://github.com/apache/spark/pull/27253#issuecomment-575503296 Merged build finished. Test FAILed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #27252: [SPARK-29231][SQL] Constraints should be inferred from cast equality constraint
AmplabJenkins commented on issue #27252: [SPARK-29231][SQL] Constraints should be inferred from cast equality constraint URL: https://github.com/apache/spark/pull/27252#issuecomment-575503157 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/21678/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #27253: [SPARK-30524] [SQL] follow up SPARK-30524 to resolve comments
AmplabJenkins commented on issue #27253: [SPARK-30524] [SQL] follow up SPARK-30524 to resolve comments URL: https://github.com/apache/spark/pull/27253#issuecomment-575503306 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/116908/ Test FAILed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on issue #27253: [SPARK-30524] [SQL] follow up SPARK-30524 to resolve comments
SparkQA commented on issue #27253: [SPARK-30524] [SQL] follow up SPARK-30524 to resolve comments URL: https://github.com/apache/spark/pull/27253#issuecomment-575503290 **[Test build #116908 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/116908/testReport)** for PR 27253 at commit [`20e6e5e`](https://github.com/apache/spark/commit/20e6e5ea6ece1276f9f6fa909e7121d94d924ec3). * This patch **fails Scala style tests**. * This patch merges cleanly. * This patch adds no public classes. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA removed a comment on issue #26921: [SPARK-30282][SQL] Migrate SHOW TBLPROPERTIES to new framework
SparkQA removed a comment on issue #26921: [SPARK-30282][SQL] Migrate SHOW TBLPROPERTIES to new framework URL: https://github.com/apache/spark/pull/26921#issuecomment-575441741 **[Test build #116889 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/116889/testReport)** for PR 26921 at commit [`fb47ea9`](https://github.com/apache/spark/commit/fb47ea97da77d4b732558348681777c5abf1da60). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on issue #26921: [SPARK-30282][SQL] Migrate SHOW TBLPROPERTIES to new framework
SparkQA commented on issue #26921: [SPARK-30282][SQL] Migrate SHOW TBLPROPERTIES to new framework URL: https://github.com/apache/spark/pull/26921#issuecomment-575502864 **[Test build #116889 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/116889/testReport)** for PR 26921 at commit [`fb47ea9`](https://github.com/apache/spark/commit/fb47ea97da77d4b732558348681777c5abf1da60). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #27253: [SPARK-30524] [SQL] follow up SPARK-30524 to resolve comments
AmplabJenkins commented on issue #27253: [SPARK-30524] [SQL] follow up SPARK-30524 to resolve comments URL: https://github.com/apache/spark/pull/27253#issuecomment-575503059 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #27253: [SPARK-30524] [SQL] follow up SPARK-30524 to resolve comments
AmplabJenkins commented on issue #27253: [SPARK-30524] [SQL] follow up SPARK-30524 to resolve comments URL: https://github.com/apache/spark/pull/27253#issuecomment-575503068 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/21677/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on issue #27253: [SPARK-30524] [SQL] follow up SPARK-30524 to resolve comments
SparkQA commented on issue #27253: [SPARK-30524] [SQL] follow up SPARK-30524 to resolve comments URL: https://github.com/apache/spark/pull/27253#issuecomment-575502708 **[Test build #116908 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/116908/testReport)** for PR 27253 at commit [`20e6e5e`](https://github.com/apache/spark/commit/20e6e5ea6ece1276f9f6fa909e7121d94d924ec3). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on issue #27252: [SPARK-29231][SQL] Constraints should be inferred from cast equality constraint
SparkQA commented on issue #27252: [SPARK-29231][SQL] Constraints should be inferred from cast equality constraint URL: https://github.com/apache/spark/pull/27252#issuecomment-575502734 **[Test build #116909 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/116909/testReport)** for PR 27252 at commit [`048a0ec`](https://github.com/apache/spark/commit/048a0ecc65763c6feaa939938e2dec6f4040d939). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] maropu edited a comment on issue #27233: [SPARK-29701][SQL] Correct behaviours of group analytical queries when empty input given
maropu edited a comment on issue #27233: [SPARK-29701][SQL] Correct behaviours of group analytical queries when empty input given URL: https://github.com/apache/spark/pull/27233#issuecomment-575498955 In my first try, I did so (I modified code in `ResolveGroupingAnalytics`), but I couldn't fix the resolution code cleanly about [the Filter/Sort cases](https://github.com/apache/spark/blob/master/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala#L588-L602) and `ResolveAggregateFunctions`. If we should handle this case in the analyzer side, I'm ganna try again based on the approach. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] beliefer commented on a change in pull request #27058: [SPARK-30276][SQL] Support Filter expression allows simultaneous use of DISTINCT
beliefer commented on a change in pull request #27058: [SPARK-30276][SQL] Support Filter expression allows simultaneous use of DISTINCT URL: https://github.com/apache/spark/pull/27058#discussion_r367793817 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/RewriteDistinctAggregates.scala ## @@ -151,21 +197,101 @@ object RewriteDistinctAggregates extends Rule[LogicalPlan] { // We need at least two distinct aggregates for this rule because aggregation // strategy can handle a single distinct group. // This check can produce false-positives, e.g., SUM(DISTINCT a) & COUNT(DISTINCT a). -distinctAggs.size > 1 +distinctAggs.size >= 1 } def apply(plan: LogicalPlan): LogicalPlan = plan transformUp { case a: Aggregate if mayNeedtoRewrite(a.aggregateExpressions) => rewrite(a) } def rewrite(a: Aggregate): Aggregate = { +val expandAggregate = expandDistinctAggregateWithFilter(a) +rewriteDistinctAggregate(expandAggregate) + } -// Collect all aggregate expressions. -val aggExpressions = a.aggregateExpressions.flatMap { e => - e.collect { -case ae: AggregateExpression => ae + private def expandDistinctAggregateWithFilter(a: Aggregate): Aggregate = { +val aggExpressions = collectAggregateExprs(a) +val (distinctAggExpressions, regularAggExpressions) = aggExpressions.partition(_.isDistinct) +if (distinctAggExpressions.exists(_.filter.isDefined)) { + // Setup expand for the 'regular' aggregate expressions. + val regularAggExprs = regularAggExpressions.filter(e => e.children.exists(!_.foldable)) + val regularFunChildren = regularAggExprs +.flatMap(_.aggregateFunction.children.filter(!_.foldable)) + val regularFilterAttrs = regularAggExprs.flatMap(_.filterAttributes) + val regularAggChildren = (regularFunChildren ++ regularFilterAttrs).distinct + val regularAggChildAttrMap = regularAggChildren.map(expressionAttributePair) + val regularAggChildAttrLookup = regularAggChildAttrMap.toMap + val regularOperatorMap = regularAggExprs.map { +case ae @ AggregateExpression(af, _, _, filter, _) => + val newChildren = af.children.map(c => regularAggChildAttrLookup.getOrElse(c, c)) + val raf = af.withNewChildren(newChildren).asInstanceOf[AggregateFunction] + val filterOpt = filter.map(_.transform { +case a: Attribute => regularAggChildAttrLookup.getOrElse(a, a) + }) + val aggExpr = ae.copy(aggregateFunction = raf, filter = filterOpt) + (ae, aggExpr) + } + + // Setup expand for the distinct aggregate expressions. + val distinctAggExprs = distinctAggExpressions.filter(e => e.children.exists(!_.foldable)) + val rewriteDistinctOperatorMap = distinctAggExprs.map { Review comment: OK. Thanks! This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] JkSelf commented on issue #27253: [SPARK-30524] [SQL] follow up SPARK-30524 to resolve comments
JkSelf commented on issue #27253: [SPARK-30524] [SQL] follow up SPARK-30524 to resolve comments URL: https://github.com/apache/spark/pull/27253#issuecomment-575501740 @cloud-fan @maryannxue help to review this PR. Thanks. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] JkSelf opened a new pull request #27253: [SPARK-30524] [SQL] follow up SPARK-30524 to resolve comments
JkSelf opened a new pull request #27253: [SPARK-30524] [SQL] follow up SPARK-30524 to resolve comments URL: https://github.com/apache/spark/pull/27253 ### What changes were proposed in this pull request? Resolve the remaining comments in [PR#27226](https://github.com/apache/spark/pull/27226). ### Why are the changes needed? Resolve the comments. ### Does this PR introduce any user-facing change? No ### How was this patch tested? Existing unit tests. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #27058: [SPARK-30276][SQL] Support Filter expression allows simultaneous use of DISTINCT
AmplabJenkins removed a comment on issue #27058: [SPARK-30276][SQL] Support Filter expression allows simultaneous use of DISTINCT URL: https://github.com/apache/spark/pull/27058#issuecomment-575501041 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #27058: [SPARK-30276][SQL] Support Filter expression allows simultaneous use of DISTINCT
AmplabJenkins removed a comment on issue #27058: [SPARK-30276][SQL] Support Filter expression allows simultaneous use of DISTINCT URL: https://github.com/apache/spark/pull/27058#issuecomment-575501047 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/21676/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] wangyum commented on issue #27252: [SPARK-29231][SQL] Constraints should be inferred from cast equality constraint
wangyum commented on issue #27252: [SPARK-29231][SQL] Constraints should be inferred from cast equality constraint URL: https://github.com/apache/spark/pull/27252#issuecomment-575501130 PostgreSQL and Hive support this feature: ```sql postgres=# EXPLAIN select t1.* from spark_29231_1 t1 join spark_29231_2 t2 on (t1.c1 = t2.c1 and t1.c1 = 1); QUERY PLAN -- Nested Loop (cost=0.00..69.77 rows=90 width=16) -> Seq Scan on spark_29231_2 t2 (cost=0.00..35.50 rows=10 width=4) Filter: (c1 = 1) -> Materialize (cost=0.00..33.17 rows=9 width=16) -> Seq Scan on spark_29231_1 t1 (cost=0.00..33.12 rows=9 width=16) Filter: (c1 = 1) (6 rows) ``` ```sql hive> explain select t1.* from spark_29231_1 t1 join spark_29231_2 t2 on (t1.c1 = t2.c1 and t1.c1 = 1); Warning: Map Join MAPJOIN[11][bigTable=?] in task 'Stage-3:MAPRED' is a cross product OK STAGE DEPENDENCIES: Stage-4 is a root stage Stage-3 depends on stages: Stage-4 Stage-0 depends on stages: Stage-3 STAGE PLANS: Stage: Stage-4 Map Reduce Local Work Alias -> Map Local Tables: $hdt$_0:t1 Fetch Operator limit: -1 Alias -> Map Local Operator Tree: $hdt$_0:t1 TableScan alias: t1 Statistics: Num rows: 1 Data size: 0 Basic stats: PARTIAL Column stats: NONE Filter Operator predicate: (c1 = 1L) (type: boolean) Statistics: Num rows: 1 Data size: 0 Basic stats: PARTIAL Column stats: NONE Select Operator expressions: c2 (type: bigint) outputColumnNames: _col1 Statistics: Num rows: 1 Data size: 0 Basic stats: PARTIAL Column stats: NONE HashTable Sink Operator keys: 0 1 Stage: Stage-3 Map Reduce Map Operator Tree: TableScan alias: t2 Statistics: Num rows: 1 Data size: 0 Basic stats: PARTIAL Column stats: NONE Filter Operator predicate: (UDFToLong(c1) = 1) (type: boolean) Statistics: Num rows: 1 Data size: 0 Basic stats: PARTIAL Column stats: NONE Select Operator Statistics: Num rows: 1 Data size: 0 Basic stats: PARTIAL Column stats: NONE Map Join Operator condition map: Inner Join 0 to 1 keys: 0 1 outputColumnNames: _col1 Statistics: Num rows: 1 Data size: 1 Basic stats: PARTIAL Column stats: NONE Select Operator expressions: 1L (type: bigint), _col1 (type: bigint) outputColumnNames: _col0, _col1 Statistics: Num rows: 1 Data size: 1 Basic stats: PARTIAL Column stats: NONE File Output Operator compressed: false Statistics: Num rows: 1 Data size: 1 Basic stats: PARTIAL Column stats: NONE table: input format: org.apache.hadoop.mapred.SequenceFileInputFormat output format: org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat serde: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe Execution mode: vectorized Local Work: Map Reduce Local Work Stage: Stage-0 Fetch Operator limit: -1 Processor Tree: ListSink Time taken: 0.2 seconds, Fetched: 69 row(s) ``` This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #27058: [SPARK-30276][SQL] Support Filter expression allows simultaneous use of DISTINCT
AmplabJenkins commented on issue #27058: [SPARK-30276][SQL] Support Filter expression allows simultaneous use of DISTINCT URL: https://github.com/apache/spark/pull/27058#issuecomment-575501041 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #27058: [SPARK-30276][SQL] Support Filter expression allows simultaneous use of DISTINCT
AmplabJenkins commented on issue #27058: [SPARK-30276][SQL] Support Filter expression allows simultaneous use of DISTINCT URL: https://github.com/apache/spark/pull/27058#issuecomment-575501047 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/21676/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] wangyum opened a new pull request #27252: [SPARK-29231][SQL] Constraints should be inferred from cast equality constraint
wangyum opened a new pull request #27252: [SPARK-29231][SQL] Constraints should be inferred from cast equality constraint URL: https://github.com/apache/spark/pull/27252 ### What changes were proposed in this pull request? This PR add support infer constraints from cast equality constraint. For example: ```scala scala> spark.sql("create table spark_29231_1(c1 bigint, c2 bigint)") res0: org.apache.spark.sql.DataFrame = [] scala> spark.sql("create table spark_29231_2(c1 int, c2 bigint)") res1: org.apache.spark.sql.DataFrame = [] scala> spark.sql("select t1.* from spark_29231_1 t1 join spark_29231_2 t2 on (t1.c1 = t2.c1 and t1.c1 = 1)").explain == Physical Plan == *(2) Project [c1#5L, c2#6L] +- *(2) BroadcastHashJoin [c1#5L], [cast(c1#7 as bigint)], Inner, BuildRight :- *(2) Project [c1#5L, c2#6L] : +- *(2) Filter (isnotnull(c1#5L) AND (c1#5L = 1)) : +- *(2) ColumnarToRow :+- FileScan parquet default.spark_29231_1[c1#5L,c2#6L] Batched: true, DataFilters: [isnotnull(c1#5L), (c1#5L = 1)], Format: Parquet, Location: InMemoryFileIndex[file:/root/spark-3.0.0-preview2-bin-hadoop2.7/spark-warehouse/spark_29231_1], PartitionFilters: [], PushedFilters: [IsNotNull(c1), EqualTo(c1,1)], ReadSchema: struct +- BroadcastExchange HashedRelationBroadcastMode(List(cast(input[0, int, true] as bigint))), [id=#209] +- *(1) Project [c1#7] +- *(1) Filter isnotnull(c1#7) +- *(1) ColumnarToRow +- FileScan parquet default.spark_29231_2[c1#7] Batched: true, DataFilters: [isnotnull(c1#7)], Format: Parquet, Location: InMemoryFileIndex[file:/root/spark-3.0.0-preview2-bin-hadoop2.7/spark-warehouse/spark_29231_2], PartitionFilters: [], PushedFilters: [IsNotNull(c1)], ReadSchema: struct ``` After this PR: ```scala scala> spark.sql("select t1.* from spark_29231_1 t1 join spark_29231_2 t2 on (t1.c1 = t2.c1 and t1.c1 = 1)").explain == Physical Plan == *(2) Project [c1#0L, c2#1L] +- *(2) BroadcastHashJoin [c1#0L], [cast(c1#2 as bigint)], Inner, BuildRight :- *(2) Project [c1#0L, c2#1L] : +- *(2) Filter (isnotnull(c1#0L) AND (c1#0L = 1)) : +- *(2) ColumnarToRow :+- FileScan parquet default.spark_29231_1[c1#0L,c2#1L] Batched: true, DataFilters: [isnotnull(c1#0L), (c1#0L = 1)], Format: Parquet, Location: InMemoryFileIndex[file:/root/opensource/spark/spark-warehouse/spark_29231_1], PartitionFilters: [], PushedFilters: [IsNotNull(c1), EqualTo(c1,1)], ReadSchema: struct +- BroadcastExchange HashedRelationBroadcastMode(List(cast(input[0, int, true] as bigint))), [id=#99] +- *(1) Project [c1#2] +- *(1) Filter ((cast(c1#2 as bigint) = 1) AND isnotnull(c1#2)) +- *(1) ColumnarToRow +- FileScan parquet default.spark_29231_2[c1#2] Batched: true, DataFilters: [(cast(c1#2 as bigint) = 1), isnotnull(c1#2)], Format: Parquet, Location: InMemoryFileIndex[file:/root/opensource/spark/spark-warehouse/spark_29231_2], PartitionFilters: [], PushedFilters: [IsNotNull(c1)], ReadSchema: struct ``` ### Why are the changes needed? Improve query performance. ### Does this PR introduce any user-facing change? No. ### How was this patch tested? Unit test. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org