[GitHub] [spark] JkSelf commented on issue #26846: [SPARK-30213] [SQL]Remove the mutable status in ShuffleQueryStageExec
JkSelf commented on issue #26846: [SPARK-30213] [SQL]Remove the mutable status in ShuffleQueryStageExec URL: https://github.com/apache/spark/pull/26846#issuecomment-564423425 Please help to retest. Thanks. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] HeartSaVioR edited a comment on issue #26821: [SPARK-20656][CORE]Support Incremental parsing of event logs in SHS
HeartSaVioR edited a comment on issue #26821: [SPARK-20656][CORE]Support Incremental parsing of event logs in SHS URL: https://github.com/apache/spark/pull/26821#issuecomment-564422644 @shahidki31 No I didn't intend to persuade you to close this. I'd just wanted to make sure we get a clear picture of full implementation before dealing with each part, but it's OK for me if you'd like to deal with current solution as I think I can deal with extending the solution with snapshotting. I can take a look with current solution, but you still need to persuade at least one committer to push this forward. Btw, we'd be better to clarify the performance test in details. It should include at least... * size of event log file for initial load * elapsed time for initial load * count/size of events for addition (mostly about size) * elapsed time for loading additional events For me, your statement in PR description sounds to me as skipping (via read and drop) 2G takes around 2 secs which is still not ideal (as we know how to do it better), though I agree that's still a huge improvement. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] HeartSaVioR commented on issue #26821: [SPARK-20656][CORE]Support Incremental parsing of event logs in SHS
HeartSaVioR commented on issue #26821: [SPARK-20656][CORE]Support Incremental parsing of event logs in SHS URL: https://github.com/apache/spark/pull/26821#issuecomment-564422644 @shahidki31 No I didn't intend to persuade you to close this. I'd just wanted to make sure we get a clear picture of full implementation before dealing with each part, but it's OK for me if you'd like to deal with current solution as I think I can deal with extending the solution with snapshotting. I could take a look with current solution, but you still need to persuade at least one committer to push this forward. Btw, we'd be better to clarify the performance test in details. It should include at least... * size of event log file for initial load * elapsed time for initial load * count/size of events for addition (mostly about size) * elapsed time for loading additional events For me, your statement in PR description sounds to me as skipping (via read and drop) 2G takes around 2 secs which is still not ideal (as we know how to do it better), though I agree that's still a huge improvement. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] yaooqinn commented on issue #26847: [SPARK-30214][SQL] Support COMMENT ON syntax
yaooqinn commented on issue #26847: [SPARK-30214][SQL] Support COMMENT ON syntax URL: https://github.com/apache/spark/pull/26847#issuecomment-564422183 a pre-discussion might be found here https://github.com/apache/spark/pull/26806 thanks again. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] yaooqinn commented on issue #26847: [SPARK-30214][SQL] Support COMMENT ON syntax
yaooqinn commented on issue #26847: [SPARK-30214][SQL] Support COMMENT ON syntax URL: https://github.com/apache/spark/pull/26847#issuecomment-564421490 cc @cloud-fan @maropu, thanks for reviewing this. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] yaooqinn opened a new pull request #26847: [SPARK-30214][SQL] Support COMMENT ON syntax
yaooqinn opened a new pull request #26847: [SPARK-30214][SQL] Support COMMENT ON syntax URL: https://github.com/apache/spark/pull/26847 ### What changes were proposed in this pull request? As the new design of catalog v2, some properties become reserved, e.g. `location`, `comment`. We are going to disable setting reserved properties by dbproperties or tblproperites directly to avoid confliction with their related subClause or specific commands. For `comment`, there is no existing syntax to alter yet, so this pull request is to add those. ```sql COMMENT ON (DATABASE|SCHEMA|NAMESPACE) ... IS ... COMMENT ON TABLE ... IS ... ``` They are best practice from PostgeSQL and presto. https://www.postgresql.org/docs/12/sql-comment.html https://prestosql.io/docs/current/sql/comment.html ### Why are the changes needed? Comming feature support. ### Does this PR introduce any user-facing change? yes, add new syntax ### How was this patch tested? add uts. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] shahidki31 edited a comment on issue #26821: [SPARK-20656][CORE]Support Incremental parsing of event logs in SHS
shahidki31 edited a comment on issue #26821: [SPARK-20656][CORE]Support Incremental parsing of event logs in SHS URL: https://github.com/apache/spark/pull/26821#issuecomment-564417301 > From what I tested locally, filtering by lines roughly takes about 30 seconds for file sizes ranging from 10MB to 400MB, while skipping by bytes only takes 2 ms for a 400MB file. Hi @oopDaniel , Actually I tried with both the approaches, and it seems skipping bytes seems more complicated as we need to handle more edge cases. Also, I tested this PR with 2GB event log file and I think the time to load UI took around 2 seconds (?) including filtering and replaying. Also it is not difficult to add the skipping bytes, as all we need to do is add the bytes read parameter instead of lines read parameter and handle the edge cases. @HeartSaVioR I think the approach which you guys are doing is great, as it handles restarting SHS. But, if we can review this PR related to incremental parsing, extending to snapshotting would be easier I guess.. If there is a working PR for that, I can close this. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on issue #26644: [SPARK-30004][SQL] Allow merge UserDefinedType into a native DataType
SparkQA commented on issue #26644: [SPARK-30004][SQL] Allow merge UserDefinedType into a native DataType URL: https://github.com/apache/spark/pull/26644#issuecomment-564419192 **[Test build #115158 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/115158/testReport)** for PR 26644 at commit [`e976297`](https://github.com/apache/spark/commit/e97629726397cbb047b9486e47540f87029abb30). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] dongjoon-hyun edited a comment on issue #26844: [SPARK-30211][INFRA] Use python3 in make-distribution.sh
dongjoon-hyun edited a comment on issue #26844: [SPARK-30211][INFRA] Use python3 in make-distribution.sh URL: https://github.com/apache/spark/pull/26844#issuecomment-564417855 FYI, `make-distribution.sh` is also used in the following Jenkins job. I triggered to make it sure that our Jenkins is also ready for this change. If it fails, we need to file a JIRA issue for that separately. Let's see. - https://amplab.cs.berkeley.edu/jenkins/view/Spark%20Packaging/job/spark-master-maven-snapshots/2743/ This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] dongjoon-hyun commented on issue #26844: [SPARK-30211][INFRA] Use python3 in make-distribution.sh
dongjoon-hyun commented on issue #26844: [SPARK-30211][INFRA] Use python3 in make-distribution.sh URL: https://github.com/apache/spark/pull/26844#issuecomment-564417855 FYI, `make-distribution.sh` is also used in the following Jenkins job. I triggered to make it sure that our Jenkins is also ready for this change. If it fails, we need to file a JIRA issue for that separately. - https://amplab.cs.berkeley.edu/jenkins/view/Spark%20Packaging/job/spark-master-maven-snapshots/2743/ This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] zhengruifeng edited a comment on issue #26803: [SPARK-30178][ML] RobustScaler support large numFeatures
zhengruifeng edited a comment on issue #26803: [SPARK-30178][ML] RobustScaler support large numFeatures URL: https://github.com/apache/spark/pull/26803#issuecomment-564413213 @srowen I impl a simple `treeAggregateByKey` [here](https://github.com/apache/spark/compare/master...zhengruifeng:treeAggByKey?expand=1), and made several local tests like: ```scala val rdd = sc.range(0, 1, 1, 100) val rdd2 = rdd.map{i => (i % 10, i)} val rdd3 = rdd2.treeAggregateByKey(0.0, new HashPartitioner(3))(_+_, _+_, 3) rdd3.collect ``` and ran successfully. ![image](https://user-images.githubusercontent.com/7322292/70601039-2f8b0480-1c2c-11ea-8cc3-579dd8eb7946.png) ![image](https://user-images.githubusercontent.com/7322292/70601026-29952380-1c2c-11ea-8449-c27c431b448c.png) BTW, it is reasonable to call `compress` before reduce tasks, so maybe a `aggregateByKeyWithinPartitions` is needed? Then we can call `compress` after locally aggregation within each partition. I guess they maybe common functions and we add this method in RDD/PairRDD? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] cloud-fan commented on a change in pull request #26773: [SPARK-30126][CORE] sparkContext.addFile and sparkContext.addJar fails when file path contains spaces
cloud-fan commented on a change in pull request #26773: [SPARK-30126][CORE] sparkContext.addFile and sparkContext.addJar fails when file path contains spaces URL: https://github.com/apache/spark/pull/26773#discussion_r356439923 ## File path: core/src/test/scala/org/apache/spark/SparkContextSuite.scala ## @@ -233,6 +233,42 @@ class SparkContextSuite extends SparkFunSuite with LocalSparkContext with Eventu } } + test("SPARK-30126: addFile when file path contains spaces with recursive works") { +withTempDir { dir => + try { +val sep = File.separator +val tmpDir = Utils.createTempDir(dir.getAbsolutePath + sep + "test space") +val tmpConfFile1 = File.createTempFile("test", ".conf", tmpDir) Review comment: can we also add space in the file name to prove it works? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #26844: [SPARK-30211][INFRA] Use python3 in make-distribution.sh
AmplabJenkins removed a comment on issue #26844: [SPARK-30211][INFRA] Use python3 in make-distribution.sh URL: https://github.com/apache/spark/pull/26844#issuecomment-564417286 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/115145/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #26844: [SPARK-30211][INFRA] Use python3 in make-distribution.sh
AmplabJenkins removed a comment on issue #26844: [SPARK-30211][INFRA] Use python3 in make-distribution.sh URL: https://github.com/apache/spark/pull/26844#issuecomment-564417274 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #26844: [SPARK-30211][INFRA] Use python3 in make-distribution.sh
AmplabJenkins commented on issue #26844: [SPARK-30211][INFRA] Use python3 in make-distribution.sh URL: https://github.com/apache/spark/pull/26844#issuecomment-564417286 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/115145/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] shahidki31 commented on issue #26821: [SPARK-20656][CORE]Support Incremental parsing of event logs in SHS
shahidki31 commented on issue #26821: [SPARK-20656][CORE]Support Incremental parsing of event logs in SHS URL: https://github.com/apache/spark/pull/26821#issuecomment-564417301 > From what I tested locally, filtering by lines roughly takes about 30 seconds for file sizes ranging from 10MB to 400MB, while skipping by bytes only takes 2 ms for a 400MB file. Hi @oopDaniel , Actually I tried with both the approach, and it seems skipping bytes seems more complicated as we need to handle more edge cases. Also, I tested this PR with 2GB event log file and I think the time to load UI took around 2 seconds (?) including filtering and replaying. Also it is not difficult to add the skipping bytes, as all we need to do is add the bytes read parameter instead of lines read parameter and handle the edge cases. @HeartSaVioR I think the approach which you guys are doing is great, as it handles restarting SHS. But, if we can review this PR related to incremental parsing, extending to snapshotting would be easier I guess.. If there is a working PR for that, I can close this. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #26844: [SPARK-30211][INFRA] Use python3 in make-distribution.sh
AmplabJenkins commented on issue #26844: [SPARK-30211][INFRA] Use python3 in make-distribution.sh URL: https://github.com/apache/spark/pull/26844#issuecomment-564417274 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] cloud-fan commented on a change in pull request #26779: [SPARK-30150][SQL] ADD FILE, ADD JAR, LIST FILE & LIST JAR Command do not accept quoted path
cloud-fan commented on a change in pull request #26779: [SPARK-30150][SQL] ADD FILE, ADD JAR, LIST FILE & LIST JAR Command do not accept quoted path URL: https://github.com/apache/spark/pull/26779#discussion_r356438818 ## File path: sql/core/src/test/scala/org/apache/spark/sql/execution/SparkSqlParserSuite.scala ## @@ -259,4 +259,22 @@ class SparkSqlParserSuite extends AnalysisTest { parser.parsePlan("ALTER SCHEMA foo SET DBPROPERTIES ('x' = 'y')")) assertEqual("DESC DATABASE foo", parser.parsePlan("DESC SCHEMA foo")) } + + test("manage resources") { +assertEqual("ADD FILE abc.txt", AddFileCommand("abc.txt")) +assertEqual("ADD FILE \'abc.txt\'", AddFileCommand("abc.txt")) Review comment: shall we use `'`? it's easier to read This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] dongjoon-hyun commented on issue #26844: [SPARK-30211][INFRA] Use python3 in make-distribution.sh
dongjoon-hyun commented on issue #26844: [SPARK-30211][INFRA] Use python3 in make-distribution.sh URL: https://github.com/apache/spark/pull/26844#issuecomment-564416361 It's merged now. Please proceed to the docker issue and ping me if you need my help. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on issue #26844: [SPARK-30211][INFRA] Use python3 in make-distribution.sh
SparkQA commented on issue #26844: [SPARK-30211][INFRA] Use python3 in make-distribution.sh URL: https://github.com/apache/spark/pull/26844#issuecomment-564416473 **[Test build #115145 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/115145/testReport)** for PR 26844 at commit [`d395904`](https://github.com/apache/spark/commit/d395904b113c0159f866fec3f2f3257313a355c8). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA removed a comment on issue #26844: [SPARK-30211][INFRA] Use python3 in make-distribution.sh
SparkQA removed a comment on issue #26844: [SPARK-30211][INFRA] Use python3 in make-distribution.sh URL: https://github.com/apache/spark/pull/26844#issuecomment-564383404 **[Test build #115145 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/115145/testReport)** for PR 26844 at commit [`d395904`](https://github.com/apache/spark/commit/d395904b113c0159f866fec3f2f3257313a355c8). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #26644: [SPARK-30004][SQL] Allow merge UserDefinedType into a native DataType
AmplabJenkins removed a comment on issue #26644: [SPARK-30004][SQL] Allow merge UserDefinedType into a native DataType URL: https://github.com/apache/spark/pull/26644#issuecomment-564415612 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/19969/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] dongjoon-hyun closed pull request #26844: [SPARK-30211][INFRA] Use python3 in make-distribution.sh
dongjoon-hyun closed pull request #26844: [SPARK-30211][INFRA] Use python3 in make-distribution.sh URL: https://github.com/apache/spark/pull/26844 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #26644: [SPARK-30004][SQL] Allow merge UserDefinedType into a native DataType
AmplabJenkins removed a comment on issue #26644: [SPARK-30004][SQL] Allow merge UserDefinedType into a native DataType URL: https://github.com/apache/spark/pull/26644#issuecomment-564415605 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #26845: [SPARK-21869][SS] Revise Kafka producer pool to implement 'expire' correctly
AmplabJenkins removed a comment on issue #26845: [SPARK-21869][SS] Revise Kafka producer pool to implement 'expire' correctly URL: https://github.com/apache/spark/pull/26845#issuecomment-564415423 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/19965/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #26845: [SPARK-21869][SS] Revise Kafka producer pool to implement 'expire' correctly
AmplabJenkins removed a comment on issue #26845: [SPARK-21869][SS] Revise Kafka producer pool to implement 'expire' correctly URL: https://github.com/apache/spark/pull/26845#issuecomment-564415419 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #26773: [SPARK-30126][CORE] sparkContext.addFile and sparkContext.addJar fails when file path contains spaces
AmplabJenkins removed a comment on issue #26773: [SPARK-30126][CORE] sparkContext.addFile and sparkContext.addJar fails when file path contains spaces URL: https://github.com/apache/spark/pull/26773#issuecomment-564415544 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/19968/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #26844: [SPARK-30211][INFRA] Use python3 in make-distribution.sh
AmplabJenkins removed a comment on issue #26844: [SPARK-30211][INFRA] Use python3 in make-distribution.sh URL: https://github.com/apache/spark/pull/26844#issuecomment-564415463 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/19966/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #26846: [SPARK-30213] [SQL]Remove the mutable status in ShuffleQueryStageExec
AmplabJenkins removed a comment on issue #26846: [SPARK-30213] [SQL]Remove the mutable status in ShuffleQueryStageExec URL: https://github.com/apache/spark/pull/26846#issuecomment-564415460 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/19964/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #26780: [SPARK-27506][SQL] Allow deserialization of Avro data using compatible schemas
AmplabJenkins removed a comment on issue #26780: [SPARK-27506][SQL] Allow deserialization of Avro data using compatible schemas URL: https://github.com/apache/spark/pull/26780#issuecomment-564415495 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/19967/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #26780: [SPARK-27506][SQL] Allow deserialization of Avro data using compatible schemas
AmplabJenkins removed a comment on issue #26780: [SPARK-27506][SQL] Allow deserialization of Avro data using compatible schemas URL: https://github.com/apache/spark/pull/26780#issuecomment-564415481 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #26844: [SPARK-30211][INFRA] Use python3 in make-distribution.sh
AmplabJenkins removed a comment on issue #26844: [SPARK-30211][INFRA] Use python3 in make-distribution.sh URL: https://github.com/apache/spark/pull/26844#issuecomment-564415450 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #26644: [SPARK-30004][SQL] Allow merge UserDefinedType into a native DataType
AmplabJenkins commented on issue #26644: [SPARK-30004][SQL] Allow merge UserDefinedType into a native DataType URL: https://github.com/apache/spark/pull/26644#issuecomment-564415605 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #26846: [SPARK-30213] [SQL]Remove the mutable status in ShuffleQueryStageExec
AmplabJenkins removed a comment on issue #26846: [SPARK-30213] [SQL]Remove the mutable status in ShuffleQueryStageExec URL: https://github.com/apache/spark/pull/26846#issuecomment-564415447 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #26773: [SPARK-30126][CORE] sparkContext.addFile and sparkContext.addJar fails when file path contains spaces
AmplabJenkins removed a comment on issue #26773: [SPARK-30126][CORE] sparkContext.addFile and sparkContext.addJar fails when file path contains spaces URL: https://github.com/apache/spark/pull/26773#issuecomment-564415538 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #26644: [SPARK-30004][SQL] Allow merge UserDefinedType into a native DataType
AmplabJenkins commented on issue #26644: [SPARK-30004][SQL] Allow merge UserDefinedType into a native DataType URL: https://github.com/apache/spark/pull/26644#issuecomment-564415612 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/19969/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #26844: [SPARK-30211][INFRA] Switch python to python3 in make-distribution.sh
AmplabJenkins commented on issue #26844: [SPARK-30211][INFRA] Switch python to python3 in make-distribution.sh URL: https://github.com/apache/spark/pull/26844#issuecomment-564415450 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #26780: [SPARK-27506][SQL] Allow deserialization of Avro data using compatible schemas
AmplabJenkins commented on issue #26780: [SPARK-27506][SQL] Allow deserialization of Avro data using compatible schemas URL: https://github.com/apache/spark/pull/26780#issuecomment-564415495 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/19967/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #26780: [SPARK-27506][SQL] Allow deserialization of Avro data using compatible schemas
AmplabJenkins commented on issue #26780: [SPARK-27506][SQL] Allow deserialization of Avro data using compatible schemas URL: https://github.com/apache/spark/pull/26780#issuecomment-564415481 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #26844: [SPARK-30211][INFRA] Switch python to python3 in make-distribution.sh
AmplabJenkins commented on issue #26844: [SPARK-30211][INFRA] Switch python to python3 in make-distribution.sh URL: https://github.com/apache/spark/pull/26844#issuecomment-564415463 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/19966/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #26846: [SPARK-30213] [SQL]Remove the mutable status in ShuffleQueryStageExec
AmplabJenkins commented on issue #26846: [SPARK-30213] [SQL]Remove the mutable status in ShuffleQueryStageExec URL: https://github.com/apache/spark/pull/26846#issuecomment-564415460 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/19964/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #26773: [SPARK-30126][CORE] sparkContext.addFile and sparkContext.addJar fails when file path contains spaces
AmplabJenkins commented on issue #26773: [SPARK-30126][CORE] sparkContext.addFile and sparkContext.addJar fails when file path contains spaces URL: https://github.com/apache/spark/pull/26773#issuecomment-564415538 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #26846: [SPARK-30213] [SQL]Remove the mutable status in ShuffleQueryStageExec
AmplabJenkins commented on issue #26846: [SPARK-30213] [SQL]Remove the mutable status in ShuffleQueryStageExec URL: https://github.com/apache/spark/pull/26846#issuecomment-564415447 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #26845: [SPARK-21869][SS] Revise Kafka producer pool to implement 'expire' correctly
AmplabJenkins commented on issue #26845: [SPARK-21869][SS] Revise Kafka producer pool to implement 'expire' correctly URL: https://github.com/apache/spark/pull/26845#issuecomment-564415423 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/19965/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #26773: [SPARK-30126][CORE] sparkContext.addFile and sparkContext.addJar fails when file path contains spaces
AmplabJenkins commented on issue #26773: [SPARK-30126][CORE] sparkContext.addFile and sparkContext.addJar fails when file path contains spaces URL: https://github.com/apache/spark/pull/26773#issuecomment-564415544 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/19968/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #26845: [SPARK-21869][SS] Revise Kafka producer pool to implement 'expire' correctly
AmplabJenkins commented on issue #26845: [SPARK-21869][SS] Revise Kafka producer pool to implement 'expire' correctly URL: https://github.com/apache/spark/pull/26845#issuecomment-564415419 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on issue #26773: [SPARK-30126][CORE] sparkContext.addFile and sparkContext.addJar fails when file path contains spaces
SparkQA commented on issue #26773: [SPARK-30126][CORE] sparkContext.addFile and sparkContext.addJar fails when file path contains spaces URL: https://github.com/apache/spark/pull/26773#issuecomment-564414528 **[Test build #115157 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/115157/testReport)** for PR 26773 at commit [`6c276ff`](https://github.com/apache/spark/commit/6c276ff14a75867e8ddd5e0f042894ca992ff572). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on issue #26780: [SPARK-27506][SQL] Allow deserialization of Avro data using compatible schemas
SparkQA commented on issue #26780: [SPARK-27506][SQL] Allow deserialization of Avro data using compatible schemas URL: https://github.com/apache/spark/pull/26780#issuecomment-564414526 **[Test build #115156 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/115156/testReport)** for PR 26780 at commit [`827e409`](https://github.com/apache/spark/commit/827e409a737382b2f8e42cf885d0d6b82f325b2d). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on issue #26844: [SPARK-30211][INFRA] Switch python to python3 in make-distribution.sh
SparkQA commented on issue #26844: [SPARK-30211][INFRA] Switch python to python3 in make-distribution.sh URL: https://github.com/apache/spark/pull/26844#issuecomment-564414518 **[Test build #115154 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/115154/testReport)** for PR 26844 at commit [`a3a7695`](https://github.com/apache/spark/commit/a3a76959b131f109f891d6022d9d18aaad1a0ea3). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on issue #26817: [SPARK-30192][SQL] support column position in DS v2
SparkQA commented on issue #26817: [SPARK-30192][SQL] support column position in DS v2 URL: https://github.com/apache/spark/pull/26817#issuecomment-564414517 **[Test build #115155 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/115155/testReport)** for PR 26817 at commit [`49434e4`](https://github.com/apache/spark/commit/49434e465e5c296e3f99f4b3914a9ce0bf8f874a). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on issue #26845: [SPARK-21869][SS] Revise Kafka producer pool to implement 'expire' correctly
SparkQA commented on issue #26845: [SPARK-21869][SS] Revise Kafka producer pool to implement 'expire' correctly URL: https://github.com/apache/spark/pull/26845#issuecomment-564414490 **[Test build #115153 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/115153/testReport)** for PR 26845 at commit [`13fd4f0`](https://github.com/apache/spark/commit/13fd4f05e3449fd79d97876ad731588125e99a15). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on issue #26846: [SPARK-30213] [SQL]Remove the mutable status in ShuffleQueryStageExec
SparkQA commented on issue #26846: [SPARK-30213] [SQL]Remove the mutable status in ShuffleQueryStageExec URL: https://github.com/apache/spark/pull/26846#issuecomment-564414338 **[Test build #115152 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/115152/testReport)** for PR 26846 at commit [`7ab903c`](https://github.com/apache/spark/commit/7ab903c4c77b5c4d439571d060de9a59c2eccf6d). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] dongjoon-hyun removed a comment on issue #26844: [SPARK-30211][INFRA] Switch python to python3 in make-distribution.sh
dongjoon-hyun removed a comment on issue #26844: [SPARK-30211][INFRA] Switch python to python3 in make-distribution.sh URL: https://github.com/apache/spark/pull/26844#issuecomment-564414062 Let's revert the doc~ This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] dongjoon-hyun commented on a change in pull request #26844: [SPARK-30211][INFRA] Switch python to python3 in make-distribution.sh
dongjoon-hyun commented on a change in pull request #26844: [SPARK-30211][INFRA] Switch python to python3 in make-distribution.sh URL: https://github.com/apache/spark/pull/26844#discussion_r356435745 ## File path: docs/building-spark.md ## @@ -66,7 +66,7 @@ with Maven profile settings and so on like the direct Maven build. Example: ./dev/make-distribution.sh --name custom-spark --pip --r --tgz -Psparkr -Phive -Phive-thriftserver -Pmesos -Pyarn -Pkubernetes -This will build Spark distribution along with Python pip and R packages. For more information on usage, run `./dev/make-distribution.sh --help` +This will build Spark distribution along with Python pip and R packages. (Note that build with Python pip package requires Python 3.6). For more information on usage, run `./dev/make-distribution.sh --help` Review comment: ~Ur, shall we revert this line since there is Python 3.7 and 3.8?~ Oops, Never mind. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] dongjoon-hyun commented on a change in pull request #26844: [SPARK-30211][INFRA] Switch python to python3 in make-distribution.sh
dongjoon-hyun commented on a change in pull request #26844: [SPARK-30211][INFRA] Switch python to python3 in make-distribution.sh URL: https://github.com/apache/spark/pull/26844#discussion_r356435745 ## File path: docs/building-spark.md ## @@ -66,7 +66,7 @@ with Maven profile settings and so on like the direct Maven build. Example: ./dev/make-distribution.sh --name custom-spark --pip --r --tgz -Psparkr -Phive -Phive-thriftserver -Pmesos -Pyarn -Pkubernetes -This will build Spark distribution along with Python pip and R packages. For more information on usage, run `./dev/make-distribution.sh --help` +This will build Spark distribution along with Python pip and R packages. (Note that build with Python pip package requires Python 3.6). For more information on usage, run `./dev/make-distribution.sh --help` Review comment: Ur, shall we revert this line since there is Python 3.7 and 3.8? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] cloud-fan commented on issue #26811: [SPARK-29600][SQL] array_contains built in function is not backward compatible in 3.0
cloud-fan commented on issue #26811: [SPARK-29600][SQL] array_contains built in function is not backward compatible in 3.0 URL: https://github.com/apache/spark/pull/26811#issuecomment-564414076 @amanomer do you know which PR causes the compatibility issue? We need to see if it's intentional or not. If it's intentional, there should be a migration guide. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] dongjoon-hyun commented on issue #26844: [SPARK-30211][INFRA] Switch python to python3 in make-distribution.sh
dongjoon-hyun commented on issue #26844: [SPARK-30211][INFRA] Switch python to python3 in make-distribution.sh URL: https://github.com/apache/spark/pull/26844#issuecomment-564414062 Let's revert the doc~ This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] Fokko commented on issue #26644: [SPARK-30004][SQL] Allow merge UserDefinedType into a native DataType
Fokko commented on issue #26644: [SPARK-30004][SQL] Allow merge UserDefinedType into a native DataType URL: https://github.com/apache/spark/pull/26644#issuecomment-564413559 Rebased onto master This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] wangyum commented on issue #26844: [SPARK-30211][INFRA] Switch python to python3 in make-distribution.sh
wangyum commented on issue #26844: [SPARK-30211][INFRA] Switch python to python3 in make-distribution.sh URL: https://github.com/apache/spark/pull/26844#issuecomment-564413141 > Do you want to do that separately? Then, I'll merge this first. Yes. I'd like to do that separately. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] zhengruifeng commented on issue #26803: [SPARK-30178][ML] RobustScaler support large numFeatures
zhengruifeng commented on issue #26803: [SPARK-30178][ML] RobustScaler support large numFeatures URL: https://github.com/apache/spark/pull/26803#issuecomment-564413213 @srowen I impl a simple `treeAggregateByKey` [here](https://github.com/apache/spark/compare/master...zhengruifeng:treeAggByKey?expand=1), and made several local tests like: ```scala val rdd = sc.range(0, 1, 1, 100) val rdd2 = rdd.map{i => (i % 10, i)} val rdd3 = rdd2.treeAggregateByKey(0.0, new HashPartitioner(3))(_+_, _+_, 2) rdd3.collect ``` and ran successfully. ![image](https://user-images.githubusercontent.com/7322292/70599939-67dd1380-1c29-11ea-8fa8-146972395ef5.png) BTW, it is reasonable to call `compress` before reduce tasks, so maybe a `aggregateByKeyWithinPartitions` is needed? Then we can call `compress` after locally aggregation within each partition. I guess they maybe common functions and we add this method in RDD/PairRDD? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] Fokko commented on a change in pull request #26780: [SPARK-27506][SQL] Allow deserialization of Avro data using compatible schemas
Fokko commented on a change in pull request #26780: [SPARK-27506][SQL] Allow deserialization of Avro data using compatible schemas URL: https://github.com/apache/spark/pull/26780#discussion_r356433990 ## File path: python/pyspark/sql/avro/functions.py ## @@ -30,9 +30,10 @@ @since(3.0) def from_avro(data, jsonFormatSchema, options={}): """ -Converts a binary column of avro format into its corresponding catalyst value. The specified -schema must match the read data, otherwise the behavior is undefined: it may fail or return -arbitrary result. +Converts a binary column of avro format into its corresponding catalyst value. If a writer's Review comment: Thanks This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] Fokko commented on a change in pull request #26780: [SPARK-27506][SQL] Allow deserialization of Avro data using compatible schemas
Fokko commented on a change in pull request #26780: [SPARK-27506][SQL] Allow deserialization of Avro data using compatible schemas URL: https://github.com/apache/spark/pull/26780#discussion_r356433702 ## File path: external/avro/src/main/scala/org/apache/spark/sql/avro/functions.scala ## @@ -45,9 +45,10 @@ object functions { } /** - * Converts a binary column of avro format into its corresponding catalyst value. The specified - * schema must match the read data, otherwise the behavior is undefined: it may fail or return - * arbitrary result. + * Converts a binary column of avro format into its corresponding catalyst value. If a writer's + * schema is provided in the options, a different (but compatible) schema can be used for reading. Review comment: Good point, thanks This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] cloud-fan commented on a change in pull request #26808: [SPARK-30184][SQL] Implement a helper method for aliasing functions
cloud-fan commented on a change in pull request #26808: [SPARK-30184][SQL] Implement a helper method for aliasing functions URL: https://github.com/apache/spark/pull/26808#discussion_r356432760 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/stat/StatFunctions.scala ## @@ -254,7 +254,8 @@ object StatFunctions extends Logging { stats.toLowerCase(Locale.ROOT) match { case "count" => (child: Expression) => Count(child).toAggregateExpression() case "mean" => (child: Expression) => Average(child).toAggregateExpression() - case "stddev" => (child: Expression) => StddevSamp(child).toAggregateExpression() + case "stddev" => (child: Expression) => +StddevSamp("stddev_samp", child).toAggregateExpression() Review comment: shall we use `stddev` here? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] cloud-fan commented on a change in pull request #26808: [SPARK-30184][SQL] Implement a helper method for aliasing functions
cloud-fan commented on a change in pull request #26808: [SPARK-30184][SQL] Implement a helper method for aliasing functions URL: https://github.com/apache/spark/pull/26808#discussion_r356432466 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/stat/StatFunctions.scala ## @@ -254,7 +254,8 @@ object StatFunctions extends Logging { stats.toLowerCase(Locale.ROOT) match { case "count" => (child: Expression) => Count(child).toAggregateExpression() case "mean" => (child: Expression) => Average(child).toAggregateExpression() - case "stddev" => (child: Expression) => StddevSamp(child).toAggregateExpression() + case "stddev" => (child: Expression) => +StddevSamp("stddev_samp", child).toAggregateExpression() Review comment: can we create a `object StddevSamp` and define the default name in the `apply` method? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #26817: [SPARK-30192][SQL] support column position in DS v2
AmplabJenkins removed a comment on issue #26817: [SPARK-30192][SQL] support column position in DS v2 URL: https://github.com/apache/spark/pull/26817#issuecomment-564410295 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/19963/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] cloud-fan commented on a change in pull request #26808: [SPARK-30184][SQL] Implement a helper method for aliasing functions
cloud-fan commented on a change in pull request #26808: [SPARK-30184][SQL] Implement a helper method for aliasing functions URL: https://github.com/apache/spark/pull/26808#discussion_r356432466 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/stat/StatFunctions.scala ## @@ -254,7 +254,8 @@ object StatFunctions extends Logging { stats.toLowerCase(Locale.ROOT) match { case "count" => (child: Expression) => Count(child).toAggregateExpression() case "mean" => (child: Expression) => Average(child).toAggregateExpression() - case "stddev" => (child: Expression) => StddevSamp(child).toAggregateExpression() + case "stddev" => (child: Expression) => +StddevSamp("stddev_samp", child).toAggregateExpression() Review comment: can we create a `object StddevSamp` and define the default name in the `apply` method? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #26817: [SPARK-30192][SQL] support column position in DS v2
AmplabJenkins removed a comment on issue #26817: [SPARK-30192][SQL] support column position in DS v2 URL: https://github.com/apache/spark/pull/26817#issuecomment-564410289 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] wangyum commented on issue #26844: [SPARK-30211][INFRA] Switch python to python3 in make-distribution.sh
wangyum commented on issue #26844: [SPARK-30211][INFRA] Switch python to python3 in make-distribution.sh URL: https://github.com/apache/spark/pull/26844#issuecomment-564410098 SPARK-29672 changed it: https://github.com/apache/spark/pull/26330/files#diff-8cf6167d58ce775a08acafcfe6f40966 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #26817: [SPARK-30192][SQL] support column position in DS v2
AmplabJenkins commented on issue #26817: [SPARK-30192][SQL] support column position in DS v2 URL: https://github.com/apache/spark/pull/26817#issuecomment-564410289 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #26817: [SPARK-30192][SQL] support column position in DS v2
AmplabJenkins commented on issue #26817: [SPARK-30192][SQL] support column position in DS v2 URL: https://github.com/apache/spark/pull/26817#issuecomment-564410295 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/19963/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] cloud-fan commented on a change in pull request #26808: [SPARK-30184][SQL] Implement a helper method for aliasing functions
cloud-fan commented on a change in pull request #26808: [SPARK-30184][SQL] Implement a helper method for aliasing functions URL: https://github.com/apache/spark/pull/26808#discussion_r356432239 ## File path: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/parser/ExpressionParserSuite.scala ## @@ -794,12 +794,18 @@ class ExpressionParserSuite extends AnalysisTest { } test("Support respect nulls keywords for first_value and last_value") { -assertEqual("first_value(a ignore nulls)", First('a, Literal(true)).toAggregateExpression()) -assertEqual("first_value(a respect nulls)", First('a, Literal(false)).toAggregateExpression()) -assertEqual("first_value(a)", First('a, Literal(false)).toAggregateExpression()) -assertEqual("last_value(a ignore nulls)", Last('a, Literal(true)).toAggregateExpression()) -assertEqual("last_value(a respect nulls)", Last('a, Literal(false)).toAggregateExpression()) -assertEqual("last_value(a)", Last('a, Literal(false)).toAggregateExpression()) +assertEqual("first_value(a ignore nulls)", + First('a, Literal(true)).toAggregateExpression()) Review comment: ditto, unnecessary change This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] cloud-fan commented on a change in pull request #26808: [SPARK-30184][SQL] Implement a helper method for aliasing functions
cloud-fan commented on a change in pull request #26808: [SPARK-30184][SQL] Implement a helper method for aliasing functions URL: https://github.com/apache/spark/pull/26808#discussion_r356432177 ## File path: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/aggregate/LastTestSuite.scala ## @@ -24,7 +24,8 @@ import org.apache.spark.sql.types.IntegerType class LastTestSuite extends SparkFunSuite { val input = AttributeReference("input", IntegerType, nullable = true)() val evaluator = DeclarativeAggregateEvaluator(Last(input, Literal(false)), Seq(input)) - val evaluatorIgnoreNulls = DeclarativeAggregateEvaluator(Last(input, Literal(true)), Seq(input)) + val evaluatorIgnoreNulls = DeclarativeAggregateEvaluator( +Last(input, Literal(true)), Seq(input)) Review comment: nit: unnecessary change This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] JkSelf opened a new pull request #26846: [SPARK-30213] [SQL]Remove the mutable status in ShuffleQueryStageExec
JkSelf opened a new pull request #26846: [SPARK-30213] [SQL]Remove the mutable status in ShuffleQueryStageExec URL: https://github.com/apache/spark/pull/26846 ### What changes were proposed in this pull request? Currently `ShuffleQueryStageExec `contain the mutable status, eg `mapOutputStatisticsFuture `variable. So It is not easy to pass when we copy `ShuffleQueryStageExec`. This PR will put the `mapOutputStatisticsFuture ` variable from `ShuffleQueryStageExec` to `ShuffleExchangeExec`. And then we can pass the value of `mapOutputStatisticsFuture ` when copying. ### Why are the changes needed? In order to remove the mutable status in `ShuffleQueryStageExec` ### Does this PR introduce any user-facing change? No ### How was this patch tested? Existing uts This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] HeartSaVioR commented on issue #24128: [SPARK-27188][SS] FileStreamSink: provide a new option to have retention on output files
HeartSaVioR commented on issue #24128: [SPARK-27188][SS] FileStreamSink: provide a new option to have retention on output files URL: https://github.com/apache/spark/pull/24128#issuecomment-564409487 @uncleGen Hi, do you plan to go ahead with your idea? I have been thinking about this issue, and your idea seems to be a realistic solution which doesn't introduce too much changes. While we may also want to find the solution which could deal with most of things, but for now it would be great even only with your idea. Otherwise, would you mind if I pick your idea up if you're not planning to do it? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on issue #26817: [SPARK-30192][SQL] support column position in DS v2
SparkQA commented on issue #26817: [SPARK-30192][SQL] support column position in DS v2 URL: https://github.com/apache/spark/pull/26817#issuecomment-564409357 **[Test build #115151 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/115151/testReport)** for PR 26817 at commit [`be2f1b8`](https://github.com/apache/spark/commit/be2f1b8ecaea22eae09077e2ed28d0995fd5f587). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] PavithraRamachandran commented on issue #26384: [SPARK-29460] [WEBUI]Add tooltip for Jobs page
PavithraRamachandran commented on issue #26384: [SPARK-29460] [WEBUI]Add tooltip for Jobs page URL: https://github.com/apache/spark/pull/26384#issuecomment-564408513 @srowen could you help me re trigger the PR build? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] HeartSaVioR commented on issue #26845: [SPARK-21869][SS] Revise Kafka producer pool to implement 'expire' correctly
HeartSaVioR commented on issue #26845: [SPARK-21869][SS] Revise Kafka producer pool to implement 'expire' correctly URL: https://github.com/apache/spark/pull/26845#issuecomment-564407397 Ah yes I had been investigating the issue. Thanks for reminding! Will update soon. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] amanomer commented on a change in pull request #26808: [SPARK-30184][SQL] Implement a helper method for aliasing functions
amanomer commented on a change in pull request #26808: [SPARK-30184][SQL] Implement a helper method for aliasing functions URL: https://github.com/apache/spark/pull/26808#discussion_r356428632 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/RewriteDistinctAggregates.scala ## @@ -198,7 +198,7 @@ object RewriteDistinctAggregates extends Rule[LogicalPlan] { // Select the result of the first aggregate in the last aggregate. val result = AggregateExpression( - aggregate.First(evalWithinGroup(regularGroupId, operator.toAttribute), Literal(true)), + new aggregate.First(evalWithinGroup(regularGroupId, operator.toAttribute), Literal(true)), Review comment: Done This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #26803: [SPARK-30178][ML] RobustScaler support large numFeatures
AmplabJenkins removed a comment on issue #26803: [SPARK-30178][ML] RobustScaler support large numFeatures URL: https://github.com/apache/spark/pull/26803#issuecomment-564405114 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/115147/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #26803: [SPARK-30178][ML] RobustScaler support large numFeatures
AmplabJenkins removed a comment on issue #26803: [SPARK-30178][ML] RobustScaler support large numFeatures URL: https://github.com/apache/spark/pull/26803#issuecomment-564405106 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #26832: [SPARK-30202][ML][PYSPARK] impl QuantileTransform
AmplabJenkins removed a comment on issue #26832: [SPARK-30202][ML][PYSPARK] impl QuantileTransform URL: https://github.com/apache/spark/pull/26832#issuecomment-564405200 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #26832: [SPARK-30202][ML][PYSPARK] impl QuantileTransform
AmplabJenkins commented on issue #26832: [SPARK-30202][ML][PYSPARK] impl QuantileTransform URL: https://github.com/apache/spark/pull/26832#issuecomment-564405203 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/115146/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #26832: [SPARK-30202][ML][PYSPARK] impl QuantileTransform
AmplabJenkins commented on issue #26832: [SPARK-30202][ML][PYSPARK] impl QuantileTransform URL: https://github.com/apache/spark/pull/26832#issuecomment-564405200 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #26832: [SPARK-30202][ML][PYSPARK] impl QuantileTransform
AmplabJenkins removed a comment on issue #26832: [SPARK-30202][ML][PYSPARK] impl QuantileTransform URL: https://github.com/apache/spark/pull/26832#issuecomment-564405203 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/115146/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA removed a comment on issue #26832: [SPARK-30202][ML][PYSPARK] impl QuantileTransform
SparkQA removed a comment on issue #26832: [SPARK-30202][ML][PYSPARK] impl QuantileTransform URL: https://github.com/apache/spark/pull/26832#issuecomment-564387974 **[Test build #115146 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/115146/testReport)** for PR 26832 at commit [`a2d2c51`](https://github.com/apache/spark/commit/a2d2c517997fe27cb9073121ccee78d890d04e6f). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on issue #26832: [SPARK-30202][ML][PYSPARK] impl QuantileTransform
SparkQA commented on issue #26832: [SPARK-30202][ML][PYSPARK] impl QuantileTransform URL: https://github.com/apache/spark/pull/26832#issuecomment-564404957 **[Test build #115146 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/115146/testReport)** for PR 26832 at commit [`a2d2c51`](https://github.com/apache/spark/commit/a2d2c517997fe27cb9073121ccee78d890d04e6f). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #26803: [SPARK-30178][ML] RobustScaler support large numFeatures
AmplabJenkins commented on issue #26803: [SPARK-30178][ML] RobustScaler support large numFeatures URL: https://github.com/apache/spark/pull/26803#issuecomment-564405114 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/115147/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #26803: [SPARK-30178][ML] RobustScaler support large numFeatures
AmplabJenkins commented on issue #26803: [SPARK-30178][ML] RobustScaler support large numFeatures URL: https://github.com/apache/spark/pull/26803#issuecomment-564405106 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA removed a comment on issue #26803: [SPARK-30178][ML] RobustScaler support large numFeatures
SparkQA removed a comment on issue #26803: [SPARK-30178][ML] RobustScaler support large numFeatures URL: https://github.com/apache/spark/pull/26803#issuecomment-564387975 **[Test build #115147 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/115147/testReport)** for PR 26803 at commit [`5283294`](https://github.com/apache/spark/commit/5283294c72c9ab0586391e3c799a0cf2adede637). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on issue #26803: [SPARK-30178][ML] RobustScaler support large numFeatures
SparkQA commented on issue #26803: [SPARK-30178][ML] RobustScaler support large numFeatures URL: https://github.com/apache/spark/pull/26803#issuecomment-564404790 **[Test build #115147 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/115147/testReport)** for PR 26803 at commit [`5283294`](https://github.com/apache/spark/commit/5283294c72c9ab0586391e3c799a0cf2adede637). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on issue #26808: [SPARK-30184][SQL] Implement a helper method for aliasing functions
SparkQA commented on issue #26808: [SPARK-30184][SQL] Implement a helper method for aliasing functions URL: https://github.com/apache/spark/pull/26808#issuecomment-564404094 **[Test build #115150 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/115150/testReport)** for PR 26808 at commit [`a2d75de`](https://github.com/apache/spark/commit/a2d75de6d4ee273fbc451effdf4b6a9303232085). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] dongjoon-hyun commented on a change in pull request #26840: [SPARK-30038][SQL] DESCRIBE FUNCTION should do multi-catalog resolution
dongjoon-hyun commented on a change in pull request #26840: [SPARK-30038][SQL] DESCRIBE FUNCTION should do multi-catalog resolution URL: https://github.com/apache/spark/pull/26840#discussion_r356423436 ## File path: sql/core/src/test/scala/org/apache/spark/sql/connector/DataSourceV2SQLSuite.scala ## @@ -1799,6 +1799,18 @@ class DataSourceV2SQLSuite } } + test("DESCRIBE FUNCTION not valid v1 namespace") { +val e = intercept[AnalysisException] { + sql(s"DESCRIBE FUNCTION testcat.ns1.ns2.fun") Review comment: `s"` -> `"`. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] dongjoon-hyun commented on a change in pull request #26840: [SPARK-30038][SQL] DESCRIBE FUNCTION should do multi-catalog resolution
dongjoon-hyun commented on a change in pull request #26840: [SPARK-30038][SQL] DESCRIBE FUNCTION should do multi-catalog resolution URL: https://github.com/apache/spark/pull/26840#discussion_r356423491 ## File path: sql/core/src/test/scala/org/apache/spark/sql/connector/DataSourceV2SQLSuite.scala ## @@ -1799,6 +1799,18 @@ class DataSourceV2SQLSuite } } + test("DESCRIBE FUNCTION not valid v1 namespace") { +val e = intercept[AnalysisException] { + sql(s"DESCRIBE FUNCTION testcat.ns1.ns2.fun") +} +assert(e.message.contains("DESCRIBE FUNCTION is only supported in v1 catalog")) + +val e1 = intercept[AnalysisException] { + sql(s"DESCRIBE FUNCTION default.ns1.ns2.fun") Review comment: `s"` -> `"`. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] dongjoon-hyun commented on a change in pull request #26840: [SPARK-30038][SQL] DESCRIBE FUNCTION should do multi-catalog resolution
dongjoon-hyun commented on a change in pull request #26840: [SPARK-30038][SQL] DESCRIBE FUNCTION should do multi-catalog resolution URL: https://github.com/apache/spark/pull/26840#discussion_r356423053 ## File path: sql/core/src/main/scala/org/apache/spark/sql/catalyst/analysis/ResolveSessionCatalog.scala ## @@ -472,6 +472,19 @@ class ResolveSessionCatalog( tableName.asTableIdentifier, propertyKey) +case DescribeFunctionStatement(CatalogAndIdentifierParts(catalog, functionName), extended) => + val functionIdentifier = if (isSessionCatalog(catalog)) { + functionName match { Review comment: ? indentation from line 477 to 483? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #26512: [SPARK-29493][SQL] Arrow MapType support
AmplabJenkins removed a comment on issue #26512: [SPARK-29493][SQL] Arrow MapType support URL: https://github.com/apache/spark/pull/26512#issuecomment-564400557 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #26808: [SPARK-30184][SQL] Implement a helper method for aliasing functions
AmplabJenkins removed a comment on issue #26808: [SPARK-30184][SQL] Implement a helper method for aliasing functions URL: https://github.com/apache/spark/pull/26808#issuecomment-564400567 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/19961/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #26808: [SPARK-30184][SQL] Implement a helper method for aliasing functions
AmplabJenkins removed a comment on issue #26808: [SPARK-30184][SQL] Implement a helper method for aliasing functions URL: https://github.com/apache/spark/pull/26808#issuecomment-564400558 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #26512: [SPARK-29493][SQL] Arrow MapType support
AmplabJenkins removed a comment on issue #26512: [SPARK-29493][SQL] Arrow MapType support URL: https://github.com/apache/spark/pull/26512#issuecomment-564400566 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/19962/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] cloud-fan commented on a change in pull request #26817: [SPARK-30192][SQL] support column position in DS v2
cloud-fan commented on a change in pull request #26817: [SPARK-30192][SQL] support column position in DS v2 URL: https://github.com/apache/spark/pull/26817#discussion_r356422627 ## File path: sql/catalyst/src/main/java/org/apache/spark/sql/connector/catalog/TableChange.java ## @@ -113,7 +114,30 @@ static TableChange addColumn( DataType dataType, boolean isNullable, String comment) { -return new AddColumn(fieldNames, dataType, isNullable, comment); +return new AddColumn(fieldNames, dataType, isNullable, comment, null); + } + + /** + * Create a TableChange for adding a column. + * + * If the field already exists, the change will result in an {@link IllegalArgumentException}. + * If the new field is nested and its parent does not exist or is not a struct, the change will + * result in an {@link IllegalArgumentException}. Review comment: This doc is for catalog implementations. They should throw `IllegalArgumentException` if something goes wrong. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #26808: [SPARK-30184][SQL] Implement a helper method for aliasing functions
AmplabJenkins commented on issue #26808: [SPARK-30184][SQL] Implement a helper method for aliasing functions URL: https://github.com/apache/spark/pull/26808#issuecomment-564400558 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org