[GitHub] [spark] AmplabJenkins removed a comment on pull request #29014: [SPARK-32199][SPARK-32198] Reduce job failures during decommissioning
AmplabJenkins removed a comment on pull request #29014: URL: https://github.com/apache/spark/pull/29014#issuecomment-661650757 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/126213/ Test FAILed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] maropu commented on a change in pull request #29166: [SPARK-32372][SQL] ResolveReferences.dedupRight should only rewrite attributes for ancestor nodes of the conflict plan
maropu commented on a change in pull request #29166: URL: https://github.com/apache/spark/pull/29166#discussion_r457854335 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala ## @@ -1237,20 +1249,44 @@ class Analyzer( if (conflictPlans.isEmpty) { right } else { -val attributeRewrites = AttributeMap(conflictPlans.flatMap { - case (oldRelation, newRelation) => oldRelation.output.zip(newRelation.output)}) -val conflictPlanMap = conflictPlans.toMap -// transformDown so that we can replace all the old Relations in one turn due to -// the reason that `conflictPlans` are also collected in pre-order. -right transformDown { - case r => conflictPlanMap.getOrElse(r, r) -} transformUp { - case other => other transformExpressions { +rewritePlan(right, conflictPlans.toMap)._1 + } +} + +private def rewritePlan(plan: LogicalPlan, conflictPlanMap: Map[LogicalPlan, LogicalPlan]) + : (LogicalPlan, mutable.ArrayBuffer[(Attribute, Attribute)]) = { + val attrMapping = new mutable.ArrayBuffer[(Attribute, Attribute)]() + if (conflictPlanMap.contains(plan)) { +// If the plan is the one that conflict the with left one, we'd +// just replace it with the new plan and collect the rewrite +// attributes for the parent node. +val newRelation = conflictPlanMap(plan) +attrMapping ++= plan.output.zip(newRelation.output) +newRelation -> attrMapping + } else { +var newPlan = plan.mapChildren { child => Review comment: nit: `val` is better. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #29014: [SPARK-32199][SPARK-32198] Reduce job failures during decommissioning
AmplabJenkins removed a comment on pull request #29014: URL: https://github.com/apache/spark/pull/29014#issuecomment-661650753 Merged build finished. Test FAILed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #29014: [SPARK-32199][SPARK-32198] Reduce job failures during decommissioning
AmplabJenkins commented on pull request #29014: URL: https://github.com/apache/spark/pull/29014#issuecomment-661650753 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #29032: [SPARK-32217] Plumb whether a worker would also be decommissioned along with executor
AmplabJenkins removed a comment on pull request #29032: URL: https://github.com/apache/spark/pull/29032#issuecomment-661649619 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/126212/ Test FAILed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA removed a comment on pull request #29014: [SPARK-32199][SPARK-32198] Reduce job failures during decommissioning
SparkQA removed a comment on pull request #29014: URL: https://github.com/apache/spark/pull/29014#issuecomment-661598793 **[Test build #126213 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/126213/testReport)** for PR 29014 at commit [`54c2235`](https://github.com/apache/spark/commit/54c2235e480e6673fb8ab84341a338d68970c3f3). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #29032: [SPARK-32217] Plumb whether a worker would also be decommissioned along with executor
AmplabJenkins removed a comment on pull request #29032: URL: https://github.com/apache/spark/pull/29032#issuecomment-661649608 Merged build finished. Test FAILed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #29014: [SPARK-32199][SPARK-32198] Reduce job failures during decommissioning
SparkQA commented on pull request #29014: URL: https://github.com/apache/spark/pull/29014#issuecomment-661649994 **[Test build #126213 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/126213/testReport)** for PR 29014 at commit [`54c2235`](https://github.com/apache/spark/commit/54c2235e480e6673fb8ab84341a338d68970c3f3). * This patch **fails Spark unit tests**. * This patch merges cleanly. * This patch adds the following public classes _(experimental)_: * `case class ExecutorProcessLost(` This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #29032: [SPARK-32217] Plumb whether a worker would also be decommissioned along with executor
AmplabJenkins commented on pull request #29032: URL: https://github.com/apache/spark/pull/29032#issuecomment-661649608 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #29172: [SPARK-32377][SQL] CaseInsensitiveMap should be deterministic for addition
AmplabJenkins removed a comment on pull request #29172: URL: https://github.com/apache/spark/pull/29172#issuecomment-661648937 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #29172: [SPARK-32377][SQL] CaseInsensitiveMap should be deterministic for addition
AmplabJenkins commented on pull request #29172: URL: https://github.com/apache/spark/pull/29172#issuecomment-661648937 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] cloud-fan commented on pull request #29152: [SPARK-32356][SQL] Forbid create view with null type
cloud-fan commented on pull request #29152: URL: https://github.com/apache/spark/pull/29152#issuecomment-661648916 LGTM if tests pass. One last question: Hive also fails to create view if the column is null type? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA removed a comment on pull request #29172: [SPARK-32377][SQL] CaseInsensitiveMap should be deterministic for addition
SparkQA removed a comment on pull request #29172: URL: https://github.com/apache/spark/pull/29172#issuecomment-661503811 **[Test build #126209 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/126209/testReport)** for PR 29172 at commit [`27bd6d5`](https://github.com/apache/spark/commit/27bd6d55ff85e4deecbb36eb04185382cca256d3). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA removed a comment on pull request #29032: [SPARK-32217] Plumb whether a worker would also be decommissioned along with executor
SparkQA removed a comment on pull request #29032: URL: https://github.com/apache/spark/pull/29032#issuecomment-661598746 **[Test build #126212 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/126212/testReport)** for PR 29032 at commit [`71633f0`](https://github.com/apache/spark/commit/71633f0e433b4c81ffd626f1522327b0d6d21759). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #29172: [SPARK-32377][SQL] CaseInsensitiveMap should be deterministic for addition
SparkQA commented on pull request #29172: URL: https://github.com/apache/spark/pull/29172#issuecomment-661648372 **[Test build #126209 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/126209/testReport)** for PR 29172 at commit [`27bd6d5`](https://github.com/apache/spark/commit/27bd6d55ff85e4deecbb36eb04185382cca256d3). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #29032: [SPARK-32217] Plumb whether a worker would also be decommissioned along with executor
SparkQA commented on pull request #29032: URL: https://github.com/apache/spark/pull/29032#issuecomment-661648257 **[Test build #126212 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/126212/testReport)** for PR 29032 at commit [`71633f0`](https://github.com/apache/spark/commit/71633f0e433b4c81ffd626f1522327b0d6d21759). * This patch **fails PySpark pip packaging tests**. * This patch merges cleanly. * This patch adds the following public classes _(experimental)_: * `case class ExecutorDecommissionInfo(message: String, isHostDecommissioned: Boolean)` * ` case class DecommissionExecutor(executorId: String, decommissionInfo: ExecutorDecommissionInfo)` This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #29175: [SPARK-32377][SQL][2.4] CaseInsensitiveMap should be deterministic for addition
AmplabJenkins removed a comment on pull request #29175: URL: https://github.com/apache/spark/pull/29175#issuecomment-661645796 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #29175: [SPARK-32377][SQL][2.4] CaseInsensitiveMap should be deterministic for addition
AmplabJenkins commented on pull request #29175: URL: https://github.com/apache/spark/pull/29175#issuecomment-661645796 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #29175: [SPARK-32377][SQL][2.4] CaseInsensitiveMap should be deterministic for addition
SparkQA commented on pull request #29175: URL: https://github.com/apache/spark/pull/29175#issuecomment-661645518 **[Test build #126221 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/126221/testReport)** for PR 29175 at commit [`91683e2`](https://github.com/apache/spark/commit/91683e2e161a150b0521c8289234a233b73cd98c). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] HeartSaVioR commented on pull request #28422: [SPARK-17604][SS] FileStreamSource: provide a new option to have retention on input files
HeartSaVioR commented on pull request #28422: URL: https://github.com/apache/spark/pull/28422#issuecomment-661645101 FYI, I've initiated the discussion around this in dev@ mailing list, how to deal with "latestFirst" option and metadata growing. https://lists.apache.org/thread.html/r08e3a8d7df74354b38d19ffdebe1afe7fa73c2f611f0a812a867dffb%40%3Cdev.spark.apache.org%3E This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] dongjoon-hyun commented on a change in pull request #29175: [SPARK-32377][SQL] CaseInsensitiveMap should be deterministic for addition
dongjoon-hyun commented on a change in pull request #29175: URL: https://github.com/apache/spark/pull/29175#discussion_r457847560 ## File path: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/util/CaseInsensitiveMapSuite.scala ## @@ -0,0 +1,30 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.spark.sql.catalyst.util + +import org.apache.spark.SparkFunSuite + +class CaseInsensitiveMapSuite extends SparkFunSuite { Review comment: I want to keep this test in `catalyst` module, but cannot find a proper test suite. I'm open to all suggestion for a better test suite if exists. ## File path: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/util/CaseInsensitiveMapSuite.scala ## @@ -0,0 +1,30 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.spark.sql.catalyst.util + +import org.apache.spark.SparkFunSuite + +class CaseInsensitiveMapSuite extends SparkFunSuite { Review comment: I want to keep this test in `catalyst` module, but cannot find a proper test suite. I'm open to any suggestion for a better test suite if exists. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] dongjoon-hyun commented on a change in pull request #29175: [SPARK-32377][SQL] CaseInsensitiveMap should be deterministic for addition
dongjoon-hyun commented on a change in pull request #29175: URL: https://github.com/apache/spark/pull/29175#discussion_r457847560 ## File path: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/util/CaseInsensitiveMapSuite.scala ## @@ -0,0 +1,30 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.spark.sql.catalyst.util + +import org.apache.spark.SparkFunSuite + +class CaseInsensitiveMapSuite extends SparkFunSuite { Review comment: I want to keep this test in `catalyst`, but cannot find a proper one. I'm open to all suggestion for a better test suite if exists. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] dongjoon-hyun opened a new pull request #29175: [SPARK-32377][SQL] CaseInsensitiveMap should be deterministic for addition
dongjoon-hyun opened a new pull request #29175: URL: https://github.com/apache/spark/pull/29175 ### What changes were proposed in this pull request? This PR aims to fix `CaseInsensitiveMap` to be deterministic for addition. This is a backporting of https://github.com/apache/spark/pull/29172 . ### Why are the changes needed? ```scala import org.apache.spark.sql.catalyst.util.CaseInsensitiveMap var m = CaseInsensitiveMap(Map.empty[String, String]) Seq(("paTh", "1"), ("PATH", "2"), ("Path", "3"), ("patH", "4"), ("path", "5")).foreach { kv => m = (m + kv).asInstanceOf[CaseInsensitiveMap[String]] println(m.get("path")) } ``` **BEFORE** ``` Some(1) Some(2) Some(3) Some(4) Some(1) ``` **AFTER** ``` Some(1) Some(2) Some(3) Some(4) Some(5) ``` ### Does this PR introduce _any_ user-facing change? Yes, but this is a bug fix on non-deterministic behavior. ### How was this patch tested? Pass the newly added test case. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #29153: [SPARK-32310][ML][PySpark][WIP] ML params default value parity in feature and tuning
AmplabJenkins commented on pull request #29153: URL: https://github.com/apache/spark/pull/29153#issuecomment-661643494 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #29153: [SPARK-32310][ML][PySpark][WIP] ML params default value parity in feature and tuning
AmplabJenkins removed a comment on pull request #29153: URL: https://github.com/apache/spark/pull/29153#issuecomment-661643494 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #29153: [SPARK-32310][ML][PySpark][WIP] ML params default value parity in feature and tuning
SparkQA commented on pull request #29153: URL: https://github.com/apache/spark/pull/29153#issuecomment-661643229 **[Test build #126220 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/126220/testReport)** for PR 29153 at commit [`1586e30`](https://github.com/apache/spark/commit/1586e3079b442daf5ab5332a3d690f218df423cc). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] dongjoon-hyun commented on pull request #29172: [SPARK-32377][SQL] CaseInsensitiveMap should be deterministic for addition
dongjoon-hyun commented on pull request #29172: URL: https://github.com/apache/spark/pull/29172#issuecomment-661639927 Merged to master/3.0. I'll make a backporting PR for `branch-2.4`. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] dongjoon-hyun commented on pull request #29172: [SPARK-32377][SQL] CaseInsensitiveMap should be deterministic for addition
dongjoon-hyun commented on pull request #29172: URL: https://github.com/apache/spark/pull/29172#issuecomment-661639361 Thank you all, @cloud-fan , @HyukjinKwon , @viirya , @maropu . I'll merge this PR because the first PR already passed GitHub Action. The second commit is only about `scala-2.13` which is not complied in Jenkins/GitHub Action currently. Its compilation is verified locally. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] dongjoon-hyun closed pull request #29172: [SPARK-32377][SQL] CaseInsensitiveMap should be deterministic for addition
dongjoon-hyun closed pull request #29172: URL: https://github.com/apache/spark/pull/29172 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] sachinaraballi commented on a change in pull request #29174: Addition of missing EXECUTOR_INSTANCES configuration key
sachinaraballi commented on a change in pull request #29174: URL: https://github.com/apache/spark/pull/29174#discussion_r457841580 ## File path: launcher/src/main/java/org/apache/spark/launcher/SparkLauncher.java ## @@ -70,6 +70,8 @@ public static final String EXECUTOR_EXTRA_JAVA_OPTIONS = "spark.executor.extraJavaOptions"; /** Configuration key for the executor native library path. */ public static final String EXECUTOR_EXTRA_LIBRARY_PATH = "spark.executor.extraLibraryPath"; + /** Configuration key for the number of executor instances. */ + public static final String EXECUTOR_INSTANCES = "spark.executor.instances"; Review comment: Actually in my scenario ``` sparkLauncher.setConf("spark.executor.instances", executors); sparkLauncher.setConf(SparkLauncher.EXECUTOR_MEMORY, executorMemory); sparkLauncher.setConf(SparkLauncher.DRIVER_MEMORY, driverMemory ); sparkLauncher.setConf(SparkLauncher.EXECUTOR_CORES, executorCores); ``` There is no option in SparkLauncher for executors. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] sachinaraballi commented on a change in pull request #29174: Addition of missing EXECUTOR_INSTANCES configuration key
sachinaraballi commented on a change in pull request #29174: URL: https://github.com/apache/spark/pull/29174#discussion_r457841580 ## File path: launcher/src/main/java/org/apache/spark/launcher/SparkLauncher.java ## @@ -70,6 +70,8 @@ public static final String EXECUTOR_EXTRA_JAVA_OPTIONS = "spark.executor.extraJavaOptions"; /** Configuration key for the executor native library path. */ public static final String EXECUTOR_EXTRA_LIBRARY_PATH = "spark.executor.extraLibraryPath"; + /** Configuration key for the number of executor instances. */ + public static final String EXECUTOR_INSTANCES = "spark.executor.instances"; Review comment: Actually in my scenario ```suggestion sparkLauncher.setConf("spark.executor.instances", executors); sparkLauncher.setConf(SparkLauncher.EXECUTOR_MEMORY, executorMemory); sparkLauncher.setConf(SparkLauncher.DRIVER_MEMORY, driverMemory ); sparkLauncher.setConf(SparkLauncher.EXECUTOR_CORES, executorCores); ``` There is no option in SparkLauncher for executors. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] HyukjinKwon edited a comment on pull request #29172: [SPARK-32377][SQL] CaseInsensitiveMap should be deterministic for addition
HyukjinKwon edited a comment on pull request #29172: URL: https://github.com/apache/spark/pull/29172#issuecomment-661637178 I thought it's better to be safe always for `originalMap` to contain the original map as is, and now it's broken. `originalMap` is an exposed property in `CaseInsensitiveMap`. Now once you combine `CaseInsensitiveMap` with another entry, e.g. `caseInsensitiveMap + ("option" -> 1)`, the `originalMap` removes the duplicates such as `Option`, `oPtion`, etc. Maybe I am too much cautious about it. I agree that it's not going to cause an external problem for now. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] HyukjinKwon edited a comment on pull request #29172: [SPARK-32377][SQL] CaseInsensitiveMap should be deterministic for addition
HyukjinKwon edited a comment on pull request #29172: URL: https://github.com/apache/spark/pull/29172#issuecomment-661637178 I thought it's better to be safe always for `originalMap` to contain the original map as is, and now it's broken. `originalMap` is an exposed property in `CaseInsensitiveMap`. Now once you combine `CaseInsensitiveMap` with another entry, e.g. `(caseInsensitiveMap + ("option" -> 1)`, the `originalMap` removes the duplicates such as `Option`, `oPtion`, etc. Maybe I am too much cautious about it. I agree that it's not going to cause an external problem for now. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] HyukjinKwon commented on pull request #29172: [SPARK-32377][SQL] CaseInsensitiveMap should be deterministic for addition
HyukjinKwon commented on pull request #29172: URL: https://github.com/apache/spark/pull/29172#issuecomment-661637178 I thought it's better to be safe to always for `originalMap` to contain the original map as is, and now it's broken. `originalMap` is an exposed property in `CaseInsensitiveMap`. Now once you combine `CaseInsensitiveMap` with another entry, e.g. `(caseInsensitiveMap + ("option" -> 1)`, the `originalMap` removes the duplicates such as `Option`, `oPtion`,etc. Maybe I am too much cautious about it. I agree that it's not going to cause a problem for now. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] HyukjinKwon edited a comment on pull request #29172: [SPARK-32377][SQL] CaseInsensitiveMap should be deterministic for addition
HyukjinKwon edited a comment on pull request #29172: URL: https://github.com/apache/spark/pull/29172#issuecomment-661637178 I thought it's better to be safe to always for `originalMap` to contain the original map as is, and now it's broken. `originalMap` is an exposed property in `CaseInsensitiveMap`. Now once you combine `CaseInsensitiveMap` with another entry, e.g. `(caseInsensitiveMap + ("option" -> 1)`, the `originalMap` removes the duplicates such as `Option`, `oPtion`,etc. Maybe I am too much cautious about it. I agree that it's not going to cause an external problem for now. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] viirya commented on pull request #29169: [SPARK-32357][INFRA] Add a step in GitHub Actions to show failed tests
viirya commented on pull request #29169: URL: https://github.com/apache/spark/pull/29169#issuecomment-661635973 Thanks @HyukjinKwon This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] viirya commented on pull request #29172: [SPARK-32377][SQL] CaseInsensitiveMap should be deterministic for addition
viirya commented on pull request #29172: URL: https://github.com/apache/spark/pull/29172#issuecomment-661635735 > originalMap will lose the duplicated instances when it adds now. Not sure why this is a concern? I think `originalMap` is immutable. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] cloud-fan commented on a change in pull request #29166: [SPARK-32372][SQL] ResolveReferences.dedupRight should only rewrite attributes for ancestor nodes of the conflict plan
cloud-fan commented on a change in pull request #29166: URL: https://github.com/apache/spark/pull/29166#discussion_r457836362 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala ## @@ -1237,20 +1249,44 @@ class Analyzer( if (conflictPlans.isEmpty) { right } else { -val attributeRewrites = AttributeMap(conflictPlans.flatMap { - case (oldRelation, newRelation) => oldRelation.output.zip(newRelation.output)}) -val conflictPlanMap = conflictPlans.toMap -// transformDown so that we can replace all the old Relations in one turn due to -// the reason that `conflictPlans` are also collected in pre-order. -right transformDown { - case r => conflictPlanMap.getOrElse(r, r) -} transformUp { - case other => other transformExpressions { +rewritePlan(right, conflictPlans.toMap)._1 + } +} + +private def rewritePlan(plan: LogicalPlan, conflictPlanMap: Map[LogicalPlan, LogicalPlan]) Review comment: This rewriting is much more surgical than before, +1 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] cloud-fan commented on pull request #29166: [SPARK-32372][SQL] ResolveReferences.dedupRight should only rewrite attributes for ancestor nodes of the conflict plan
cloud-fan commented on pull request #29166: URL: https://github.com/apache/spark/pull/29166#issuecomment-661633497 cc @viirya This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #29172: [SPARK-32377][SQL] CaseInsensitiveMap should be deterministic for addition
AmplabJenkins commented on pull request #29172: URL: https://github.com/apache/spark/pull/29172#issuecomment-661632836 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #29172: [SPARK-32377][SQL] CaseInsensitiveMap should be deterministic for addition
AmplabJenkins removed a comment on pull request #29172: URL: https://github.com/apache/spark/pull/29172#issuecomment-661632836 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #29172: [SPARK-32377][SQL] CaseInsensitiveMap should be deterministic for addition
SparkQA commented on pull request #29172: URL: https://github.com/apache/spark/pull/29172#issuecomment-661632576 **[Test build #126219 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/126219/testReport)** for PR 29172 at commit [`a6dc25d`](https://github.com/apache/spark/commit/a6dc25dcacd34b315bf67d8bb1d28e52f2dec3bb). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] maropu commented on a change in pull request #28764: [SPARK-21117][SQL] Built-in SQL Function Support - WIDTH_BUCKET
maropu commented on a change in pull request #28764: URL: https://github.com/apache/spark/pull/28764#discussion_r457835108 ## File path: sql/core/src/test/resources/sql-functions/sql-expression-schema.md ## @@ -291,6 +291,7 @@ | org.apache.spark.sql.catalyst.expressions.Uuid | uuid | SELECT uuid() | struct | | org.apache.spark.sql.catalyst.expressions.WeekDay | weekday | SELECT weekday('2009-07-30') | struct | | org.apache.spark.sql.catalyst.expressions.WeekOfYear | weekofyear | SELECT weekofyear('2008-02-20') | struct | +| org.apache.spark.sql.catalyst.expressions.WidthBucket | width_bucket | SELECT width_bucket(5.3, 0.2, 10.6, 5) | struct | Review comment: yea, it looks better. I'll fix it later. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] dongjoon-hyun commented on pull request #29172: [SPARK-32377][SQL] CaseInsensitiveMap should be deterministic for addition
dongjoon-hyun commented on pull request #29172: URL: https://github.com/apache/spark/pull/29172#issuecomment-661632212 Thank you, @HyukjinKwon . @cloud-fan and @HyukjinKwon . I updated Scala-2.13 code consistently with the same code logically. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] dongjoon-hyun commented on a change in pull request #29172: [SPARK-32377][SQL] CaseInsensitiveMap should be deterministic for addition
dongjoon-hyun commented on a change in pull request #29172: URL: https://github.com/apache/spark/pull/29172#discussion_r457834857 ## File path: sql/catalyst/src/main/scala-2.13/org/apache/spark/sql/catalyst/util/CaseInsensitiveMap.scala ## @@ -40,7 +40,7 @@ class CaseInsensitiveMap[T] private (val originalMap: Map[String, T]) extends Ma keyLowerCasedMap.contains(k.toLowerCase(Locale.ROOT)) override def updated[B1 >: T](key: String, value: B1): Map[String, B1] = { -new CaseInsensitiveMap[B1](originalMap + (key -> value)) +new CaseInsensitiveMap[B1](originalMap.filter(!_._1.equalsIgnoreCase(key)) + (key -> value)) Review comment: This one is also reusing the logic at `removed` at line 49. I also verified that Scala-2.13 compilation. For now, Scala-2.13 testing is not possible. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] HyukjinKwon commented on a change in pull request #29174: Addition of missing EXECUTOR_INSTANCES configuration key
HyukjinKwon commented on a change in pull request #29174: URL: https://github.com/apache/spark/pull/29174#discussion_r457834593 ## File path: launcher/src/main/java/org/apache/spark/launcher/SparkLauncher.java ## @@ -70,6 +70,8 @@ public static final String EXECUTOR_EXTRA_JAVA_OPTIONS = "spark.executor.extraJavaOptions"; /** Configuration key for the executor native library path. */ public static final String EXECUTOR_EXTRA_LIBRARY_PATH = "spark.executor.extraLibraryPath"; + /** Configuration key for the number of executor instances. */ + public static final String EXECUTOR_INSTANCES = "spark.executor.instances"; Review comment: Can you use this constant in other places? For example, `core/src/main/scala/org/apache/spark/internal/config/package.scala` This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] cloud-fan commented on a change in pull request #29167: [SPARK-32374][SQL] Disallow setting properties when creating temporary views
cloud-fan commented on a change in pull request #29167: URL: https://github.com/apache/spark/pull/29167#discussion_r457834146 ## File path: sql/core/src/test/scala/org/apache/spark/sql/execution/SQLViewSuite.scala ## @@ -266,6 +266,17 @@ abstract class SQLViewSuite extends QueryTest with SQLTestUtils { } } + test("SPARK-32374: disallow setting properties for CREATE TEMPORARY VIEW") { +withTempView("myabcdview") { + val e = intercept[AnalysisException] { +sql("CREATE TEMPORARY VIEW myabcdview TBLPROPERTIES ('a' = 'b') AS SELECT * FROM jt") + } + assert(e.message.contains( +"Operation not allowed: CREATE TEMPORARY VIEW ... " + + "TBLPROPERTIES (property_name = property_value, ...)")) Review comment: how about: `Operation not allowed: TBLPROPERTIES can't coexist with CREATE TEMPORARY VIEW` This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] cloud-fan commented on a change in pull request #29167: [SPARK-32374][SQL] Disallow setting properties when creating temporary views
cloud-fan commented on a change in pull request #29167: URL: https://github.com/apache/spark/pull/29167#discussion_r457833897 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/parser/AstBuilder.scala ## @@ -3515,6 +3515,13 @@ class AstBuilder(conf: SQLConf) extends SqlBaseBaseVisitor[AnyRef] with Logging } } +val properties = ctx.tablePropertyList.asScala.headOption.map(visitPropertyKeyValues) + .getOrElse(Map.empty) +if (ctx.TEMPORARY != null && !properties.isEmpty) { + operationNotAllowed( +"CREATE TEMPORARY VIEW ... TBLPROPERTIES (property_name = property_value, ...)", ctx) Review comment: I think we should move that check to the parser as well (in a separate PR). If we support view in DS v2, we shouldn't duplicated the check in both v1 and v2 commands. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #28977: [WIP] Add all hive.execution suite in the parallel test group
AmplabJenkins removed a comment on pull request #28977: URL: https://github.com/apache/spark/pull/28977#issuecomment-661630958 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #29174: Addition of missing EXECUTOR_INSTANCES configuration key
AmplabJenkins commented on pull request #29174: URL: https://github.com/apache/spark/pull/29174#issuecomment-661630835 Can one of the admins verify this patch? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #29174: Addition of missing EXECUTOR_INSTANCES configuration key
AmplabJenkins removed a comment on pull request #29174: URL: https://github.com/apache/spark/pull/29174#issuecomment-661630521 Can one of the admins verify this patch? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #28977: [WIP] Add all hive.execution suite in the parallel test group
AmplabJenkins commented on pull request #28977: URL: https://github.com/apache/spark/pull/28977#issuecomment-661630958 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #28977: [WIP] Add all hive.execution suite in the parallel test group
SparkQA commented on pull request #28977: URL: https://github.com/apache/spark/pull/28977#issuecomment-661630661 **[Test build #126218 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/126218/testReport)** for PR 28977 at commit [`38e5ff4`](https://github.com/apache/spark/commit/38e5ff423084005424461daa8c5085ff32a41a06). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #29174: Addition of missing EXECUTOR_INSTANCES configuration key
AmplabJenkins commented on pull request #29174: URL: https://github.com/apache/spark/pull/29174#issuecomment-661630521 Can one of the admins verify this patch? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] HyukjinKwon commented on pull request #29173: Fix permission problem in client.prepareLocalResources() while spark.…
HyukjinKwon commented on pull request #29173: URL: https://github.com/apache/spark/pull/29173#issuecomment-661630520 @sekingme, please file a JIRA and format the PR title correctly. See also http://spark.apache.org/contributing.html. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] HyukjinKwon commented on pull request #28977: [WIP] Add all hive.execution suite in the parallel test group
HyukjinKwon commented on pull request #28977: URL: https://github.com/apache/spark/pull/28977#issuecomment-661629804 retest this please This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] HyukjinKwon closed pull request #29154: [MINOR][DOCS] add link for Debugging your Application in running-on-yarn.html#launching-spark-on-yarn
HyukjinKwon closed pull request #29154: URL: https://github.com/apache/spark/pull/29154 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] HyukjinKwon commented on pull request #29154: [MINOR][DOCS] add link for Debugging your Application in running-on-yarn.html#launching-spark-on-yarn
HyukjinKwon commented on pull request #29154: URL: https://github.com/apache/spark/pull/29154#issuecomment-661629688 Merged to master, branch-3.0 and branch-2.4. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] sachinaraballi opened a new pull request #29174: Addition of missing EXECUTOR_INSTANCES configuration key
sachinaraballi opened a new pull request #29174: URL: https://github.com/apache/spark/pull/29174 spark.executor.instances is missing in the constant list ### What changes were proposed in this pull request? The constant EXECUTOR_INSTANCES is missing in the sparkLauncher list. Currently we need to manually add this parameter. ### Why are the changes needed? No ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? N/A This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #29173: Fix permission problem in client.prepareLocalResources() while spark.…
AmplabJenkins commented on pull request #29173: URL: https://github.com/apache/spark/pull/29173#issuecomment-661628872 Can one of the admins verify this patch? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #29173: Fix permission problem in client.prepareLocalResources() while spark.…
AmplabJenkins removed a comment on pull request #29173: URL: https://github.com/apache/spark/pull/29173#issuecomment-661628601 Can one of the admins verify this patch? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #29173: Fix permission problem in client.prepareLocalResources() while spark.…
AmplabJenkins commented on pull request #29173: URL: https://github.com/apache/spark/pull/29173#issuecomment-661628601 Can one of the admins verify this patch? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] dongjoon-hyun commented on pull request #29172: [SPARK-32377][SQL] CaseInsensitiveMap should be deterministic for addition
dongjoon-hyun commented on pull request #29172: URL: https://github.com/apache/spark/pull/29172#issuecomment-661628411 Thank you, @cloud-fan . BTW, hold on one second. I found that I need to update Scala-2.13 code path~ This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] cloud-fan commented on pull request #29172: [SPARK-32377][SQL] CaseInsensitiveMap should be deterministic for addition
cloud-fan commented on pull request #29172: URL: https://github.com/apache/spark/pull/29172#issuecomment-661627579 I think this patch is easier to reason about, and we should have done it in the first place. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] sekingme opened a new pull request #29173: Fix permission problem in client.prepareLocalResources() while spark.…
sekingme opened a new pull request #29173: URL: https://github.com/apache/spark/pull/29173 …yarn.stagingDir and spark.yarn.archive points to different file system ### What changes were proposed in this pull request? if spark.yarn.archive and spark.yarn.stagingDir points to a different file system, the directory of spark.yarn.archive needs to be upload to a destPath(started with spark.yarn.stagingDir). But the default permission set to the destPath is 644(APP_FILE_PERMISSION). This permission is OK with file case, but not with directory. So, fix the permission problem after coping a directory in org.apache.spark.deploy.yarn.copyFileToRemote(): `2020-07-20 14:53:36,488 [main] INFO org.apache.spark.deploy.yarn.Client - Setting up container launch context for our AM 2020-07-20 14:53:36,490 [main] INFO org.apache.spark.deploy.yarn.Client - Setting up the launch environment for our AM container 2020-07-20 14:53:36,494 [main] INFO org.apache.spark.deploy.yarn.Client - Preparing resources for our AM container 2020-07-20 14:53:36,582 [main] INFO org.apache.spark.deploy.yarn.Client - Uploading resource hdfs://n-fed/user/oozie/share/lib/lib_2020060819522/sparksql -> hdfs://test-hadoop/data/spark/stagingDir/.sparkStaging/application_1591344376643_55140/sparksql 2020-07-20 14:53:48,171 [main] INFO org.apache.spark.deploy.yarn.Client - Uploading resource hdfs://n-fed/user/oozie/share/lib/lib_2020060819522/sparksql/javax.inject-2.4.0-b34.jar -> hdfs://test-hadoop/data/spark/stagingDir/.sparkStaging/application_1591344376643_55140/javax.inject-2.4.0-b34.jar` ### Why are the changes needed? Permission denied error happens without this changes: `Diagnostics: Permission denied: user=jsq, access=READ_EXECUTE, inode="/data/spark/stagingDir/.sparkStaging/application_1591344376643_55140/sparksql":jsq:hdfs:drw-r--r-- at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.check(FSPermissionChecker.java:319) at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkPermission(FSPermissionChecker.java:219) at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkPermission(FSPermissionChecker.java:190) at org.apache.hadoop.hdfs.server.namenode.FSDirectory.checkPermission(FSDirectory.java:1780) at org.apache.hadoop.hdfs.server.namenode.FSDirectory.checkPermission(FSDirectory.java:1764) at org.apache.hadoop.hdfs.server.namenode.FSDirectory.checkPathAccess(FSDirectory.java:1738) at org.apache.hadoop.hdfs.server.namenode.FSDirStatAndListingOp.getListingInt(FSDirStatAndListingOp.java:76) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getListing(FSNamesystem.java:4565) at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.getListing(NameNodeRpcServer.java:1104) at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.getListing(ClientNamenodeProtocolServerSideTranslatorPB.java:642) at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java) at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:616) at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:982) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2211) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2207) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1724) at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2205)` ### Does this PR introduce _any_ user-facing change? ### How was this patch tested? Unit test is passed and the patch has been tested in practice with our spark cluster. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] dongjoon-hyun commented on pull request #29158: [SPARK-32362][SQL][TEST] AdaptiveQueryExecSuite misses verifying AE results
dongjoon-hyun commented on pull request #29158: URL: https://github.com/apache/spark/pull/29158#issuecomment-661624613 +1, late LGTM. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #29152: [SPARK-32356][SQL] Forbid create view with null type
AmplabJenkins removed a comment on pull request #29152: URL: https://github.com/apache/spark/pull/29152#issuecomment-661623951 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/126211/ Test FAILed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA removed a comment on pull request #29152: [SPARK-32356][SQL] Forbid create view with null type
SparkQA removed a comment on pull request #29152: URL: https://github.com/apache/spark/pull/29152#issuecomment-661576957 **[Test build #126211 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/126211/testReport)** for PR 29152 at commit [`4d8d088`](https://github.com/apache/spark/commit/4d8d0888277fa1cf02317d2f3cc0e037eabbdfea). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #29152: [SPARK-32356][SQL] Forbid create view with null type
AmplabJenkins commented on pull request #29152: URL: https://github.com/apache/spark/pull/29152#issuecomment-661623947 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #29152: [SPARK-32356][SQL] Forbid create view with null type
AmplabJenkins removed a comment on pull request #29152: URL: https://github.com/apache/spark/pull/29152#issuecomment-661623947 Merged build finished. Test FAILed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #29152: [SPARK-32356][SQL] Forbid create view with null type
SparkQA commented on pull request #29152: URL: https://github.com/apache/spark/pull/29152#issuecomment-661623860 **[Test build #126211 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/126211/testReport)** for PR 29152 at commit [`4d8d088`](https://github.com/apache/spark/commit/4d8d0888277fa1cf02317d2f3cc0e037eabbdfea). * This patch **fails Spark unit tests**. * This patch merges cleanly. * This patch adds no public classes. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA removed a comment on pull request #29154: [MINOR][DOCS] add link for Debugging your Application in running-on-yarn.html#launching-spark-on-yarn
SparkQA removed a comment on pull request #29154: URL: https://github.com/apache/spark/pull/29154#issuecomment-661613910 **[Test build #126217 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/126217/testReport)** for PR 29154 at commit [`96fa931`](https://github.com/apache/spark/commit/96fa9318bfb2a4d94f0f2347f4f2b86e81940143). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] dongjoon-hyun edited a comment on pull request #29172: [SPARK-32377][SQL] CaseInsensitiveMap should be deterministic for addition
dongjoon-hyun edited a comment on pull request #29172: URL: https://github.com/apache/spark/pull/29172#issuecomment-661616433 For @HyukjinKwon comments, - `CaseInsensitiveMap` is already using modified `originalMap` at `-` operation. Technically, `originalMap` is still immutable, and `+` operation can use modified `originalMap` too in the same reason. In short, both `-` and `+` operation should return the desirable map object for case-insensitivity. Also, this PR makes the class more consistent. - In general, `DataFrameReader` [example](https://github.com/apache/spark/pull/29172#issuecomment-661604611) is orthogonal to this PR because this PR is focusing on `CaseInsensitiveMap` only. - For the example, I believe it's not a good practice. For that alternative approach, we need to add an implicit assumption into the Apache Spark code-base. And, it'll be fragile when we add a new code. - We should warn that `.toMap` should not be used always. - All concatenation should start with `collection.immutable.ListMap.empty[String, String]` always. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] dongjoon-hyun edited a comment on pull request #29172: [SPARK-32377][SQL] CaseInsensitiveMap should be deterministic for addition
dongjoon-hyun edited a comment on pull request #29172: URL: https://github.com/apache/spark/pull/29172#issuecomment-661616433 For @HyukjinKwon comments, - `CaseInsensitiveMap` is already using modified `originalMap` at `-` operation. Technically, `originalMap` is still immutable, and `+` operation can use modified `originalMap` too in the same reason. In short, both `-` and `+` operation should return the desirable map object for case-insensitivity with this kind of `indeterministic` behavior. Also, this PR makes the `CaseInsensitiveMap` class more consistent. - In general, `DataFrameReader` [example](https://github.com/apache/spark/pull/29172#issuecomment-661604611) is orthogonal to this PR because this PR is focusing on `CaseInsensitiveMap` only. - For the example, I believe it's not a good practice. For that alternative approach, we need to add an implicit assumption into the Apache Spark code-base. And, it'll be fragile when we add a new code. - We should warn that `.toMap` should not be used always. - All concatenation should start with `collection.immutable.ListMap.empty[String, String]` always. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #29154: [MINOR][DOCS] add link for Debugging your Application in running-on-yarn.html#launching-spark-on-yarn
AmplabJenkins commented on pull request #29154: URL: https://github.com/apache/spark/pull/29154#issuecomment-661617108 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #29154: [MINOR][DOCS] add link for Debugging your Application in running-on-yarn.html#launching-spark-on-yarn
AmplabJenkins removed a comment on pull request #29154: URL: https://github.com/apache/spark/pull/29154#issuecomment-661617108 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] dongjoon-hyun commented on pull request #29172: [SPARK-32377][SQL] CaseInsensitiveMap should be deterministic for addition
dongjoon-hyun commented on pull request #29172: URL: https://github.com/apache/spark/pull/29172#issuecomment-661616433 For @HyukjinKwon comments, - `CaseInsensitiveMap` is already modifying `originalMap` at `-` operation. `originalMap` is not immutable already. I believe `+` operation can modify it too in the same reason. Both `-` and `+` operation returns the desirable `CaseInsensitiveMap`. - In general, `DataFrameReader` [example](https://github.com/apache/spark/pull/29172#issuecomment-661604611) is orthogonal to this PR because this PR is focusing on `CaseInsensitiveMap` only. - For the example, I believe it's not a good practice. For that alternative approach, we need to add an implicit assumption into the Apache Spark code-base. And, it'll be fragile when we add a new code. - We should warn that `.toMap` should not be used always. - All concatenation should start with `collection.immutable.ListMap.empty[String, String]` always. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] HeartSaVioR commented on pull request #28904: [SPARK-30462][SS] Streamline the logic on file stream source and sink metadata log to avoid memory issue
HeartSaVioR commented on pull request #28904: URL: https://github.com/apache/spark/pull/28904#issuecomment-661617632 cc. @tdas @zsxwing @jose-torres as it's technically not possible to merge only from @xuanyuanking 's review. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] dongjoon-hyun commented on pull request #29142: [SPARK-32360][SQL] Add MaxMinBy to support eliminate sorts
dongjoon-hyun commented on pull request #29142: URL: https://github.com/apache/spark/pull/29142#issuecomment-661618176 cc @gatorsmile This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] dongjoon-hyun edited a comment on pull request #29172: [SPARK-32377][SQL] CaseInsensitiveMap should be deterministic for addition
dongjoon-hyun edited a comment on pull request #29172: URL: https://github.com/apache/spark/pull/29172#issuecomment-661616433 For @HyukjinKwon comments, - `CaseInsensitiveMap` is already using modified `originalMap` at `-` operation. Technically, `originalMap` is still immutable, and `+` operation can use modified `originalMap` too in the same reason. In short, both `-` and `+` operation should return the desirable map object for case-insensitivity with this kind of `indeterministic` behavior. Also, this PR makes the class more consistent. - In general, `DataFrameReader` [example](https://github.com/apache/spark/pull/29172#issuecomment-661604611) is orthogonal to this PR because this PR is focusing on `CaseInsensitiveMap` only. - For the example, I believe it's not a good practice. For that alternative approach, we need to add an implicit assumption into the Apache Spark code-base. And, it'll be fragile when we add a new code. - We should warn that `.toMap` should not be used always. - All concatenation should start with `collection.immutable.ListMap.empty[String, String]` always. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #29154: [MINOR][DOCS] add link for Debugging your Application in running-on-yarn.html#launching-spark-on-yarn
SparkQA commented on pull request #29154: URL: https://github.com/apache/spark/pull/29154#issuecomment-661616997 **[Test build #126217 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/126217/testReport)** for PR 29154 at commit [`96fa931`](https://github.com/apache/spark/commit/96fa9318bfb2a4d94f0f2347f4f2b86e81940143). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] dongjoon-hyun edited a comment on pull request #29172: [SPARK-32377][SQL] CaseInsensitiveMap should be deterministic for addition
dongjoon-hyun edited a comment on pull request #29172: URL: https://github.com/apache/spark/pull/29172#issuecomment-661616433 For @HyukjinKwon comments, - `CaseInsensitiveMap` is already modifying `originalMap` at `-` operation. Technically, `originalMap` is immutable, but `+` operation can modify it too in the same reason. In short, both `-` and `+` operation should return the desirable map object for case-insensitivity. Also, this PR makes the class more consistent. - In general, `DataFrameReader` [example](https://github.com/apache/spark/pull/29172#issuecomment-661604611) is orthogonal to this PR because this PR is focusing on `CaseInsensitiveMap` only. - For the example, I believe it's not a good practice. For that alternative approach, we need to add an implicit assumption into the Apache Spark code-base. And, it'll be fragile when we add a new code. - We should warn that `.toMap` should not be used always. - All concatenation should start with `collection.immutable.ListMap.empty[String, String]` always. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] dongjoon-hyun edited a comment on pull request #29172: [SPARK-32377][SQL] CaseInsensitiveMap should be deterministic for addition
dongjoon-hyun edited a comment on pull request #29172: URL: https://github.com/apache/spark/pull/29172#issuecomment-661616433 For @HyukjinKwon comments, - `CaseInsensitiveMap` is already using modified `originalMap` at `-` operation. Technically, `originalMap` is immutable, but `+` operation can modify it too in the same reason. In short, both `-` and `+` operation should return the desirable map object for case-insensitivity. Also, this PR makes the class more consistent. - In general, `DataFrameReader` [example](https://github.com/apache/spark/pull/29172#issuecomment-661604611) is orthogonal to this PR because this PR is focusing on `CaseInsensitiveMap` only. - For the example, I believe it's not a good practice. For that alternative approach, we need to add an implicit assumption into the Apache Spark code-base. And, it'll be fragile when we add a new code. - We should warn that `.toMap` should not be used always. - All concatenation should start with `collection.immutable.ListMap.empty[String, String]` always. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] HyukjinKwon commented on pull request #29165: [SPARK-32368][SQL] pathGlobFilter, recursiveFileLookup and basePath should respect case insensitivity
HyukjinKwon commented on pull request #29165: URL: https://github.com/apache/spark/pull/29165#issuecomment-661615530 Thank you @dongjoon-hyun ! This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] HyukjinKwon commented on pull request #29171: Spark works despite SSL certificate in keystore has expired
HyukjinKwon commented on pull request #29171: URL: https://github.com/apache/spark/pull/29171#issuecomment-661614936 Please file an issue in JIRA, and read http://spark.apache.org/contributing.html This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] HyukjinKwon closed pull request #29171: Spark works despite SSL certificate in keystore has expired
HyukjinKwon closed pull request #29171: URL: https://github.com/apache/spark/pull/29171 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] HyukjinKwon commented on pull request #29158: [SPARK-32362][SQL][TEST] AdaptiveQueryExecSuite misses verifying AE results
HyukjinKwon commented on pull request #29158: URL: https://github.com/apache/spark/pull/29158#issuecomment-661614495 Merged to master and branch-3.0. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #29154: [MINOR][DOCS] add link for Debugging your Application in running-on-yarn.html#launching-spark-on-yarn
AmplabJenkins removed a comment on pull request #29154: URL: https://github.com/apache/spark/pull/29154#issuecomment-661614373 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #29154: [MINOR][DOCS] add link for Debugging your Application in running-on-yarn.html#launching-spark-on-yarn
AmplabJenkins commented on pull request #29154: URL: https://github.com/apache/spark/pull/29154#issuecomment-661614373 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] HyukjinKwon closed pull request #29158: [SPARK-32362][SQL][TEST] AdaptiveQueryExecSuite misses verifying AE results
HyukjinKwon closed pull request #29158: URL: https://github.com/apache/spark/pull/29158 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #29154: [MINOR][DOCS] add link for Debugging your Application in running-on-yarn.html#launching-spark-on-yarn
SparkQA commented on pull request #29154: URL: https://github.com/apache/spark/pull/29154#issuecomment-661613910 **[Test build #126217 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/126217/testReport)** for PR 29154 at commit [`96fa931`](https://github.com/apache/spark/commit/96fa9318bfb2a4d94f0f2347f4f2b86e81940143). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] HyukjinKwon commented on pull request #29154: [MINOR][DOCS] add link for Debugging your Application in running-on-yarn.html#launching-spark-on-yarn
HyukjinKwon commented on pull request #29154: URL: https://github.com/apache/spark/pull/29154#issuecomment-661613716 Jenkins test this please This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] HyukjinKwon commented on pull request #29169: [SPARK-32357][INFRA] Add a step in GitHub Actions to show failed tests
HyukjinKwon commented on pull request #29169: URL: https://github.com/apache/spark/pull/29169#issuecomment-661612967 Let me take a quick look if there are any other good easy way to do this. If there' no, let's go for this way This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] dongjoon-hyun commented on pull request #29117: [SPARK-32363][PYTHON][BUILD] Fix flakiness in pip package testing in Jenkins
dongjoon-hyun commented on pull request #29117: URL: https://github.com/apache/spark/pull/29117#issuecomment-661612523 +1 for the decision. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] HyukjinKwon commented on pull request #29169: [SPARK-32357][INFRA] Add a step in GitHub Actions to show failed tests
HyukjinKwon commented on pull request #29169: URL: https://github.com/apache/spark/pull/29169#issuecomment-661612446 Yeah, +1 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #29117: [SPARK-32363][PYTHON][BUILD] Fix flakiness in pip package testing in Jenkins
AmplabJenkins removed a comment on pull request #29117: URL: https://github.com/apache/spark/pull/29117#issuecomment-661611893 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #29117: [SPARK-32363][PYTHON][BUILD] Fix flakiness in pip package testing in Jenkins
AmplabJenkins commented on pull request #29117: URL: https://github.com/apache/spark/pull/29117#issuecomment-661611893 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] HyukjinKwon commented on pull request #29117: [SPARK-32363][PYTHON][BUILD] Fix flakiness in pip package testing in Jenkins
HyukjinKwon commented on pull request #29117: URL: https://github.com/apache/spark/pull/29117#issuecomment-661611708 Thank you @srowen and @dongjoon-hyun. I am pretty sure it's going to prevent or reduce the flakiness. I will merge as soon as the tests pass. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #29117: [SPARK-32363][PYTHON][BUILD] Fix flakiness in pip package testing in Jenkins
SparkQA commented on pull request #29117: URL: https://github.com/apache/spark/pull/29117#issuecomment-661611654 **[Test build #126216 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/126216/testReport)** for PR 29117 at commit [`51187d1`](https://github.com/apache/spark/commit/51187d1e012ea8e6259492d125037a32ea75c1f1). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] HyukjinKwon commented on a change in pull request #29117: [SPARK-32363][PYTHON][BUILD] Fix flakiness in pip package testing in Jenkins
HyukjinKwon commented on a change in pull request #29117: URL: https://github.com/apache/spark/pull/29117#discussion_r457816625 ## File path: dev/run-pip-tests ## @@ -63,10 +63,9 @@ fi PYSPARK_VERSION=$(python3 -c "exec(open('python/pyspark/version.py').read());print(__version__)") PYSPARK_DIST="$FWDIR/python/dist/pyspark-$PYSPARK_VERSION.tar.gz" # The pip install options we use for all the pip commands -PIP_OPTIONS="--user --upgrade --no-cache-dir --force-reinstall " +PIP_OPTIONS="--upgrade --no-cache-dir --force-reinstall --ignore-installed" # Test both regular user and edit/dev install modes. -PIP_COMMANDS=("pip install $PIP_OPTIONS $PYSPARK_DIST" - "pip install $PIP_OPTIONS -e python/") +PIP_COMMANDS=("pip install $PIP_OPTIONS $PYSPARK_DIST") Review comment: I think we don't have any documentation about editable mode for dev purpose in PySpark. I will remove this for now. I believe the installation test from the official distribute is good enough. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org