[GitHub] [spark] cloud-fan commented on a change in pull request #25601: [SPARK-28856][SQL] Implement SHOW DATABASES for Data Source V2 Tables
cloud-fan commented on a change in pull request #25601: [SPARK-28856][SQL] Implement SHOW DATABASES for Data Source V2 Tables URL: https://github.com/apache/spark/pull/25601#discussion_r320583736 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2/ShowNamespacesExec.scala ## @@ -0,0 +1,64 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.spark.sql.execution.datasources.v2 + +import scala.collection.mutable.ArrayBuffer + +import org.apache.spark.rdd.RDD +import org.apache.spark.sql.catalog.v2.CatalogV2Implicits.NamespaceHelper +import org.apache.spark.sql.catalog.v2.SupportsNamespaces +import org.apache.spark.sql.catalyst.InternalRow +import org.apache.spark.sql.catalyst.encoders.RowEncoder +import org.apache.spark.sql.catalyst.expressions.{Attribute, GenericRowWithSchema} +import org.apache.spark.sql.catalyst.util.StringUtils +import org.apache.spark.sql.execution.LeafExecNode + +/** + * Physical plan node for showing namespaces. + */ +case class ShowNamespacesExec( +output: Seq[Attribute], +catalog: SupportsNamespaces, +namespace: Option[Seq[String]], +pattern: Option[String]) +extends LeafExecNode { + override protected def doExecute(): RDD[InternalRow] = { +val namespaces = namespace.map{ ns => +if (ns.nonEmpty) { Review comment: > should we list the root namespaces or call listNamespaces with an empty array? I think these 2 are the same? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on issue #25542: [SPARK-28840][SQL][test-hadoop3.2] conf.getClassLoader in SparkSQLCLIDriver should be avoided as it returns the UDFClassLoader which is created by H
SparkQA commented on issue #25542: [SPARK-28840][SQL][test-hadoop3.2] conf.getClassLoader in SparkSQLCLIDriver should be avoided as it returns the UDFClassLoader which is created by Hive URL: https://github.com/apache/spark/pull/25542#issuecomment-527753394 **[Test build #110096 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/110096/testReport)** for PR 25542 at commit [`b87a687`](https://github.com/apache/spark/commit/b87a68727cd768f9826aafc8bb0fa4e0c41c3ae9). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] cloud-fan commented on a change in pull request #25601: [SPARK-28856][SQL] Implement SHOW DATABASES for Data Source V2 Tables
cloud-fan commented on a change in pull request #25601: [SPARK-28856][SQL] Implement SHOW DATABASES for Data Source V2 Tables URL: https://github.com/apache/spark/pull/25601#discussion_r320583051 ## File path: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/parser/DDLParserSuite.scala ## @@ -779,6 +779,21 @@ class DDLParserSuite extends AnalysisTest { ShowTablesStatement(Some(Seq("tbl")), Some("*dog*"))) } + test("show namespaces") { Review comment: cc @xianyinxin can you add similar parser tests for DELETE/UPDATE as well? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] cloud-fan commented on a change in pull request #25512: [SPARK-28782][SQL] Generator support in aggregate expressions
cloud-fan commented on a change in pull request #25512: [SPARK-28782][SQL] Generator support in aggregate expressions URL: https://github.com/apache/spark/pull/25512#discussion_r320582127 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala ## @@ -2018,6 +2018,58 @@ class Analyzer( throw new AnalysisException("Only one generator allowed per select clause but found " + generators.size + ": " + generators.map(toPrettySQL).mkString(", ")) + case Aggregate(_, aggList, _) if aggList.exists(hasNestedGenerator) => +val nestedGenerator = aggList.find(hasNestedGenerator).get +throw new AnalysisException("Generators are not supported when it's nested in " + + "expressions, but got: " + toPrettySQL(trimAlias(nestedGenerator))) + + case Aggregate(_, aggList, _) if aggList.count(hasGenerator) > 1 => +val generators = aggList.filter(hasGenerator).map(trimAlias) +throw new AnalysisException("Only one generator allowed per aggregate clause but found " + + generators.size + ": " + generators.map(toPrettySQL).mkString(", ")) + + case agg @ Aggregate(groupList, aggList, child) if aggList.forall { + case AliasedGenerator(generator, _, _) => generator.resolved Review comment: nit: `case AliasedGenerator(_, _, _) => true`. `AliasedGenerator` guarantees the generator is resolved. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] maropu commented on a change in pull request #25602: [SPARK-28613][SQL] Add config option for limiting uncompressed result size in SQL
maropu commented on a change in pull request #25602: [SPARK-28613][SQL] Add config option for limiting uncompressed result size in SQL URL: https://github.com/apache/spark/pull/25602#discussion_r320581474 ## File path: docs/configuration.md ## @@ -163,6 +163,16 @@ of the most common options to set are: out-of-memory errors. + + spark.sql.driver.maxUncompressedResultSize Review comment: How about the name `maxCollectSize`? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] maropu commented on a change in pull request #25602: [SPARK-28613][SQL] Add config option for limiting uncompressed result size in SQL
maropu commented on a change in pull request #25602: [SPARK-28613][SQL] Add config option for limiting uncompressed result size in SQL URL: https://github.com/apache/spark/pull/25602#discussion_r320580346 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/SizeLimitingByteArrayDecoder.scala ## @@ -0,0 +1,80 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.spark.sql.execution + +import java.io.{ByteArrayInputStream, DataInputStream} + +import org.apache.spark.{SparkEnv, SparkException} +import org.apache.spark.internal.{config, Logging} +import org.apache.spark.io.CompressionCodec +import org.apache.spark.sql.catalyst.InternalRow +import org.apache.spark.sql.catalyst.expressions.UnsafeRow +import org.apache.spark.sql.internal.SQLConf +import org.apache.spark.util.Utils + +/** + * Provides methods for converting compressed byte arrays back to UnsafeRows. + * Additionally, can enforce a limit on the total, decoded size of all decoded UnsafeRows. + * Enforcing the limit is controlled via a sql config and if it is turned on the encoder will + * throw a SparkException when the limit is reached. + */ +private[spark] class SizeLimitingByteArrayDecoder( + nFields: Int, + sqlConf: SQLConf) extends Logging { + private var totalUncompressedResultSize = 0L + private val maxUncompressedResultSize = sqlConf.maxUncompressedResultSize + + /** + * Decodes the byte arrays back to UnsafeRows and puts them into buffer. + */ + def decodeUnsafeRows(bytes: Array[Byte]): Iterator[InternalRow] = { +val codec = CompressionCodec.createCodec(SparkEnv.get.conf) +val bis = new ByteArrayInputStream(bytes) +val ins = new DataInputStream(codec.compressedInputStream(bis)) + +new Iterator[InternalRow] { + private var sizeOfNextRow = ins.readInt() + + override def hasNext: Boolean = sizeOfNextRow >= 0 + + override def next(): InternalRow = { +ensureCanFetchMoreResults(sizeOfNextRow) +val bs = new Array[Byte](sizeOfNextRow) +ins.readFully(bs) +val row = new UnsafeRow(nFields) +row.pointTo(bs, sizeOfNextRow) +sizeOfNextRow = ins.readInt() +row + } +} + } + + private def ensureCanFetchMoreResults(sizeOfNextRow: Int): Unit = { +totalUncompressedResultSize += sizeOfNextRow Review comment: We cannot check the actual data size before encoding for collect? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] cloud-fan commented on a change in pull request #25601: [SPARK-28856][SQL] Implement SHOW DATABASES for Data Source V2 Tables
cloud-fan commented on a change in pull request #25601: [SPARK-28856][SQL] Implement SHOW DATABASES for Data Source V2 Tables URL: https://github.com/apache/spark/pull/25601#discussion_r320578981 ## File path: docs/sql-keywords.md ## @@ -179,6 +179,7 @@ Below is a list of all the keywords in Spark SQL. MONTHreservednon-reservedreserved MONTHSnon-reservednon-reservednon-reserved MSCKnon-reservednon-reservednon-reserved + NAMESPACESnon-reservednon-reservednon-reserved Review comment: cc @xianyinxin , we should also add DELETE and UPDATE. Can you open a PR to do it? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on issue #25542: [SPARK-28840][SQL][test-hadoop3.2] conf.getClassLoader in SparkSQLCLIDriver should be avoided as it returns the UDFClassLoader which is created by H
SparkQA commented on issue #25542: [SPARK-28840][SQL][test-hadoop3.2] conf.getClassLoader in SparkSQLCLIDriver should be avoided as it returns the UDFClassLoader which is created by Hive URL: https://github.com/apache/spark/pull/25542#issuecomment-527747604 **[Test build #110095 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/110095/testReport)** for PR 25542 at commit [`873cdf4`](https://github.com/apache/spark/commit/873cdf452484e2f1eef4b31b76049ea8216a3325). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #25542: [SPARK-28840][SQL][test-hadoop3.2] conf.getClassLoader in SparkSQLCLIDriver should be avoided as it returns the UDFClassLoader which i
AmplabJenkins removed a comment on issue #25542: [SPARK-28840][SQL][test-hadoop3.2] conf.getClassLoader in SparkSQLCLIDriver should be avoided as it returns the UDFClassLoader which is created by Hive URL: https://github.com/apache/spark/pull/25542#issuecomment-527747091 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/15109/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #25542: [SPARK-28840][SQL][test-hadoop3.2] conf.getClassLoader in SparkSQLCLIDriver should be avoided as it returns the UDFClassLoader which is create
AmplabJenkins commented on issue #25542: [SPARK-28840][SQL][test-hadoop3.2] conf.getClassLoader in SparkSQLCLIDriver should be avoided as it returns the UDFClassLoader which is created by Hive URL: https://github.com/apache/spark/pull/25542#issuecomment-527747091 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/15109/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #25542: [SPARK-28840][SQL][test-hadoop3.2] conf.getClassLoader in SparkSQLCLIDriver should be avoided as it returns the UDFClassLoader which i
AmplabJenkins removed a comment on issue #25542: [SPARK-28840][SQL][test-hadoop3.2] conf.getClassLoader in SparkSQLCLIDriver should be avoided as it returns the UDFClassLoader which is created by Hive URL: https://github.com/apache/spark/pull/25542#issuecomment-527747082 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #25542: [SPARK-28840][SQL][test-hadoop3.2] conf.getClassLoader in SparkSQLCLIDriver should be avoided as it returns the UDFClassLoader which is create
AmplabJenkins commented on issue #25542: [SPARK-28840][SQL][test-hadoop3.2] conf.getClassLoader in SparkSQLCLIDriver should be avoided as it returns the UDFClassLoader which is created by Hive URL: https://github.com/apache/spark/pull/25542#issuecomment-527747082 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] maropu edited a comment on issue #25464: [SPARK-28746][SQL] Add partitionby hint for sql queries
maropu edited a comment on issue #25464: [SPARK-28746][SQL] Add partitionby hint for sql queries URL: https://github.com/apache/spark/pull/25464#issuecomment-527742988 ``` - df.hint("REPARTITION", 1, $"c".expr) ``` We need `.expr`? I think this is not intuitive for users... Can you try https://github.com/apache/spark/pull/25464#discussion_r319788105? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] maropu commented on issue #25464: [SPARK-28746][SQL] Add partitionby hint for sql queries
maropu commented on issue #25464: [SPARK-28746][SQL] Add partitionby hint for sql queries URL: https://github.com/apache/spark/pull/25464#issuecomment-527742988 ``` - df.hint("REPARTITION", 1, $"c".expr) ``` We need `.expr`? I think this is not intuitive for users... Can you support `df.hint("REPARTITION", 1, $"c")`? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] wypoon commented on a change in pull request #25659: [SPARK-28770][CORE][TESTS]Ignore SparkListenerStageExecutorMetrics in testApplicationReplay test
wypoon commented on a change in pull request #25659: [SPARK-28770][CORE][TESTS]Ignore SparkListenerStageExecutorMetrics in testApplicationReplay test URL: https://github.com/apache/spark/pull/25659#discussion_r320574368 ## File path: core/src/test/scala/org/apache/spark/scheduler/ReplayListenerSuite.scala ## @@ -218,8 +219,17 @@ class ReplayListenerSuite extends SparkFunSuite with BeforeAndAfter with LocalSp // Verify the same events are replayed in the same order assert(sc.eventLogger.isDefined) -val originalEvents = sc.eventLogger.get.loggedEvents -val replayedEvents = eventMonster.loggedEvents +implicit val format = DefaultFormats +def exceptCase(a: JValue) = ( a \ "Event").extract[String] match { + // If we are logging stage executor metrics, there is a bulk call to logEvent with + // SparkListenerStageExecutorMetrics events via a Map.foreach. The Map.foreach bulk + // operation may not log the events with the same order. So here we should not compare + // SparkListenerStageExecutorMetrics here. Review comment: This comment is based on my earlier analysis in the JIRA, which is not quite correct. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] wypoon commented on issue #25659: [SPARK-28770][CORE][TESTS]Ignore SparkListenerStageExecutorMetrics in testApplicationReplay test
wypoon commented on issue #25659: [SPARK-28770][CORE][TESTS]Ignore SparkListenerStageExecutorMetrics in testApplicationReplay test URL: https://github.com/apache/spark/pull/25659#issuecomment-527741025 `SparkListenerStageExecutorMetrics` were introduced in SPARK-23429 (https://github.com/apache/spark/pull/21221/). By design, executor metrics update events are not logged by the `EventLoggingListener`. Instead, the listener keeps track of the per stage peaks for any of the executors and the driver for which it has received metrics. On stage completion, the peaks for the stage are logged in `SparkListenerStageExecutorMetrics` events for each of these executors and driver. Since executor metrics update events are not logged in the event log, they do not get replayed. Thus the listener for the replay never sees metrics updates. It is therefore valid to exclude `SparkListenerStageExecutorMetrics` events from both the original and the replay for the purpose of comparison. However, instead of excluding all `SparkListenerStageExecutorMetrics` events from both the original `EventLoggingListener` and the replay listener, we can have a finer-grained fix, which I have proposed in https://github.com/apache/spark/pull/25673/ for comparison. It should be sufficient to exclude any `SparkListenerStageExecutorMetrics` events for the driver. This is because with SPARK-26329 (https://github.com/apache/spark/pull/23767/), executor metrics are also sent in task end events (which do get replayed), so the `EventLoggingListener` always receives metrics for the executors (just not necessarily for the driver), and thus `SparkListenerStageExecutorMetrics` events for the executors always get logged. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] maropu commented on issue #25642: [SPARK-28916][SQL] Split subexpression elimination functions code
maropu commented on issue #25642: [SPARK-28916][SQL] Split subexpression elimination functions code URL: https://github.com/apache/spark/pull/25642#issuecomment-527739492 To make the title more precise, how about adding `For Generate[Mutable|Unsafe]Projection` in it? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] HeartSaVioR commented on issue #23952: [SPARK-26929][SQL]fix table owner use user instead of principal when create table through spark-sql or beeline
HeartSaVioR commented on issue #23952: [SPARK-26929][SQL]fix table owner use user instead of principal when create table through spark-sql or beeline URL: https://github.com/apache/spark/pull/23952#issuecomment-527739361 I tend to agree, just wanted to say without UT we may have regression even after this patch. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] maropu commented on a change in pull request #25642: [SPARK-28916][SQL] Split subexpression elimination functions code
maropu commented on a change in pull request #25642: [SPARK-28916][SQL] Split subexpression elimination functions code URL: https://github.com/apache/spark/pull/25642#discussion_r320571433 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/codegen/CodeGenerator.scala ## @@ -993,6 +994,14 @@ class CodegenContext { } } + /** + * Returns the code for subexpression elimination after splitting it if necessary. + */ + def subexprFunctionsCode: String = { +// Wholestage codegen does not allow subexpression elimination Review comment: If so, how about `assert(currentVars != null && subexprFunctions.isEmpty)` for strict checks? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on issue #22138: [SPARK-25151][SS] Apply Apache Commons Pool to KafkaDataConsumer
SparkQA commented on issue #22138: [SPARK-25151][SS] Apply Apache Commons Pool to KafkaDataConsumer URL: https://github.com/apache/spark/pull/22138#issuecomment-527738983 **[Test build #110094 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/110094/testReport)** for PR 22138 at commit [`74a6cbf`](https://github.com/apache/spark/commit/74a6cbfbcb60b53fae62ff2111d85539d81130f6). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #25653: [SPARK-28954][SQL] In SparkSQL CLI, pass extra jar through hive hive conf HIVEAUXJARS, we just use SessionResourceLoader API to cover multi-v
AmplabJenkins commented on issue #25653: [SPARK-28954][SQL] In SparkSQL CLI, pass extra jar through hive hive conf HIVEAUXJARS, we just use SessionResourceLoader API to cover multi-version problem URL: https://github.com/apache/spark/pull/25653#issuecomment-527738022 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/110091/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #25653: [SPARK-28954][SQL] In SparkSQL CLI, pass extra jar through hive hive conf HIVEAUXJARS, we just use SessionResourceLoader API to cover
AmplabJenkins removed a comment on issue #25653: [SPARK-28954][SQL] In SparkSQL CLI, pass extra jar through hive hive conf HIVEAUXJARS, we just use SessionResourceLoader API to cover multi-version problem URL: https://github.com/apache/spark/pull/25653#issuecomment-527738022 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/110091/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #25653: [SPARK-28954][SQL] In SparkSQL CLI, pass extra jar through hive hive conf HIVEAUXJARS, we just use SessionResourceLoader API to cover
AmplabJenkins removed a comment on issue #25653: [SPARK-28954][SQL] In SparkSQL CLI, pass extra jar through hive hive conf HIVEAUXJARS, we just use SessionResourceLoader API to cover multi-version problem URL: https://github.com/apache/spark/pull/25653#issuecomment-527738019 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #25653: [SPARK-28954][SQL] In SparkSQL CLI, pass extra jar through hive hive conf HIVEAUXJARS, we just use SessionResourceLoader API to cover multi-v
AmplabJenkins commented on issue #25653: [SPARK-28954][SQL] In SparkSQL CLI, pass extra jar through hive hive conf HIVEAUXJARS, we just use SessionResourceLoader API to cover multi-version problem URL: https://github.com/apache/spark/pull/25653#issuecomment-527738019 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on issue #25653: [SPARK-28954][SQL] In SparkSQL CLI, pass extra jar through hive hive conf HIVEAUXJARS, we just use SessionResourceLoader API to cover multi-version
SparkQA commented on issue #25653: [SPARK-28954][SQL] In SparkSQL CLI, pass extra jar through hive hive conf HIVEAUXJARS, we just use SessionResourceLoader API to cover multi-version problem URL: https://github.com/apache/spark/pull/25653#issuecomment-527737941 **[Test build #110091 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/110091/testReport)** for PR 25653 at commit [`2d32ea3`](https://github.com/apache/spark/commit/2d32ea36f3cd05016854fbf421881559fc405a26). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA removed a comment on issue #25653: [SPARK-28954][SQL] In SparkSQL CLI, pass extra jar through hive hive conf HIVEAUXJARS, we just use SessionResourceLoader API to cover multi
SparkQA removed a comment on issue #25653: [SPARK-28954][SQL] In SparkSQL CLI, pass extra jar through hive hive conf HIVEAUXJARS, we just use SessionResourceLoader API to cover multi-version problem URL: https://github.com/apache/spark/pull/25653#issuecomment-527734031 **[Test build #110091 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/110091/testReport)** for PR 25653 at commit [`2d32ea3`](https://github.com/apache/spark/commit/2d32ea36f3cd05016854fbf421881559fc405a26). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] maropu commented on issue #25642: [SPARK-28916][SQL] Split subexpression elimination functions code
maropu commented on issue #25642: [SPARK-28916][SQL] Split subexpression elimination functions code URL: https://github.com/apache/spark/pull/25642#issuecomment-527737573 btw, (this is a off-topic though) the HashAggregateExec code for common subexpr elimination has the same issue? That also expands all generated the code for CSE in a single method now; https://github.com/apache/spark/blob/master/sql/core/src/main/scala/org/apache/spark/sql/execution/aggregate/HashAggregateExec.scala#L287 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] maropu commented on a change in pull request #25642: [SPARK-28916][SQL] Split subexpression elimination functions code
maropu commented on a change in pull request #25642: [SPARK-28916][SQL] Split subexpression elimination functions code URL: https://github.com/apache/spark/pull/25642#discussion_r320570104 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/codegen/CodeGenerator.scala ## @@ -993,6 +994,14 @@ class CodegenContext { } } + /** + * Returns the code for subexpression elimination after splitting it if necessary. + */ + def subexprFunctionsCode: String = { +// Wholestage codegen does not allow subexpression elimination Review comment: btw, (this is a off-topic though) the `HashAggregateExec` code for common subexpr elimination has the same issue? That also expands all generated the code for CSE in a single method now; https://github.com/apache/spark/blob/master/sql/core/src/main/scala/org/apache/spark/sql/execution/aggregate/HashAggregateExec.scala#L287 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] maropu commented on a change in pull request #25642: [SPARK-28916][SQL] Split subexpression elimination functions code
maropu commented on a change in pull request #25642: [SPARK-28916][SQL] Split subexpression elimination functions code URL: https://github.com/apache/spark/pull/25642#discussion_r320570104 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/codegen/CodeGenerator.scala ## @@ -993,6 +994,14 @@ class CodegenContext { } } + /** + * Returns the code for subexpression elimination after splitting it if necessary. + */ + def subexprFunctionsCode: String = { +// Wholestage codegen does not allow subexpression elimination Review comment: btw, (this is a off-topic though) the `HashAggregateExec` code for common subexpr elimination has the same issue? That also expands all generated the code for CSE in a single method now; https://github.com/apache/spark/blob/master/sql/core/src/main/scala/org/apache/spark/sql/execution/aggregate/HashAggregateExec.scala#L287 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] HeartSaVioR commented on a change in pull request #22138: [SPARK-25151][SS] Apply Apache Commons Pool to KafkaDataConsumer
HeartSaVioR commented on a change in pull request #22138: [SPARK-25151][SS] Apply Apache Commons Pool to KafkaDataConsumer URL: https://github.com/apache/spark/pull/22138#discussion_r320569961 ## File path: docs/structured-streaming-kafka-integration.md ## @@ -430,20 +430,70 @@ The following configurations are optional: ### Consumer Caching It's time-consuming to initialize Kafka consumers, especially in streaming scenarios where processing time is a key factor. -Because of this, Spark caches Kafka consumers on executors. The caching key is built up from the following information: +Because of this, Spark pools Kafka consumers on executors, by leveraging Apache Commons Pool. + +The caching key is built up from the following information: + * Topic name * Topic partition * Group ID -The size of the cache is limited by spark.kafka.consumer.cache.capacity (default: 64). -If this threshold is reached, it tries to remove the least-used entry that is currently not in use. -If it cannot be removed, then the cache will keep growing. In the worst case, the cache will grow to -the max number of concurrent tasks that can run in the executor (that is, number of tasks slots), -after which it will never reduce. +The following properties are available to configure the consumer pool: Review comment: Same missing. Will fix. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] HeartSaVioR commented on a change in pull request #22138: [SPARK-25151][SS] Apply Apache Commons Pool to KafkaDataConsumer
HeartSaVioR commented on a change in pull request #22138: [SPARK-25151][SS] Apply Apache Commons Pool to KafkaDataConsumer URL: https://github.com/apache/spark/pull/22138#discussion_r320569927 ## File path: docs/structured-streaming-kafka-integration.md ## @@ -430,20 +430,70 @@ The following configurations are optional: ### Consumer Caching It's time-consuming to initialize Kafka consumers, especially in streaming scenarios where processing time is a key factor. -Because of this, Spark caches Kafka consumers on executors. The caching key is built up from the following information: +Because of this, Spark pools Kafka consumers on executors, by leveraging Apache Commons Pool. + +The caching key is built up from the following information: + * Topic name * Topic partition * Group ID -The size of the cache is limited by spark.kafka.consumer.cache.capacity (default: 64). -If this threshold is reached, it tries to remove the least-used entry that is currently not in use. -If it cannot be removed, then the cache will keep growing. In the worst case, the cache will grow to -the max number of concurrent tasks that can run in the executor (that is, number of tasks slots), -after which it will never reduce. +The following properties are available to configure the consumer pool: + + +Property NameDefaultMeaning + + spark.kafka.consumer.cache.capacity + The maximum number of consumers cached. Please note that it's a soft limit. + 64 + + + spark.kafka.consumer.cache.timeout + The minimum amount of time a consumer may sit idle in the pool before it is eligible for eviction by the evictor. + 5m (5 minutes) + + + spark.kafka.consumer.cache.jmx.enable + Enable or disable JMX for pools created with this configuration instance. Statistics of the pool are available via JMX instance. + The prefix of JMX name is set to "kafka010-cached-simple-kafka-consumer-pool". + + false + + + +The size of the pool is limited by spark.kafka.consumer.cache.capacity, +but it works as "soft-limit" to not block Spark tasks. + +Idle eviction thread periodically removes some consumers which are not used. If this threshold is reached when borrowing, +it tries to remove the least-used entry that is currently not in use. + +If it cannot be removed, then the pool will keep growing. In the worst case, the pool will grow to +the max number of concurrent tasks that can run in the executor (that is, number of tasks slots). + +If a task fails for any reason, the new task is executed with a newly created Kafka consumer for safety reasons. +At the same time, we invalidate all consumers in pool which have same caching key, to remove consumer which was used +in failed execution. Consumers which any other tasks are using will not be closed, but will be invalidated as well +when they are returned into pool. -If a task fails for any reason the new task is executed with a newly created Kafka consumer for safety reasons. -At the same time the cached Kafka consumer which was used in the failed execution will be invalidated. Here it has to -be emphasized it will not be closed if any other task is using it. +Along with consumers, Spark pools the records fetched from Kafka separately, to let Kafka consumers stateless in point +of Spark's view, and maximize the efficiency of pooling. It leverages same cache key with Kafka consumers pool. +Note that it doesn't leverage Apache Commons Pool due to the difference of characteristics. + +The following properties are available to configure the fetched data pool: + + +Property NameDefaultMeaning + + spark.kafka.consumer.fetchedData.cache.timeout Review comment: My bad - just a copy-and-paste error. Will fix all missing things. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #25673: [SPARK-28770][CORE][TEST] Fix ReplayListenerSuite tests that sometime…
AmplabJenkins removed a comment on issue #25673: [SPARK-28770][CORE][TEST] Fix ReplayListenerSuite tests that sometime… URL: https://github.com/apache/spark/pull/25673#issuecomment-527736573 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #25673: [SPARK-28770][CORE][TEST] Fix ReplayListenerSuite tests that sometime…
AmplabJenkins removed a comment on issue #25673: [SPARK-28770][CORE][TEST] Fix ReplayListenerSuite tests that sometime… URL: https://github.com/apache/spark/pull/25673#issuecomment-527736576 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/15108/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] sujith71955 commented on issue #24903: [SPARK-28084][SQL] Resolving the partition column name based on the resolver in sql load command
sujith71955 commented on issue #24903: [SPARK-28084][SQL] Resolving the partition column name based on the resolver in sql load command URL: https://github.com/apache/spark/pull/24903#issuecomment-527736816 gentle ping @dongjoon-hyun @maropu @dilipbiswal @HyukjinKwon This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #25673: [SPARK-28770][CORE][TEST] Fix ReplayListenerSuite tests that sometime…
AmplabJenkins commented on issue #25673: [SPARK-28770][CORE][TEST] Fix ReplayListenerSuite tests that sometime… URL: https://github.com/apache/spark/pull/25673#issuecomment-527736576 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/15108/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #25673: [SPARK-28770][CORE][TEST] Fix ReplayListenerSuite tests that sometime…
AmplabJenkins commented on issue #25673: [SPARK-28770][CORE][TEST] Fix ReplayListenerSuite tests that sometime… URL: https://github.com/apache/spark/pull/25673#issuecomment-527736573 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] maropu commented on a change in pull request #25642: [SPARK-28916][SQL] Split subexpression elimination functions code
maropu commented on a change in pull request #25642: [SPARK-28916][SQL] Split subexpression elimination functions code URL: https://github.com/apache/spark/pull/25642#discussion_r320569117 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/codegen/CodeGenerator.scala ## @@ -993,6 +994,15 @@ class CodegenContext { } } + /** + * Returns the code for subexpression elimination after splitting it if necessary. + */ + def subexprFunctionsCode: String = { +// Wholestage codegen does not allow subexpression elimination: in that case, subexprFunctions Review comment: looks ok, too. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #25601: [SPARK-28856][SQL] Implement SHOW DATABASES for Data Source V2 Tables
AmplabJenkins removed a comment on issue #25601: [SPARK-28856][SQL] Implement SHOW DATABASES for Data Source V2 Tables URL: https://github.com/apache/spark/pull/25601#issuecomment-527735914 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/110083/ Test FAILed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #25601: [SPARK-28856][SQL] Implement SHOW DATABASES for Data Source V2 Tables
AmplabJenkins removed a comment on issue #25601: [SPARK-28856][SQL] Implement SHOW DATABASES for Data Source V2 Tables URL: https://github.com/apache/spark/pull/25601#issuecomment-527735907 Merged build finished. Test FAILed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #25601: [SPARK-28856][SQL] Implement SHOW DATABASES for Data Source V2 Tables
AmplabJenkins commented on issue #25601: [SPARK-28856][SQL] Implement SHOW DATABASES for Data Source V2 Tables URL: https://github.com/apache/spark/pull/25601#issuecomment-527735914 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/110083/ Test FAILed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA removed a comment on issue #25601: [SPARK-28856][SQL] Implement SHOW DATABASES for Data Source V2 Tables
SparkQA removed a comment on issue #25601: [SPARK-28856][SQL] Implement SHOW DATABASES for Data Source V2 Tables URL: https://github.com/apache/spark/pull/25601#issuecomment-527719700 **[Test build #110083 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/110083/testReport)** for PR 25601 at commit [`9f738cf`](https://github.com/apache/spark/commit/9f738cf6708af6f4c5bf8bbaa981d7c69f3bdbd3). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #25601: [SPARK-28856][SQL] Implement SHOW DATABASES for Data Source V2 Tables
AmplabJenkins commented on issue #25601: [SPARK-28856][SQL] Implement SHOW DATABASES for Data Source V2 Tables URL: https://github.com/apache/spark/pull/25601#issuecomment-527735907 Merged build finished. Test FAILed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on issue #25601: [SPARK-28856][SQL] Implement SHOW DATABASES for Data Source V2 Tables
SparkQA commented on issue #25601: [SPARK-28856][SQL] Implement SHOW DATABASES for Data Source V2 Tables URL: https://github.com/apache/spark/pull/25601#issuecomment-527735819 **[Test build #110083 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/110083/testReport)** for PR 25601 at commit [`9f738cf`](https://github.com/apache/spark/commit/9f738cf6708af6f4c5bf8bbaa981d7c69f3bdbd3). * This patch **fails Spark unit tests**. * This patch merges cleanly. * This patch adds no public classes. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] HeartSaVioR commented on issue #25629: [SPARK-28931][CORE][TESTS] Fix couple of bugs in FsHistoryProviderSuite
HeartSaVioR commented on issue #25629: [SPARK-28931][CORE][TESTS] Fix couple of bugs in FsHistoryProviderSuite URL: https://github.com/apache/spark/pull/25629#issuecomment-52773 cc. @felixcheung as this patch is a part of work in SPARK-28869 (found this during working) This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on issue #25601: [SPARK-28856][SQL] Implement SHOW DATABASES for Data Source V2 Tables
SparkQA commented on issue #25601: [SPARK-28856][SQL] Implement SHOW DATABASES for Data Source V2 Tables URL: https://github.com/apache/spark/pull/25601#issuecomment-527735533 **[Test build #110093 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/110093/testReport)** for PR 25601 at commit [`672f526`](https://github.com/apache/spark/commit/672f526c8833c51b18c0c23934050c10e3bb7f5f). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on issue #25673: [SPARK-28770][CORE][TEST] Fix ReplayListenerSuite tests that sometime…
SparkQA commented on issue #25673: [SPARK-28770][CORE][TEST] Fix ReplayListenerSuite tests that sometime… URL: https://github.com/apache/spark/pull/25673#issuecomment-527735516 **[Test build #110092 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/110092/testReport)** for PR 25673 at commit [`0b0adb0`](https://github.com/apache/spark/commit/0b0adb0b7a508f3e1902b838382e478a1435cfde). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #25601: [SPARK-28856][SQL] Implement SHOW DATABASES for Data Source V2 Tables
AmplabJenkins removed a comment on issue #25601: [SPARK-28856][SQL] Implement SHOW DATABASES for Data Source V2 Tables URL: https://github.com/apache/spark/pull/25601#issuecomment-527735062 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #25601: [SPARK-28856][SQL] Implement SHOW DATABASES for Data Source V2 Tables
AmplabJenkins removed a comment on issue #25601: [SPARK-28856][SQL] Implement SHOW DATABASES for Data Source V2 Tables URL: https://github.com/apache/spark/pull/25601#issuecomment-527735066 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/15107/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #25672: [SPARK-28967][CORE] Include cloned version of "properties" to avoid ConcurrentModificationException
AmplabJenkins commented on issue #25672: [SPARK-28967][CORE] Include cloned version of "properties" to avoid ConcurrentModificationException URL: https://github.com/apache/spark/pull/25672#issuecomment-527734989 Can one of the admins verify this patch? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #25601: [SPARK-28856][SQL] Implement SHOW DATABASES for Data Source V2 Tables
AmplabJenkins commented on issue #25601: [SPARK-28856][SQL] Implement SHOW DATABASES for Data Source V2 Tables URL: https://github.com/apache/spark/pull/25601#issuecomment-527735062 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #25601: [SPARK-28856][SQL] Implement SHOW DATABASES for Data Source V2 Tables
AmplabJenkins commented on issue #25601: [SPARK-28856][SQL] Implement SHOW DATABASES for Data Source V2 Tables URL: https://github.com/apache/spark/pull/25601#issuecomment-527735066 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/15107/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #25672: [SPARK-28967][CORE] Include cloned version of "properties" to avoid ConcurrentModificationException
AmplabJenkins removed a comment on issue #25672: [SPARK-28967][CORE] Include cloned version of "properties" to avoid ConcurrentModificationException URL: https://github.com/apache/spark/pull/25672#issuecomment-527734891 Can one of the admins verify this patch? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #25672: [SPARK-28967][CORE] Include cloned version of "properties" to avoid ConcurrentModificationException
AmplabJenkins commented on issue #25672: [SPARK-28967][CORE] Include cloned version of "properties" to avoid ConcurrentModificationException URL: https://github.com/apache/spark/pull/25672#issuecomment-527734891 Can one of the admins verify this patch? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] wypoon opened a new pull request #25673: [SPARK-28770][CORE][TEST] Fix ReplayListenerSuite tests that sometime…
wypoon opened a new pull request #25673: [SPARK-28770][CORE][TEST] Fix ReplayListenerSuite tests that sometime… URL: https://github.com/apache/spark/pull/25673 …s fail. ### What changes were proposed in this pull request? In `testApplicationReplay`, filter out stage executor metrics for the driver from the original application events before comparing with the events logged on replay. ### Why are the changes needed? `testApplicationReplay` fails if the application runs long enough for the driver to send an executor metrics update. This causes stage executor metrics to be written for the driver. However, executor metrics updates are not logged, and thus not replayed. Therefore no stage executor metrics for the driver is logged on replay. ### Does this PR introduce any user-facing change? No. ### How was this patch tested? Existing unit tests. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #25601: [SPARK-28856][SQL] Implement SHOW DATABASES for Data Source V2 Tables
AmplabJenkins removed a comment on issue #25601: [SPARK-28856][SQL] Implement SHOW DATABASES for Data Source V2 Tables URL: https://github.com/apache/spark/pull/25601#issuecomment-527734234 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/110081/ Test FAILed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #25601: [SPARK-28856][SQL] Implement SHOW DATABASES for Data Source V2 Tables
AmplabJenkins removed a comment on issue #25601: [SPARK-28856][SQL] Implement SHOW DATABASES for Data Source V2 Tables URL: https://github.com/apache/spark/pull/25601#issuecomment-527734230 Merged build finished. Test FAILed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA removed a comment on issue #25601: [SPARK-28856][SQL] Implement SHOW DATABASES for Data Source V2 Tables
SparkQA removed a comment on issue #25601: [SPARK-28856][SQL] Implement SHOW DATABASES for Data Source V2 Tables URL: https://github.com/apache/spark/pull/25601#issuecomment-527718215 **[Test build #110081 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/110081/testReport)** for PR 25601 at commit [`ba1e7f4`](https://github.com/apache/spark/commit/ba1e7f477a50408eac0e1795d21dd56ac5c4ed5d). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #25601: [SPARK-28856][SQL] Implement SHOW DATABASES for Data Source V2 Tables
AmplabJenkins commented on issue #25601: [SPARK-28856][SQL] Implement SHOW DATABASES for Data Source V2 Tables URL: https://github.com/apache/spark/pull/25601#issuecomment-527734230 Merged build finished. Test FAILed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #25601: [SPARK-28856][SQL] Implement SHOW DATABASES for Data Source V2 Tables
AmplabJenkins commented on issue #25601: [SPARK-28856][SQL] Implement SHOW DATABASES for Data Source V2 Tables URL: https://github.com/apache/spark/pull/25601#issuecomment-527734234 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/110081/ Test FAILed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on issue #25601: [SPARK-28856][SQL] Implement SHOW DATABASES for Data Source V2 Tables
SparkQA commented on issue #25601: [SPARK-28856][SQL] Implement SHOW DATABASES for Data Source V2 Tables URL: https://github.com/apache/spark/pull/25601#issuecomment-527734126 **[Test build #110081 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/110081/testReport)** for PR 25601 at commit [`ba1e7f4`](https://github.com/apache/spark/commit/ba1e7f477a50408eac0e1795d21dd56ac5c4ed5d). * This patch **fails Spark unit tests**. * This patch merges cleanly. * This patch adds no public classes. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] HeartSaVioR commented on issue #25670: [SPARK-28869][CORE] Roll over event log files
HeartSaVioR commented on issue #25670: [SPARK-28869][CORE] Roll over event log files URL: https://github.com/apache/spark/pull/25670#issuecomment-527734068 Yes, filed https://issues.apache.org/jira/browse/SPARK-28967 and submitted a patch #25672 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on issue #25653: [SPARK-28954][SQL] In SparkSQL CLI, pass extra jar through hive hive conf HIVEAUXJARS, we just use SessionResourceLoader API to cover multi-version
SparkQA commented on issue #25653: [SPARK-28954][SQL] In SparkSQL CLI, pass extra jar through hive hive conf HIVEAUXJARS, we just use SessionResourceLoader API to cover multi-version problem URL: https://github.com/apache/spark/pull/25653#issuecomment-527734031 **[Test build #110091 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/110091/testReport)** for PR 25653 at commit [`2d32ea3`](https://github.com/apache/spark/commit/2d32ea36f3cd05016854fbf421881559fc405a26). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on issue #25672: [SPARK-28967][CORE] Include cloned version of "properties" to avoid ConcurrentModificationException
SparkQA commented on issue #25672: [SPARK-28967][CORE] Include cloned version of "properties" to avoid ConcurrentModificationException URL: https://github.com/apache/spark/pull/25672#issuecomment-527734023 **[Test build #110090 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/110090/testReport)** for PR 25672 at commit [`75dd865`](https://github.com/apache/spark/commit/75dd86553772ca840312e4225be44340f5830f81). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #25653: [SPARK-28954][SQL] In SparkSQL CLI, pass extra jar through hive hive conf HIVEAUXJARS, we just use SessionResourceLoader API to cover
AmplabJenkins removed a comment on issue #25653: [SPARK-28954][SQL] In SparkSQL CLI, pass extra jar through hive hive conf HIVEAUXJARS, we just use SessionResourceLoader API to cover multi-version problem URL: https://github.com/apache/spark/pull/25653#issuecomment-527733690 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] HeartSaVioR opened a new pull request #25672: [SPARK-28967][CORE] Include cloned version of "properties" to avoid ConcurrentModificationException
HeartSaVioR opened a new pull request #25672: [SPARK-28967][CORE] Include cloned version of "properties" to avoid ConcurrentModificationException URL: https://github.com/apache/spark/pull/25672 ### What changes were proposed in this pull request? This patch fixes the bug which throws ConcurrentModificationException when job with 0 partition is submitted via DAGScheduler. ### Why are the changes needed? Without this patch, structured streaming query throws ConcurrentModificationException, like below stack trace: ``` 19/09/04 09:48:49 ERROR AsyncEventQueue: Listener EventLoggingListener threw an exception java.util.ConcurrentModificationException at java.util.Hashtable$Enumerator.next(Hashtable.java:1387) at scala.collection.convert.Wrappers$JPropertiesWrapper$$anon$6.next(Wrappers.scala:424) at scala.collection.convert.Wrappers$JPropertiesWrapper$$anon$6.next(Wrappers.scala:420) at scala.collection.Iterator.foreach(Iterator.scala:941) at scala.collection.Iterator.foreach$(Iterator.scala:941) at scala.collection.AbstractIterator.foreach(Iterator.scala:1429) at scala.collection.IterableLike.foreach(IterableLike.scala:74) at scala.collection.IterableLike.foreach$(IterableLike.scala:73) at scala.collection.AbstractIterable.foreach(Iterable.scala:56) at scala.collection.TraversableLike.map(TraversableLike.scala:237) at scala.collection.TraversableLike.map$(TraversableLike.scala:230) at scala.collection.AbstractTraversable.map(Traversable.scala:108) at org.apache.spark.util.JsonProtocol$.mapToJson(JsonProtocol.scala:514) at org.apache.spark.util.JsonProtocol$.$anonfun$propertiesToJson$1(JsonProtocol.scala:520) at scala.Option.map(Option.scala:163) at org.apache.spark.util.JsonProtocol$.propertiesToJson(JsonProtocol.scala:519) at org.apache.spark.util.JsonProtocol$.jobStartToJson(JsonProtocol.scala:155) at org.apache.spark.util.JsonProtocol$.sparkEventToJson(JsonProtocol.scala:79) at org.apache.spark.scheduler.EventLoggingListener.logEvent(EventLoggingListener.scala:149) at org.apache.spark.scheduler.EventLoggingListener.onJobStart(EventLoggingListener.scala:217) at org.apache.spark.scheduler.SparkListenerBus.doPostEvent(SparkListenerBus.scala:37) at org.apache.spark.scheduler.SparkListenerBus.doPostEvent$(SparkListenerBus.scala:28) at org.apache.spark.scheduler.AsyncEventQueue.doPostEvent(AsyncEventQueue.scala:37) at org.apache.spark.scheduler.AsyncEventQueue.doPostEvent(AsyncEventQueue.scala:37) at org.apache.spark.util.ListenerBus.postToAll(ListenerBus.scala:99) at org.apache.spark.util.ListenerBus.postToAll$(ListenerBus.scala:84) at org.apache.spark.scheduler.AsyncEventQueue.super$postToAll(AsyncEventQueue.scala:102) at org.apache.spark.scheduler.AsyncEventQueue.$anonfun$dispatch$1(AsyncEventQueue.scala:102) at scala.runtime.java8.JFunction0$mcJ$sp.apply(JFunction0$mcJ$sp.java:23) at scala.util.DynamicVariable.withValue(DynamicVariable.scala:62) at org.apache.spark.scheduler.AsyncEventQueue.org$apache$spark$scheduler$AsyncEventQueue$$dispatch(AsyncEventQueue.scala:97) at org.apache.spark.scheduler.AsyncEventQueue$$anon$2.$anonfun$run$1(AsyncEventQueue.scala:93) at org.apache.spark.util.Utils$.tryOrStopSparkContext(Utils.scala:1319) at org.apache.spark.scheduler.AsyncEventQueue$$anon$2.run(AsyncEventQueue.scala:93) ``` Please refer https://issues.apache.org/jira/browse/SPARK-28967 for detailed reproducer. ### Does this PR introduce any user-facing change? No ### How was this patch tested? Newly added UT. Also manually tested via running simple structured streaming query in spark-shell. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #25653: [SPARK-28954][SQL] In SparkSQL CLI, pass extra jar through hive hive conf HIVEAUXJARS, we just use SessionResourceLoader API to cover
AmplabJenkins removed a comment on issue #25653: [SPARK-28954][SQL] In SparkSQL CLI, pass extra jar through hive hive conf HIVEAUXJARS, we just use SessionResourceLoader API to cover multi-version problem URL: https://github.com/apache/spark/pull/25653#issuecomment-527733694 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/15106/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #25653: [SPARK-28954][SQL] In SparkSQL CLI, pass extra jar through hive hive conf HIVEAUXJARS, we just use SessionResourceLoader API to cover multi-v
AmplabJenkins commented on issue #25653: [SPARK-28954][SQL] In SparkSQL CLI, pass extra jar through hive hive conf HIVEAUXJARS, we just use SessionResourceLoader API to cover multi-version problem URL: https://github.com/apache/spark/pull/25653#issuecomment-527733690 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #25653: [SPARK-28954][SQL] In SparkSQL CLI, pass extra jar through hive hive conf HIVEAUXJARS, we just use SessionResourceLoader API to cover multi-v
AmplabJenkins commented on issue #25653: [SPARK-28954][SQL] In SparkSQL CLI, pass extra jar through hive hive conf HIVEAUXJARS, we just use SessionResourceLoader API to cover multi-version problem URL: https://github.com/apache/spark/pull/25653#issuecomment-527733694 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/15106/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #25601: [SPARK-28856][SQL] Implement SHOW DATABASES for Data Source V2 Tables
AmplabJenkins removed a comment on issue #25601: [SPARK-28856][SQL] Implement SHOW DATABASES for Data Source V2 Tables URL: https://github.com/apache/spark/pull/25601#issuecomment-527732600 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/110080/ Test FAILed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] HyukjinKwon commented on issue #25653: [SPARK-28954][SQL] In SparkSQL CLI, pass extra jar through hive hive conf HIVEAUXJARS, we just use SessionResourceLoader API to cover multi-ver
HyukjinKwon commented on issue #25653: [SPARK-28954][SQL] In SparkSQL CLI, pass extra jar through hive hive conf HIVEAUXJARS, we just use SessionResourceLoader API to cover multi-version problem URL: https://github.com/apache/spark/pull/25653#issuecomment-527732683 ok to test This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #25601: [SPARK-28856][SQL] Implement SHOW DATABASES for Data Source V2 Tables
AmplabJenkins removed a comment on issue #25601: [SPARK-28856][SQL] Implement SHOW DATABASES for Data Source V2 Tables URL: https://github.com/apache/spark/pull/25601#issuecomment-527732595 Merged build finished. Test FAILed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA removed a comment on issue #25601: [SPARK-28856][SQL] Implement SHOW DATABASES for Data Source V2 Tables
SparkQA removed a comment on issue #25601: [SPARK-28856][SQL] Implement SHOW DATABASES for Data Source V2 Tables URL: https://github.com/apache/spark/pull/25601#issuecomment-527716760 **[Test build #110080 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/110080/testReport)** for PR 25601 at commit [`9974a58`](https://github.com/apache/spark/commit/9974a58d65bce948e5ab6f27ec64aa7c587598f0). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #25601: [SPARK-28856][SQL] Implement SHOW DATABASES for Data Source V2 Tables
AmplabJenkins commented on issue #25601: [SPARK-28856][SQL] Implement SHOW DATABASES for Data Source V2 Tables URL: https://github.com/apache/spark/pull/25601#issuecomment-527732600 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/110080/ Test FAILed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #25601: [SPARK-28856][SQL] Implement SHOW DATABASES for Data Source V2 Tables
AmplabJenkins commented on issue #25601: [SPARK-28856][SQL] Implement SHOW DATABASES for Data Source V2 Tables URL: https://github.com/apache/spark/pull/25601#issuecomment-527732595 Merged build finished. Test FAILed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on issue #25601: [SPARK-28856][SQL] Implement SHOW DATABASES for Data Source V2 Tables
SparkQA commented on issue #25601: [SPARK-28856][SQL] Implement SHOW DATABASES for Data Source V2 Tables URL: https://github.com/apache/spark/pull/25601#issuecomment-527732510 **[Test build #110080 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/110080/testReport)** for PR 25601 at commit [`9974a58`](https://github.com/apache/spark/commit/9974a58d65bce948e5ab6f27ec64aa7c587598f0). * This patch **fails Spark unit tests**. * This patch merges cleanly. * This patch adds no public classes. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on issue #25666: [SQL][SPARK-28962] Provide index argument to filter lambda functions
SparkQA commented on issue #25666: [SQL][SPARK-28962] Provide index argument to filter lambda functions URL: https://github.com/apache/spark/pull/25666#issuecomment-527732607 **[Test build #110089 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/110089/testReport)** for PR 25666 at commit [`30f37d0`](https://github.com/apache/spark/commit/30f37d0444b361e5f91251f069b87a34c3433432). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] felixcheung commented on a change in pull request #25647: [SPARK-28946][R][DOCS] Add some more information about building SparkR on Windows
felixcheung commented on a change in pull request #25647: [SPARK-28946][R][DOCS] Add some more information about building SparkR on Windows URL: https://github.com/apache/spark/pull/25647#discussion_r320566075 ## File path: R/WINDOWS.md ## @@ -20,25 +20,28 @@ license: | To build SparkR on Windows, the following steps are required -1. Install R (>= 3.1) and [Rtools](https://cloud.r-project.org/bin/windows/Rtools/). Make sure to +1. Make sure `bash` is available and in `PATH` if you already have a built-in `bash` on Windows. If you do not have, install [Cygwin](https://www.cygwin.com/). + +2. Install R (>= 3.1) and [Rtools](https://cloud.r-project.org/bin/windows/Rtools/). Make sure to include Rtools and R in `PATH`. Note that support for R prior to version 3.4 is deprecated as of Spark 3.0.0. -2. Install -[JDK8](https://www.oracle.com/technetwork/java/javase/downloads/jdk8-downloads-2133151.html) and set +3. Install [JDK](https://www.oracle.com/technetwork/java/javase/downloads) that SparkR supports - see `R/pkg/DESCRIPTION`, and set `JAVA_HOME` in the system environment variables. -3. Download and install [Maven](https://maven.apache.org/download.html). Also include the `bin` +4. Download and install [Maven](https://maven.apache.org/download.html). Also include the `bin` Review comment: we meant the `maven version` part This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #20965: [SPARK-21870][SQL] Split aggregation code into small functions
AmplabJenkins removed a comment on issue #20965: [SPARK-21870][SQL] Split aggregation code into small functions URL: https://github.com/apache/spark/pull/20965#issuecomment-527732247 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/15105/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #25666: [SQL][SPARK-28962] Provide index argument to filter lambda functions
AmplabJenkins removed a comment on issue #25666: [SQL][SPARK-28962] Provide index argument to filter lambda functions URL: https://github.com/apache/spark/pull/25666#issuecomment-527732228 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #20965: [SPARK-21870][SQL] Split aggregation code into small functions
AmplabJenkins removed a comment on issue #20965: [SPARK-21870][SQL] Split aggregation code into small functions URL: https://github.com/apache/spark/pull/20965#issuecomment-527732243 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #25666: [SQL][SPARK-28962] Provide index argument to filter lambda functions
AmplabJenkins removed a comment on issue #25666: [SQL][SPARK-28962] Provide index argument to filter lambda functions URL: https://github.com/apache/spark/pull/25666#issuecomment-527732233 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/15104/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #20965: [SPARK-21870][SQL] Split aggregation code into small functions
AmplabJenkins commented on issue #20965: [SPARK-21870][SQL] Split aggregation code into small functions URL: https://github.com/apache/spark/pull/20965#issuecomment-527732243 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #25666: [SQL][SPARK-28962] Provide index argument to filter lambda functions
AmplabJenkins commented on issue #25666: [SQL][SPARK-28962] Provide index argument to filter lambda functions URL: https://github.com/apache/spark/pull/25666#issuecomment-527732233 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/15104/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #20965: [SPARK-21870][SQL] Split aggregation code into small functions
AmplabJenkins commented on issue #20965: [SPARK-21870][SQL] Split aggregation code into small functions URL: https://github.com/apache/spark/pull/20965#issuecomment-527732247 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/15105/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #25666: [SQL][SPARK-28962] Provide index argument to filter lambda functions
AmplabJenkins commented on issue #25666: [SQL][SPARK-28962] Provide index argument to filter lambda functions URL: https://github.com/apache/spark/pull/25666#issuecomment-527732228 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] HyukjinKwon commented on issue #25667: [SPARK-28963][BUILD] Fall back to archive.apache.org in build/mvn for older releases
HyukjinKwon commented on issue #25667: [SPARK-28963][BUILD] Fall back to archive.apache.org in build/mvn for older releases URL: https://github.com/apache/spark/pull/25667#issuecomment-527731485 Merged to master. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] HyukjinKwon closed pull request #25667: [SPARK-28963][BUILD] Fall back to archive.apache.org in build/mvn for older releases
HyukjinKwon closed pull request #25667: [SPARK-28963][BUILD] Fall back to archive.apache.org in build/mvn for older releases URL: https://github.com/apache/spark/pull/25667 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] HyukjinKwon commented on issue #25666: [SQL][SPARK-28962] Provide index argument to filter lambda functions
HyukjinKwon commented on issue #25666: [SQL][SPARK-28962] Provide index argument to filter lambda functions URL: https://github.com/apache/spark/pull/25666#issuecomment-527731578 ok to test This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on issue #20965: [SPARK-21870][SQL] Split aggregation code into small functions
SparkQA commented on issue #20965: [SPARK-21870][SQL] Split aggregation code into small functions URL: https://github.com/apache/spark/pull/20965#issuecomment-527731196 **[Test build #110088 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/110088/testReport)** for PR 20965 at commit [`d1e06a6`](https://github.com/apache/spark/commit/d1e06a600d262469542ac671bf4ae85a4da32706). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #25389: [SPARK-28657][CORE] Fix currentContext Instance failed sometimes
AmplabJenkins removed a comment on issue #25389: [SPARK-28657][CORE] Fix currentContext Instance failed sometimes URL: https://github.com/apache/spark/pull/25389#issuecomment-527731072 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #25389: [SPARK-28657][CORE] Fix currentContext Instance failed sometimes
AmplabJenkins removed a comment on issue #25389: [SPARK-28657][CORE] Fix currentContext Instance failed sometimes URL: https://github.com/apache/spark/pull/25389#issuecomment-527731075 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/110076/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #25389: [SPARK-28657][CORE] Fix currentContext Instance failed sometimes
AmplabJenkins commented on issue #25389: [SPARK-28657][CORE] Fix currentContext Instance failed sometimes URL: https://github.com/apache/spark/pull/25389#issuecomment-527731075 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/110076/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #25661: [SPARK-28957][SQL] Copy any "spark.hive.foo=bar" spark properties into hadoop conf as "hive.foo=bar"
AmplabJenkins removed a comment on issue #25661: [SPARK-28957][SQL] Copy any "spark.hive.foo=bar" spark properties into hadoop conf as "hive.foo=bar" URL: https://github.com/apache/spark/pull/25661#issuecomment-527730824 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/15103/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #25389: [SPARK-28657][CORE] Fix currentContext Instance failed sometimes
AmplabJenkins commented on issue #25389: [SPARK-28657][CORE] Fix currentContext Instance failed sometimes URL: https://github.com/apache/spark/pull/25389#issuecomment-527731072 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #25661: [SPARK-28957][SQL] Copy any "spark.hive.foo=bar" spark properties into hadoop conf as "hive.foo=bar"
AmplabJenkins removed a comment on issue #25661: [SPARK-28957][SQL] Copy any "spark.hive.foo=bar" spark properties into hadoop conf as "hive.foo=bar" URL: https://github.com/apache/spark/pull/25661#issuecomment-527730820 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #25661: [SPARK-28957][SQL] Copy any "spark.hive.foo=bar" spark properties into hadoop conf as "hive.foo=bar"
AmplabJenkins commented on issue #25661: [SPARK-28957][SQL] Copy any "spark.hive.foo=bar" spark properties into hadoop conf as "hive.foo=bar" URL: https://github.com/apache/spark/pull/25661#issuecomment-527730820 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] felixcheung commented on issue #25670: [SPARK-28869][CORE] Roll over event log files
felixcheung commented on issue #25670: [SPARK-28869][CORE] Roll over event log files URL: https://github.com/apache/spark/pull/25670#issuecomment-527730845 > > java.util.ConcurrentModificationException > > This is also occurring with current master branch. You can reproduce it with below query in spark-shell, with continuously pushing records to topic1 and topic2. > > ``` > val bootstrapServers = "localhost:9092" > val checkpointLocation = "/tmp/SPARK-28869-testing" > val sourceTopics = "topic1" > val sourceTopics2 = "topic2" > val targetTopic = "topic3" > > val df = spark.readStream.format("kafka").option("kafka.bootstrap.servers", bootstrapServers).option("subscribe", sourceTopics).option("startingOffsets", "earliest").load() > > val df2 = spark.readStream.format("kafka").option("kafka.bootstrap.servers", bootstrapServers).option("subscribe", sourceTopics2).option("startingOffsets", "earliest").load() > > df.union(df2).writeStream.format("kafka").option("kafka.bootstrap.servers", bootstrapServers).option("checkpointLocation", checkpointLocation).option("topic", targetTopic).start() > ``` is there a separate jira on this? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA removed a comment on issue #25389: [SPARK-28657][CORE] Fix currentContext Instance failed sometimes
SparkQA removed a comment on issue #25389: [SPARK-28657][CORE] Fix currentContext Instance failed sometimes URL: https://github.com/apache/spark/pull/25389#issuecomment-527705718 **[Test build #110076 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/110076/testReport)** for PR 25389 at commit [`6dced2a`](https://github.com/apache/spark/commit/6dced2af9c6311bd9bd3390abc12ebe522b39213). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #25661: [SPARK-28957][SQL] Copy any "spark.hive.foo=bar" spark properties into hadoop conf as "hive.foo=bar"
AmplabJenkins commented on issue #25661: [SPARK-28957][SQL] Copy any "spark.hive.foo=bar" spark properties into hadoop conf as "hive.foo=bar" URL: https://github.com/apache/spark/pull/25661#issuecomment-527730824 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/15103/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] HyukjinKwon closed pull request #25649: [SPARK-28694][EXAMPLES]Add Java/Scala StructuredKerberizedKafkaWordCount examples
HyukjinKwon closed pull request #25649: [SPARK-28694][EXAMPLES]Add Java/Scala StructuredKerberizedKafkaWordCount examples URL: https://github.com/apache/spark/pull/25649 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org