[GitHub] [spark] HyukjinKwon commented on issue #27888: [SPARK-31116][SQL] Fix nested schema case-sensitivity in ParquetRowConverter
HyukjinKwon commented on issue #27888: [SPARK-31116][SQL] Fix nested schema case-sensitivity in ParquetRowConverter URL: https://github.com/apache/spark/pull/27888#issuecomment-599354017 The fix seems fine except the comments above. @dbtsai, WDYT about disabling nested pruning by default for Spark 3.0 (see https://github.com/apache/spark/pull/27888#pullrequestreview-373986169)? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] HyukjinKwon commented on a change in pull request #27888: [SPARK-31116][SQL] Fix nested schema case-sensitivity in ParquetRowConverter
HyukjinKwon commented on a change in pull request #27888: [SPARK-31116][SQL] Fix nested schema case-sensitivity in ParquetRowConverter URL: https://github.com/apache/spark/pull/27888#discussion_r392789070 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/parquet/ParquetRowConverter.scala ## @@ -176,15 +178,38 @@ private[parquet] class ParquetRowConverter( */ def currentRecord: InternalRow = currentRow + // Converters for each field. private[this] val fieldConverters: Array[Converter with HasParentContainerUpdater] = { -parquetType.getFields.asScala.map { parquetField => - val fieldIndex = catalystType.fieldIndex(parquetField.getName) - val catalystField = catalystType(fieldIndex) - // Converted field value should be set to the `fieldIndex`-th cell of `currentRow` - newConverter(parquetField, catalystField.dataType, new RowUpdater(currentRow, fieldIndex)) -}.toArray - } + +// (SPARK-31116) There is an issue when schema pruning is enabled, so we keep original codes +if (schemaPruning) { + // (SPARK-31116) For letter case issue, create name to field index based on case sensitivity + val catalystFieldNameToIndex = if (caseSensitive) { +catalystType.fieldNames.zipWithIndex.toMap + } else { +CaseInsensitiveMap(catalystType.fieldNames.zipWithIndex.toMap) + } + parquetType.getFields.asScala.map { parquetField => +val fieldIndex = catalystFieldNameToIndex.getOrElse(parquetField.getName, + throw new IllegalArgumentException( +s"${parquetField.getName} does not exist. " + + s"Available: ${catalystType.fieldNames.mkString(", ")}") +) +val catalystField = catalystType(fieldIndex) +// Converted field value should be set to the `fieldIndex`-th cell of `currentRow` +newConverter(parquetField, catalystField.dataType, new RowUpdater(currentRow, fieldIndex)) + }.toArray +} else { + parquetType.getFields.asScala.zip(catalystType).zipWithIndex.map { +case ((parquetFieldType, catalystField), ordinal) => + // Converted field value should be set to the `ordinal`-th cell of `currentRow` + newConverter( +parquetFieldType, catalystField.dataType, new RowUpdater(currentRow, ordinal)) + }.toArray +} Review comment: I actually asked to keep the original codes as were at https://github.com/apache/spark/pull/27888#discussion_r391979749 although it apparently works same. I am okay to remove this branch if we're very sure it works identically. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] kimtkyeom commented on a change in pull request #27888: [SPARK-31116][SQL] Fix nested schema case-sensitivity in ParquetRowConverter
kimtkyeom commented on a change in pull request #27888: [SPARK-31116][SQL] Fix nested schema case-sensitivity in ParquetRowConverter URL: https://github.com/apache/spark/pull/27888#discussion_r392788006 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/parquet/ParquetRowConverter.scala ## @@ -176,15 +178,38 @@ private[parquet] class ParquetRowConverter( */ def currentRecord: InternalRow = currentRow + // Converters for each field. private[this] val fieldConverters: Array[Converter with HasParentContainerUpdater] = { -parquetType.getFields.asScala.map { parquetField => - val fieldIndex = catalystType.fieldIndex(parquetField.getName) - val catalystField = catalystType(fieldIndex) - // Converted field value should be set to the `fieldIndex`-th cell of `currentRow` - newConverter(parquetField, catalystField.dataType, new RowUpdater(currentRow, fieldIndex)) -}.toArray - } + +// (SPARK-31116) There is an issue when schema pruning is enabled, so we keep original codes +if (schemaPruning) { + // (SPARK-31116) For letter case issue, create name to field index based on case sensitivity + val catalystFieldNameToIndex = if (caseSensitive) { +catalystType.fieldNames.zipWithIndex.toMap + } else { +CaseInsensitiveMap(catalystType.fieldNames.zipWithIndex.toMap) + } + parquetType.getFields.asScala.map { parquetField => +val fieldIndex = catalystFieldNameToIndex.getOrElse(parquetField.getName, + throw new IllegalArgumentException( +s"${parquetField.getName} does not exist. " + + s"Available: ${catalystType.fieldNames.mkString(", ")}") +) +val catalystField = catalystType(fieldIndex) +// Converted field value should be set to the `fieldIndex`-th cell of `currentRow` +newConverter(parquetField, catalystField.dataType, new RowUpdater(currentRow, fieldIndex)) + }.toArray +} else { + parquetType.getFields.asScala.zip(catalystType).zipWithIndex.map { +case ((parquetFieldType, catalystField), ordinal) => + // Converted field value should be set to the `ordinal`-th cell of `currentRow` + newConverter( +parquetFieldType, catalystField.dataType, new RowUpdater(currentRow, ordinal)) + }.toArray +} Review comment: I added this codes because of https://github.com/apache/spark/pull/27888#discussion_r391979749. Am I misunderstood comment? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #27895: [SPARK-31138][ML] Add ANOVA Selector for continuous features and categorical labels
AmplabJenkins removed a comment on issue #27895: [SPARK-31138][ML] Add ANOVA Selector for continuous features and categorical labels URL: https://github.com/apache/spark/pull/27895#issuecomment-599350825 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #27895: [SPARK-31138][ML] Add ANOVA Selector for continuous features and categorical labels
AmplabJenkins removed a comment on issue #27895: [SPARK-31138][ML] Add ANOVA Selector for continuous features and categorical labels URL: https://github.com/apache/spark/pull/27895#issuecomment-599350829 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/24563/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #27882: [WIP][SPARK-31127][ML] Add abstract Selector
AmplabJenkins commented on issue #27882: [WIP][SPARK-31127][ML] Add abstract Selector URL: https://github.com/apache/spark/pull/27882#issuecomment-599350818 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #27882: [WIP][SPARK-31127][ML] Add abstract Selector
AmplabJenkins removed a comment on issue #27882: [WIP][SPARK-31127][ML] Add abstract Selector URL: https://github.com/apache/spark/pull/27882#issuecomment-599350818 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #27882: [WIP][SPARK-31127][ML] Add abstract Selector
AmplabJenkins commented on issue #27882: [WIP][SPARK-31127][ML] Add abstract Selector URL: https://github.com/apache/spark/pull/27882#issuecomment-599350822 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/24564/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #27882: [WIP][SPARK-31127][ML] Add abstract Selector
AmplabJenkins removed a comment on issue #27882: [WIP][SPARK-31127][ML] Add abstract Selector URL: https://github.com/apache/spark/pull/27882#issuecomment-599350822 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/24564/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #27895: [SPARK-31138][ML] Add ANOVA Selector for continuous features and categorical labels
AmplabJenkins commented on issue #27895: [SPARK-31138][ML] Add ANOVA Selector for continuous features and categorical labels URL: https://github.com/apache/spark/pull/27895#issuecomment-599350825 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #27895: [SPARK-31138][ML] Add ANOVA Selector for continuous features and categorical labels
AmplabJenkins commented on issue #27895: [SPARK-31138][ML] Add ANOVA Selector for continuous features and categorical labels URL: https://github.com/apache/spark/pull/27895#issuecomment-599350829 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/24563/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on issue #27895: [SPARK-31138][ML] Add ANOVA Selector for continuous features and categorical labels
SparkQA commented on issue #27895: [SPARK-31138][ML] Add ANOVA Selector for continuous features and categorical labels URL: https://github.com/apache/spark/pull/27895#issuecomment-599350500 **[Test build #119833 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/119833/testReport)** for PR 27895 at commit [`6107389`](https://github.com/apache/spark/commit/6107389d399ae8a6d34799c9ea166f45503bef86). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on issue #27882: [WIP][SPARK-31127][ML] Add abstract Selector
SparkQA commented on issue #27882: [WIP][SPARK-31127][ML] Add abstract Selector URL: https://github.com/apache/spark/pull/27882#issuecomment-599350499 **[Test build #119834 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/119834/testReport)** for PR 27882 at commit [`eb841a5`](https://github.com/apache/spark/commit/eb841a5148a56a4e814148898408704f40dadcba). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] viirya commented on a change in pull request #27888: [SPARK-31116][SQL] Fix nested schema case-sensitivity in ParquetRowConverter
viirya commented on a change in pull request #27888: [SPARK-31116][SQL] Fix nested schema case-sensitivity in ParquetRowConverter URL: https://github.com/apache/spark/pull/27888#discussion_r392786863 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/parquet/ParquetRowConverter.scala ## @@ -176,15 +178,38 @@ private[parquet] class ParquetRowConverter( */ def currentRecord: InternalRow = currentRow + // Converters for each field. private[this] val fieldConverters: Array[Converter with HasParentContainerUpdater] = { -parquetType.getFields.asScala.map { parquetField => - val fieldIndex = catalystType.fieldIndex(parquetField.getName) - val catalystField = catalystType(fieldIndex) - // Converted field value should be set to the `fieldIndex`-th cell of `currentRow` - newConverter(parquetField, catalystField.dataType, new RowUpdater(currentRow, fieldIndex)) -}.toArray - } + +// (SPARK-31116) There is an issue when schema pruning is enabled, so we keep original codes +if (schemaPruning) { + // (SPARK-31116) For letter case issue, create name to field index based on case sensitivity + val catalystFieldNameToIndex = if (caseSensitive) { +catalystType.fieldNames.zipWithIndex.toMap + } else { +CaseInsensitiveMap(catalystType.fieldNames.zipWithIndex.toMap) + } + parquetType.getFields.asScala.map { parquetField => +val fieldIndex = catalystFieldNameToIndex.getOrElse(parquetField.getName, + throw new IllegalArgumentException( +s"${parquetField.getName} does not exist. " + + s"Available: ${catalystType.fieldNames.mkString(", ")}") +) +val catalystField = catalystType(fieldIndex) +// Converted field value should be set to the `fieldIndex`-th cell of `currentRow` +newConverter(parquetField, catalystField.dataType, new RowUpdater(currentRow, fieldIndex)) + }.toArray +} else { + parquetType.getFields.asScala.zip(catalystType).zipWithIndex.map { +case ((parquetFieldType, catalystField), ordinal) => + // Converted field value should be set to the `ordinal`-th cell of `currentRow` + newConverter( +parquetFieldType, catalystField.dataType, new RowUpdater(currentRow, ordinal)) + }.toArray +} Review comment: Why add this part of code? Seems to me the above code inside `if (schemaPruning) { ... }` looks reasonable. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] viirya commented on a change in pull request #27888: [SPARK-31116][SQL] Fix nested schema case-sensitivity in ParquetRowConverter
viirya commented on a change in pull request #27888: [SPARK-31116][SQL] Fix nested schema case-sensitivity in ParquetRowConverter URL: https://github.com/apache/spark/pull/27888#discussion_r392785966 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/parquet/ParquetRowConverter.scala ## @@ -176,15 +178,38 @@ private[parquet] class ParquetRowConverter( */ def currentRecord: InternalRow = currentRow + Review comment: nit: remove unnecessary blank link. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] huaxingao commented on a change in pull request #27895: [SPARK-31138][ML] Add ANOVA Selector for continuous features and categorical labels
huaxingao commented on a change in pull request #27895: [SPARK-31138][ML] Add ANOVA Selector for continuous features and categorical labels URL: https://github.com/apache/spark/pull/27895#discussion_r392786552 ## File path: mllib/src/main/scala/org/apache/spark/ml/stat/ANOVATest.scala ## @@ -0,0 +1,166 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.spark.ml.stat + +import org.apache.commons.math3.distribution.FDistribution + +import org.apache.spark.annotation.Since +import org.apache.spark.ml.feature.LabeledPoint +import org.apache.spark.ml.linalg.{Vector, Vectors, VectorUDT} +import org.apache.spark.ml.util.SchemaUtils +import org.apache.spark.sql._ +import org.apache.spark.sql.functions.col +import org.apache.spark.util.collection.OpenHashMap + + +/** + * ANOVA Test + */ +@Since("3.1.0") +object ANOVATest { + + /** Used to construct output schema of tests */ + private case class ANOVAResult( + pValues: Vector, + degreesOfFreedom: Array[Long], + fValues: Vector) + + /** + * @param dataset DataFrame of categorical labels and continuous features. + * @param featuresCol Name of features column in dataset, of type `Vector` (`VectorUDT`) + * @param labelCol Name of label column in dataset, of any numerical type + * @return DataFrame containing the test result for every feature against the label. + * This DataFrame will contain a single Row with the following fields: + * - `pValues: Vector` + * - `degreesOfFreedom: Array[Long]` + * - `fValues: Vector` + * Each of these fields has one value per feature. + */ + @Since("3.1.0") + def test(dataset: DataFrame, featuresCol: String, labelCol: String): DataFrame = { +val spark = dataset.sparkSession +val testResults = testClassification(dataset, featuresCol, labelCol) +val pValues: Vector = Vectors.dense(testResults.map(_.pValue)) +val degreesOfFreedom: Array[Long] = testResults.map(_.degreesOfFreedom) +val fValues: Vector = Vectors.dense(testResults.map(_.statistic)) +spark.createDataFrame( + Seq(new ANOVAResult(pValues, degreesOfFreedom, fValues))) + } + + /** + * @param dataset DataFrame of categorical labels and continuous features. + * @param featuresCol Name of features column in dataset, of type `Vector` (`VectorUDT`) + * @param labelCol Name of label column in dataset, of any numerical type + * @return Array containing the ANOVATestResult for every feature against the + * label. + */ + private[ml] def testClassification( + dataset: Dataset[_], + featuresCol: String, + labelCol: String): Array[SelectionTestResult] = { + +val spark = dataset.sparkSession +import spark.implicits._ + +SchemaUtils.checkColumnType(dataset.schema, featuresCol, new VectorUDT) +SchemaUtils.checkNumericType(dataset.schema, labelCol) + +val labeledPointRdd = dataset.select(col("label").cast("double"), col("features")) Review comment: Fixed. Thanks! This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #27616: [SPARK-30864] [SQL]add the user guide for Adaptive Query Execution
AmplabJenkins removed a comment on issue #27616: [SPARK-30864] [SQL]add the user guide for Adaptive Query Execution URL: https://github.com/apache/spark/pull/27616#issuecomment-599347352 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #27616: [SPARK-30864] [SQL]add the user guide for Adaptive Query Execution
AmplabJenkins removed a comment on issue #27616: [SPARK-30864] [SQL]add the user guide for Adaptive Query Execution URL: https://github.com/apache/spark/pull/27616#issuecomment-599347356 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/24562/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #27616: [SPARK-30864] [SQL]add the user guide for Adaptive Query Execution
AmplabJenkins commented on issue #27616: [SPARK-30864] [SQL]add the user guide for Adaptive Query Execution URL: https://github.com/apache/spark/pull/27616#issuecomment-599347352 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #27616: [SPARK-30864] [SQL]add the user guide for Adaptive Query Execution
AmplabJenkins commented on issue #27616: [SPARK-30864] [SQL]add the user guide for Adaptive Query Execution URL: https://github.com/apache/spark/pull/27616#issuecomment-599347356 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/24562/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on issue #27616: [SPARK-30864] [SQL]add the user guide for Adaptive Query Execution
SparkQA commented on issue #27616: [SPARK-30864] [SQL]add the user guide for Adaptive Query Execution URL: https://github.com/apache/spark/pull/27616#issuecomment-599347058 **[Test build #119832 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/119832/testReport)** for PR 27616 at commit [`13aa51b`](https://github.com/apache/spark/commit/13aa51b245ecabf5ab38ba2b446196db5a79cb4e). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] maropu commented on issue #27920: [SPARK-31102][SQL] Spark-sql fails to parse when contains comment.
maropu commented on issue #27920: [SPARK-31102][SQL] Spark-sql fails to parse when contains comment. URL: https://github.com/apache/spark/pull/27920#issuecomment-599344938 cc: @wangyum This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] maropu commented on a change in pull request #27920: [SPARK-31102][SQL] Spark-sql fails to parse when contains comment.
maropu commented on a change in pull request #27920: [SPARK-31102][SQL] Spark-sql fails to parse when contains comment. URL: https://github.com/apache/spark/pull/27920#discussion_r392782401 ## File path: sql/hive-thriftserver/src/test/scala/org/apache/spark/sql/hive/thriftserver/CliSuite.scala ## @@ -402,24 +402,14 @@ class CliSuite extends SparkFunSuite with BeforeAndAfterAll with Logging { } test("SPARK-30049 Should not complain for quotes in commented lines") { -runCliWithin(1.minute)( +runCliWithin(3.minute)( """SELECT concat('test', 'comment') -- someone's comment here |;""".stripMargin -> "testcomment" ) - } - - test("SPARK-30049 Should not complain for quotes in commented with multi-lines") { -runCliWithin(1.minute)( - """SELECT concat('test', 'comment') -- someone's comment here \\ -| comment continues here with single ' quote \\ -| extra ' \\ -|;""".stripMargin -> "testcomment" -) -runCliWithin(1.minute)( - """SELECT concat('test', 'comment') -- someone's comment here \\ -| comment continues here with single ' quote \\ -| extra ' \\ -| ;""".stripMargin -> "testcomment" Review comment: Why you did you remove the existing tests instead of adding new tests? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #27278: [SPARK-30569][SQL][PYSPARK][SPARKR] Add percentile_approx DSL functions.
AmplabJenkins removed a comment on issue #27278: [SPARK-30569][SQL][PYSPARK][SPARKR] Add percentile_approx DSL functions. URL: https://github.com/apache/spark/pull/27278#issuecomment-599344119 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #27278: [SPARK-30569][SQL][PYSPARK][SPARKR] Add percentile_approx DSL functions.
AmplabJenkins removed a comment on issue #27278: [SPARK-30569][SQL][PYSPARK][SPARKR] Add percentile_approx DSL functions. URL: https://github.com/apache/spark/pull/27278#issuecomment-599344123 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/24561/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #27278: [SPARK-30569][SQL][PYSPARK][SPARKR] Add percentile_approx DSL functions.
AmplabJenkins commented on issue #27278: [SPARK-30569][SQL][PYSPARK][SPARKR] Add percentile_approx DSL functions. URL: https://github.com/apache/spark/pull/27278#issuecomment-599344119 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #27278: [SPARK-30569][SQL][PYSPARK][SPARKR] Add percentile_approx DSL functions.
AmplabJenkins commented on issue #27278: [SPARK-30569][SQL][PYSPARK][SPARKR] Add percentile_approx DSL functions. URL: https://github.com/apache/spark/pull/27278#issuecomment-599344123 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/24561/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on issue #27278: [SPARK-30569][SQL][PYSPARK][SPARKR] Add percentile_approx DSL functions.
SparkQA commented on issue #27278: [SPARK-30569][SQL][PYSPARK][SPARKR] Add percentile_approx DSL functions. URL: https://github.com/apache/spark/pull/27278#issuecomment-599343820 **[Test build #119831 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/119831/testReport)** for PR 27278 at commit [`b6bd2d4`](https://github.com/apache/spark/commit/b6bd2d46ceb2539dd8e99e07268ce9e8fd6e6558). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #27916: [SPARK-30532] DataFrameStatFunctions to work with TABLE.COLUMN syntax
AmplabJenkins removed a comment on issue #27916: [SPARK-30532] DataFrameStatFunctions to work with TABLE.COLUMN syntax URL: https://github.com/apache/spark/pull/27916#issuecomment-599336486 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/24560/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #27916: [SPARK-30532] DataFrameStatFunctions to work with TABLE.COLUMN syntax
AmplabJenkins removed a comment on issue #27916: [SPARK-30532] DataFrameStatFunctions to work with TABLE.COLUMN syntax URL: https://github.com/apache/spark/pull/27916#issuecomment-599336481 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #27916: [SPARK-30532] DataFrameStatFunctions to work with TABLE.COLUMN syntax
AmplabJenkins commented on issue #27916: [SPARK-30532] DataFrameStatFunctions to work with TABLE.COLUMN syntax URL: https://github.com/apache/spark/pull/27916#issuecomment-599336486 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/24560/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #27916: [SPARK-30532] DataFrameStatFunctions to work with TABLE.COLUMN syntax
AmplabJenkins commented on issue #27916: [SPARK-30532] DataFrameStatFunctions to work with TABLE.COLUMN syntax URL: https://github.com/apache/spark/pull/27916#issuecomment-599336481 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on issue #27916: [SPARK-30532] DataFrameStatFunctions to work with TABLE.COLUMN syntax
SparkQA commented on issue #27916: [SPARK-30532] DataFrameStatFunctions to work with TABLE.COLUMN syntax URL: https://github.com/apache/spark/pull/27916#issuecomment-599336222 **[Test build #119830 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/119830/testReport)** for PR 27916 at commit [`2f24ae0`](https://github.com/apache/spark/commit/2f24ae0365d21412dfdcb998bf9fafa513f39cd5). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] kachayev commented on issue #27916: [SPARK-30532] DataFrameStatFunctions to work with TABLE.COLUMN syntax
kachayev commented on issue #27916: [SPARK-30532] DataFrameStatFunctions to work with TABLE.COLUMN syntax URL: https://github.com/apache/spark/pull/27916#issuecomment-599335665 Fixed tests. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #27278: [SPARK-30569][SQL][PYSPARK][SPARKR] Add percentile_approx DSL functions.
AmplabJenkins removed a comment on issue #27278: [SPARK-30569][SQL][PYSPARK][SPARKR] Add percentile_approx DSL functions. URL: https://github.com/apache/spark/pull/27278#issuecomment-599333723 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #27278: [SPARK-30569][SQL][PYSPARK][SPARKR] Add percentile_approx DSL functions.
AmplabJenkins removed a comment on issue #27278: [SPARK-30569][SQL][PYSPARK][SPARKR] Add percentile_approx DSL functions. URL: https://github.com/apache/spark/pull/27278#issuecomment-599333726 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/24559/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #27278: [SPARK-30569][SQL][PYSPARK][SPARKR] Add percentile_approx DSL functions.
AmplabJenkins commented on issue #27278: [SPARK-30569][SQL][PYSPARK][SPARKR] Add percentile_approx DSL functions. URL: https://github.com/apache/spark/pull/27278#issuecomment-599333726 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/24559/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #27278: [SPARK-30569][SQL][PYSPARK][SPARKR] Add percentile_approx DSL functions.
AmplabJenkins commented on issue #27278: [SPARK-30569][SQL][PYSPARK][SPARKR] Add percentile_approx DSL functions. URL: https://github.com/apache/spark/pull/27278#issuecomment-599333723 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on issue #27278: [SPARK-30569][SQL][PYSPARK][SPARKR] Add percentile_approx DSL functions.
SparkQA commented on issue #27278: [SPARK-30569][SQL][PYSPARK][SPARKR] Add percentile_approx DSL functions. URL: https://github.com/apache/spark/pull/27278#issuecomment-599333526 **[Test build #119829 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/119829/testReport)** for PR 27278 at commit [`1ee0b24`](https://github.com/apache/spark/commit/1ee0b2433523aa6a9494daa816b018b7cb875777). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] HyukjinKwon commented on issue #27278: [SPARK-30569][SQL][PYSPARK][SPARKR] Add percentile_approx DSL functions.
HyukjinKwon commented on issue #27278: [SPARK-30569][SQL][PYSPARK][SPARKR] Add percentile_approx DSL functions. URL: https://github.com/apache/spark/pull/27278#issuecomment-599332466 retest this please This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #27921: [SPARK-31161][WEBUI] Refactor the on-click timeline action in streagming-page.js
AmplabJenkins removed a comment on issue #27921: [SPARK-31161][WEBUI] Refactor the on-click timeline action in streagming-page.js URL: https://github.com/apache/spark/pull/27921#issuecomment-599329436 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/24558/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #27921: [SPARK-31161][WEBUI] Refactor the on-click timeline action in streagming-page.js
AmplabJenkins removed a comment on issue #27921: [SPARK-31161][WEBUI] Refactor the on-click timeline action in streagming-page.js URL: https://github.com/apache/spark/pull/27921#issuecomment-599329427 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #27921: [SPARK-31161][WEBUI] Refactor the on-click timeline action in streagming-page.js
AmplabJenkins commented on issue #27921: [SPARK-31161][WEBUI] Refactor the on-click timeline action in streagming-page.js URL: https://github.com/apache/spark/pull/27921#issuecomment-599329427 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #27921: [SPARK-31161][WEBUI] Refactor the on-click timeline action in streagming-page.js
AmplabJenkins commented on issue #27921: [SPARK-31161][WEBUI] Refactor the on-click timeline action in streagming-page.js URL: https://github.com/apache/spark/pull/27921#issuecomment-599329436 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/24558/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on issue #27921: [SPARK-31161][WEBUI] Refactor the on-click timeline action in streagming-page.js
SparkQA commented on issue #27921: [SPARK-31161][WEBUI] Refactor the on-click timeline action in streagming-page.js URL: https://github.com/apache/spark/pull/27921#issuecomment-599329197 **[Test build #119828 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/119828/testReport)** for PR 27921 at commit [`96e8c58`](https://github.com/apache/spark/commit/96e8c5853ac2dc0104f7cb77e85f86b2a6afff7a). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] sarutak removed a comment on issue #27921: [SPARK-31161][WEBUI] Refactor the on-click timeline action in streagming-page.js
sarutak removed a comment on issue #27921: [SPARK-31161][WEBUI] Refactor the on-click timeline action in streagming-page.js URL: https://github.com/apache/spark/pull/27921#issuecomment-599326541 I'll fix the conflict. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] cloud-fan commented on a change in pull request #27893: [SPARK-31134][SQL] optimize skew join after shuffle partitions are coalesced
cloud-fan commented on a change in pull request #27893: [SPARK-31134][SQL] optimize skew join after shuffle partitions are coalesced URL: https://github.com/apache/spark/pull/27893#discussion_r392766089 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/adaptive/OptimizeSkewedJoin.scala ## @@ -328,6 +287,48 @@ case class OptimizeSkewedJoin(conf: SQLConf) extends Rule[SparkPlan] { } } +private object ShuffleStage { + def unapply(plan: SparkPlan): Option[ShuffleStageInfo] = plan match { +case s: ShuffleQueryStageExec => + val mapStats = getMapStats(s) + val sizes = mapStats.bytesByPartitionId + val partitions = sizes.zipWithIndex.map { +case (size, i) => CoalescedPartitionSpec(i, i + 1) -> size + } + Some(ShuffleStageInfo(s, mapStats, partitions)) + +case CustomShuffleReaderExec(s: ShuffleQueryStageExec, partitionSpecs, _) => + val mapStats = getMapStats(s) + val sizes = mapStats.bytesByPartitionId + val partitions = partitionSpecs.map { +case spec @ CoalescedPartitionSpec(start, end) => + var sum = 0L Review comment: `slice` will create a new array, which is less efficient. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #27882: [WIP][SPARK-31127][ML] Add abstract Selector
AmplabJenkins removed a comment on issue #27882: [WIP][SPARK-31127][ML] Add abstract Selector URL: https://github.com/apache/spark/pull/27882#issuecomment-599326597 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #27882: [WIP][SPARK-31127][ML] Add abstract Selector
AmplabJenkins removed a comment on issue #27882: [WIP][SPARK-31127][ML] Add abstract Selector URL: https://github.com/apache/spark/pull/27882#issuecomment-599326600 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/119823/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #27882: [WIP][SPARK-31127][ML] Add abstract Selector
AmplabJenkins commented on issue #27882: [WIP][SPARK-31127][ML] Add abstract Selector URL: https://github.com/apache/spark/pull/27882#issuecomment-599326597 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #27920: [SPARK-31102][SQL] Spark-sql fails to parse when contains comment.
AmplabJenkins removed a comment on issue #27920: [SPARK-31102][SQL] Spark-sql fails to parse when contains comment. URL: https://github.com/apache/spark/pull/27920#issuecomment-599326419 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #27920: [SPARK-31102][SQL] Spark-sql fails to parse when contains comment.
AmplabJenkins removed a comment on issue #27920: [SPARK-31102][SQL] Spark-sql fails to parse when contains comment. URL: https://github.com/apache/spark/pull/27920#issuecomment-599326425 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/119825/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] sarutak commented on issue #27921: [SPARK-31161][WEBUI] Refactor the on-click timeline action in streagming-page.js
sarutak commented on issue #27921: [SPARK-31161][WEBUI] Refactor the on-click timeline action in streagming-page.js URL: https://github.com/apache/spark/pull/27921#issuecomment-599326541 I'll fix the conflict. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #27882: [WIP][SPARK-31127][ML] Add abstract Selector
AmplabJenkins commented on issue #27882: [WIP][SPARK-31127][ML] Add abstract Selector URL: https://github.com/apache/spark/pull/27882#issuecomment-599326600 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/119823/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA removed a comment on issue #27882: [WIP][SPARK-31127][ML] Add abstract Selector
SparkQA removed a comment on issue #27882: [WIP][SPARK-31127][ML] Add abstract Selector URL: https://github.com/apache/spark/pull/27882#issuecomment-599289509 **[Test build #119823 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/119823/testReport)** for PR 27882 at commit [`b219f0a`](https://github.com/apache/spark/commit/b219f0af6713c77b162db50b72f0b42d2c420818). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #27920: [SPARK-31102][SQL] Spark-sql fails to parse when contains comment.
AmplabJenkins commented on issue #27920: [SPARK-31102][SQL] Spark-sql fails to parse when contains comment. URL: https://github.com/apache/spark/pull/27920#issuecomment-599326425 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/119825/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on issue #27920: [SPARK-31102][SQL] Spark-sql fails to parse when contains comment.
SparkQA commented on issue #27920: [SPARK-31102][SQL] Spark-sql fails to parse when contains comment. URL: https://github.com/apache/spark/pull/27920#issuecomment-599326321 **[Test build #119825 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/119825/testReport)** for PR 27920 at commit [`d69d271`](https://github.com/apache/spark/commit/d69d27126c06ea9782c5422a5435809a062b8ed7). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #27921: [SPARK-31161][WEBUI] Refactor the on-click timeline action in streagming-page.js
AmplabJenkins removed a comment on issue #27921: [SPARK-31161][WEBUI] Refactor the on-click timeline action in streagming-page.js URL: https://github.com/apache/spark/pull/27921#issuecomment-599324864 Build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #27921: [SPARK-31161][WEBUI] Refactor the on-click timeline action in streagming-page.js
AmplabJenkins removed a comment on issue #27921: [SPARK-31161][WEBUI] Refactor the on-click timeline action in streagming-page.js URL: https://github.com/apache/spark/pull/27921#issuecomment-599324867 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/24557/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #27920: [SPARK-31102][SQL] Spark-sql fails to parse when contains comment.
AmplabJenkins commented on issue #27920: [SPARK-31102][SQL] Spark-sql fails to parse when contains comment. URL: https://github.com/apache/spark/pull/27920#issuecomment-599326419 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA removed a comment on issue #27920: [SPARK-31102][SQL] Spark-sql fails to parse when contains comment.
SparkQA removed a comment on issue #27920: [SPARK-31102][SQL] Spark-sql fails to parse when contains comment. URL: https://github.com/apache/spark/pull/27920#issuecomment-599314095 **[Test build #119825 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/119825/testReport)** for PR 27920 at commit [`d69d271`](https://github.com/apache/spark/commit/d69d27126c06ea9782c5422a5435809a062b8ed7). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on issue #27921: [SPARK-31161][WEBUI] Refactor the on-click timeline action in streagming-page.js
SparkQA commented on issue #27921: [SPARK-31161][WEBUI] Refactor the on-click timeline action in streagming-page.js URL: https://github.com/apache/spark/pull/27921#issuecomment-599326192 **[Test build #119827 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/119827/testReport)** for PR 27921 at commit [`7e591a7`](https://github.com/apache/spark/commit/7e591a70855a93f32e2edee7ae9eb204e4f77be4). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on issue #27882: [WIP][SPARK-31127][ML] Add abstract Selector
SparkQA commented on issue #27882: [WIP][SPARK-31127][ML] Add abstract Selector URL: https://github.com/apache/spark/pull/27882#issuecomment-599326224 **[Test build #119823 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/119823/testReport)** for PR 27882 at commit [`b219f0a`](https://github.com/apache/spark/commit/b219f0af6713c77b162db50b72f0b42d2c420818). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #27921: [SPARK-31161][WEBUI] Refactor the on-click timeline action in streagming-page.js
AmplabJenkins commented on issue #27921: [SPARK-31161][WEBUI] Refactor the on-click timeline action in streagming-page.js URL: https://github.com/apache/spark/pull/27921#issuecomment-599324867 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/24557/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #27921: [SPARK-31161][WEBUI] Refactor the on-click timeline action in streagming-page.js
AmplabJenkins commented on issue #27921: [SPARK-31161][WEBUI] Refactor the on-click timeline action in streagming-page.js URL: https://github.com/apache/spark/pull/27921#issuecomment-599324864 Build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] sarutak opened a new pull request #27921: [SPARK-31161][WEBUI] Refactor the on-click timeline action in streagming-page.js
sarutak opened a new pull request #27921: [SPARK-31161][WEBUI] Refactor the on-click timeline action in streagming-page.js URL: https://github.com/apache/spark/pull/27921 ### What changes were proposed in this pull request? Refactor `streaming-page.js` by making on-click timeline action customizable. ### Why are the changes needed? In the current implementation, `streaming-page.js` is used from Streaming page and Structured Streaming page but the implementation of the on-click timeline action is strongly dependent on Streamng page. Structured Streaming page doesn't define the on-click action for now but it's better to remove the dependncy for the future. Originally, I make this change to fix `SPARK-31128` but #27883 resolved it. So, now this is just for refactoring. ### Does this PR introduce any user-facing change? No. ### How was this patch tested? Manual tests with following code and confirmed there are no regression and no error in the debug console in Firefox. For Structured Streaming: ``` spark.readStream.format("socket").options(Map("host"->"localhost", "port"->"8765")).load.writeStream.format("console").start ``` And then, visited Structured Streaming page and there were no error in the debug console when I clicked a point in the timeline. For Spark Streaming: ``` import org.apache.spark.streaming._ val ssc = new StreamingContext(sc, Seconds(1)) ssc.socketTextStream("localhost", 8765) dstream.foreachRDD(rdd => rdd.foreach(println)) ssc.start ``` And then, visited Streaming page and confirmed scrolling down and hilighting work well and there were no error in the debug console when I clicked a point in the timeline. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] zhengruifeng commented on a change in pull request #27895: [SPARK-31138][ML] Add ANOVA Selector for continuous features and categorical labels
zhengruifeng commented on a change in pull request #27895: [SPARK-31138][ML] Add ANOVA Selector for continuous features and categorical labels URL: https://github.com/apache/spark/pull/27895#discussion_r392760983 ## File path: mllib/src/main/scala/org/apache/spark/ml/stat/ANOVATest.scala ## @@ -0,0 +1,166 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.spark.ml.stat + +import org.apache.commons.math3.distribution.FDistribution + +import org.apache.spark.annotation.Since +import org.apache.spark.ml.feature.LabeledPoint +import org.apache.spark.ml.linalg.{Vector, Vectors, VectorUDT} +import org.apache.spark.ml.util.SchemaUtils +import org.apache.spark.sql._ +import org.apache.spark.sql.functions.col +import org.apache.spark.util.collection.OpenHashMap + + +/** + * ANOVA Test + */ +@Since("3.1.0") +object ANOVATest { + + /** Used to construct output schema of tests */ + private case class ANOVAResult( + pValues: Vector, + degreesOfFreedom: Array[Long], + fValues: Vector) + + /** + * @param dataset DataFrame of categorical labels and continuous features. + * @param featuresCol Name of features column in dataset, of type `Vector` (`VectorUDT`) + * @param labelCol Name of label column in dataset, of any numerical type + * @return DataFrame containing the test result for every feature against the label. + * This DataFrame will contain a single Row with the following fields: + * - `pValues: Vector` + * - `degreesOfFreedom: Array[Long]` + * - `fValues: Vector` + * Each of these fields has one value per feature. + */ + @Since("3.1.0") + def test(dataset: DataFrame, featuresCol: String, labelCol: String): DataFrame = { +val spark = dataset.sparkSession +val testResults = testClassification(dataset, featuresCol, labelCol) +val pValues: Vector = Vectors.dense(testResults.map(_.pValue)) +val degreesOfFreedom: Array[Long] = testResults.map(_.degreesOfFreedom) +val fValues: Vector = Vectors.dense(testResults.map(_.statistic)) +spark.createDataFrame( + Seq(new ANOVAResult(pValues, degreesOfFreedom, fValues))) + } + + /** + * @param dataset DataFrame of categorical labels and continuous features. + * @param featuresCol Name of features column in dataset, of type `Vector` (`VectorUDT`) + * @param labelCol Name of label column in dataset, of any numerical type + * @return Array containing the ANOVATestResult for every feature against the + * label. + */ + private[ml] def testClassification( + dataset: Dataset[_], + featuresCol: String, + labelCol: String): Array[SelectionTestResult] = { + +val spark = dataset.sparkSession +import spark.implicits._ + +SchemaUtils.checkColumnType(dataset.schema, featuresCol, new VectorUDT) +SchemaUtils.checkNumericType(dataset.schema, labelCol) + +val labeledPointRdd = dataset.select(col("label").cast("double"), col("features")) + .as[(Double, Vector)] + .rdd.map { case (label, features) => LabeledPoint(label, features) } + +val numFeatures = labeledPointRdd.first().features.size Review comment: nit: we can compute `numSamples` and `numClasses` together: ```scala val numFeatures = MetadataUtils.getNumFeatures(dataset, $(featuresCol)) val Row(numSamples: Long, numClasses: Long) = dataset.select(countDistinct(labelCol), count(labelCol)).head ``` This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] zhengruifeng commented on a change in pull request #27895: [SPARK-31138][ML] Add ANOVA Selector for continuous features and categorical labels
zhengruifeng commented on a change in pull request #27895: [SPARK-31138][ML] Add ANOVA Selector for continuous features and categorical labels URL: https://github.com/apache/spark/pull/27895#discussion_r392761159 ## File path: mllib/src/main/scala/org/apache/spark/ml/stat/ANOVATest.scala ## @@ -0,0 +1,166 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.spark.ml.stat + +import org.apache.commons.math3.distribution.FDistribution + +import org.apache.spark.annotation.Since +import org.apache.spark.ml.feature.LabeledPoint +import org.apache.spark.ml.linalg.{Vector, Vectors, VectorUDT} +import org.apache.spark.ml.util.SchemaUtils +import org.apache.spark.sql._ +import org.apache.spark.sql.functions.col +import org.apache.spark.util.collection.OpenHashMap + + +/** + * ANOVA Test + */ +@Since("3.1.0") +object ANOVATest { + + /** Used to construct output schema of tests */ + private case class ANOVAResult( + pValues: Vector, + degreesOfFreedom: Array[Long], + fValues: Vector) + + /** + * @param dataset DataFrame of categorical labels and continuous features. + * @param featuresCol Name of features column in dataset, of type `Vector` (`VectorUDT`) + * @param labelCol Name of label column in dataset, of any numerical type + * @return DataFrame containing the test result for every feature against the label. + * This DataFrame will contain a single Row with the following fields: + * - `pValues: Vector` + * - `degreesOfFreedom: Array[Long]` + * - `fValues: Vector` + * Each of these fields has one value per feature. + */ + @Since("3.1.0") + def test(dataset: DataFrame, featuresCol: String, labelCol: String): DataFrame = { +val spark = dataset.sparkSession +val testResults = testClassification(dataset, featuresCol, labelCol) +val pValues: Vector = Vectors.dense(testResults.map(_.pValue)) +val degreesOfFreedom: Array[Long] = testResults.map(_.degreesOfFreedom) +val fValues: Vector = Vectors.dense(testResults.map(_.statistic)) +spark.createDataFrame( + Seq(new ANOVAResult(pValues, degreesOfFreedom, fValues))) + } + + /** + * @param dataset DataFrame of categorical labels and continuous features. + * @param featuresCol Name of features column in dataset, of type `Vector` (`VectorUDT`) + * @param labelCol Name of label column in dataset, of any numerical type + * @return Array containing the ANOVATestResult for every feature against the + * label. + */ + private[ml] def testClassification( + dataset: Dataset[_], + featuresCol: String, + labelCol: String): Array[SelectionTestResult] = { + +val spark = dataset.sparkSession +import spark.implicits._ + +SchemaUtils.checkColumnType(dataset.schema, featuresCol, new VectorUDT) +SchemaUtils.checkNumericType(dataset.schema, labelCol) + +val labeledPointRdd = dataset.select(col("label").cast("double"), col("features")) Review comment: "label" -> `labelCol` "features" -> `featuresCol` This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #27920: [SPARK-31102][SQL] Spark-sql fails to parse when contains comment.
AmplabJenkins removed a comment on issue #27920: [SPARK-31102][SQL] Spark-sql fails to parse when contains comment. URL: https://github.com/apache/spark/pull/27920#issuecomment-599314402 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #27920: [SPARK-31102][SQL] Spark-sql fails to parse when contains comment.
AmplabJenkins removed a comment on issue #27920: [SPARK-31102][SQL] Spark-sql fails to parse when contains comment. URL: https://github.com/apache/spark/pull/27920#issuecomment-599314405 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/24555/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #27019: [SPARK-30027][SQL] Support codegen for aggregate filters in HashAggregateExec
AmplabJenkins removed a comment on issue #27019: [SPARK-30027][SQL] Support codegen for aggregate filters in HashAggregateExec URL: https://github.com/apache/spark/pull/27019#issuecomment-599314385 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/24556/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #27019: [SPARK-30027][SQL] Support codegen for aggregate filters in HashAggregateExec
AmplabJenkins removed a comment on issue #27019: [SPARK-30027][SQL] Support codegen for aggregate filters in HashAggregateExec URL: https://github.com/apache/spark/pull/27019#issuecomment-599314378 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #27920: [SPARK-31102][SQL] Spark-sql fails to parse when contains comment.
AmplabJenkins commented on issue #27920: [SPARK-31102][SQL] Spark-sql fails to parse when contains comment. URL: https://github.com/apache/spark/pull/27920#issuecomment-599314405 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/24555/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #27019: [SPARK-30027][SQL] Support codegen for aggregate filters in HashAggregateExec
AmplabJenkins commented on issue #27019: [SPARK-30027][SQL] Support codegen for aggregate filters in HashAggregateExec URL: https://github.com/apache/spark/pull/27019#issuecomment-599314385 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/24556/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #27920: [SPARK-31102][SQL] Spark-sql fails to parse when contains comment.
AmplabJenkins commented on issue #27920: [SPARK-31102][SQL] Spark-sql fails to parse when contains comment. URL: https://github.com/apache/spark/pull/27920#issuecomment-599314402 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #27019: [SPARK-30027][SQL] Support codegen for aggregate filters in HashAggregateExec
AmplabJenkins commented on issue #27019: [SPARK-30027][SQL] Support codegen for aggregate filters in HashAggregateExec URL: https://github.com/apache/spark/pull/27019#issuecomment-599314378 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on issue #27920: [SPARK-31102][SQL] Spark-sql fails to parse when contains comment.
SparkQA commented on issue #27920: [SPARK-31102][SQL] Spark-sql fails to parse when contains comment. URL: https://github.com/apache/spark/pull/27920#issuecomment-599314095 **[Test build #119825 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/119825/testReport)** for PR 27920 at commit [`d69d271`](https://github.com/apache/spark/commit/d69d27126c06ea9782c5422a5435809a062b8ed7). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on issue #27019: [SPARK-30027][SQL] Support codegen for aggregate filters in HashAggregateExec
SparkQA commented on issue #27019: [SPARK-30027][SQL] Support codegen for aggregate filters in HashAggregateExec URL: https://github.com/apache/spark/pull/27019#issuecomment-599314096 **[Test build #119826 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/119826/testReport)** for PR 27019 at commit [`d303b2f`](https://github.com/apache/spark/commit/d303b2fe2bc1ddc0b1e98c2028fb92322a0bb976). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] maropu commented on issue #27019: [SPARK-30027][SQL] Support codegen for aggregate filters in HashAggregateExec
maropu commented on issue #27019: [SPARK-30027][SQL] Support codegen for aggregate filters in HashAggregateExec URL: https://github.com/apache/spark/pull/27019#issuecomment-599313299 cc: @cloud-fan This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #25827: [SPARK-29128][SQL] Split predicate code in OR expressions
AmplabJenkins removed a comment on issue #25827: [SPARK-29128][SQL] Split predicate code in OR expressions URL: https://github.com/apache/spark/pull/25827#issuecomment-599312956 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #25827: [SPARK-29128][SQL] Split predicate code in OR expressions
AmplabJenkins commented on issue #25827: [SPARK-29128][SQL] Split predicate code in OR expressions URL: https://github.com/apache/spark/pull/25827#issuecomment-599312961 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/24554/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #27920: [SPARK-31102][SQL] Spark-sql fails to parse when contains comment.
AmplabJenkins removed a comment on issue #27920: [SPARK-31102][SQL] Spark-sql fails to parse when contains comment. URL: https://github.com/apache/spark/pull/27920#issuecomment-599307680 Can one of the admins verify this patch? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #25827: [SPARK-29128][SQL] Split predicate code in OR expressions
AmplabJenkins removed a comment on issue #25827: [SPARK-29128][SQL] Split predicate code in OR expressions URL: https://github.com/apache/spark/pull/25827#issuecomment-599312961 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/24554/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] maropu commented on issue #27920: [SPARK-31102][SQL] Spark-sql fails to parse when contains comment.
maropu commented on issue #27920: [SPARK-31102][SQL] Spark-sql fails to parse when contains comment. URL: https://github.com/apache/spark/pull/27920#issuecomment-599312972 ok to test This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #25827: [SPARK-29128][SQL] Split predicate code in OR expressions
AmplabJenkins commented on issue #25827: [SPARK-29128][SQL] Split predicate code in OR expressions URL: https://github.com/apache/spark/pull/25827#issuecomment-599312956 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on issue #25827: [SPARK-29128][SQL] Split predicate code in OR expressions
SparkQA commented on issue #25827: [SPARK-29128][SQL] Split predicate code in OR expressions URL: https://github.com/apache/spark/pull/25827#issuecomment-599312685 **[Test build #119824 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/119824/testReport)** for PR 25827 at commit [`543c016`](https://github.com/apache/spark/commit/543c0167dab23ece2e4db232c0fd7d4c9e5eeb8e). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] maropu commented on issue #25827: [SPARK-29128][SQL] Split predicate code in OR expressions
maropu commented on issue #25827: [SPARK-29128][SQL] Split predicate code in OR expressions URL: https://github.com/apache/spark/pull/25827#issuecomment-599312055 retest this please This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #27920: [SPARK-31102][SQL] Spark-sql fails to parse when contains comment.
AmplabJenkins removed a comment on issue #27920: [SPARK-31102][SQL] Spark-sql fails to parse when contains comment. URL: https://github.com/apache/spark/pull/27920#issuecomment-599307401 Can one of the admins verify this patch? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #27920: [SPARK-31102][SQL] Spark-sql fails to parse when contains comment.
AmplabJenkins commented on issue #27920: [SPARK-31102][SQL] Spark-sql fails to parse when contains comment. URL: https://github.com/apache/spark/pull/27920#issuecomment-599307680 Can one of the admins verify this patch? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #27920: [SPARK-31102][SQL] Spark-sql fails to parse when contains comment.
AmplabJenkins commented on issue #27920: [SPARK-31102][SQL] Spark-sql fails to parse when contains comment. URL: https://github.com/apache/spark/pull/27920#issuecomment-599307401 Can one of the admins verify this patch? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] javierivanov opened a new pull request #27920: [SPARK-31102][SQL] Spark-sql fails to parse when contains comment.
javierivanov opened a new pull request #27920: [SPARK-31102][SQL] Spark-sql fails to parse when contains comment. URL: https://github.com/apache/spark/pull/27920 ### What changes were proposed in this pull request? This PR introduces a change to false for the insideComment flag on a newline. Fixing the issue introduced by SPARK-30049. ### Why are the changes needed? Previously on SPARK-30049 a comment containing an unclosed quote produced the following issue: ``` spark-sql> SELECT 1 -- someone's comment here > ; Error in query: extraneous input ';' expecting (line 2, pos 0) == SQL == SELECT 1 -- someone's comment here ; ^^^ ``` This was caused because there was no flag for comment sections inside the splitSemiColon method to ignore quotes. SPARK-30049 added that flag and fixed the issue, but introduced the follwoing problem: ``` spark-sql> select > 1, > -- two > 2; Error in query: mismatched input '' expecting {'(', 'ADD', 'AFTER', 'ALL', 'ALTER', ...}(line 3, pos 2) == SQL == select 1, --^^^ ``` This issue is generated by a missing turn-off for the insideComment flag with a newline. ### Does this PR introduce any user-facing change? No ### How was this patch tested? Previous tests using line-continuity(`\`) were removed and a test for inline comments within a query was added. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] viirya commented on a change in pull request #27901: [SPARK-31146][SQL] Leverage the helper method for aliasing in built-in SQL expressions
viirya commented on a change in pull request #27901: [SPARK-31146][SQL] Leverage the helper method for aliasing in built-in SQL expressions URL: https://github.com/apache/spark/pull/27901#discussion_r392753020 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/CallMethodViaReflection.scala ## @@ -55,7 +55,7 @@ import org.apache.spark.util.Utils case class CallMethodViaReflection(children: Seq[Expression]) extends Expression with CodegenFallback { - override def prettyName: String = "reflect" + override def prettyName: String = getTagValue(FunctionRegistry.FUNC_ALIAS).getOrElse("reflect") Review comment: Yea, if a code relies on the output column name of `selectExpr("java_method(...)")`, it could be broken as it is changed to `java_method(...)` now. This is correct, I think. If we don't guarantee on the output column name, it is ok. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] huaxingao commented on issue #27919: [parkSPARK-30954][ML][SPARKR]Make class name the same as file name
huaxingao commented on issue #27919: [parkSPARK-30954][ML][SPARKR]Make class name the same as file name URL: https://github.com/apache/spark/pull/27919#issuecomment-599304708 @kevinyu98 Hi Kevin, thanks for working on this. I think you missed ```DecisionTreeREgressionWrapper.scala```. Also, the title of your PR is not right ```[parkSPARK-30954]``` This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] beliefer commented on issue #27898: [SPARK-31141][DSTREAMS][DOC] Add version information to the configuration of Dstreams
beliefer commented on issue #27898: [SPARK-31141][DSTREAMS][DOC] Add version information to the configuration of Dstreams URL: https://github.com/apache/spark/pull/27898#issuecomment-599304484 @HyukjinKwon Could you take a look at this PR? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] sarutak commented on issue #27849: [SPARK-31081][UI][SQL] Make the display of stageId/stageAttemptId/taskId of sql metrics configurable in UI
sarutak commented on issue #27849: [SPARK-31081][UI][SQL] Make the display of stageId/stageAttemptId/taskId of sql metrics configurable in UI URL: https://github.com/apache/spark/pull/27849#issuecomment-599303912 @gengliangwang I'll try to add the checkbox. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #27882: [WIP][SPARK-31127][ML] Add abstract Selector
AmplabJenkins removed a comment on issue #27882: [WIP][SPARK-31127][ML] Add abstract Selector URL: https://github.com/apache/spark/pull/27882#issuecomment-599302162 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #27882: [WIP][SPARK-31127][ML] Add abstract Selector
AmplabJenkins removed a comment on issue #27882: [WIP][SPARK-31127][ML] Add abstract Selector URL: https://github.com/apache/spark/pull/27882#issuecomment-599302165 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/119821/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #27882: [WIP][SPARK-31127][ML] Add abstract Selector
AmplabJenkins commented on issue #27882: [WIP][SPARK-31127][ML] Add abstract Selector URL: https://github.com/apache/spark/pull/27882#issuecomment-599302162 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #27882: [WIP][SPARK-31127][ML] Add abstract Selector
AmplabJenkins commented on issue #27882: [WIP][SPARK-31127][ML] Add abstract Selector URL: https://github.com/apache/spark/pull/27882#issuecomment-599302165 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/119821/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA removed a comment on issue #27882: [WIP][SPARK-31127][ML] Add abstract Selector
SparkQA removed a comment on issue #27882: [WIP][SPARK-31127][ML] Add abstract Selector URL: https://github.com/apache/spark/pull/27882#issuecomment-599278543 **[Test build #119821 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/119821/testReport)** for PR 27882 at commit [`b4d30f9`](https://github.com/apache/spark/commit/b4d30f97a0d4bd40bc689f7d1fd961b8ce66d123). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org