[GitHub] [spark] HyukjinKwon commented on issue #27888: [SPARK-31116][SQL] Fix nested schema case-sensitivity in ParquetRowConverter

2020-03-15 Thread GitBox
HyukjinKwon commented on issue #27888: [SPARK-31116][SQL] Fix nested schema 
case-sensitivity in ParquetRowConverter
URL: https://github.com/apache/spark/pull/27888#issuecomment-599354017
 
 
   The fix seems fine except the comments above. @dbtsai, WDYT about disabling 
nested pruning by default for Spark 3.0 (see 
https://github.com/apache/spark/pull/27888#pullrequestreview-373986169)?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] HyukjinKwon commented on a change in pull request #27888: [SPARK-31116][SQL] Fix nested schema case-sensitivity in ParquetRowConverter

2020-03-15 Thread GitBox
HyukjinKwon commented on a change in pull request #27888: [SPARK-31116][SQL] 
Fix nested schema case-sensitivity in ParquetRowConverter
URL: https://github.com/apache/spark/pull/27888#discussion_r392789070
 
 

 ##
 File path: 
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/parquet/ParquetRowConverter.scala
 ##
 @@ -176,15 +178,38 @@ private[parquet] class ParquetRowConverter(
*/
   def currentRecord: InternalRow = currentRow
 
+
   // Converters for each field.
   private[this] val fieldConverters: Array[Converter with 
HasParentContainerUpdater] = {
-parquetType.getFields.asScala.map { parquetField =>
-  val fieldIndex = catalystType.fieldIndex(parquetField.getName)
-  val catalystField = catalystType(fieldIndex)
-  // Converted field value should be set to the `fieldIndex`-th cell of 
`currentRow`
-  newConverter(parquetField, catalystField.dataType, new 
RowUpdater(currentRow, fieldIndex))
-}.toArray
-  }
+
+// (SPARK-31116) There is an issue when schema pruning is enabled, so we 
keep original codes
+if (schemaPruning) {
+  // (SPARK-31116) For letter case issue, create name to field index based 
on case sensitivity
+  val catalystFieldNameToIndex = if (caseSensitive) {
+catalystType.fieldNames.zipWithIndex.toMap
+  } else {
+CaseInsensitiveMap(catalystType.fieldNames.zipWithIndex.toMap)
+  }
+  parquetType.getFields.asScala.map { parquetField =>
+val fieldIndex = 
catalystFieldNameToIndex.getOrElse(parquetField.getName,
+  throw new IllegalArgumentException(
+s"${parquetField.getName} does not exist. " +
+  s"Available: ${catalystType.fieldNames.mkString(", ")}")
+)
+val catalystField = catalystType(fieldIndex)
+// Converted field value should be set to the `fieldIndex`-th cell of 
`currentRow`
+newConverter(parquetField, catalystField.dataType, new 
RowUpdater(currentRow, fieldIndex))
+  }.toArray
+} else {
+  parquetType.getFields.asScala.zip(catalystType).zipWithIndex.map {
+case ((parquetFieldType, catalystField), ordinal) =>
+  // Converted field value should be set to the `ordinal`-th cell of 
`currentRow`
+  newConverter(
+parquetFieldType, catalystField.dataType, new 
RowUpdater(currentRow, ordinal))
+  }.toArray
+}
 
 Review comment:
   I actually asked to keep the original codes as were at 
https://github.com/apache/spark/pull/27888#discussion_r391979749 although it 
apparently works same.
   I am okay to remove this branch if we're very sure it works identically.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] kimtkyeom commented on a change in pull request #27888: [SPARK-31116][SQL] Fix nested schema case-sensitivity in ParquetRowConverter

2020-03-15 Thread GitBox
kimtkyeom commented on a change in pull request #27888: [SPARK-31116][SQL] Fix 
nested schema case-sensitivity in ParquetRowConverter
URL: https://github.com/apache/spark/pull/27888#discussion_r392788006
 
 

 ##
 File path: 
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/parquet/ParquetRowConverter.scala
 ##
 @@ -176,15 +178,38 @@ private[parquet] class ParquetRowConverter(
*/
   def currentRecord: InternalRow = currentRow
 
+
   // Converters for each field.
   private[this] val fieldConverters: Array[Converter with 
HasParentContainerUpdater] = {
-parquetType.getFields.asScala.map { parquetField =>
-  val fieldIndex = catalystType.fieldIndex(parquetField.getName)
-  val catalystField = catalystType(fieldIndex)
-  // Converted field value should be set to the `fieldIndex`-th cell of 
`currentRow`
-  newConverter(parquetField, catalystField.dataType, new 
RowUpdater(currentRow, fieldIndex))
-}.toArray
-  }
+
+// (SPARK-31116) There is an issue when schema pruning is enabled, so we 
keep original codes
+if (schemaPruning) {
+  // (SPARK-31116) For letter case issue, create name to field index based 
on case sensitivity
+  val catalystFieldNameToIndex = if (caseSensitive) {
+catalystType.fieldNames.zipWithIndex.toMap
+  } else {
+CaseInsensitiveMap(catalystType.fieldNames.zipWithIndex.toMap)
+  }
+  parquetType.getFields.asScala.map { parquetField =>
+val fieldIndex = 
catalystFieldNameToIndex.getOrElse(parquetField.getName,
+  throw new IllegalArgumentException(
+s"${parquetField.getName} does not exist. " +
+  s"Available: ${catalystType.fieldNames.mkString(", ")}")
+)
+val catalystField = catalystType(fieldIndex)
+// Converted field value should be set to the `fieldIndex`-th cell of 
`currentRow`
+newConverter(parquetField, catalystField.dataType, new 
RowUpdater(currentRow, fieldIndex))
+  }.toArray
+} else {
+  parquetType.getFields.asScala.zip(catalystType).zipWithIndex.map {
+case ((parquetFieldType, catalystField), ordinal) =>
+  // Converted field value should be set to the `ordinal`-th cell of 
`currentRow`
+  newConverter(
+parquetFieldType, catalystField.dataType, new 
RowUpdater(currentRow, ordinal))
+  }.toArray
+}
 
 Review comment:
   I added this codes because of 
https://github.com/apache/spark/pull/27888#discussion_r391979749. Am I 
misunderstood comment?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on issue #27895: [SPARK-31138][ML] Add ANOVA Selector for continuous features and categorical labels

2020-03-15 Thread GitBox
AmplabJenkins removed a comment on issue #27895: [SPARK-31138][ML] Add ANOVA 
Selector for continuous features and categorical labels
URL: https://github.com/apache/spark/pull/27895#issuecomment-599350825
 
 
   Merged build finished. Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on issue #27895: [SPARK-31138][ML] Add ANOVA Selector for continuous features and categorical labels

2020-03-15 Thread GitBox
AmplabJenkins removed a comment on issue #27895: [SPARK-31138][ML] Add ANOVA 
Selector for continuous features and categorical labels
URL: https://github.com/apache/spark/pull/27895#issuecomment-599350829
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/24563/
   Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on issue #27882: [WIP][SPARK-31127][ML] Add abstract Selector

2020-03-15 Thread GitBox
AmplabJenkins commented on issue #27882: [WIP][SPARK-31127][ML] Add abstract 
Selector
URL: https://github.com/apache/spark/pull/27882#issuecomment-599350818
 
 
   Merged build finished. Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on issue #27882: [WIP][SPARK-31127][ML] Add abstract Selector

2020-03-15 Thread GitBox
AmplabJenkins removed a comment on issue #27882: [WIP][SPARK-31127][ML] Add 
abstract Selector
URL: https://github.com/apache/spark/pull/27882#issuecomment-599350818
 
 
   Merged build finished. Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on issue #27882: [WIP][SPARK-31127][ML] Add abstract Selector

2020-03-15 Thread GitBox
AmplabJenkins commented on issue #27882: [WIP][SPARK-31127][ML] Add abstract 
Selector
URL: https://github.com/apache/spark/pull/27882#issuecomment-599350822
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/24564/
   Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on issue #27882: [WIP][SPARK-31127][ML] Add abstract Selector

2020-03-15 Thread GitBox
AmplabJenkins removed a comment on issue #27882: [WIP][SPARK-31127][ML] Add 
abstract Selector
URL: https://github.com/apache/spark/pull/27882#issuecomment-599350822
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/24564/
   Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on issue #27895: [SPARK-31138][ML] Add ANOVA Selector for continuous features and categorical labels

2020-03-15 Thread GitBox
AmplabJenkins commented on issue #27895: [SPARK-31138][ML] Add ANOVA Selector 
for continuous features and categorical labels
URL: https://github.com/apache/spark/pull/27895#issuecomment-599350825
 
 
   Merged build finished. Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on issue #27895: [SPARK-31138][ML] Add ANOVA Selector for continuous features and categorical labels

2020-03-15 Thread GitBox
AmplabJenkins commented on issue #27895: [SPARK-31138][ML] Add ANOVA Selector 
for continuous features and categorical labels
URL: https://github.com/apache/spark/pull/27895#issuecomment-599350829
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/24563/
   Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on issue #27895: [SPARK-31138][ML] Add ANOVA Selector for continuous features and categorical labels

2020-03-15 Thread GitBox
SparkQA commented on issue #27895: [SPARK-31138][ML] Add ANOVA Selector for 
continuous features and categorical labels
URL: https://github.com/apache/spark/pull/27895#issuecomment-599350500
 
 
   **[Test build #119833 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/119833/testReport)**
 for PR 27895 at commit 
[`6107389`](https://github.com/apache/spark/commit/6107389d399ae8a6d34799c9ea166f45503bef86).


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on issue #27882: [WIP][SPARK-31127][ML] Add abstract Selector

2020-03-15 Thread GitBox
SparkQA commented on issue #27882: [WIP][SPARK-31127][ML] Add abstract Selector
URL: https://github.com/apache/spark/pull/27882#issuecomment-599350499
 
 
   **[Test build #119834 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/119834/testReport)**
 for PR 27882 at commit 
[`eb841a5`](https://github.com/apache/spark/commit/eb841a5148a56a4e814148898408704f40dadcba).


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] viirya commented on a change in pull request #27888: [SPARK-31116][SQL] Fix nested schema case-sensitivity in ParquetRowConverter

2020-03-15 Thread GitBox
viirya commented on a change in pull request #27888: [SPARK-31116][SQL] Fix 
nested schema case-sensitivity in ParquetRowConverter
URL: https://github.com/apache/spark/pull/27888#discussion_r392786863
 
 

 ##
 File path: 
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/parquet/ParquetRowConverter.scala
 ##
 @@ -176,15 +178,38 @@ private[parquet] class ParquetRowConverter(
*/
   def currentRecord: InternalRow = currentRow
 
+
   // Converters for each field.
   private[this] val fieldConverters: Array[Converter with 
HasParentContainerUpdater] = {
-parquetType.getFields.asScala.map { parquetField =>
-  val fieldIndex = catalystType.fieldIndex(parquetField.getName)
-  val catalystField = catalystType(fieldIndex)
-  // Converted field value should be set to the `fieldIndex`-th cell of 
`currentRow`
-  newConverter(parquetField, catalystField.dataType, new 
RowUpdater(currentRow, fieldIndex))
-}.toArray
-  }
+
+// (SPARK-31116) There is an issue when schema pruning is enabled, so we 
keep original codes
+if (schemaPruning) {
+  // (SPARK-31116) For letter case issue, create name to field index based 
on case sensitivity
+  val catalystFieldNameToIndex = if (caseSensitive) {
+catalystType.fieldNames.zipWithIndex.toMap
+  } else {
+CaseInsensitiveMap(catalystType.fieldNames.zipWithIndex.toMap)
+  }
+  parquetType.getFields.asScala.map { parquetField =>
+val fieldIndex = 
catalystFieldNameToIndex.getOrElse(parquetField.getName,
+  throw new IllegalArgumentException(
+s"${parquetField.getName} does not exist. " +
+  s"Available: ${catalystType.fieldNames.mkString(", ")}")
+)
+val catalystField = catalystType(fieldIndex)
+// Converted field value should be set to the `fieldIndex`-th cell of 
`currentRow`
+newConverter(parquetField, catalystField.dataType, new 
RowUpdater(currentRow, fieldIndex))
+  }.toArray
+} else {
+  parquetType.getFields.asScala.zip(catalystType).zipWithIndex.map {
+case ((parquetFieldType, catalystField), ordinal) =>
+  // Converted field value should be set to the `ordinal`-th cell of 
`currentRow`
+  newConverter(
+parquetFieldType, catalystField.dataType, new 
RowUpdater(currentRow, ordinal))
+  }.toArray
+}
 
 Review comment:
   Why add this part of code? Seems to me the above code inside `if 
(schemaPruning) { ... }` looks reasonable. 


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] viirya commented on a change in pull request #27888: [SPARK-31116][SQL] Fix nested schema case-sensitivity in ParquetRowConverter

2020-03-15 Thread GitBox
viirya commented on a change in pull request #27888: [SPARK-31116][SQL] Fix 
nested schema case-sensitivity in ParquetRowConverter
URL: https://github.com/apache/spark/pull/27888#discussion_r392785966
 
 

 ##
 File path: 
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/parquet/ParquetRowConverter.scala
 ##
 @@ -176,15 +178,38 @@ private[parquet] class ParquetRowConverter(
*/
   def currentRecord: InternalRow = currentRow
 
+
 
 Review comment:
   nit: remove unnecessary blank link.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] huaxingao commented on a change in pull request #27895: [SPARK-31138][ML] Add ANOVA Selector for continuous features and categorical labels

2020-03-15 Thread GitBox
huaxingao commented on a change in pull request #27895: [SPARK-31138][ML] Add 
ANOVA Selector for continuous features and categorical labels
URL: https://github.com/apache/spark/pull/27895#discussion_r392786552
 
 

 ##
 File path: mllib/src/main/scala/org/apache/spark/ml/stat/ANOVATest.scala
 ##
 @@ -0,0 +1,166 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.ml.stat
+
+import org.apache.commons.math3.distribution.FDistribution
+
+import org.apache.spark.annotation.Since
+import org.apache.spark.ml.feature.LabeledPoint
+import org.apache.spark.ml.linalg.{Vector, Vectors, VectorUDT}
+import org.apache.spark.ml.util.SchemaUtils
+import org.apache.spark.sql._
+import org.apache.spark.sql.functions.col
+import org.apache.spark.util.collection.OpenHashMap
+
+
+/**
+ * ANOVA Test
+ */
+@Since("3.1.0")
+object ANOVATest {
+
+  /** Used to construct output schema of tests */
+  private case class ANOVAResult(
+  pValues: Vector,
+  degreesOfFreedom: Array[Long],
+  fValues: Vector)
+
+  /**
+   * @param dataset  DataFrame of categorical labels and continuous features.
+   * @param featuresCol  Name of features column in dataset, of type `Vector` 
(`VectorUDT`)
+   * @param labelCol  Name of label column in dataset, of any numerical type
+   * @return DataFrame containing the test result for every feature against 
the label.
+   * This DataFrame will contain a single Row with the following 
fields:
+   *  - `pValues: Vector`
+   *  - `degreesOfFreedom: Array[Long]`
+   *  - `fValues: Vector`
+   * Each of these fields has one value per feature.
+   */
+  @Since("3.1.0")
+  def test(dataset: DataFrame, featuresCol: String, labelCol: String): 
DataFrame = {
+val spark = dataset.sparkSession
+val testResults = testClassification(dataset, featuresCol, labelCol)
+val pValues: Vector = Vectors.dense(testResults.map(_.pValue))
+val degreesOfFreedom: Array[Long] = testResults.map(_.degreesOfFreedom)
+val fValues: Vector = Vectors.dense(testResults.map(_.statistic))
+spark.createDataFrame(
+  Seq(new ANOVAResult(pValues, degreesOfFreedom, fValues)))
+  }
+
+  /**
+   * @param dataset  DataFrame of categorical labels and continuous features.
+   * @param featuresCol  Name of features column in dataset, of type `Vector` 
(`VectorUDT`)
+   * @param labelCol  Name of label column in dataset, of any numerical type
+   * @return Array containing the ANOVATestResult for every feature against the
+   * label.
+   */
+  private[ml] def testClassification(
+  dataset: Dataset[_],
+  featuresCol: String,
+  labelCol: String): Array[SelectionTestResult] = {
+
+val spark = dataset.sparkSession
+import spark.implicits._
+
+SchemaUtils.checkColumnType(dataset.schema, featuresCol, new VectorUDT)
+SchemaUtils.checkNumericType(dataset.schema, labelCol)
+
+val labeledPointRdd = dataset.select(col("label").cast("double"), 
col("features"))
 
 Review comment:
   Fixed. Thanks!


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on issue #27616: [SPARK-30864] [SQL]add the user guide for Adaptive Query Execution

2020-03-15 Thread GitBox
AmplabJenkins removed a comment on issue #27616: [SPARK-30864] [SQL]add the 
user guide for Adaptive Query Execution
URL: https://github.com/apache/spark/pull/27616#issuecomment-599347352
 
 
   Merged build finished. Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on issue #27616: [SPARK-30864] [SQL]add the user guide for Adaptive Query Execution

2020-03-15 Thread GitBox
AmplabJenkins removed a comment on issue #27616: [SPARK-30864] [SQL]add the 
user guide for Adaptive Query Execution
URL: https://github.com/apache/spark/pull/27616#issuecomment-599347356
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/24562/
   Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on issue #27616: [SPARK-30864] [SQL]add the user guide for Adaptive Query Execution

2020-03-15 Thread GitBox
AmplabJenkins commented on issue #27616: [SPARK-30864] [SQL]add the user guide 
for Adaptive Query Execution
URL: https://github.com/apache/spark/pull/27616#issuecomment-599347352
 
 
   Merged build finished. Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on issue #27616: [SPARK-30864] [SQL]add the user guide for Adaptive Query Execution

2020-03-15 Thread GitBox
AmplabJenkins commented on issue #27616: [SPARK-30864] [SQL]add the user guide 
for Adaptive Query Execution
URL: https://github.com/apache/spark/pull/27616#issuecomment-599347356
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/24562/
   Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on issue #27616: [SPARK-30864] [SQL]add the user guide for Adaptive Query Execution

2020-03-15 Thread GitBox
SparkQA commented on issue #27616: [SPARK-30864] [SQL]add the user guide for 
Adaptive Query Execution
URL: https://github.com/apache/spark/pull/27616#issuecomment-599347058
 
 
   **[Test build #119832 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/119832/testReport)**
 for PR 27616 at commit 
[`13aa51b`](https://github.com/apache/spark/commit/13aa51b245ecabf5ab38ba2b446196db5a79cb4e).


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] maropu commented on issue #27920: [SPARK-31102][SQL] Spark-sql fails to parse when contains comment.

2020-03-15 Thread GitBox
maropu commented on issue #27920: [SPARK-31102][SQL] Spark-sql fails to parse 
when contains comment.
URL: https://github.com/apache/spark/pull/27920#issuecomment-599344938
 
 
   cc: @wangyum


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] maropu commented on a change in pull request #27920: [SPARK-31102][SQL] Spark-sql fails to parse when contains comment.

2020-03-15 Thread GitBox
maropu commented on a change in pull request #27920: [SPARK-31102][SQL] 
Spark-sql fails to parse when contains comment.
URL: https://github.com/apache/spark/pull/27920#discussion_r392782401
 
 

 ##
 File path: 
sql/hive-thriftserver/src/test/scala/org/apache/spark/sql/hive/thriftserver/CliSuite.scala
 ##
 @@ -402,24 +402,14 @@ class CliSuite extends SparkFunSuite with 
BeforeAndAfterAll with Logging {
   }
 
   test("SPARK-30049 Should not complain for quotes in commented lines") {
-runCliWithin(1.minute)(
+runCliWithin(3.minute)(
   """SELECT concat('test', 'comment') -- someone's comment here
 |;""".stripMargin -> "testcomment"
 )
-  }
-
-  test("SPARK-30049 Should not complain for quotes in commented with 
multi-lines") {
-runCliWithin(1.minute)(
-  """SELECT concat('test', 'comment') -- someone's comment here \\
-| comment continues here with single ' quote \\
-| extra ' \\
-|;""".stripMargin -> "testcomment"
-)
-runCliWithin(1.minute)(
-  """SELECT concat('test', 'comment') -- someone's comment here \\
-|   comment continues here with single ' quote \\
-|   extra ' \\
-|   ;""".stripMargin -> "testcomment"
 
 Review comment:
   Why you did you remove the existing tests instead of adding new tests?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on issue #27278: [SPARK-30569][SQL][PYSPARK][SPARKR] Add percentile_approx DSL functions.

2020-03-15 Thread GitBox
AmplabJenkins removed a comment on issue #27278: 
[SPARK-30569][SQL][PYSPARK][SPARKR] Add percentile_approx DSL functions.
URL: https://github.com/apache/spark/pull/27278#issuecomment-599344119
 
 
   Merged build finished. Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on issue #27278: [SPARK-30569][SQL][PYSPARK][SPARKR] Add percentile_approx DSL functions.

2020-03-15 Thread GitBox
AmplabJenkins removed a comment on issue #27278: 
[SPARK-30569][SQL][PYSPARK][SPARKR] Add percentile_approx DSL functions.
URL: https://github.com/apache/spark/pull/27278#issuecomment-599344123
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/24561/
   Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on issue #27278: [SPARK-30569][SQL][PYSPARK][SPARKR] Add percentile_approx DSL functions.

2020-03-15 Thread GitBox
AmplabJenkins commented on issue #27278: [SPARK-30569][SQL][PYSPARK][SPARKR] 
Add percentile_approx DSL functions.
URL: https://github.com/apache/spark/pull/27278#issuecomment-599344119
 
 
   Merged build finished. Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on issue #27278: [SPARK-30569][SQL][PYSPARK][SPARKR] Add percentile_approx DSL functions.

2020-03-15 Thread GitBox
AmplabJenkins commented on issue #27278: [SPARK-30569][SQL][PYSPARK][SPARKR] 
Add percentile_approx DSL functions.
URL: https://github.com/apache/spark/pull/27278#issuecomment-599344123
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/24561/
   Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on issue #27278: [SPARK-30569][SQL][PYSPARK][SPARKR] Add percentile_approx DSL functions.

2020-03-15 Thread GitBox
SparkQA commented on issue #27278: [SPARK-30569][SQL][PYSPARK][SPARKR] Add 
percentile_approx DSL functions.
URL: https://github.com/apache/spark/pull/27278#issuecomment-599343820
 
 
   **[Test build #119831 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/119831/testReport)**
 for PR 27278 at commit 
[`b6bd2d4`](https://github.com/apache/spark/commit/b6bd2d46ceb2539dd8e99e07268ce9e8fd6e6558).


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on issue #27916: [SPARK-30532] DataFrameStatFunctions to work with TABLE.COLUMN syntax

2020-03-15 Thread GitBox
AmplabJenkins removed a comment on issue #27916: [SPARK-30532] 
DataFrameStatFunctions to work with TABLE.COLUMN syntax
URL: https://github.com/apache/spark/pull/27916#issuecomment-599336486
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/24560/
   Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on issue #27916: [SPARK-30532] DataFrameStatFunctions to work with TABLE.COLUMN syntax

2020-03-15 Thread GitBox
AmplabJenkins removed a comment on issue #27916: [SPARK-30532] 
DataFrameStatFunctions to work with TABLE.COLUMN syntax
URL: https://github.com/apache/spark/pull/27916#issuecomment-599336481
 
 
   Merged build finished. Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on issue #27916: [SPARK-30532] DataFrameStatFunctions to work with TABLE.COLUMN syntax

2020-03-15 Thread GitBox
AmplabJenkins commented on issue #27916: [SPARK-30532] DataFrameStatFunctions 
to work with TABLE.COLUMN syntax
URL: https://github.com/apache/spark/pull/27916#issuecomment-599336486
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/24560/
   Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on issue #27916: [SPARK-30532] DataFrameStatFunctions to work with TABLE.COLUMN syntax

2020-03-15 Thread GitBox
AmplabJenkins commented on issue #27916: [SPARK-30532] DataFrameStatFunctions 
to work with TABLE.COLUMN syntax
URL: https://github.com/apache/spark/pull/27916#issuecomment-599336481
 
 
   Merged build finished. Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on issue #27916: [SPARK-30532] DataFrameStatFunctions to work with TABLE.COLUMN syntax

2020-03-15 Thread GitBox
SparkQA commented on issue #27916: [SPARK-30532] DataFrameStatFunctions to work 
with TABLE.COLUMN syntax
URL: https://github.com/apache/spark/pull/27916#issuecomment-599336222
 
 
   **[Test build #119830 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/119830/testReport)**
 for PR 27916 at commit 
[`2f24ae0`](https://github.com/apache/spark/commit/2f24ae0365d21412dfdcb998bf9fafa513f39cd5).


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] kachayev commented on issue #27916: [SPARK-30532] DataFrameStatFunctions to work with TABLE.COLUMN syntax

2020-03-15 Thread GitBox
kachayev commented on issue #27916: [SPARK-30532] DataFrameStatFunctions to 
work with TABLE.COLUMN syntax
URL: https://github.com/apache/spark/pull/27916#issuecomment-599335665
 
 
   Fixed tests.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on issue #27278: [SPARK-30569][SQL][PYSPARK][SPARKR] Add percentile_approx DSL functions.

2020-03-15 Thread GitBox
AmplabJenkins removed a comment on issue #27278: 
[SPARK-30569][SQL][PYSPARK][SPARKR] Add percentile_approx DSL functions.
URL: https://github.com/apache/spark/pull/27278#issuecomment-599333723
 
 
   Merged build finished. Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on issue #27278: [SPARK-30569][SQL][PYSPARK][SPARKR] Add percentile_approx DSL functions.

2020-03-15 Thread GitBox
AmplabJenkins removed a comment on issue #27278: 
[SPARK-30569][SQL][PYSPARK][SPARKR] Add percentile_approx DSL functions.
URL: https://github.com/apache/spark/pull/27278#issuecomment-599333726
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/24559/
   Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on issue #27278: [SPARK-30569][SQL][PYSPARK][SPARKR] Add percentile_approx DSL functions.

2020-03-15 Thread GitBox
AmplabJenkins commented on issue #27278: [SPARK-30569][SQL][PYSPARK][SPARKR] 
Add percentile_approx DSL functions.
URL: https://github.com/apache/spark/pull/27278#issuecomment-599333726
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/24559/
   Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on issue #27278: [SPARK-30569][SQL][PYSPARK][SPARKR] Add percentile_approx DSL functions.

2020-03-15 Thread GitBox
AmplabJenkins commented on issue #27278: [SPARK-30569][SQL][PYSPARK][SPARKR] 
Add percentile_approx DSL functions.
URL: https://github.com/apache/spark/pull/27278#issuecomment-599333723
 
 
   Merged build finished. Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on issue #27278: [SPARK-30569][SQL][PYSPARK][SPARKR] Add percentile_approx DSL functions.

2020-03-15 Thread GitBox
SparkQA commented on issue #27278: [SPARK-30569][SQL][PYSPARK][SPARKR] Add 
percentile_approx DSL functions.
URL: https://github.com/apache/spark/pull/27278#issuecomment-599333526
 
 
   **[Test build #119829 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/119829/testReport)**
 for PR 27278 at commit 
[`1ee0b24`](https://github.com/apache/spark/commit/1ee0b2433523aa6a9494daa816b018b7cb875777).


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] HyukjinKwon commented on issue #27278: [SPARK-30569][SQL][PYSPARK][SPARKR] Add percentile_approx DSL functions.

2020-03-15 Thread GitBox
HyukjinKwon commented on issue #27278: [SPARK-30569][SQL][PYSPARK][SPARKR] Add 
percentile_approx DSL functions.
URL: https://github.com/apache/spark/pull/27278#issuecomment-599332466
 
 
   retest this please


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on issue #27921: [SPARK-31161][WEBUI] Refactor the on-click timeline action in streagming-page.js

2020-03-15 Thread GitBox
AmplabJenkins removed a comment on issue #27921: [SPARK-31161][WEBUI] Refactor 
the on-click timeline action in streagming-page.js
URL: https://github.com/apache/spark/pull/27921#issuecomment-599329436
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/24558/
   Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on issue #27921: [SPARK-31161][WEBUI] Refactor the on-click timeline action in streagming-page.js

2020-03-15 Thread GitBox
AmplabJenkins removed a comment on issue #27921: [SPARK-31161][WEBUI] Refactor 
the on-click timeline action in streagming-page.js
URL: https://github.com/apache/spark/pull/27921#issuecomment-599329427
 
 
   Merged build finished. Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on issue #27921: [SPARK-31161][WEBUI] Refactor the on-click timeline action in streagming-page.js

2020-03-15 Thread GitBox
AmplabJenkins commented on issue #27921: [SPARK-31161][WEBUI] Refactor the 
on-click timeline action in streagming-page.js
URL: https://github.com/apache/spark/pull/27921#issuecomment-599329427
 
 
   Merged build finished. Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on issue #27921: [SPARK-31161][WEBUI] Refactor the on-click timeline action in streagming-page.js

2020-03-15 Thread GitBox
AmplabJenkins commented on issue #27921: [SPARK-31161][WEBUI] Refactor the 
on-click timeline action in streagming-page.js
URL: https://github.com/apache/spark/pull/27921#issuecomment-599329436
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/24558/
   Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on issue #27921: [SPARK-31161][WEBUI] Refactor the on-click timeline action in streagming-page.js

2020-03-15 Thread GitBox
SparkQA commented on issue #27921: [SPARK-31161][WEBUI] Refactor the on-click 
timeline action in streagming-page.js
URL: https://github.com/apache/spark/pull/27921#issuecomment-599329197
 
 
   **[Test build #119828 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/119828/testReport)**
 for PR 27921 at commit 
[`96e8c58`](https://github.com/apache/spark/commit/96e8c5853ac2dc0104f7cb77e85f86b2a6afff7a).


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] sarutak removed a comment on issue #27921: [SPARK-31161][WEBUI] Refactor the on-click timeline action in streagming-page.js

2020-03-15 Thread GitBox
sarutak removed a comment on issue #27921: [SPARK-31161][WEBUI] Refactor the 
on-click timeline action in streagming-page.js
URL: https://github.com/apache/spark/pull/27921#issuecomment-599326541
 
 
   I'll fix the conflict.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] cloud-fan commented on a change in pull request #27893: [SPARK-31134][SQL] optimize skew join after shuffle partitions are coalesced

2020-03-15 Thread GitBox
cloud-fan commented on a change in pull request #27893: [SPARK-31134][SQL] 
optimize skew join after shuffle partitions are coalesced
URL: https://github.com/apache/spark/pull/27893#discussion_r392766089
 
 

 ##
 File path: 
sql/core/src/main/scala/org/apache/spark/sql/execution/adaptive/OptimizeSkewedJoin.scala
 ##
 @@ -328,6 +287,48 @@ case class OptimizeSkewedJoin(conf: SQLConf) extends 
Rule[SparkPlan] {
   }
 }
 
+private object ShuffleStage {
+  def unapply(plan: SparkPlan): Option[ShuffleStageInfo] = plan match {
+case s: ShuffleQueryStageExec =>
+  val mapStats = getMapStats(s)
+  val sizes = mapStats.bytesByPartitionId
+  val partitions = sizes.zipWithIndex.map {
+case (size, i) => CoalescedPartitionSpec(i, i + 1) -> size
+  }
+  Some(ShuffleStageInfo(s, mapStats, partitions))
+
+case CustomShuffleReaderExec(s: ShuffleQueryStageExec, partitionSpecs, _) 
=>
+  val mapStats = getMapStats(s)
+  val sizes = mapStats.bytesByPartitionId
+  val partitions = partitionSpecs.map {
+case spec @ CoalescedPartitionSpec(start, end) =>
+  var sum = 0L
 
 Review comment:
   `slice` will create a new array, which is less efficient.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on issue #27882: [WIP][SPARK-31127][ML] Add abstract Selector

2020-03-15 Thread GitBox
AmplabJenkins removed a comment on issue #27882: [WIP][SPARK-31127][ML] Add 
abstract Selector
URL: https://github.com/apache/spark/pull/27882#issuecomment-599326597
 
 
   Merged build finished. Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on issue #27882: [WIP][SPARK-31127][ML] Add abstract Selector

2020-03-15 Thread GitBox
AmplabJenkins removed a comment on issue #27882: [WIP][SPARK-31127][ML] Add 
abstract Selector
URL: https://github.com/apache/spark/pull/27882#issuecomment-599326600
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/119823/
   Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on issue #27882: [WIP][SPARK-31127][ML] Add abstract Selector

2020-03-15 Thread GitBox
AmplabJenkins commented on issue #27882: [WIP][SPARK-31127][ML] Add abstract 
Selector
URL: https://github.com/apache/spark/pull/27882#issuecomment-599326597
 
 
   Merged build finished. Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on issue #27920: [SPARK-31102][SQL] Spark-sql fails to parse when contains comment.

2020-03-15 Thread GitBox
AmplabJenkins removed a comment on issue #27920: [SPARK-31102][SQL] Spark-sql 
fails to parse when contains comment.
URL: https://github.com/apache/spark/pull/27920#issuecomment-599326419
 
 
   Merged build finished. Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on issue #27920: [SPARK-31102][SQL] Spark-sql fails to parse when contains comment.

2020-03-15 Thread GitBox
AmplabJenkins removed a comment on issue #27920: [SPARK-31102][SQL] Spark-sql 
fails to parse when contains comment.
URL: https://github.com/apache/spark/pull/27920#issuecomment-599326425
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/119825/
   Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] sarutak commented on issue #27921: [SPARK-31161][WEBUI] Refactor the on-click timeline action in streagming-page.js

2020-03-15 Thread GitBox
sarutak commented on issue #27921: [SPARK-31161][WEBUI] Refactor the on-click 
timeline action in streagming-page.js
URL: https://github.com/apache/spark/pull/27921#issuecomment-599326541
 
 
   I'll fix the conflict.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on issue #27882: [WIP][SPARK-31127][ML] Add abstract Selector

2020-03-15 Thread GitBox
AmplabJenkins commented on issue #27882: [WIP][SPARK-31127][ML] Add abstract 
Selector
URL: https://github.com/apache/spark/pull/27882#issuecomment-599326600
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/119823/
   Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA removed a comment on issue #27882: [WIP][SPARK-31127][ML] Add abstract Selector

2020-03-15 Thread GitBox
SparkQA removed a comment on issue #27882: [WIP][SPARK-31127][ML] Add abstract 
Selector
URL: https://github.com/apache/spark/pull/27882#issuecomment-599289509
 
 
   **[Test build #119823 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/119823/testReport)**
 for PR 27882 at commit 
[`b219f0a`](https://github.com/apache/spark/commit/b219f0af6713c77b162db50b72f0b42d2c420818).


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on issue #27920: [SPARK-31102][SQL] Spark-sql fails to parse when contains comment.

2020-03-15 Thread GitBox
AmplabJenkins commented on issue #27920: [SPARK-31102][SQL] Spark-sql fails to 
parse when contains comment.
URL: https://github.com/apache/spark/pull/27920#issuecomment-599326425
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/119825/
   Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on issue #27920: [SPARK-31102][SQL] Spark-sql fails to parse when contains comment.

2020-03-15 Thread GitBox
SparkQA commented on issue #27920: [SPARK-31102][SQL] Spark-sql fails to parse 
when contains comment.
URL: https://github.com/apache/spark/pull/27920#issuecomment-599326321
 
 
   **[Test build #119825 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/119825/testReport)**
 for PR 27920 at commit 
[`d69d271`](https://github.com/apache/spark/commit/d69d27126c06ea9782c5422a5435809a062b8ed7).
* This patch passes all tests.
* This patch merges cleanly.
* This patch adds no public classes.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on issue #27921: [SPARK-31161][WEBUI] Refactor the on-click timeline action in streagming-page.js

2020-03-15 Thread GitBox
AmplabJenkins removed a comment on issue #27921: [SPARK-31161][WEBUI] Refactor 
the on-click timeline action in streagming-page.js
URL: https://github.com/apache/spark/pull/27921#issuecomment-599324864
 
 
   Build finished. Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on issue #27921: [SPARK-31161][WEBUI] Refactor the on-click timeline action in streagming-page.js

2020-03-15 Thread GitBox
AmplabJenkins removed a comment on issue #27921: [SPARK-31161][WEBUI] Refactor 
the on-click timeline action in streagming-page.js
URL: https://github.com/apache/spark/pull/27921#issuecomment-599324867
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/24557/
   Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on issue #27920: [SPARK-31102][SQL] Spark-sql fails to parse when contains comment.

2020-03-15 Thread GitBox
AmplabJenkins commented on issue #27920: [SPARK-31102][SQL] Spark-sql fails to 
parse when contains comment.
URL: https://github.com/apache/spark/pull/27920#issuecomment-599326419
 
 
   Merged build finished. Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA removed a comment on issue #27920: [SPARK-31102][SQL] Spark-sql fails to parse when contains comment.

2020-03-15 Thread GitBox
SparkQA removed a comment on issue #27920: [SPARK-31102][SQL] Spark-sql fails 
to parse when contains comment.
URL: https://github.com/apache/spark/pull/27920#issuecomment-599314095
 
 
   **[Test build #119825 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/119825/testReport)**
 for PR 27920 at commit 
[`d69d271`](https://github.com/apache/spark/commit/d69d27126c06ea9782c5422a5435809a062b8ed7).


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on issue #27921: [SPARK-31161][WEBUI] Refactor the on-click timeline action in streagming-page.js

2020-03-15 Thread GitBox
SparkQA commented on issue #27921: [SPARK-31161][WEBUI] Refactor the on-click 
timeline action in streagming-page.js
URL: https://github.com/apache/spark/pull/27921#issuecomment-599326192
 
 
   **[Test build #119827 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/119827/testReport)**
 for PR 27921 at commit 
[`7e591a7`](https://github.com/apache/spark/commit/7e591a70855a93f32e2edee7ae9eb204e4f77be4).


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on issue #27882: [WIP][SPARK-31127][ML] Add abstract Selector

2020-03-15 Thread GitBox
SparkQA commented on issue #27882: [WIP][SPARK-31127][ML] Add abstract Selector
URL: https://github.com/apache/spark/pull/27882#issuecomment-599326224
 
 
   **[Test build #119823 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/119823/testReport)**
 for PR 27882 at commit 
[`b219f0a`](https://github.com/apache/spark/commit/b219f0af6713c77b162db50b72f0b42d2c420818).
* This patch passes all tests.
* This patch merges cleanly.
* This patch adds no public classes.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on issue #27921: [SPARK-31161][WEBUI] Refactor the on-click timeline action in streagming-page.js

2020-03-15 Thread GitBox
AmplabJenkins commented on issue #27921: [SPARK-31161][WEBUI] Refactor the 
on-click timeline action in streagming-page.js
URL: https://github.com/apache/spark/pull/27921#issuecomment-599324867
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/24557/
   Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on issue #27921: [SPARK-31161][WEBUI] Refactor the on-click timeline action in streagming-page.js

2020-03-15 Thread GitBox
AmplabJenkins commented on issue #27921: [SPARK-31161][WEBUI] Refactor the 
on-click timeline action in streagming-page.js
URL: https://github.com/apache/spark/pull/27921#issuecomment-599324864
 
 
   Build finished. Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] sarutak opened a new pull request #27921: [SPARK-31161][WEBUI] Refactor the on-click timeline action in streagming-page.js

2020-03-15 Thread GitBox
sarutak opened a new pull request #27921: [SPARK-31161][WEBUI] Refactor the 
on-click timeline action in streagming-page.js
URL: https://github.com/apache/spark/pull/27921
 
 
   
   
   ### What changes were proposed in this pull request?
   
   Refactor `streaming-page.js` by making on-click timeline action customizable.
   
   ### Why are the changes needed?
   
   In the current implementation, `streaming-page.js` is used from Streaming 
page and Structured Streaming page but the implementation of the on-click 
timeline action is strongly dependent on Streamng page.
   Structured Streaming page doesn't define the on-click action for now but 
it's better to remove the dependncy for the future.
   
   Originally, I make this change to fix `SPARK-31128` but #27883 resolved it.
   So, now this is just for refactoring.
   
   ### Does this PR introduce any user-facing change?
   
   No.
   
   ### How was this patch tested?
   
   Manual tests with following code and confirmed there are no regression and 
no error in the debug console in Firefox.
   
   For Structured Streaming:
   ```
   spark.readStream.format("socket").options(Map("host"->"localhost", 
"port"->"8765")).load.writeStream.format("console").start
   ```
   And then, visited Structured Streaming page and there were no error in the 
debug console when I clicked a point in the timeline.
   
   For Spark Streaming:
   ```
   import org.apache.spark.streaming._
   val ssc = new StreamingContext(sc, Seconds(1))
   ssc.socketTextStream("localhost", 8765)
   dstream.foreachRDD(rdd => rdd.foreach(println))
   ssc.start
   ```
   And then, visited Streaming page and confirmed scrolling down and hilighting 
work well and there were no error in the debug console when I clicked a point 
in the timeline.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] zhengruifeng commented on a change in pull request #27895: [SPARK-31138][ML] Add ANOVA Selector for continuous features and categorical labels

2020-03-15 Thread GitBox
zhengruifeng commented on a change in pull request #27895: [SPARK-31138][ML] 
Add ANOVA Selector for continuous features and categorical labels
URL: https://github.com/apache/spark/pull/27895#discussion_r392760983
 
 

 ##
 File path: mllib/src/main/scala/org/apache/spark/ml/stat/ANOVATest.scala
 ##
 @@ -0,0 +1,166 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.ml.stat
+
+import org.apache.commons.math3.distribution.FDistribution
+
+import org.apache.spark.annotation.Since
+import org.apache.spark.ml.feature.LabeledPoint
+import org.apache.spark.ml.linalg.{Vector, Vectors, VectorUDT}
+import org.apache.spark.ml.util.SchemaUtils
+import org.apache.spark.sql._
+import org.apache.spark.sql.functions.col
+import org.apache.spark.util.collection.OpenHashMap
+
+
+/**
+ * ANOVA Test
+ */
+@Since("3.1.0")
+object ANOVATest {
+
+  /** Used to construct output schema of tests */
+  private case class ANOVAResult(
+  pValues: Vector,
+  degreesOfFreedom: Array[Long],
+  fValues: Vector)
+
+  /**
+   * @param dataset  DataFrame of categorical labels and continuous features.
+   * @param featuresCol  Name of features column in dataset, of type `Vector` 
(`VectorUDT`)
+   * @param labelCol  Name of label column in dataset, of any numerical type
+   * @return DataFrame containing the test result for every feature against 
the label.
+   * This DataFrame will contain a single Row with the following 
fields:
+   *  - `pValues: Vector`
+   *  - `degreesOfFreedom: Array[Long]`
+   *  - `fValues: Vector`
+   * Each of these fields has one value per feature.
+   */
+  @Since("3.1.0")
+  def test(dataset: DataFrame, featuresCol: String, labelCol: String): 
DataFrame = {
+val spark = dataset.sparkSession
+val testResults = testClassification(dataset, featuresCol, labelCol)
+val pValues: Vector = Vectors.dense(testResults.map(_.pValue))
+val degreesOfFreedom: Array[Long] = testResults.map(_.degreesOfFreedom)
+val fValues: Vector = Vectors.dense(testResults.map(_.statistic))
+spark.createDataFrame(
+  Seq(new ANOVAResult(pValues, degreesOfFreedom, fValues)))
+  }
+
+  /**
+   * @param dataset  DataFrame of categorical labels and continuous features.
+   * @param featuresCol  Name of features column in dataset, of type `Vector` 
(`VectorUDT`)
+   * @param labelCol  Name of label column in dataset, of any numerical type
+   * @return Array containing the ANOVATestResult for every feature against the
+   * label.
+   */
+  private[ml] def testClassification(
+  dataset: Dataset[_],
+  featuresCol: String,
+  labelCol: String): Array[SelectionTestResult] = {
+
+val spark = dataset.sparkSession
+import spark.implicits._
+
+SchemaUtils.checkColumnType(dataset.schema, featuresCol, new VectorUDT)
+SchemaUtils.checkNumericType(dataset.schema, labelCol)
+
+val labeledPointRdd = dataset.select(col("label").cast("double"), 
col("features"))
+  .as[(Double, Vector)]
+  .rdd.map { case (label, features) => LabeledPoint(label, features) }
+
+val numFeatures = labeledPointRdd.first().features.size
 
 Review comment:
   nit: we can compute `numSamples` and `numClasses` together:
   ```scala
   val numFeatures = MetadataUtils.getNumFeatures(dataset, $(featuresCol))
   val Row(numSamples: Long, numClasses: Long) = 
dataset.select(countDistinct(labelCol), count(labelCol)).head
   ```


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] zhengruifeng commented on a change in pull request #27895: [SPARK-31138][ML] Add ANOVA Selector for continuous features and categorical labels

2020-03-15 Thread GitBox
zhengruifeng commented on a change in pull request #27895: [SPARK-31138][ML] 
Add ANOVA Selector for continuous features and categorical labels
URL: https://github.com/apache/spark/pull/27895#discussion_r392761159
 
 

 ##
 File path: mllib/src/main/scala/org/apache/spark/ml/stat/ANOVATest.scala
 ##
 @@ -0,0 +1,166 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.ml.stat
+
+import org.apache.commons.math3.distribution.FDistribution
+
+import org.apache.spark.annotation.Since
+import org.apache.spark.ml.feature.LabeledPoint
+import org.apache.spark.ml.linalg.{Vector, Vectors, VectorUDT}
+import org.apache.spark.ml.util.SchemaUtils
+import org.apache.spark.sql._
+import org.apache.spark.sql.functions.col
+import org.apache.spark.util.collection.OpenHashMap
+
+
+/**
+ * ANOVA Test
+ */
+@Since("3.1.0")
+object ANOVATest {
+
+  /** Used to construct output schema of tests */
+  private case class ANOVAResult(
+  pValues: Vector,
+  degreesOfFreedom: Array[Long],
+  fValues: Vector)
+
+  /**
+   * @param dataset  DataFrame of categorical labels and continuous features.
+   * @param featuresCol  Name of features column in dataset, of type `Vector` 
(`VectorUDT`)
+   * @param labelCol  Name of label column in dataset, of any numerical type
+   * @return DataFrame containing the test result for every feature against 
the label.
+   * This DataFrame will contain a single Row with the following 
fields:
+   *  - `pValues: Vector`
+   *  - `degreesOfFreedom: Array[Long]`
+   *  - `fValues: Vector`
+   * Each of these fields has one value per feature.
+   */
+  @Since("3.1.0")
+  def test(dataset: DataFrame, featuresCol: String, labelCol: String): 
DataFrame = {
+val spark = dataset.sparkSession
+val testResults = testClassification(dataset, featuresCol, labelCol)
+val pValues: Vector = Vectors.dense(testResults.map(_.pValue))
+val degreesOfFreedom: Array[Long] = testResults.map(_.degreesOfFreedom)
+val fValues: Vector = Vectors.dense(testResults.map(_.statistic))
+spark.createDataFrame(
+  Seq(new ANOVAResult(pValues, degreesOfFreedom, fValues)))
+  }
+
+  /**
+   * @param dataset  DataFrame of categorical labels and continuous features.
+   * @param featuresCol  Name of features column in dataset, of type `Vector` 
(`VectorUDT`)
+   * @param labelCol  Name of label column in dataset, of any numerical type
+   * @return Array containing the ANOVATestResult for every feature against the
+   * label.
+   */
+  private[ml] def testClassification(
+  dataset: Dataset[_],
+  featuresCol: String,
+  labelCol: String): Array[SelectionTestResult] = {
+
+val spark = dataset.sparkSession
+import spark.implicits._
+
+SchemaUtils.checkColumnType(dataset.schema, featuresCol, new VectorUDT)
+SchemaUtils.checkNumericType(dataset.schema, labelCol)
+
+val labeledPointRdd = dataset.select(col("label").cast("double"), 
col("features"))
 
 Review comment:
   "label" -> `labelCol`
   "features" -> `featuresCol`


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on issue #27920: [SPARK-31102][SQL] Spark-sql fails to parse when contains comment.

2020-03-15 Thread GitBox
AmplabJenkins removed a comment on issue #27920: [SPARK-31102][SQL] Spark-sql 
fails to parse when contains comment.
URL: https://github.com/apache/spark/pull/27920#issuecomment-599314402
 
 
   Merged build finished. Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on issue #27920: [SPARK-31102][SQL] Spark-sql fails to parse when contains comment.

2020-03-15 Thread GitBox
AmplabJenkins removed a comment on issue #27920: [SPARK-31102][SQL] Spark-sql 
fails to parse when contains comment.
URL: https://github.com/apache/spark/pull/27920#issuecomment-599314405
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/24555/
   Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on issue #27019: [SPARK-30027][SQL] Support codegen for aggregate filters in HashAggregateExec

2020-03-15 Thread GitBox
AmplabJenkins removed a comment on issue #27019: [SPARK-30027][SQL] Support 
codegen for aggregate filters in HashAggregateExec
URL: https://github.com/apache/spark/pull/27019#issuecomment-599314385
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/24556/
   Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on issue #27019: [SPARK-30027][SQL] Support codegen for aggregate filters in HashAggregateExec

2020-03-15 Thread GitBox
AmplabJenkins removed a comment on issue #27019: [SPARK-30027][SQL] Support 
codegen for aggregate filters in HashAggregateExec
URL: https://github.com/apache/spark/pull/27019#issuecomment-599314378
 
 
   Merged build finished. Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on issue #27920: [SPARK-31102][SQL] Spark-sql fails to parse when contains comment.

2020-03-15 Thread GitBox
AmplabJenkins commented on issue #27920: [SPARK-31102][SQL] Spark-sql fails to 
parse when contains comment.
URL: https://github.com/apache/spark/pull/27920#issuecomment-599314405
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/24555/
   Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on issue #27019: [SPARK-30027][SQL] Support codegen for aggregate filters in HashAggregateExec

2020-03-15 Thread GitBox
AmplabJenkins commented on issue #27019: [SPARK-30027][SQL] Support codegen for 
aggregate filters in HashAggregateExec
URL: https://github.com/apache/spark/pull/27019#issuecomment-599314385
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/24556/
   Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on issue #27920: [SPARK-31102][SQL] Spark-sql fails to parse when contains comment.

2020-03-15 Thread GitBox
AmplabJenkins commented on issue #27920: [SPARK-31102][SQL] Spark-sql fails to 
parse when contains comment.
URL: https://github.com/apache/spark/pull/27920#issuecomment-599314402
 
 
   Merged build finished. Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on issue #27019: [SPARK-30027][SQL] Support codegen for aggregate filters in HashAggregateExec

2020-03-15 Thread GitBox
AmplabJenkins commented on issue #27019: [SPARK-30027][SQL] Support codegen for 
aggregate filters in HashAggregateExec
URL: https://github.com/apache/spark/pull/27019#issuecomment-599314378
 
 
   Merged build finished. Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on issue #27920: [SPARK-31102][SQL] Spark-sql fails to parse when contains comment.

2020-03-15 Thread GitBox
SparkQA commented on issue #27920: [SPARK-31102][SQL] Spark-sql fails to parse 
when contains comment.
URL: https://github.com/apache/spark/pull/27920#issuecomment-599314095
 
 
   **[Test build #119825 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/119825/testReport)**
 for PR 27920 at commit 
[`d69d271`](https://github.com/apache/spark/commit/d69d27126c06ea9782c5422a5435809a062b8ed7).


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on issue #27019: [SPARK-30027][SQL] Support codegen for aggregate filters in HashAggregateExec

2020-03-15 Thread GitBox
SparkQA commented on issue #27019: [SPARK-30027][SQL] Support codegen for 
aggregate filters in HashAggregateExec
URL: https://github.com/apache/spark/pull/27019#issuecomment-599314096
 
 
   **[Test build #119826 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/119826/testReport)**
 for PR 27019 at commit 
[`d303b2f`](https://github.com/apache/spark/commit/d303b2fe2bc1ddc0b1e98c2028fb92322a0bb976).


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] maropu commented on issue #27019: [SPARK-30027][SQL] Support codegen for aggregate filters in HashAggregateExec

2020-03-15 Thread GitBox
maropu commented on issue #27019: [SPARK-30027][SQL] Support codegen for 
aggregate filters in HashAggregateExec
URL: https://github.com/apache/spark/pull/27019#issuecomment-599313299
 
 
   cc: @cloud-fan 


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on issue #25827: [SPARK-29128][SQL] Split predicate code in OR expressions

2020-03-15 Thread GitBox
AmplabJenkins removed a comment on issue #25827: [SPARK-29128][SQL] Split 
predicate code in OR expressions
URL: https://github.com/apache/spark/pull/25827#issuecomment-599312956
 
 
   Merged build finished. Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on issue #25827: [SPARK-29128][SQL] Split predicate code in OR expressions

2020-03-15 Thread GitBox
AmplabJenkins commented on issue #25827: [SPARK-29128][SQL] Split predicate 
code in OR expressions
URL: https://github.com/apache/spark/pull/25827#issuecomment-599312961
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/24554/
   Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on issue #27920: [SPARK-31102][SQL] Spark-sql fails to parse when contains comment.

2020-03-15 Thread GitBox
AmplabJenkins removed a comment on issue #27920: [SPARK-31102][SQL] Spark-sql 
fails to parse when contains comment.
URL: https://github.com/apache/spark/pull/27920#issuecomment-599307680
 
 
   Can one of the admins verify this patch?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on issue #25827: [SPARK-29128][SQL] Split predicate code in OR expressions

2020-03-15 Thread GitBox
AmplabJenkins removed a comment on issue #25827: [SPARK-29128][SQL] Split 
predicate code in OR expressions
URL: https://github.com/apache/spark/pull/25827#issuecomment-599312961
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/24554/
   Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] maropu commented on issue #27920: [SPARK-31102][SQL] Spark-sql fails to parse when contains comment.

2020-03-15 Thread GitBox
maropu commented on issue #27920: [SPARK-31102][SQL] Spark-sql fails to parse 
when contains comment.
URL: https://github.com/apache/spark/pull/27920#issuecomment-599312972
 
 
   ok to test


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on issue #25827: [SPARK-29128][SQL] Split predicate code in OR expressions

2020-03-15 Thread GitBox
AmplabJenkins commented on issue #25827: [SPARK-29128][SQL] Split predicate 
code in OR expressions
URL: https://github.com/apache/spark/pull/25827#issuecomment-599312956
 
 
   Merged build finished. Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on issue #25827: [SPARK-29128][SQL] Split predicate code in OR expressions

2020-03-15 Thread GitBox
SparkQA commented on issue #25827: [SPARK-29128][SQL] Split predicate code in 
OR expressions
URL: https://github.com/apache/spark/pull/25827#issuecomment-599312685
 
 
   **[Test build #119824 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/119824/testReport)**
 for PR 25827 at commit 
[`543c016`](https://github.com/apache/spark/commit/543c0167dab23ece2e4db232c0fd7d4c9e5eeb8e).


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] maropu commented on issue #25827: [SPARK-29128][SQL] Split predicate code in OR expressions

2020-03-15 Thread GitBox
maropu commented on issue #25827: [SPARK-29128][SQL] Split predicate code in OR 
expressions
URL: https://github.com/apache/spark/pull/25827#issuecomment-599312055
 
 
   retest this please


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on issue #27920: [SPARK-31102][SQL] Spark-sql fails to parse when contains comment.

2020-03-15 Thread GitBox
AmplabJenkins removed a comment on issue #27920: [SPARK-31102][SQL] Spark-sql 
fails to parse when contains comment.
URL: https://github.com/apache/spark/pull/27920#issuecomment-599307401
 
 
   Can one of the admins verify this patch?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on issue #27920: [SPARK-31102][SQL] Spark-sql fails to parse when contains comment.

2020-03-15 Thread GitBox
AmplabJenkins commented on issue #27920: [SPARK-31102][SQL] Spark-sql fails to 
parse when contains comment.
URL: https://github.com/apache/spark/pull/27920#issuecomment-599307680
 
 
   Can one of the admins verify this patch?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on issue #27920: [SPARK-31102][SQL] Spark-sql fails to parse when contains comment.

2020-03-15 Thread GitBox
AmplabJenkins commented on issue #27920: [SPARK-31102][SQL] Spark-sql fails to 
parse when contains comment.
URL: https://github.com/apache/spark/pull/27920#issuecomment-599307401
 
 
   Can one of the admins verify this patch?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] javierivanov opened a new pull request #27920: [SPARK-31102][SQL] Spark-sql fails to parse when contains comment.

2020-03-15 Thread GitBox
javierivanov opened a new pull request #27920: [SPARK-31102][SQL] Spark-sql 
fails to parse when contains comment.
URL: https://github.com/apache/spark/pull/27920
 
 
   
   
   ### What changes were proposed in this pull request?
   
   This PR introduces a change to false for the insideComment flag on a 
newline. Fixing the issue introduced by SPARK-30049.
   
   ### Why are the changes needed?
   
   Previously on SPARK-30049 a comment containing an unclosed quote produced 
the following issue:
   ```
   spark-sql> SELECT 1 -- someone's comment here
> ;
   Error in query: 
   extraneous input ';' expecting (line 2, pos 0)
   
   == SQL ==
   SELECT 1 -- someone's comment here
   ;
   ^^^
   ```
   
   This was caused because there was no flag for comment sections inside the 
splitSemiColon method to ignore quotes. SPARK-30049 added that flag and fixed 
the issue, but introduced the follwoing problem:
   ```
   spark-sql> select
>   1,
>   -- two
>   2;
   Error in query:
   mismatched input '' expecting {'(', 'ADD', 'AFTER', 'ALL', 'ALTER', 
...}(line 3, pos 2)
   == SQL ==
   select
 1,
   --^^^
   ```
   This issue is generated by a missing turn-off for the insideComment flag 
with a newline.
   
   ### Does this PR introduce any user-facing change?
   
   No
   
   ### How was this patch tested?
   
   Previous tests using line-continuity(`\`) were removed and a test for inline 
comments within a query was added.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] viirya commented on a change in pull request #27901: [SPARK-31146][SQL] Leverage the helper method for aliasing in built-in SQL expressions

2020-03-15 Thread GitBox
viirya commented on a change in pull request #27901: [SPARK-31146][SQL] 
Leverage the helper method for aliasing in built-in SQL expressions 
URL: https://github.com/apache/spark/pull/27901#discussion_r392753020
 
 

 ##
 File path: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/CallMethodViaReflection.scala
 ##
 @@ -55,7 +55,7 @@ import org.apache.spark.util.Utils
 case class CallMethodViaReflection(children: Seq[Expression])
   extends Expression with CodegenFallback {
 
-  override def prettyName: String = "reflect"
+  override def prettyName: String = 
getTagValue(FunctionRegistry.FUNC_ALIAS).getOrElse("reflect")
 
 Review comment:
   Yea, if a code relies on the output column name of 
`selectExpr("java_method(...)")`, it could be broken as it is changed to 
`java_method(...)` now. This is correct, I think. If we don't guarantee on the 
output column name, it is ok.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] huaxingao commented on issue #27919: [parkSPARK-30954][ML][SPARKR]Make class name the same as file name

2020-03-15 Thread GitBox
huaxingao commented on issue #27919: [parkSPARK-30954][ML][SPARKR]Make class 
name the same as file name
URL: https://github.com/apache/spark/pull/27919#issuecomment-599304708
 
 
   @kevinyu98 
   Hi Kevin, thanks for working on this. I think you missed 
```DecisionTreeREgressionWrapper.scala```. Also, the title of your PR is not 
right ```[parkSPARK-30954]```


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] beliefer commented on issue #27898: [SPARK-31141][DSTREAMS][DOC] Add version information to the configuration of Dstreams

2020-03-15 Thread GitBox
beliefer commented on issue #27898: [SPARK-31141][DSTREAMS][DOC] Add version 
information to the configuration of Dstreams
URL: https://github.com/apache/spark/pull/27898#issuecomment-599304484
 
 
   @HyukjinKwon Could you take a look at this PR?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] sarutak commented on issue #27849: [SPARK-31081][UI][SQL] Make the display of stageId/stageAttemptId/taskId of sql metrics configurable in UI

2020-03-15 Thread GitBox
sarutak commented on issue #27849: [SPARK-31081][UI][SQL] Make the display of 
stageId/stageAttemptId/taskId of sql metrics configurable in UI
URL: https://github.com/apache/spark/pull/27849#issuecomment-599303912
 
 
   @gengliangwang I'll try to add the checkbox.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on issue #27882: [WIP][SPARK-31127][ML] Add abstract Selector

2020-03-15 Thread GitBox
AmplabJenkins removed a comment on issue #27882: [WIP][SPARK-31127][ML] Add 
abstract Selector
URL: https://github.com/apache/spark/pull/27882#issuecomment-599302162
 
 
   Merged build finished. Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on issue #27882: [WIP][SPARK-31127][ML] Add abstract Selector

2020-03-15 Thread GitBox
AmplabJenkins removed a comment on issue #27882: [WIP][SPARK-31127][ML] Add 
abstract Selector
URL: https://github.com/apache/spark/pull/27882#issuecomment-599302165
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/119821/
   Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on issue #27882: [WIP][SPARK-31127][ML] Add abstract Selector

2020-03-15 Thread GitBox
AmplabJenkins commented on issue #27882: [WIP][SPARK-31127][ML] Add abstract 
Selector
URL: https://github.com/apache/spark/pull/27882#issuecomment-599302162
 
 
   Merged build finished. Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on issue #27882: [WIP][SPARK-31127][ML] Add abstract Selector

2020-03-15 Thread GitBox
AmplabJenkins commented on issue #27882: [WIP][SPARK-31127][ML] Add abstract 
Selector
URL: https://github.com/apache/spark/pull/27882#issuecomment-599302165
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/119821/
   Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA removed a comment on issue #27882: [WIP][SPARK-31127][ML] Add abstract Selector

2020-03-15 Thread GitBox
SparkQA removed a comment on issue #27882: [WIP][SPARK-31127][ML] Add abstract 
Selector
URL: https://github.com/apache/spark/pull/27882#issuecomment-599278543
 
 
   **[Test build #119821 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/119821/testReport)**
 for PR 27882 at commit 
[`b4d30f9`](https://github.com/apache/spark/commit/b4d30f97a0d4bd40bc689f7d1fd961b8ce66d123).


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



  1   2   3   4   >