[GitHub] [spark] cloud-fan commented on a change in pull request #29045: [SPARK-32234][SQL] Spark sql commands are failing on selecting the orc tables

2020-07-13 Thread GitBox


cloud-fan commented on a change in pull request #29045:
URL: https://github.com/apache/spark/pull/29045#discussion_r454119852



##
File path: 
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/orc/OrcFileFormat.scala
##
@@ -179,12 +179,17 @@ class OrcFileFormat
 
   val fs = filePath.getFileSystem(conf)
   val readerOptions = OrcFile.readerOptions(conf).filesystem(fs)
-  val requestedColIdsOrEmptyFile =
+  val (requestedColIdsOrEmptyFile, sendActualSchema) =
 Utils.tryWithResource(OrcFile.createReader(filePath, readerOptions)) { 
reader =>
   OrcUtils.requestedColumnIds(
 isCaseSensitive, dataSchema, requiredSchema, reader, conf)
 }
 
+  if (sendActualSchema) {
+resultSchemaString = OrcUtils.orcTypeDescriptionString(actualSchema)

Review comment:
   do you mean we can't do column pruning in this case?





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] zhengruifeng commented on pull request #29018: [SPARK-32202][ML][WIP] tree models auto infer compact integer type

2020-07-13 Thread GitBox


zhengruifeng commented on pull request #29018:
URL: https://github.com/apache/spark/pull/29018#issuecomment-657984250


   @huaxingao @WeichenXu123 @viirya How do you think about saving ~70% 
(Array[Int] -> Array[Byte]) RAM at the cost of somewhat regression (1% ~ 10%)?



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #28708: [SPARK-20629][CORE][K8S] Copy shuffle data when nodes are being shutdown

2020-07-13 Thread GitBox


AmplabJenkins removed a comment on pull request #28708:
URL: https://github.com/apache/spark/pull/28708#issuecomment-657984010







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #28708: [SPARK-20629][CORE][K8S] Copy shuffle data when nodes are being shutdown

2020-07-13 Thread GitBox


AmplabJenkins commented on pull request #28708:
URL: https://github.com/apache/spark/pull/28708#issuecomment-657984010







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA removed a comment on pull request #28708: [SPARK-20629][CORE][K8S] Copy shuffle data when nodes are being shutdown

2020-07-13 Thread GitBox


SparkQA removed a comment on pull request #28708:
URL: https://github.com/apache/spark/pull/28708#issuecomment-657927290


   **[Test build #125799 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/125799/testReport)**
 for PR 28708 at commit 
[`5a0cd2a`](https://github.com/apache/spark/commit/5a0cd2abd316aacc601b9e8fa6e1406b67c55fb7).



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #28708: [SPARK-20629][CORE][K8S] Copy shuffle data when nodes are being shutdown

2020-07-13 Thread GitBox


AmplabJenkins removed a comment on pull request #28708:
URL: https://github.com/apache/spark/pull/28708#issuecomment-657937025


   Test FAILed.
   Refer to this link for build results (access rights to CI server needed): 
   
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/30410/
   Test FAILed.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #28708: [SPARK-20629][CORE][K8S] Copy shuffle data when nodes are being shutdown

2020-07-13 Thread GitBox


SparkQA commented on pull request #28708:
URL: https://github.com/apache/spark/pull/28708#issuecomment-657983486


   **[Test build #125799 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/125799/testReport)**
 for PR 28708 at commit 
[`5a0cd2a`](https://github.com/apache/spark/commit/5a0cd2abd316aacc601b9e8fa6e1406b67c55fb7).
* This patch passes all tests.
* This patch merges cleanly.
* This patch adds the following public classes _(experimental)_:
 * `public final class MapOutputCommitMessage `
 * `  case class IsExecutorAlive(executorId: String) extends 
CoarseGrainedClusterMessage`
 * `sealed trait LogisticRegressionSummary extends ClassificationSummary `
 * `sealed trait RandomForestClassificationSummary extends 
ClassificationSummary `
 * `class _ClassificationSummary(JavaWrapper):`
 * `class _TrainingSummary(JavaWrapper):`
 * `class _BinaryClassificationSummary(_ClassificationSummary):`
 * `class LinearSVCModel(_JavaClassificationModel, _LinearSVCParams, 
JavaMLWritable, JavaMLReadable,`
 * `class LinearSVCSummary(_BinaryClassificationSummary):`
 * `class LinearSVCTrainingSummary(LinearSVCSummary, _TrainingSummary):`
 * `class LogisticRegressionSummary(_ClassificationSummary):`
 * `class LogisticRegressionTrainingSummary(LogisticRegressionSummary, 
_TrainingSummary):`
 * `class BinaryLogisticRegressionSummary(_BinaryClassificationSummary,`
 * `class RandomForestClassificationSummary(_ClassificationSummary):`
 * `class 
RandomForestClassificationTrainingSummary(RandomForestClassificationSummary,`
 * `class 
BinaryRandomForestClassificationSummary(_BinaryClassificationSummary):`
 * `class 
BinaryRandomForestClassificationTrainingSummary(BinaryRandomForestClassificationSummary,`
 * `  class DisableHints(conf: SQLConf) extends RemoveAllHints(conf: 
SQLConf) `
 * `case class WithFields(`
 * `case class Hour(child: Expression, timeZoneId: Option[String] = None) 
extends GetTimeField `
 * `case class Minute(child: Expression, timeZoneId: Option[String] = None) 
extends GetTimeField `
 * `case class Second(child: Expression, timeZoneId: Option[String] = None) 
extends GetTimeField `
 * `trait GetDateField extends UnaryExpression with ImplicitCastInputTypes 
with NullIntolerant `
 * `case class DayOfYear(child: Expression) extends GetDateField `
 * `case class SecondsToTimestamp(child: Expression) extends 
UnaryExpression`
 * `case class Year(child: Expression) extends GetDateField `
 * `case class YearOfWeek(child: Expression) extends GetDateField `
 * `case class Quarter(child: Expression) extends GetDateField `
 * `case class Month(child: Expression) extends GetDateField `
 * `case class DayOfMonth(child: Expression) extends GetDateField `
 * `case class DayOfWeek(child: Expression) extends GetDateField `
 * `case class WeekDay(child: Expression) extends GetDateField `
 * `case class WeekOfYear(child: Expression) extends GetDateField `
 * `sealed trait UTCTimestamp extends BinaryExpression with 
ImplicitCastInputTypes with NullIntolerant `
 * `case class FromUTCTimestamp(left: Expression, right: Expression) 
extends UTCTimestamp `
 * `case class ToUTCTimestamp(left: Expression, right: Expression) extends 
UTCTimestamp `
 * `sealed abstract class MergeAction extends Expression with Unevaluable `
 * `case class DeleteAction(condition: Option[Expression]) extends 
MergeAction`
 * `trait BaseScriptTransformationExec extends UnaryExecNode `
 * `abstract class BaseScriptTransformationWriterThread(`
 * `abstract class BaseScriptTransformIOSchema extends Serializable `
 * `case class CoalesceBucketsInSortMergeJoin(conf: SQLConf) extends 
Rule[SparkPlan] `
 * `class StateStoreConf(`
 * `case class HiveScriptTransformationExec(`



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #29090: [WIP][SPARK-32293] Fix inconsistency between Spark memory configs and JVM option

2020-07-13 Thread GitBox


AmplabJenkins removed a comment on pull request #29090:
URL: https://github.com/apache/spark/pull/29090#issuecomment-657982803







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SaurabhChawla100 edited a comment on pull request #29045: [SPARK-32234][SQL] Spark sql commands are failing on selecting the orc tables

2020-07-13 Thread GitBox


SaurabhChawla100 edited a comment on pull request #29045:
URL: https://github.com/apache/spark/pull/29045#issuecomment-657978027


   > Can you be more specific about the problem? Are you saying that the actual 
file schema doesn't match the table schema specified by the user?
   
   So in case of orc data created by the hive no field names in the physical 
schema. Please find the below code for reference.
   
https://github.com/apache/spark/blob/24be81689cee76e03cd5136dfd089123bbff4595/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/orc/OrcUtils.scala#L133
   
   So from this code we are sending the index of the col from the dataschema.
   
   But Where as in the below code , we are passing the input result schema and 
that result schema will not have that index number that is passed from 
OrcUtils.scala
   
https://github.com/apache/spark/blob/24be81689cee76e03cd5136dfd089123bbff4595/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/orc/OrcFileFormat.scala#L211
   
   For example - 
   
   ```
   val u = """select date_dim. d_year from date_dim limit 5"""
   
   spark.sql(u).collect
   ```
   
   Here the value of index(d_year returned by the OrcUtils.scala#L133 is 6
   
   where the resultSchema passed in OrcFileFormat.scala#L211 is having only one 
 struct<`d_year`:int> 
   
   So now on using the index value 6 in the resultSchema schema which is having 
size 1 is giving the exception
   
   ```
   org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in 
stage 2.0 failed 1 times, most recent failure: Lost task 0.0 in stage 2.0 (TID 
2, 192.168.0.103, executor driver): java.lang.ArrayIndexOutOfBoundsException: 6
   at 
org.apache.spark.sql.execution.datasources.orc.OrcColumnarBatchReader.initBatch(OrcColumnarBatchReader.java:156)
   at 
org.apache.spark.sql.execution.datasources.orc.OrcFileFormat.$anonfun$buildReaderWithPartitionValues$7(OrcFileFormat.scala:258)
   ```



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #29090: [WIP][SPARK-32293] Fix inconsistency between Spark memory configs and JVM option

2020-07-13 Thread GitBox


AmplabJenkins commented on pull request #29090:
URL: https://github.com/apache/spark/pull/29090#issuecomment-657982803







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] cloud-fan commented on pull request #29091: [SPARK-32258][SQL] Not duplicate normalization on children for float/double If/CaseWhen/Coalesce

2020-07-13 Thread GitBox


cloud-fan commented on pull request #29091:
URL: https://github.com/apache/spark/pull/29091#issuecomment-657982389


   thanks, merging to master!



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] cloud-fan closed pull request #29091: [SPARK-32258][SQL] Not duplicate normalization on children for float/double If/CaseWhen/Coalesce

2020-07-13 Thread GitBox


cloud-fan closed pull request #29091:
URL: https://github.com/apache/spark/pull/29091


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #29090: [WIP][SPARK-32293] Fix inconsistency between Spark memory configs and JVM option

2020-07-13 Thread GitBox


SparkQA commented on pull request #29090:
URL: https://github.com/apache/spark/pull/29090#issuecomment-657982312


   **[Test build #125801 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/125801/testReport)**
 for PR 29090 at commit 
[`dfbce91`](https://github.com/apache/spark/commit/dfbce912c7371afae5e8f87bf18b5a3d7dbfca52).
* This patch passes all tests.
* This patch merges cleanly.
* This patch adds no public classes.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA removed a comment on pull request #29090: [WIP][SPARK-32293] Fix inconsistency between Spark memory configs and JVM option

2020-07-13 Thread GitBox


SparkQA removed a comment on pull request #29090:
URL: https://github.com/apache/spark/pull/29090#issuecomment-657931374


   **[Test build #125801 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/125801/testReport)**
 for PR 29090 at commit 
[`dfbce91`](https://github.com/apache/spark/commit/dfbce912c7371afae5e8f87bf18b5a3d7dbfca52).



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SaurabhChawla100 commented on pull request #29045: [SPARK-32234][SQL] Spark sql commands are failing on selecting the orc tables

2020-07-13 Thread GitBox


SaurabhChawla100 commented on pull request #29045:
URL: https://github.com/apache/spark/pull/29045#issuecomment-657978027


   > Can you be more specific about the problem? Are you saying that the actual 
file schema doesn't match the table schema specified by the user?
   
   So in case of orc data created by the hive no field names in the physical 
schema. Please find the below code for reference.
   
https://github.com/apache/spark/blob/24be81689cee76e03cd5136dfd089123bbff4595/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/orc/OrcUtils.scala#L133
   
   So from this code we are sending the index of the col from the dataschema.
   
   But Where as in the below code , we are passing the input result schema and 
that result schema will not have that index number that is passed from 
OrcUtils.scala
   
https://github.com/apache/spark/blob/24be81689cee76e03cd5136dfd089123bbff4595/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/orc/OrcFileFormat.scala#L211
   
   For example - 
   
   ```
   val u = """select date_dim.d_date_id from date_dim limit 5"""
   
   spark.sql(u).collect
   ```
   
   Here the value of index(d_date_id) returned by the OrcUtils.scala#L133 is 2 
   
   where the resultSchema passed in OrcFileFormat.scala#L211 is having only one 
 struct<`d_date_id`:string> 
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #29077: [SPARK-31985][SS] Remove incomplete/undocumented stateful aggregation in continuous mode

2020-07-13 Thread GitBox


AmplabJenkins commented on pull request #29077:
URL: https://github.com/apache/spark/pull/29077#issuecomment-657976343







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #29077: [SPARK-31985][SS] Remove incomplete/undocumented stateful aggregation in continuous mode

2020-07-13 Thread GitBox


AmplabJenkins removed a comment on pull request #29077:
URL: https://github.com/apache/spark/pull/29077#issuecomment-657976343







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA removed a comment on pull request #29077: [SPARK-31985][SS] Remove incomplete/undocumented stateful aggregation in continuous mode

2020-07-13 Thread GitBox


SparkQA removed a comment on pull request #29077:
URL: https://github.com/apache/spark/pull/29077#issuecomment-657891341


   **[Test build #125792 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/125792/testReport)**
 for PR 29077 at commit 
[`5459d58`](https://github.com/apache/spark/commit/5459d58d8ea68a8266f05366fc06eb3f6c062351).



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #29077: [SPARK-31985][SS] Remove incomplete/undocumented stateful aggregation in continuous mode

2020-07-13 Thread GitBox


SparkQA commented on pull request #29077:
URL: https://github.com/apache/spark/pull/29077#issuecomment-657975832


   **[Test build #125792 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/125792/testReport)**
 for PR 29077 at commit 
[`5459d58`](https://github.com/apache/spark/commit/5459d58d8ea68a8266f05366fc06eb3f6c062351).
* This patch passes all tests.
* This patch merges cleanly.
* This patch adds no public classes.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #29088: [SPARK-32289][SQL] Some characters are garbled when opening csv files with Excel

2020-07-13 Thread GitBox


AmplabJenkins removed a comment on pull request #29088:
URL: https://github.com/apache/spark/pull/29088#issuecomment-657974472


   Test FAILed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/125794/
   Test FAILed.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #29088: [SPARK-32289][SQL] Some characters are garbled when opening csv files with Excel

2020-07-13 Thread GitBox


AmplabJenkins removed a comment on pull request #29088:
URL: https://github.com/apache/spark/pull/29088#issuecomment-657974464


   Merged build finished. Test FAILed.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #29088: [SPARK-32289][SQL] Some characters are garbled when opening csv files with Excel

2020-07-13 Thread GitBox


AmplabJenkins commented on pull request #29088:
URL: https://github.com/apache/spark/pull/29088#issuecomment-657974464







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA removed a comment on pull request #29088: [SPARK-32289][SQL] Some characters are garbled when opening csv files with Excel

2020-07-13 Thread GitBox


SparkQA removed a comment on pull request #29088:
URL: https://github.com/apache/spark/pull/29088#issuecomment-657904577


   **[Test build #125794 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/125794/testReport)**
 for PR 29088 at commit 
[`6111a0a`](https://github.com/apache/spark/commit/6111a0a495fc1c0650a472d985ea221f8008f81f).



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #28901: [SPARK-32064][SQL] Supporting create temporary table

2020-07-13 Thread GitBox


AmplabJenkins commented on pull request #28901:
URL: https://github.com/apache/spark/pull/28901#issuecomment-657974354







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #28901: [SPARK-32064][SQL] Supporting create temporary table

2020-07-13 Thread GitBox


AmplabJenkins removed a comment on pull request #28901:
URL: https://github.com/apache/spark/pull/28901#issuecomment-657974354







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #29088: [SPARK-32289][SQL] Some characters are garbled when opening csv files with Excel

2020-07-13 Thread GitBox


SparkQA commented on pull request #29088:
URL: https://github.com/apache/spark/pull/29088#issuecomment-657974102


   **[Test build #125794 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/125794/testReport)**
 for PR 29088 at commit 
[`6111a0a`](https://github.com/apache/spark/commit/6111a0a495fc1c0650a472d985ea221f8008f81f).
* This patch **fails PySpark pip packaging tests**.
* This patch merges cleanly.
* This patch adds no public classes.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #28901: [SPARK-32064][SQL] Supporting create temporary table

2020-07-13 Thread GitBox


SparkQA commented on pull request #28901:
URL: https://github.com/apache/spark/pull/28901#issuecomment-657973941


   **[Test build #125805 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/125805/testReport)**
 for PR 28901 at commit 
[`9b11aac`](https://github.com/apache/spark/commit/9b11aace28be8169e8eff1ce61810bc8250fc37d).



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] LantaoJin commented on pull request #28901: [SPARK-32064][SQL] Supporting create temporary table

2020-07-13 Thread GitBox


LantaoJin commented on pull request #28901:
URL: https://github.com/apache/spark/pull/28901#issuecomment-657972128


   retest this please



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #29095: [SPARK-32298][ML] tree models prediction optimization

2020-07-13 Thread GitBox


AmplabJenkins removed a comment on pull request #29095:
URL: https://github.com/apache/spark/pull/29095#issuecomment-657967907







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AngersZhuuuu commented on a change in pull request #29087: [SPARK-28227][SQL] Support TRANSFORM with aggregation

2020-07-13 Thread GitBox


AngersZh commented on a change in pull request #29087:
URL: https://github.com/apache/spark/pull/29087#discussion_r454102278



##
File path: 
sql/catalyst/src/main/antlr4/org/apache/spark/sql/catalyst/parser/SqlBase.g4
##
@@ -496,7 +496,9 @@ fromStatementBody
 querySpecification
 : transformClause
   fromClause?
-  whereClause?  
#transformQuerySpecification
+  whereClause?
+  aggregationClause?
+  havingClause?   
#transformQuerySpecification

Review comment:
   > Could you update the SQL doc, too?
   
   Can we add this after all things done? and we need to add a new page like 
`Where clause`





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #29095: [SPARK-32298][ML] tree models prediction optimization

2020-07-13 Thread GitBox


AmplabJenkins commented on pull request #29095:
URL: https://github.com/apache/spark/pull/29095#issuecomment-657967907







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AngersZhuuuu commented on a change in pull request #29087: [SPARK-28227][SQL] Support TRANSFORM with aggregation

2020-07-13 Thread GitBox


AngersZh commented on a change in pull request #29087:
URL: https://github.com/apache/spark/pull/29087#discussion_r454045113



##
File path: 
sql/catalyst/src/main/antlr4/org/apache/spark/sql/catalyst/parser/SqlBase.g4
##
@@ -496,7 +496,9 @@ fromStatementBody
 querySpecification
 : transformClause
   fromClause?
-  whereClause?  
#transformQuerySpecification
+  whereClause?
+  aggregationClause?
+  havingClause?   
#transformQuerySpecification

Review comment:
   > Could you update the SQL doc, too?
   
   Yea.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AngersZhuuuu commented on a change in pull request #29087: [SPARK-28227][SQL] Support TRANSFORM with aggregation

2020-07-13 Thread GitBox


AngersZh commented on a change in pull request #29087:
URL: https://github.com/apache/spark/pull/29087#discussion_r454101882



##
File path: 
sql/hive/src/test/scala/org/apache/spark/sql/hive/execution/SQLQuerySuite.scala
##
@@ -2558,6 +2558,131 @@ abstract class SQLQuerySuiteBase extends QueryTest with 
SQLTestUtils with TestHi
   }
 }
   }
+
+  test("SPARK-28227: test script transform with aggregation") {

Review comment:
   > Could you move the tests into `SQLQueryTestSuite`?
   
   This should wait for  https://github.com/apache/spark/pull/29085, since 
currently we can't use script transform in sql/core





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #29095: [SPARK-32298][ML] tree models prediction optimization

2020-07-13 Thread GitBox


SparkQA commented on pull request #29095:
URL: https://github.com/apache/spark/pull/29095#issuecomment-657967512


   **[Test build #125804 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/125804/testReport)**
 for PR 29095 at commit 
[`50510dd`](https://github.com/apache/spark/commit/50510ddc30bb9da42dd7700a55bd5ecec7d3620b).



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] zhengruifeng commented on pull request #29095: [SPARK-32298][ML] tree models prediction optimization

2020-07-13 Thread GitBox


zhengruifeng commented on pull request #29095:
URL: https://github.com/apache/spark/pull/29095#issuecomment-657966575


   test:
   ```
   import org.apache.spark.ml.linalg._
   import org.apache.spark.ml.classification._
   import org.apache.spark.storage.StorageLevel
   
   
   val df = spark.read.option("numFeatures", 
"2000").format("libsvm").load("/data1/Datasets/epsilon/epsilon_normalized.t").withColumn("label",
 (col("label")+1)/2)
   df.persist(StorageLevel.MEMORY_AND_DISK)
   df.count
   
   
   
   val rf = new RandomForestClassifier().setMaxDepth(10).setNumTrees(100)
   val model = rf.fit(df)
   model.save("/tmp/rf-model")
   
   
   val rf2 = new RandomForestClassifier().setMaxDepth(20).setNumTrees(100)
   val model2 = rf2.fit(df)
   model2.save("/tmp/rf-model-d20")
   
   
   
   val model = RandomForestClassificationModel.load("/tmp/rf-model")
   val model2 = RandomForestClassificationModel.load("/tmp/rf-model-d20")
   
   val vecs = df.select("features").rdd.map(row => row.getAs[Vector](0)).collect
   
   val start = System.currentTimeMillis; Seq.range(0, 20).foreach{_ => 
vecs.foreach(model.predict)}; val end = System.currentTimeMillis; end - start
   
   
   val start = System.currentTimeMillis; Seq.range(0, 20).foreach{_ => 
vecs.foreach(model2.predict)}; val end = System.currentTimeMillis; end - start
   ```
   
   
   Results (durations):
   this PR: 167640, 404901
   Master: 187645, 416243



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #28840: [SPARK-31999][SQL] Add REFRESH FUNCTION command

2020-07-13 Thread GitBox


AmplabJenkins removed a comment on pull request #28840:
URL: https://github.com/apache/spark/pull/28840#issuecomment-657965897


   Test FAILed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/125793/
   Test FAILed.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] ulysses-you commented on a change in pull request #28840: [SPARK-31999][SQL] Add REFRESH FUNCTION command

2020-07-13 Thread GitBox


ulysses-you commented on a change in pull request #28840:
URL: https://github.com/apache/spark/pull/28840#discussion_r454099903



##
File path: 
sql/core/src/main/scala/org/apache/spark/sql/execution/command/functions.scala
##
@@ -236,6 +236,45 @@ case class ShowFunctionsCommand(
   }
 }
 
+
+/**
+ * A command for users to refresh the persistent function.
+ * The syntax of using this command in SQL is:
+ * {{{
+ *REFRESH FUNCTION functionName
+ * }}}
+ */
+case class RefreshFunctionCommand(
+databaseName: Option[String],
+functionName: String)
+  extends RunnableCommand {
+
+  override def run(sparkSession: SparkSession): Seq[Row] = {
+val catalog = sparkSession.sessionState.catalog
+if 
(FunctionRegistry.builtin.functionExists(FunctionIdentifier(functionName))) {
+  throw new AnalysisException(s"Cannot refresh builtin function 
$functionName")
+}
+if (catalog.isTemporaryFunction(FunctionIdentifier(functionName, 
databaseName))) {
+  throw new AnalysisException(s"Cannot refresh temporary function 
$functionName")
+}
+
+val identifier = FunctionIdentifier(
+  functionName, Some(databaseName.getOrElse(catalog.getCurrentDatabase)))
+// we only refresh the permanent function.
+if (catalog.isPersistentFunction(identifier)) {
+  // register overwrite function.
+  val func = catalog.getFunctionMetadata(identifier)
+  catalog.registerFunction(func, true)
+} else {
+  // function is not exists, clear cached function.
+  catalog.unregisterFunction(identifier, true)
+  throw new NoSuchFunctionException(identifier.database.get, functionName)

Review comment:
   Just keep the same behavior with `refresh table`, the later also throw 
`NoSuchTableException`.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] zhengruifeng opened a new pull request #29095: [SPARK-32298][ML] tree models prediction optimization

2020-07-13 Thread GitBox


zhengruifeng opened a new pull request #29095:
URL: https://github.com/apache/spark/pull/29095


   ### What changes were proposed in this pull request?
   use while-loop instead of the recursive way
   
   
   ### Why are the changes needed?
   3% ~ 10% faster
   
   
   ### Does this PR introduce _any_ user-facing change?
   No
   
   
   ### How was this patch tested?
   existing testsuites
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #28840: [SPARK-31999][SQL] Add REFRESH FUNCTION command

2020-07-13 Thread GitBox


AmplabJenkins removed a comment on pull request #28840:
URL: https://github.com/apache/spark/pull/28840#issuecomment-657965895


   Merged build finished. Test FAILed.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #28840: [SPARK-31999][SQL] Add REFRESH FUNCTION command

2020-07-13 Thread GitBox


AmplabJenkins commented on pull request #28840:
URL: https://github.com/apache/spark/pull/28840#issuecomment-657965895







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA removed a comment on pull request #28840: [SPARK-31999][SQL] Add REFRESH FUNCTION command

2020-07-13 Thread GitBox


SparkQA removed a comment on pull request #28840:
URL: https://github.com/apache/spark/pull/28840#issuecomment-657893231


   **[Test build #125793 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/125793/testReport)**
 for PR 28840 at commit 
[`c129a54`](https://github.com/apache/spark/commit/c129a545b6ec92117728439e83842fccb54a6a66).



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #28840: [SPARK-31999][SQL] Add REFRESH FUNCTION command

2020-07-13 Thread GitBox


SparkQA commented on pull request #28840:
URL: https://github.com/apache/spark/pull/28840#issuecomment-657965588


   **[Test build #125793 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/125793/testReport)**
 for PR 28840 at commit 
[`c129a54`](https://github.com/apache/spark/commit/c129a545b6ec92117728439e83842fccb54a6a66).
* This patch **fails PySpark pip packaging tests**.
* This patch merges cleanly.
* This patch adds no public classes.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] viirya commented on pull request #29091: [SPARK-32258][SQL] Not duplicate normalization on children for float/double If/CaseWhen/Coalesce

2020-07-13 Thread GitBox


viirya commented on pull request #29091:
URL: https://github.com/apache/spark/pull/29091#issuecomment-657965371


   cc @cloud-fan 



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] ulysses-you commented on a change in pull request #28840: [SPARK-31999][SQL] Add REFRESH FUNCTION command

2020-07-13 Thread GitBox


ulysses-you commented on a change in pull request #28840:
URL: https://github.com/apache/spark/pull/28840#discussion_r454098536



##
File path: 
sql/core/src/main/scala/org/apache/spark/sql/execution/command/functions.scala
##
@@ -236,6 +236,45 @@ case class ShowFunctionsCommand(
   }
 }
 
+
+/**
+ * A command for users to refresh the persistent function.
+ * The syntax of using this command in SQL is:
+ * {{{
+ *REFRESH FUNCTION functionName
+ * }}}
+ */
+case class RefreshFunctionCommand(
+databaseName: Option[String],
+functionName: String)
+  extends RunnableCommand {
+
+  override def run(sparkSession: SparkSession): Seq[Row] = {
+val catalog = sparkSession.sessionState.catalog
+if 
(FunctionRegistry.builtin.functionExists(FunctionIdentifier(functionName))) {
+  throw new AnalysisException(s"Cannot refresh builtin function 
$functionName")
+}
+if (catalog.isTemporaryFunction(FunctionIdentifier(functionName, 
databaseName))) {
+  throw new AnalysisException(s"Cannot refresh temporary function 
$functionName")
+}
+
+val identifier = FunctionIdentifier(
+  functionName, Some(databaseName.getOrElse(catalog.getCurrentDatabase)))
+// we only refresh the permanent function.
+if (catalog.isPersistentFunction(identifier)) {
+  // register overwrite function.
+  val func = catalog.getFunctionMetadata(identifier)
+  catalog.registerFunction(func, true)
+} else {
+  // function is not exists, clear cached function.

Review comment:
   If function already in cache, query can still work after we drop 
function with hive client.
   
   I think the behavior of `refresh` should be that invalid the cache and keep 
consistent with hive metastore.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #27428: [SPARK-30276][SQL] Support Filter expression allows simultaneous use of DISTINCT

2020-07-13 Thread GitBox


AmplabJenkins removed a comment on pull request #27428:
URL: https://github.com/apache/spark/pull/27428#issuecomment-657964078


   Test FAILed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/125800/
   Test FAILed.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #28939: [SPARK-32119][CORE] ExecutorPlugin doesn't work with Standalone Cluster

2020-07-13 Thread GitBox


AmplabJenkins commented on pull request #28939:
URL: https://github.com/apache/spark/pull/28939#issuecomment-657964151







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #27428: [SPARK-30276][SQL] Support Filter expression allows simultaneous use of DISTINCT

2020-07-13 Thread GitBox


AmplabJenkins commented on pull request #27428:
URL: https://github.com/apache/spark/pull/27428#issuecomment-657964073







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #28939: [SPARK-32119][CORE] ExecutorPlugin doesn't work with Standalone Cluster

2020-07-13 Thread GitBox


AmplabJenkins removed a comment on pull request #28939:
URL: https://github.com/apache/spark/pull/28939#issuecomment-657964151







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #27428: [SPARK-30276][SQL] Support Filter expression allows simultaneous use of DISTINCT

2020-07-13 Thread GitBox


AmplabJenkins removed a comment on pull request #27428:
URL: https://github.com/apache/spark/pull/27428#issuecomment-657964073


   Merged build finished. Test FAILed.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #28939: [SPARK-32119][CORE] ExecutorPlugin doesn't work with Standalone Cluster

2020-07-13 Thread GitBox


AmplabJenkins removed a comment on pull request #28939:
URL: https://github.com/apache/spark/pull/28939#issuecomment-657890028


   Test FAILed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/125778/
   Test FAILed.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #28939: [SPARK-32119][CORE] ExecutorPlugin doesn't work with Standalone Cluster

2020-07-13 Thread GitBox


SparkQA commented on pull request #28939:
URL: https://github.com/apache/spark/pull/28939#issuecomment-657963838


   **[Test build #125803 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/125803/testReport)**
 for PR 28939 at commit 
[`449df2b`](https://github.com/apache/spark/commit/449df2b92e5ad0dac6ea8dd83233450946a39df2).



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA removed a comment on pull request #27428: [SPARK-30276][SQL] Support Filter expression allows simultaneous use of DISTINCT

2020-07-13 Thread GitBox


SparkQA removed a comment on pull request #27428:
URL: https://github.com/apache/spark/pull/27428#issuecomment-657927315


   **[Test build #125800 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/125800/testReport)**
 for PR 27428 at commit 
[`20ad143`](https://github.com/apache/spark/commit/20ad143c620ef75e8d446f8f1e595992a1959b4a).



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #27428: [SPARK-30276][SQL] Support Filter expression allows simultaneous use of DISTINCT

2020-07-13 Thread GitBox


SparkQA commented on pull request #27428:
URL: https://github.com/apache/spark/pull/27428#issuecomment-657963736


   **[Test build #125800 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/125800/testReport)**
 for PR 27428 at commit 
[`20ad143`](https://github.com/apache/spark/commit/20ad143c620ef75e8d446f8f1e595992a1959b4a).
* This patch **fails Spark unit tests**.
* This patch merges cleanly.
* This patch adds no public classes.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #29064: [SPARK-32272][SQL] Add SQL standard command SET TIME ZONE

2020-07-13 Thread GitBox


AmplabJenkins removed a comment on pull request #29064:
URL: https://github.com/apache/spark/pull/29064#issuecomment-657962666







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #29064: [SPARK-32272][SQL] Add SQL standard command SET TIME ZONE

2020-07-13 Thread GitBox


AmplabJenkins commented on pull request #29064:
URL: https://github.com/apache/spark/pull/29064#issuecomment-657962666







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA removed a comment on pull request #29064: [SPARK-32272][SQL] Add SQL standard command SET TIME ZONE

2020-07-13 Thread GitBox


SparkQA removed a comment on pull request #29064:
URL: https://github.com/apache/spark/pull/29064#issuecomment-657876739


   **[Test build #125791 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/125791/testReport)**
 for PR 29064 at commit 
[`5501213`](https://github.com/apache/spark/commit/5501213c0525aa6c0556a9bdae90edd0facf8025).



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #29064: [SPARK-32272][SQL] Add SQL standard command SET TIME ZONE

2020-07-13 Thread GitBox


SparkQA commented on pull request #29064:
URL: https://github.com/apache/spark/pull/29064#issuecomment-657962108


   **[Test build #125791 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/125791/testReport)**
 for PR 29064 at commit 
[`5501213`](https://github.com/apache/spark/commit/5501213c0525aa6c0556a9bdae90edd0facf8025).
* This patch passes all tests.
* This patch merges cleanly.
* This patch adds no public classes.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #28968: [SPARK-32010][PYTHON][CORE] Add InheritableThread for local properties and fixing a thread leak issue in pinned thread mode

2020-07-13 Thread GitBox


AmplabJenkins removed a comment on pull request #28968:
URL: https://github.com/apache/spark/pull/28968#issuecomment-657959411







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA removed a comment on pull request #28968: [SPARK-32010][PYTHON][CORE] Add InheritableThread for local properties and fixing a thread leak issue in pinned thread mode

2020-07-13 Thread GitBox


SparkQA removed a comment on pull request #28968:
URL: https://github.com/apache/spark/pull/28968#issuecomment-657953435


   **[Test build #125802 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/125802/testReport)**
 for PR 28968 at commit 
[`a78fd43`](https://github.com/apache/spark/commit/a78fd4314ba39d1feb63ba1539ac9a2acf40de77).



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #28968: [SPARK-32010][PYTHON][CORE] Add InheritableThread for local properties and fixing a thread leak issue in pinned thread mode

2020-07-13 Thread GitBox


AmplabJenkins commented on pull request #28968:
URL: https://github.com/apache/spark/pull/28968#issuecomment-657959411







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #28968: [SPARK-32010][PYTHON][CORE] Add InheritableThread for local properties and fixing a thread leak issue in pinned thread mode

2020-07-13 Thread GitBox


SparkQA commented on pull request #28968:
URL: https://github.com/apache/spark/pull/28968#issuecomment-657959340


   **[Test build #125802 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/125802/testReport)**
 for PR 28968 at commit 
[`a78fd43`](https://github.com/apache/spark/commit/a78fd4314ba39d1feb63ba1539ac9a2acf40de77).
* This patch passes all tests.
* This patch merges cleanly.
* This patch adds the following public classes _(experimental)_:
 * `class InheritableThread(threading.Thread):`



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] cloud-fan closed pull request #29053: [SPARK-32241][SQL] Remove empty children of union

2020-07-13 Thread GitBox


cloud-fan closed pull request #29053:
URL: https://github.com/apache/spark/pull/29053


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] cloud-fan commented on pull request #29053: [SPARK-32241][SQL] Remove empty children of union

2020-07-13 Thread GitBox


cloud-fan commented on pull request #29053:
URL: https://github.com/apache/spark/pull/29053#issuecomment-657958705


   thanks, merging to master!



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #29002: [SPARK-32175][CORE] Fix the order between initialization for ExecutorPlugin and starting heartbeat thread

2020-07-13 Thread GitBox


AmplabJenkins removed a comment on pull request #29002:
URL: https://github.com/apache/spark/pull/29002#issuecomment-657958140


   Test FAILed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/125798/
   Test FAILed.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #29002: [SPARK-32175][CORE] Fix the order between initialization for ExecutorPlugin and starting heartbeat thread

2020-07-13 Thread GitBox


AmplabJenkins commented on pull request #29002:
URL: https://github.com/apache/spark/pull/29002#issuecomment-657958137







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #29002: [SPARK-32175][CORE] Fix the order between initialization for ExecutorPlugin and starting heartbeat thread

2020-07-13 Thread GitBox


AmplabJenkins removed a comment on pull request #29002:
URL: https://github.com/apache/spark/pull/29002#issuecomment-657958137


   Merged build finished. Test FAILed.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA removed a comment on pull request #29002: [SPARK-32175][CORE] Fix the order between initialization for ExecutorPlugin and starting heartbeat thread

2020-07-13 Thread GitBox


SparkQA removed a comment on pull request #29002:
URL: https://github.com/apache/spark/pull/29002#issuecomment-657916150


   **[Test build #125798 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/125798/testReport)**
 for PR 29002 at commit 
[`d768385`](https://github.com/apache/spark/commit/d768385caac9c79c456de87a4afd72298dda46db).



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #29002: [SPARK-32175][CORE] Fix the order between initialization for ExecutorPlugin and starting heartbeat thread

2020-07-13 Thread GitBox


SparkQA commented on pull request #29002:
URL: https://github.com/apache/spark/pull/29002#issuecomment-657957833


   **[Test build #125798 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/125798/testReport)**
 for PR 29002 at commit 
[`d768385`](https://github.com/apache/spark/commit/d768385caac9c79c456de87a4afd72298dda46db).
* This patch **fails PySpark pip packaging tests**.
* This patch merges cleanly.
* This patch adds no public classes.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] cloud-fan commented on a change in pull request #28840: [SPARK-31999][SQL] Add REFRESH FUNCTION command

2020-07-13 Thread GitBox


cloud-fan commented on a change in pull request #28840:
URL: https://github.com/apache/spark/pull/28840#discussion_r454087942



##
File path: 
sql/core/src/main/scala/org/apache/spark/sql/execution/command/functions.scala
##
@@ -236,6 +236,45 @@ case class ShowFunctionsCommand(
   }
 }
 
+
+/**
+ * A command for users to refresh the persistent function.
+ * The syntax of using this command in SQL is:
+ * {{{
+ *REFRESH FUNCTION functionName
+ * }}}
+ */
+case class RefreshFunctionCommand(
+databaseName: Option[String],
+functionName: String)
+  extends RunnableCommand {
+
+  override def run(sparkSession: SparkSession): Seq[Row] = {
+val catalog = sparkSession.sessionState.catalog
+if 
(FunctionRegistry.builtin.functionExists(FunctionIdentifier(functionName))) {
+  throw new AnalysisException(s"Cannot refresh builtin function 
$functionName")
+}
+if (catalog.isTemporaryFunction(FunctionIdentifier(functionName, 
databaseName))) {
+  throw new AnalysisException(s"Cannot refresh temporary function 
$functionName")
+}
+
+val identifier = FunctionIdentifier(
+  functionName, Some(databaseName.getOrElse(catalog.getCurrentDatabase)))
+// we only refresh the permanent function.
+if (catalog.isPersistentFunction(identifier)) {
+  // register overwrite function.
+  val func = catalog.getFunctionMetadata(identifier)
+  catalog.registerFunction(func, true)
+} else {
+  // function is not exists, clear cached function.

Review comment:
   BTW does the query fail if it tries to use such a function?





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] cloud-fan commented on a change in pull request #28840: [SPARK-31999][SQL] Add REFRESH FUNCTION command

2020-07-13 Thread GitBox


cloud-fan commented on a change in pull request #28840:
URL: https://github.com/apache/spark/pull/28840#discussion_r454087814



##
File path: 
sql/core/src/main/scala/org/apache/spark/sql/execution/command/functions.scala
##
@@ -236,6 +236,45 @@ case class ShowFunctionsCommand(
   }
 }
 
+
+/**
+ * A command for users to refresh the persistent function.
+ * The syntax of using this command in SQL is:
+ * {{{
+ *REFRESH FUNCTION functionName
+ * }}}
+ */
+case class RefreshFunctionCommand(
+databaseName: Option[String],
+functionName: String)
+  extends RunnableCommand {
+
+  override def run(sparkSession: SparkSession): Seq[Row] = {
+val catalog = sparkSession.sessionState.catalog
+if 
(FunctionRegistry.builtin.functionExists(FunctionIdentifier(functionName))) {
+  throw new AnalysisException(s"Cannot refresh builtin function 
$functionName")
+}
+if (catalog.isTemporaryFunction(FunctionIdentifier(functionName, 
databaseName))) {
+  throw new AnalysisException(s"Cannot refresh temporary function 
$functionName")
+}
+
+val identifier = FunctionIdentifier(
+  functionName, Some(databaseName.getOrElse(catalog.getCurrentDatabase)))
+// we only refresh the permanent function.
+if (catalog.isPersistentFunction(identifier)) {
+  // register overwrite function.
+  val func = catalog.getFunctionMetadata(identifier)
+  catalog.registerFunction(func, true)
+} else {
+  // function is not exists, clear cached function.
+  catalog.unregisterFunction(identifier, true)
+  throw new NoSuchFunctionException(identifier.database.get, functionName)

Review comment:
   If it's a valid use case, why do we throw an exception here?





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] cloud-fan commented on a change in pull request #28840: [SPARK-31999][SQL] Add REFRESH FUNCTION command

2020-07-13 Thread GitBox


cloud-fan commented on a change in pull request #28840:
URL: https://github.com/apache/spark/pull/28840#discussion_r454087683



##
File path: 
sql/core/src/main/scala/org/apache/spark/sql/execution/command/functions.scala
##
@@ -236,6 +236,45 @@ case class ShowFunctionsCommand(
   }
 }
 
+
+/**
+ * A command for users to refresh the persistent function.
+ * The syntax of using this command in SQL is:
+ * {{{
+ *REFRESH FUNCTION functionName
+ * }}}
+ */
+case class RefreshFunctionCommand(
+databaseName: Option[String],
+functionName: String)
+  extends RunnableCommand {
+
+  override def run(sparkSession: SparkSession): Seq[Row] = {
+val catalog = sparkSession.sessionState.catalog
+if 
(FunctionRegistry.builtin.functionExists(FunctionIdentifier(functionName))) {
+  throw new AnalysisException(s"Cannot refresh builtin function 
$functionName")
+}
+if (catalog.isTemporaryFunction(FunctionIdentifier(functionName, 
databaseName))) {
+  throw new AnalysisException(s"Cannot refresh temporary function 
$functionName")
+}
+
+val identifier = FunctionIdentifier(
+  functionName, Some(databaseName.getOrElse(catalog.getCurrentDatabase)))
+// we only refresh the permanent function.
+if (catalog.isPersistentFunction(identifier)) {
+  // register overwrite function.
+  val func = catalog.getFunctionMetadata(identifier)
+  catalog.registerFunction(func, true)
+} else {
+  // function is not exists, clear cached function.

Review comment:
   do you mean function does not exist in the metastore/catalog, and we 
need to clear the cache entry?





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] cloud-fan commented on pull request #29045: [SPARK-32234][SQL] Spark sql commands are failing on selecting the orc tables

2020-07-13 Thread GitBox


cloud-fan commented on pull request #29045:
URL: https://github.com/apache/spark/pull/29045#issuecomment-657954315


   > The reason behind this initBatch is not getting the schema that is needed 
to find out the column value in OrcFileFormat.scala
   
   Can you be more specific about the problem? Are you saying that the actual 
file schema doesn't match the table schema specified by the user?



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #28968: [SPARK-32010][PYTHON][CORE] Add InheritableThread for local properties and fixing a thread leak issue in pinned thread mode

2020-07-13 Thread GitBox


AmplabJenkins removed a comment on pull request #28968:
URL: https://github.com/apache/spark/pull/28968#issuecomment-657953861







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #28968: [SPARK-32010][PYTHON][CORE] Add InheritableThread for local properties and fixing a thread leak issue in pinned thread mode

2020-07-13 Thread GitBox


AmplabJenkins commented on pull request #28968:
URL: https://github.com/apache/spark/pull/28968#issuecomment-657953861







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #29094: [SPARK-24983][SQL] limit number of leaf expressions in a single project when collapse project to prevent driver oom

2020-07-13 Thread GitBox


AmplabJenkins removed a comment on pull request #29094:
URL: https://github.com/apache/spark/pull/29094#issuecomment-657953317


   Can one of the admins verify this patch?



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #29094: [SPARK-24983][SQL] limit number of leaf expressions in a single project when collapse project to prevent driver oom

2020-07-13 Thread GitBox


AmplabJenkins commented on pull request #29094:
URL: https://github.com/apache/spark/pull/29094#issuecomment-657953631


   Can one of the admins verify this patch?



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #29094: [SPARK-24983][SQL] limit number of leaf expressions in a single project when collapse project to prevent driver oom

2020-07-13 Thread GitBox


AmplabJenkins commented on pull request #29094:
URL: https://github.com/apache/spark/pull/29094#issuecomment-657953317


   Can one of the admins verify this patch?



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #28968: [SPARK-32010][PYTHON][CORE] Add InheritableThread for local properties and fixing a thread leak issue in pinned thread mode

2020-07-13 Thread GitBox


SparkQA commented on pull request #28968:
URL: https://github.com/apache/spark/pull/28968#issuecomment-657953435


   **[Test build #125802 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/125802/testReport)**
 for PR 28968 at commit 
[`a78fd43`](https://github.com/apache/spark/commit/a78fd4314ba39d1feb63ba1539ac9a2acf40de77).



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] constzhou opened a new pull request #29094: [SPARK-24983][SQL] limit number of leaf expressions in a single project when collapse project to prevent driver oom

2020-07-13 Thread GitBox


constzhou opened a new pull request #29094:
URL: https://github.com/apache/spark/pull/29094


   
   
   ### What changes were proposed in this pull request?
   
   
   
   ### Why are the changes needed?
   
   
   
   ### Does this PR introduce _any_ user-facing change?
   
   
   
   ### How was this patch tested?
   
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] GuoPhilipse commented on pull request #29056: [SPARK-31753][SQL][DOCS] Add missing keywords in the SQL docs

2020-07-13 Thread GitBox


GuoPhilipse commented on pull request #29056:
URL: https://github.com/apache/spark/pull/29056#issuecomment-657942650


   cc @maropu 



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #28708: [SPARK-20629][CORE][K8S] Copy shuffle data when nodes are being shutdown

2020-07-13 Thread GitBox


AmplabJenkins removed a comment on pull request #28708:
URL: https://github.com/apache/spark/pull/28708#issuecomment-657937021


   Merged build finished. Test FAILed.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #28708: [SPARK-20629][CORE][K8S] Copy shuffle data when nodes are being shutdown

2020-07-13 Thread GitBox


AmplabJenkins commented on pull request #28708:
URL: https://github.com/apache/spark/pull/28708#issuecomment-657937021







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #29090: [WIP][SPARK-32293] Fix inconsistency between Spark memory configs and JVM option

2020-07-13 Thread GitBox


AmplabJenkins commented on pull request #29090:
URL: https://github.com/apache/spark/pull/29090#issuecomment-657931688







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #29090: [WIP][SPARK-32293] Fix inconsistency between Spark memory configs and JVM option

2020-07-13 Thread GitBox


AmplabJenkins removed a comment on pull request #29090:
URL: https://github.com/apache/spark/pull/29090#issuecomment-657931688







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] srowen commented on a change in pull request #28960: [SPARK-32140][ML][PySpark] Add training summary to FMClassificationModel

2020-07-13 Thread GitBox


srowen commented on a change in pull request #28960:
URL: https://github.com/apache/spark/pull/28960#discussion_r454062744



##
File path: 
mllib/src/main/scala/org/apache/spark/mllib/optimization/GradientDescent.scala
##
@@ -226,45 +239,48 @@ object GradientDescent extends Logging {
 
 var converged = false // indicates whether converged based on 
convergenceTol
 var i = 1
-while (!converged && i <= numIterations) {
-  val bcWeights = data.context.broadcast(weights)
-  // Sample a subset (fraction miniBatchFraction) of the total data
-  // compute and sum up the subgradients on this subset (this is one 
map-reduce)
-  val (gradientSum, lossSum, miniBatchSize) = data.sample(false, 
miniBatchFraction, 42 + i)
-.treeAggregate((BDV.zeros[Double](n), 0.0, 0L))(
-  seqOp = (c, v) => {
-// c: (grad, loss, count), v: (label, features)
-val l = gradient.compute(v._2, v._1, bcWeights.value, 
Vectors.fromBreeze(c._1))
-(c._1, c._2 + l, c._3 + 1)
-  },
-  combOp = (c1, c2) => {
-// c: (grad, loss, count)
-(c1._1 += c2._1, c1._2 + c2._2, c1._3 + c2._3)
-  })
-  bcWeights.destroy()
-
-  if (miniBatchSize > 0) {
-/**
- * lossSum is computed using the weights from the previous iteration
- * and regVal is the regularization value computed in the previous 
iteration as well.
- */
-stochasticLossHistory += lossSum / miniBatchSize + regVal
-val update = updater.compute(
-  weights, Vectors.fromBreeze(gradientSum / miniBatchSize.toDouble),
-  stepSize, i, regParam)
-weights = update._1
-regVal = update._2
-
-previousWeights = currentWeights
-currentWeights = Some(weights)
-if (previousWeights != None && currentWeights != None) {
-  converged = isConverged(previousWeights.get,
-currentWeights.get, convergenceTol)
+breakable {
+  while (i <= numIterations + 1) {
+val bcWeights = data.context.broadcast(weights)
+// Sample a subset (fraction miniBatchFraction) of the total data
+// compute and sum up the subgradients on this subset (this is one 
map-reduce)
+val (gradientSum, lossSum, miniBatchSize) = data.sample(false, 
miniBatchFraction, 42 + i)
+  .treeAggregate((BDV.zeros[Double](n), 0.0, 0L))(
+seqOp = (c, v) => {

Review comment:
   Yeah it's a little unusual unless it significantly simplifies the code. 
Can `!converged` be added back to the while condition, and then turn the `if 
(X) break` condition below into `if (!X) { ... code that follows ...}` ? should 
be the same as i will increment and end the loop right after anyway





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #29090: [WIP][SPARK-32293] Fix inconsistency between Spark memory configs and JVM option

2020-07-13 Thread GitBox


SparkQA commented on pull request #29090:
URL: https://github.com/apache/spark/pull/29090#issuecomment-657931374


   **[Test build #125801 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/125801/testReport)**
 for PR 29090 at commit 
[`dfbce91`](https://github.com/apache/spark/commit/dfbce912c7371afae5e8f87bf18b5a3d7dbfca52).



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] HyukjinKwon commented on pull request #29090: [WIP][SPARK-32293] Fix inconsistency between Spark memory configs and JVM option

2020-07-13 Thread GitBox


HyukjinKwon commented on pull request #29090:
URL: https://github.com/apache/spark/pull/29090#issuecomment-657929538


   retest this please



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #27428: [SPARK-30276][SQL] Support Filter expression allows simultaneous use of DISTINCT

2020-07-13 Thread GitBox


AmplabJenkins removed a comment on pull request #27428:
URL: https://github.com/apache/spark/pull/27428#issuecomment-657927691







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #27428: [SPARK-30276][SQL] Support Filter expression allows simultaneous use of DISTINCT

2020-07-13 Thread GitBox


AmplabJenkins commented on pull request #27428:
URL: https://github.com/apache/spark/pull/27428#issuecomment-657927691







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #27428: [SPARK-30276][SQL] Support Filter expression allows simultaneous use of DISTINCT

2020-07-13 Thread GitBox


SparkQA commented on pull request #27428:
URL: https://github.com/apache/spark/pull/27428#issuecomment-657927315


   **[Test build #125800 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/125800/testReport)**
 for PR 27428 at commit 
[`20ad143`](https://github.com/apache/spark/commit/20ad143c620ef75e8d446f8f1e595992a1959b4a).



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] holdenk commented on pull request #28957: [SPARK-32138] Drop Python 2.7, 3.4 and 3.5

2020-07-13 Thread GitBox


holdenk commented on pull request #28957:
URL: https://github.com/apache/spark/pull/28957#issuecomment-657927412


   Thanks for doing this, awesome work :)



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] holdenk commented on pull request #28708: [SPARK-20629][CORE][K8S] Copy shuffle data when nodes are being shutdown

2020-07-13 Thread GitBox


holdenk commented on pull request #28708:
URL: https://github.com/apache/spark/pull/28708#issuecomment-657927249


   Looking at this I believe all of the changes requested have been addressed. 
I'm going to get this PR up to date with the current development now that the 
SPIP has passed and if there are no other issues by the time that's done I 
intend to merge this PR.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #28708: [SPARK-20629][CORE][K8S] Copy shuffle data when nodes are being shutdown

2020-07-13 Thread GitBox


SparkQA commented on pull request #28708:
URL: https://github.com/apache/spark/pull/28708#issuecomment-657927290


   **[Test build #125799 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/125799/testReport)**
 for PR 28708 at commit 
[`5a0cd2a`](https://github.com/apache/spark/commit/5a0cd2abd316aacc601b9e8fa6e1406b67c55fb7).



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] HyukjinKwon closed pull request #28957: [SPARK-32138] Drop Python 2.7, 3.4 and 3.5

2020-07-13 Thread GitBox


HyukjinKwon closed pull request #28957:
URL: https://github.com/apache/spark/pull/28957


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] zhengruifeng commented on a change in pull request #28960: [SPARK-32140][ML][PySpark] Add training summary to FMClassificationModel

2020-07-13 Thread GitBox


zhengruifeng commented on a change in pull request #28960:
URL: https://github.com/apache/spark/pull/28960#discussion_r454058321



##
File path: 
mllib/src/main/scala/org/apache/spark/mllib/optimization/GradientDescent.scala
##
@@ -226,45 +239,48 @@ object GradientDescent extends Logging {
 
 var converged = false // indicates whether converged based on 
convergenceTol
 var i = 1
-while (!converged && i <= numIterations) {
-  val bcWeights = data.context.broadcast(weights)
-  // Sample a subset (fraction miniBatchFraction) of the total data
-  // compute and sum up the subgradients on this subset (this is one 
map-reduce)
-  val (gradientSum, lossSum, miniBatchSize) = data.sample(false, 
miniBatchFraction, 42 + i)
-.treeAggregate((BDV.zeros[Double](n), 0.0, 0L))(
-  seqOp = (c, v) => {
-// c: (grad, loss, count), v: (label, features)
-val l = gradient.compute(v._2, v._1, bcWeights.value, 
Vectors.fromBreeze(c._1))
-(c._1, c._2 + l, c._3 + 1)
-  },
-  combOp = (c1, c2) => {
-// c: (grad, loss, count)
-(c1._1 += c2._1, c1._2 + c2._2, c1._3 + c2._3)
-  })
-  bcWeights.destroy()
-
-  if (miniBatchSize > 0) {
-/**
- * lossSum is computed using the weights from the previous iteration
- * and regVal is the regularization value computed in the previous 
iteration as well.
- */
-stochasticLossHistory += lossSum / miniBatchSize + regVal
-val update = updater.compute(
-  weights, Vectors.fromBreeze(gradientSum / miniBatchSize.toDouble),
-  stepSize, i, regParam)
-weights = update._1
-regVal = update._2
-
-previousWeights = currentWeights
-currentWeights = Some(weights)
-if (previousWeights != None && currentWeights != None) {
-  converged = isConverged(previousWeights.get,
-currentWeights.get, convergenceTol)
+breakable {
+  while (i <= numIterations + 1) {
+val bcWeights = data.context.broadcast(weights)
+// Sample a subset (fraction miniBatchFraction) of the total data
+// compute and sum up the subgradients on this subset (this is one 
map-reduce)
+val (gradientSum, lossSum, miniBatchSize) = data.sample(false, 
miniBatchFraction, 42 + i)
+  .treeAggregate((BDV.zeros[Double](n), 0.0, 0L))(
+seqOp = (c, v) => {

Review comment:
   nit: it seems that `breakable` is not used in spark (except two suites):
   ```
   ➜  spark git:(master) ag --scala 'breakable' .   
   
mllib/src/test/scala/org/apache/spark/ml/classification/LogisticRegressionSuite.scala
   2941:  breakable {
   
   
mllib/src/test/scala/org/apache/spark/mllib/classification/LogisticRegressionSuite.scala
   142:  breakable {
   ```
   
   I am not sure whether it is suiteable





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] HyukjinKwon commented on pull request #28957: [SPARK-32138] Drop Python 2.7, 3.4 and 3.5

2020-07-13 Thread GitBox


HyukjinKwon commented on pull request #28957:
URL: https://github.com/apache/spark/pull/28957#issuecomment-657927040







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] holdenk commented on a change in pull request #28708: [SPARK-20629][CORE][K8S] Copy shuffle data when nodes are being shutdown

2020-07-13 Thread GitBox


holdenk commented on a change in pull request #28708:
URL: https://github.com/apache/spark/pull/28708#discussion_r454058089



##
File path: core/src/main/scala/org/apache/spark/storage/BlockManager.scala
##
@@ -242,8 +244,7 @@ private[spark] class BlockManager(
 
   private var blockReplicationPolicy: BlockReplicationPolicy = _
 
-  private var blockManagerDecommissioning: Boolean = false
-  private var decommissionManager: Option[BlockManagerDecommissionManager] = 
None
+  @volatile private var decommissioner: Option[BlockManagerDecommissioner] = 
None

Review comment:
   I think I'm going to leave it volatile for now, I'd like to avoid remote 
block puts once we're in decommissioning because we depend on not getting new 
blocks except from tasks to figure out when it is safe to exit.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #29078: [SPARK-29292][STREAMING][SQL][BUILD] Get streaming, catalyst, sql compiling for Scala 2.13

2020-07-13 Thread GitBox


AmplabJenkins removed a comment on pull request #29078:
URL: https://github.com/apache/spark/pull/29078#issuecomment-657922148







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #29078: [SPARK-29292][STREAMING][SQL][BUILD] Get streaming, catalyst, sql compiling for Scala 2.13

2020-07-13 Thread GitBox


AmplabJenkins commented on pull request #29078:
URL: https://github.com/apache/spark/pull/29078#issuecomment-657922148







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #29078: [SPARK-29292][STREAMING][SQL][BUILD] Get streaming, catalyst, sql compiling for Scala 2.13

2020-07-13 Thread GitBox


SparkQA commented on pull request #29078:
URL: https://github.com/apache/spark/pull/29078#issuecomment-657921783


   **[Test build #125790 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/125790/testReport)**
 for PR 29078 at commit 
[`370dabe`](https://github.com/apache/spark/commit/370dabeca759f78237afe3c84c511f2b0904b228).
* This patch passes all tests.
* This patch merges cleanly.
* This patch adds the following public classes _(experimental)_:
 * `class ContinuousRecordEndpoint(buckets: Seq[mutable.Seq[UnsafeRow]], 
lock: Object)`



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



  1   2   3   4   5   6   7   >