[GitHub] [spark] AmplabJenkins removed a comment on pull request #25965: [SPARK-26425][SS] Add more constraint checks to avoid checkpoint corruption

2020-05-26 Thread GitBox


AmplabJenkins removed a comment on pull request #25965:
URL: https://github.com/apache/spark/pull/25965#issuecomment-634438431


   Test FAILed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/123157/
   Test FAILed.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #25965: [SPARK-26425][SS] Add more constraint checks to avoid checkpoint corruption

2020-05-26 Thread GitBox


AmplabJenkins removed a comment on pull request #25965:
URL: https://github.com/apache/spark/pull/25965#issuecomment-634438426


   Merged build finished. Test FAILed.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #25965: [SPARK-26425][SS] Add more constraint checks to avoid checkpoint corruption

2020-05-26 Thread GitBox


SparkQA commented on pull request #25965:
URL: https://github.com/apache/spark/pull/25965#issuecomment-634438282


   **[Test build #123157 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/123157/testReport)**
 for PR 25965 at commit 
[`d15acef`](https://github.com/apache/spark/commit/d15acef9698528239dc8a5b92d55c950cdf602b2).
* This patch **fails Spark unit tests**.
* This patch merges cleanly.
* This patch adds no public classes.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #25965: [SPARK-26425][SS] Add more constraint checks to avoid checkpoint corruption

2020-05-26 Thread GitBox


AmplabJenkins commented on pull request #25965:
URL: https://github.com/apache/spark/pull/25965#issuecomment-634438426







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA removed a comment on pull request #25965: [SPARK-26425][SS] Add more constraint checks to avoid checkpoint corruption

2020-05-26 Thread GitBox


SparkQA removed a comment on pull request #25965:
URL: https://github.com/apache/spark/pull/25965#issuecomment-634387900


   **[Test build #123157 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/123157/testReport)**
 for PR 25965 at commit 
[`d15acef`](https://github.com/apache/spark/commit/d15acef9698528239dc8a5b92d55c950cdf602b2).



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] kiszk commented on a change in pull request #28593: [SPARK-31710][SQL] Fail casting numeric to timestamp by default

2020-05-26 Thread GitBox


kiszk commented on a change in pull request #28593:
URL: https://github.com/apache/spark/pull/28593#discussion_r430862928



##
File path: 
sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/CastSuite.scala
##
@@ -1311,6 +1311,27 @@ class CastSuite extends CastSuiteBase {
   checkEvaluation(cast(negativeTs, LongType), expectedSecs)
 }
   }
+
+  test("SPARK-31710:fail casting from integral to timestamp by default") {
+withSQLConf(SQLConf.LEGACY_AllOW_CAST_NUMERIC_TO_TIMESTAMP.key -> "false") 
{
+  assert(!cast(2.toByte, TimestampType).resolved)
+  assert(!cast(10.toShort, TimestampType).resolved)
+  assert(!cast(3, TimestampType).resolved)
+  assert(!cast(10L, TimestampType).resolved)
+  assert(!cast(Decimal(1.2), TimestampType).resolved)
+  assert(!cast(1.7f, TimestampType).resolved)
+  assert(!cast(2.3d, TimestampType).resolved)
+}
+withSQLConf(SQLConf.LEGACY_AllOW_CAST_NUMERIC_TO_TIMESTAMP.key -> "true") {
+  assert(cast(2.toByte, TimestampType).resolved)
+  assert(cast(10.toShort, TimestampType).resolved)
+  assert(cast(3, TimestampType).resolved)
+  assert(cast(10L, TimestampType).resolved)
+  assert(cast(Decimal(1.2), TimestampType).resolved)
+  assert(cast(1.7f, TimestampType).resolved)
+  assert(cast(2.3d, TimestampType).resolved)
+}
+  }

Review comment:
   How about the following style to make it small?
   
   ```
 Seq(true, false).foreach { enabled =>
   withSQLConf(SQLConf.LEGACY_AllOW_CAST_NUMERIC_TO_TIMESTAMP.key -> 
enabled.toString) {
 assert(cast(2.toByte, TimestampType).resolved == enabled)
 ...
   }
 }
   ```





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #28650: [SPARK-31830][SQL] Consistent error handling for datetime formatting functions

2020-05-26 Thread GitBox


SparkQA commented on pull request #28650:
URL: https://github.com/apache/spark/pull/28650#issuecomment-634434966


   **[Test build #123165 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/123165/testReport)**
 for PR 28650 at commit 
[`f39dd48`](https://github.com/apache/spark/commit/f39dd48ffea3ce39c6a2659d3c690a5116717a9a).



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #28650: [SPARK-31830][SQL] Consistent error handling for datetime formatting functions

2020-05-26 Thread GitBox


AmplabJenkins removed a comment on pull request #28650:
URL: https://github.com/apache/spark/pull/28650#issuecomment-634433042







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] kiszk commented on a change in pull request #28593: [SPARK-31710][SQL] Fail casting numeric to timestamp by default

2020-05-26 Thread GitBox


kiszk commented on a change in pull request #28593:
URL: https://github.com/apache/spark/pull/28593#discussion_r430861312



##
File path: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/Cast.scala
##
@@ -266,7 +266,15 @@ abstract class CastBase extends UnaryExpression with 
TimeZoneAwareExpression wit
   TypeCheckResult.TypeCheckSuccess
 } else {
   TypeCheckResult.TypeCheckFailure(
-s"cannot cast ${child.dataType.catalogString} to 
${dataType.catalogString}")
+if (child.dataType.isInstanceOf[NumericType] && 
dataType.isInstanceOf[TimestampType]) {
+  s"cannot cast ${child.dataType.catalogString} to 
${dataType.catalogString}," +
+", you can enable the casting by setting " +

Review comment:
   super nit: remove `,`. This is because we will see `...,, you can ...`





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #28650: [SPARK-31830][SQL] Consistent error handling for datetime formatting functions

2020-05-26 Thread GitBox


AmplabJenkins commented on pull request #28650:
URL: https://github.com/apache/spark/pull/28650#issuecomment-634433042







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #28650: [SPARK-31830][SQL] Consistent error handling for datetime formatting functions

2020-05-26 Thread GitBox


AmplabJenkins removed a comment on pull request #28650:
URL: https://github.com/apache/spark/pull/28650#issuecomment-634414043


   Test FAILed.
   Refer to this link for build results (access rights to CI server needed): 
   
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/27789/
   Test FAILed.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #27694: [SPARK-30946][SS] Serde entry with UnsafeRow on FileStream(Source/Sink)Log with LZ4 compression

2020-05-26 Thread GitBox


AmplabJenkins commented on pull request #27694:
URL: https://github.com/apache/spark/pull/27694#issuecomment-634432819







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #27694: [SPARK-30946][SS] Serde entry with UnsafeRow on FileStream(Source/Sink)Log with LZ4 compression

2020-05-26 Thread GitBox


AmplabJenkins removed a comment on pull request #27694:
URL: https://github.com/apache/spark/pull/27694#issuecomment-634432819







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] yaooqinn commented on pull request #28650: [SPARK-31830][SQL] Consistent error handling for datetime formatting functions

2020-05-26 Thread GitBox


yaooqinn commented on pull request #28650:
URL: https://github.com/apache/spark/pull/28650#issuecomment-634432602


   retest this please



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #28425: [SPARK-31480][SQL] Improve the EXPLAIN FORMATTED's output for DSV2's Scan Node

2020-05-26 Thread GitBox


AmplabJenkins removed a comment on pull request #28425:
URL: https://github.com/apache/spark/pull/28425#issuecomment-634432338







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #28425: [SPARK-31480][SQL] Improve the EXPLAIN FORMATTED's output for DSV2's Scan Node

2020-05-26 Thread GitBox


AmplabJenkins commented on pull request #28425:
URL: https://github.com/apache/spark/pull/28425#issuecomment-634432338







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA removed a comment on pull request #27694: [SPARK-30946][SS] Serde entry with UnsafeRow on FileStream(Source/Sink)Log with LZ4 compression

2020-05-26 Thread GitBox


SparkQA removed a comment on pull request #27694:
URL: https://github.com/apache/spark/pull/27694#issuecomment-634380448


   **[Test build #123156 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/123156/testReport)**
 for PR 27694 at commit 
[`3231611`](https://github.com/apache/spark/commit/323161192cff589ee9ae32ed25ca611f629cf309).



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #27694: [SPARK-30946][SS] Serde entry with UnsafeRow on FileStream(Source/Sink)Log with LZ4 compression

2020-05-26 Thread GitBox


SparkQA commented on pull request #27694:
URL: https://github.com/apache/spark/pull/27694#issuecomment-634432090


   **[Test build #123156 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/123156/testReport)**
 for PR 27694 at commit 
[`3231611`](https://github.com/apache/spark/commit/323161192cff589ee9ae32ed25ca611f629cf309).
* This patch passes all tests.
* This patch merges cleanly.
* This patch adds no public classes.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA removed a comment on pull request #28425: [SPARK-31480][SQL] Improve the EXPLAIN FORMATTED's output for DSV2's Scan Node

2020-05-26 Thread GitBox


SparkQA removed a comment on pull request #28425:
URL: https://github.com/apache/spark/pull/28425#issuecomment-634338287


   **[Test build #123153 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/123153/testReport)**
 for PR 28425 at commit 
[`6cc361a`](https://github.com/apache/spark/commit/6cc361a86f9373076067a6e9bc05b161b01da52e).



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #28425: [SPARK-31480][SQL] Improve the EXPLAIN FORMATTED's output for DSV2's Scan Node

2020-05-26 Thread GitBox


SparkQA commented on pull request #28425:
URL: https://github.com/apache/spark/pull/28425#issuecomment-634431700


   **[Test build #123153 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/123153/testReport)**
 for PR 28425 at commit 
[`6cc361a`](https://github.com/apache/spark/commit/6cc361a86f9373076067a6e9bc05b161b01da52e).
* This patch passes all tests.
* This patch merges cleanly.
* This patch adds no public classes.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #27620: [SPARK-30866][SS] FileStreamSource: Cache fetched list of files beyond maxFilesPerTrigger as unread files

2020-05-26 Thread GitBox


AmplabJenkins removed a comment on pull request #27620:
URL: https://github.com/apache/spark/pull/27620#issuecomment-634421948







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #24173: [SPARK-27237][SS] Introduce State schema validation among query restart

2020-05-26 Thread GitBox


AmplabJenkins removed a comment on pull request #24173:
URL: https://github.com/apache/spark/pull/24173#issuecomment-634422030







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #27333: [SPARK-29438][SS][FOLLOWUP] Add regression tests for Streaming Aggregation and flatMapGroupsWithState

2020-05-26 Thread GitBox


AmplabJenkins commented on pull request #27333:
URL: https://github.com/apache/spark/pull/27333#issuecomment-634422033







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #24173: [SPARK-27237][SS] Introduce State schema validation among query restart

2020-05-26 Thread GitBox


AmplabJenkins commented on pull request #24173:
URL: https://github.com/apache/spark/pull/24173#issuecomment-634422030







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #28422: [SPARK-17604][SS] FileStreamSource: provide a new option to have retention on input files

2020-05-26 Thread GitBox


AmplabJenkins removed a comment on pull request #28422:
URL: https://github.com/apache/spark/pull/28422#issuecomment-634421967







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #27333: [SPARK-29438][SS][FOLLOWUP] Add regression tests for Streaming Aggregation and flatMapGroupsWithState

2020-05-26 Thread GitBox


AmplabJenkins removed a comment on pull request #27333:
URL: https://github.com/apache/spark/pull/27333#issuecomment-634422033







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #27620: [SPARK-30866][SS] FileStreamSource: Cache fetched list of files beyond maxFilesPerTrigger as unread files

2020-05-26 Thread GitBox


AmplabJenkins commented on pull request #27620:
URL: https://github.com/apache/spark/pull/27620#issuecomment-634421948







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #28422: [SPARK-17604][SS] FileStreamSource: provide a new option to have retention on input files

2020-05-26 Thread GitBox


AmplabJenkins commented on pull request #28422:
URL: https://github.com/apache/spark/pull/28422#issuecomment-634421967







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #27620: [SPARK-30866][SS] FileStreamSource: Cache fetched list of files beyond maxFilesPerTrigger as unread files

2020-05-26 Thread GitBox


SparkQA commented on pull request #27620:
URL: https://github.com/apache/spark/pull/27620#issuecomment-634421710


   **[Test build #123162 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/123162/testReport)**
 for PR 27620 at commit 
[`8251b74`](https://github.com/apache/spark/commit/8251b744d40f4f8744df53d68842894489808c2b).



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #28422: [SPARK-17604][SS] FileStreamSource: provide a new option to have retention on input files

2020-05-26 Thread GitBox


SparkQA commented on pull request #28422:
URL: https://github.com/apache/spark/pull/28422#issuecomment-634421709


   **[Test build #123161 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/123161/testReport)**
 for PR 28422 at commit 
[`2af1df1`](https://github.com/apache/spark/commit/2af1df14a9efc58162402d6e76718fc226a7970d).



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #24173: [SPARK-27237][SS] Introduce State schema validation among query restart

2020-05-26 Thread GitBox


SparkQA commented on pull request #24173:
URL: https://github.com/apache/spark/pull/24173#issuecomment-634421696


   **[Test build #123164 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/123164/testReport)**
 for PR 24173 at commit 
[`1fcfff5`](https://github.com/apache/spark/commit/1fcfff5c2ca78049eb38cf4ef7c041d0005ab9b3).



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #27333: [SPARK-29438][SS][FOLLOWUP] Add regression tests for Streaming Aggregation and flatMapGroupsWithState

2020-05-26 Thread GitBox


SparkQA commented on pull request #27333:
URL: https://github.com/apache/spark/pull/27333#issuecomment-634421671


   **[Test build #123163 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/123163/testReport)**
 for PR 27333 at commit 
[`466363e`](https://github.com/apache/spark/commit/466363edb22ea83a81e21a72f1b983dc7b5a733e).



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] HeartSaVioR commented on pull request #24173: [SPARK-27237][SS] Introduce State schema validation among query restart

2020-05-26 Thread GitBox


HeartSaVioR commented on pull request #24173:
URL: https://github.com/apache/spark/pull/24173#issuecomment-634421224


   retest this, please



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] HeartSaVioR commented on pull request #28422: [SPARK-17604][SS] FileStreamSource: provide a new option to have retention on input files

2020-05-26 Thread GitBox


HeartSaVioR commented on pull request #28422:
URL: https://github.com/apache/spark/pull/28422#issuecomment-634421190


   retest this, please



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] HeartSaVioR commented on pull request #27620: [SPARK-30866][SS] FileStreamSource: Cache fetched list of files beyond maxFilesPerTrigger as unread files

2020-05-26 Thread GitBox


HeartSaVioR commented on pull request #27620:
URL: https://github.com/apache/spark/pull/27620#issuecomment-634420718


   retest this, please



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] HeartSaVioR commented on pull request #27333: [SPARK-29438][SS][FOLLOWUP] Add regression tests for Streaming Aggregation and flatMapGroupsWithState

2020-05-26 Thread GitBox


HeartSaVioR commented on pull request #27333:
URL: https://github.com/apache/spark/pull/27333#issuecomment-634420114


   retest this, please



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #27694: [SPARK-30946][SS] Serde entry with UnsafeRow on FileStream(Source/Sink)Log with LZ4 compression

2020-05-26 Thread GitBox


AmplabJenkins removed a comment on pull request #27694:
URL: https://github.com/apache/spark/pull/27694#issuecomment-634323950







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #26935: [SPARK-30294][SS] Explicitly defines read-only StateStore and optimize for HDFSBackedStateStore

2020-05-26 Thread GitBox


AmplabJenkins commented on pull request #26935:
URL: https://github.com/apache/spark/pull/26935#issuecomment-634417466







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #28593: [SPARK-31710][SQL] Fail casting numeric to timestamp by default

2020-05-26 Thread GitBox


SparkQA commented on pull request #28593:
URL: https://github.com/apache/spark/pull/28593#issuecomment-634318498







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #28650: [SPARK-31830][SQL] Consistent error handling for datetime formatting functions

2020-05-26 Thread GitBox


SparkQA commented on pull request #28650:
URL: https://github.com/apache/spark/pull/28650#issuecomment-634413806


   **[Test build #123160 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/123160/testReport)**
 for PR 28650 at commit 
[`f39dd48`](https://github.com/apache/spark/commit/f39dd48ffea3ce39c6a2659d3c690a5116717a9a).



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA removed a comment on pull request #28593: [SPARK-31710][SQL] Fail casting numeric to timestamp by default

2020-05-26 Thread GitBox


SparkQA removed a comment on pull request #28593:
URL: https://github.com/apache/spark/pull/28593#issuecomment-634410085


   **[Test build #123159 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/123159/testReport)**
 for PR 28593 at commit 
[`0a1a6a5`](https://github.com/apache/spark/commit/0a1a6a5bef84259bdf9716837a02d9e64c3e063b).



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #26935: [SPARK-30294][SS] Explicitly defines read-only StateStore and optimize for HDFSBackedStateStore

2020-05-26 Thread GitBox


AmplabJenkins removed a comment on pull request #26935:
URL: https://github.com/apache/spark/pull/26935#issuecomment-634417466







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #28650: [SPARK-31830][SQL] Consistent error handling for datetime formatting functions

2020-05-26 Thread GitBox


AmplabJenkins commented on pull request #28650:
URL: https://github.com/apache/spark/pull/28650#issuecomment-634414035







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #26935: [SPARK-30294][SS] Explicitly defines read-only StateStore and optimize for HDFSBackedStateStore

2020-05-26 Thread GitBox


SparkQA commented on pull request #26935:
URL: https://github.com/apache/spark/pull/26935#issuecomment-634416943


   **[Test build #123145 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/123145/testReport)**
 for PR 26935 at commit 
[`895fe06`](https://github.com/apache/spark/commit/895fe068bd3b32ed70ef84cc68e3352306099214).
* This patch passes all tests.
* This patch merges cleanly.
* This patch adds no public classes.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #28593: [SPARK-31710][SQL] Fail casting numeric to timestamp by default

2020-05-26 Thread GitBox


AmplabJenkins commented on pull request #28593:
URL: https://github.com/apache/spark/pull/28593#issuecomment-634410315







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA removed a comment on pull request #28648: [SPARK-31788][CORE][DSTREAM][PYTHON] Recover the support of union for different types of RDD and DStreams

2020-05-26 Thread GitBox


SparkQA removed a comment on pull request #28648:
URL: https://github.com/apache/spark/pull/28648#issuecomment-634392776


   **[Test build #123158 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/123158/testReport)**
 for PR 28648 at commit 
[`ce9c374`](https://github.com/apache/spark/commit/ce9c374a14bf1a27bee4a425763f90bb1ce449c8).



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA removed a comment on pull request #28647: [SPARK-31828][SQL] Retain table properties at CreateTableLikeCommand

2020-05-26 Thread GitBox


SparkQA removed a comment on pull request #28647:
URL: https://github.com/apache/spark/pull/28647#issuecomment-634369436


   **[Test build #123155 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/123155/testReport)**
 for PR 28647 at commit 
[`37ba3b3`](https://github.com/apache/spark/commit/37ba3b3f16852badcb2c4a4290d9cb22d7ef2584).



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] yaooqinn opened a new pull request #28650: [SPARK-31830][SQL] Consistent error handling for datetime formatting functions

2020-05-26 Thread GitBox


yaooqinn opened a new pull request #28650:
URL: https://github.com/apache/spark/pull/28650


   
   
   
   ### What changes were proposed in this pull request?
   Currently, `date_format` and `from_unixtime` have different exception 
handling behavior for formatting datetime values.
   
   In this PR, we apply the exception handling behavior of `date_format` to 
`from_unixtime`
   
   
   ### Why are the changes needed?
   Consistency, and avoid silently data change to result in `null`.
   ### Does this PR introduce _any_ user-facing change?
   
   Yes, invalid datetime patterns will fail `from_unixtime` instead of 
resulting `NULL` 
   
   ### How was this patch tested?
   
   
   add more tests
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #28593: [SPARK-31710][SQL] Fail casting numeric to timestamp by default

2020-05-26 Thread GitBox


AmplabJenkins removed a comment on pull request #28593:
URL: https://github.com/apache/spark/pull/28593#issuecomment-634415789







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #28650: [SPARK-31830][SQL] Consistent error handling for datetime formatting functions

2020-05-26 Thread GitBox


AmplabJenkins removed a comment on pull request #28650:
URL: https://github.com/apache/spark/pull/28650#issuecomment-634414035


   Merged build finished. Test FAILed.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] zhining-lu opened a new pull request #28644: Merge pull request #1 from apache/master

2020-05-26 Thread GitBox


zhining-lu opened a new pull request #28644:
URL: https://github.com/apache/spark/pull/28644


   pull request 001
   
   
   
   ### What changes were proposed in this pull request?
   
   
   
   ### Why are the changes needed?
   
   
   
   ### Does this PR introduce _any_ user-facing change?
   
   
   
   ### How was this patch tested?
   
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA removed a comment on pull request #26935: [SPARK-30294][SS] Explicitly defines read-only StateStore and optimize for HDFSBackedStateStore

2020-05-26 Thread GitBox


SparkQA removed a comment on pull request #26935:
URL: https://github.com/apache/spark/pull/26935#issuecomment-634306992


   **[Test build #123145 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/123145/testReport)**
 for PR 26935 at commit 
[`895fe06`](https://github.com/apache/spark/commit/895fe068bd3b32ed70ef84cc68e3352306099214).



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] srowen commented on pull request #28553: [SPARK-31734][ML][PySpark] Add weight support in ClusteringEvaluator

2020-05-26 Thread GitBox


srowen commented on pull request #28553:
URL: https://github.com/apache/spark/pull/28553#issuecomment-633594334


   Merged to master. You may want to further change the nonnegativity check in 
your other PR to use the new method you introduced there.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] cloud-fan commented on pull request #28639: [SPARK-31820][SQL][TESTS] Fix flaky JavaBeanDeserializationSuite

2020-05-26 Thread GitBox


cloud-fan commented on pull request #28639:
URL: https://github.com/apache/spark/pull/28639#issuecomment-633986761


   thanks, merging to master/3.0!



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #26935: [SPARK-30294][SS] Explicitly defines read-only StateStore and optimize for HDFSBackedStateStore

2020-05-26 Thread GitBox


AmplabJenkins removed a comment on pull request #26935:
URL: https://github.com/apache/spark/pull/26935#issuecomment-634307568







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA removed a comment on pull request #28422: [SPARK-17604][SS] FileStreamSource: provide a new option to have retention on input files

2020-05-26 Thread GitBox


SparkQA removed a comment on pull request #28422:
URL: https://github.com/apache/spark/pull/28422#issuecomment-634306965


   **[Test build #123141 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/123141/testReport)**
 for PR 28422 at commit 
[`2af1df1`](https://github.com/apache/spark/commit/2af1df14a9efc58162402d6e76718fc226a7970d).



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA removed a comment on pull request #28641: [SPARK-31824][CORE][TESTS] DAGSchedulerSuite: Improve and reuse completeShuffleMapStageSuccessfully

2020-05-26 Thread GitBox


SparkQA removed a comment on pull request #28641:
URL: https://github.com/apache/spark/pull/28641#issuecomment-633922057







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #28425: [SPARK-31480][SQL] Improve the EXPLAIN FORMATTED's output for DSV2's Scan Node

2020-05-26 Thread GitBox


SparkQA commented on pull request #28425:
URL: https://github.com/apache/spark/pull/28425#issuecomment-633787704







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #28636: [SPARK-31818][SQL][test-hive1.2] Fix pushing down filters with `java.time.Instant` values in ORC

2020-05-26 Thread GitBox


AmplabJenkins removed a comment on pull request #28636:
URL: https://github.com/apache/spark/pull/28636#issuecomment-633588483







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] dongjoon-hyun commented on pull request #28638: [SPARK-31819][K8S][DOCS][TESTS] Add a workaround for Java 8u251+ and update integration test cases

2020-05-26 Thread GitBox


dongjoon-hyun commented on pull request #28638:
URL: https://github.com/apache/spark/pull/28638#issuecomment-633831799







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA removed a comment on pull request #28627: [SPARK-31756][WEBUI][test-maven] Add real headless browser support for UI test

2020-05-26 Thread GitBox


SparkQA removed a comment on pull request #28627:
URL: https://github.com/apache/spark/pull/28627#issuecomment-633419679







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA removed a comment on pull request #24173: [SPARK-27237][SS] Introduce State schema validation among query restart

2020-05-26 Thread GitBox


SparkQA removed a comment on pull request #24173:
URL: https://github.com/apache/spark/pull/24173#issuecomment-634307006


   **[Test build #123146 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/123146/testReport)**
 for PR 24173 at commit 
[`1fcfff5`](https://github.com/apache/spark/commit/1fcfff5c2ca78049eb38cf4ef7c041d0005ab9b3).



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] keypointt commented on a change in pull request #28595: [SPARK-31781][ML][PySpark] Move param k (number of clusters) to shared params

2020-05-26 Thread GitBox


keypointt commented on a change in pull request #28595:
URL: https://github.com/apache/spark/pull/28595#discussion_r430155232



##
File path: 
mllib/src/main/scala/org/apache/spark/ml/param/shared/sharedParams.scala
##
@@ -562,4 +562,20 @@ trait HasBlockSize extends Params {
   /** @group expertGetParam */
   final def getBlockSize: Int = $(blockSize)
 }
+
+/**
+ * Trait for shared param k. This trait may be changed or
+ * removed between minor versions.
+ */
+trait HasK extends Params {
+

Review comment:
   Add a simple test in 
https://github.com/apache/spark/blob/master/mllib/src/test/scala/org/apache/spark/ml/param/shared/SharedParamsSuite.scala
 ?
   
   like 
   ```
   class Obj(override val uid: String) extends Params with HasOutputCol with 
HasK {
   ...
   
   assert(obj.getOrDefault(obj.k) == 2)
   
   ```

##
File path: 
mllib/src/main/scala/org/apache/spark/ml/param/shared/sharedParams.scala
##
@@ -562,4 +562,20 @@ trait HasBlockSize extends Params {
   /** @group expertGetParam */
   final def getBlockSize: Int = $(blockSize)
 }
+
+/**
+ * Trait for shared param k. This trait may be changed or
+ * removed between minor versions.
+ */
+trait HasK extends Params {
+

Review comment:
   cool, I'm all good then. thanks :)





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #28645: [SPARK-31826][SQL] Support composed type of case class for typed Scala UDF

2020-05-26 Thread GitBox


AmplabJenkins commented on pull request #28645:
URL: https://github.com/apache/spark/pull/28645#issuecomment-634034635







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #28583: [SPARK-31764][CORE] JsonProtocol doesn't write RDDInfo#isBarrier

2020-05-26 Thread GitBox


AmplabJenkins removed a comment on pull request #28583:
URL: https://github.com/apache/spark/pull/28583#issuecomment-634274631







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] HeartSaVioR commented on pull request #28363: [SPARK-27188][SS] FileStreamSink: provide a new option to have retention on output files

2020-05-26 Thread GitBox


HeartSaVioR commented on pull request #28363:
URL: https://github.com/apache/spark/pull/28363#issuecomment-633890605







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #28528: [SPARK-31711][CORE] Register the executor source with the metrics system when running in local mode.

2020-05-26 Thread GitBox


AmplabJenkins commented on pull request #28528:
URL: https://github.com/apache/spark/pull/28528#issuecomment-633772093







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA removed a comment on pull request #28645: [SPARK-31826][SQL] Support composed type of case class for typed Scala UDF

2020-05-26 Thread GitBox


SparkQA removed a comment on pull request #28645:
URL: https://github.com/apache/spark/pull/28645#issuecomment-634033907


   **[Test build #123125 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/123125/testReport)**
 for PR 28645 at commit 
[`c1a2d1c`](https://github.com/apache/spark/commit/c1a2d1c93a6fa2f4e72d8c896b792054b9db1004).



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #28331: [WIP][SPARK-20629][CORE] Copy shuffle data when nodes are being shutdown

2020-05-26 Thread GitBox


AmplabJenkins commented on pull request #28331:
URL: https://github.com/apache/spark/pull/28331#issuecomment-634336369







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] seanli-rallyhealth commented on a change in pull request #28635: [SPARK-31337][SQL]Support MS SQL Kerberos login in JDBC connector

2020-05-26 Thread GitBox


seanli-rallyhealth commented on a change in pull request #28635:
URL: https://github.com/apache/spark/pull/28635#discussion_r429982799



##
File path: 
sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/jdbc/connection/MSSQLConnectionProviderSuite.scala
##
@@ -0,0 +1,51 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql.execution.datasources.jdbc.connection
+
+class MSSQLConnectionProviderSuite extends ConnectionProviderSuiteBase {
+  test("setAuthenticationConfigIfNeeded default parser must set authentication 
if not set") {
+val driver = registerDriver(MSSQLConnectionProvider.driverClass)
+val defaultProvider = new MSSQLConnectionProvider(
+  driver, options("jdbc:sqlserver://localhost/mssql"))
+val customProvider = new MSSQLConnectionProvider(
+  driver, 
options(s"jdbc:sqlserver://localhost/mssql;jaasConfigurationName=custommssql"))
+
+testProviders(defaultProvider, customProvider)
+  }
+
+  test("setAuthenticationConfigIfNeeded custom parser must set authentication 
if not set") {
+val parserMethod = "IntentionallyNotExistingMethod"
+val driver = registerDriver(MSSQLConnectionProvider.driverClass)
+val defaultProvider = new MSSQLConnectionProvider(
+  driver, options("jdbc:sqlserver://localhost/mssql"), parserMethod)
+val customProvider = new MSSQLConnectionProvider(
+  driver,
+  
options(s"jdbc:sqlserver://localhost/mssql;jaasConfigurationName=custommssql"),

Review comment:
   no need of s String unless there is embedded variable





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] srowen commented on pull request #28621: [SPARK-31803][ML] Make sure instance weight is not negative

2020-05-26 Thread GitBox


srowen commented on pull request #28621:
URL: https://github.com/apache/spark/pull/28621#issuecomment-633592952







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] seanli-rallyhealth commented on a change in pull request #28616: [SPARK-31798][SHUFFLE][API] Shuffle Writer API changes to return custom map output metadata

2020-05-26 Thread GitBox


seanli-rallyhealth commented on a change in pull request #28616:
URL: https://github.com/apache/spark/pull/28616#discussion_r429991433



##
File path: 
core/src/main/java/org/apache/spark/shuffle/api/ShuffleMapOutputWriter.java
##
@@ -63,7 +64,7 @@
* The returned array should contain, for each partition from (0) to 
(numPartitions - 1), the
* number of bytes written by the partition writer for that partition id.
*/
-  long[] commitAllPartitions() throws IOException;
+  MapOutputCommitMessage commitAllPartitions() throws IOException;

Review comment:
   need rewrite above comment?





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #27649: [SPARK-30900][SS] FileStreamSource: Avoid reading compact metadata log twice if the query restarts from compact batch

2020-05-26 Thread GitBox


AmplabJenkins removed a comment on pull request #27649:
URL: https://github.com/apache/spark/pull/27649#issuecomment-634307485







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #28647: [SPARK-31828][SQL] Retain table properties at CreateTableLikeCommand

2020-05-26 Thread GitBox


AmplabJenkins removed a comment on pull request #28647:
URL: https://github.com/apache/spark/pull/28647#issuecomment-634369793







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] dongjoon-hyun closed pull request #28633: [SPARK-31808][SQL] Makes struct function's output name and class name pretty

2020-05-26 Thread GitBox


dongjoon-hyun closed pull request #28633:
URL: https://github.com/apache/spark/pull/28633


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] gatorsmile commented on pull request #28485: [SPARK-31642] Add Pagination Support for Structured Streaming Page

2020-05-26 Thread GitBox


gatorsmile commented on pull request #28485:
URL: https://github.com/apache/spark/pull/28485#issuecomment-634396656


   > . I will add snapshots soon.
   
   @iRakson could you update the PR description with the screenshot. 



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] MaxGekk commented on pull request #28630: [SPARK-31806][SQL][TESTS] Check reading date/timestamp from legacy parquet: dictionary encoding, w/o Spark version

2020-05-26 Thread GitBox


MaxGekk commented on pull request #28630:
URL: https://github.com/apache/spark/pull/28630#issuecomment-633696780







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] HyukjinKwon commented on pull request #28523: [SPARK-31706][SQL] add back the support of streaming update mode

2020-05-26 Thread GitBox


HyukjinKwon commented on pull request #28523:
URL: https://github.com/apache/spark/pull/28523#issuecomment-633759255







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #28584: [SPARK-31730][CORE][TEST] Fix flaky tests in BarrierTaskContextSuite

2020-05-26 Thread GitBox


AmplabJenkins removed a comment on pull request #28584:
URL: https://github.com/apache/spark/pull/28584#issuecomment-633803017







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA removed a comment on pull request #28619: [SPARK-21040][CORE] Speculate tasks which are running on decommission executors

2020-05-26 Thread GitBox


SparkQA removed a comment on pull request #28619:
URL: https://github.com/apache/spark/pull/28619#issuecomment-633594567







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] HyukjinKwon edited a comment on pull request #28603: [SPARK-31788][CORE][PYTHON] Fix UnionRDD of PairRDDs

2020-05-26 Thread GitBox


HyukjinKwon edited a comment on pull request #28603:
URL: https://github.com/apache/spark/pull/28603#issuecomment-634365472


   Okay, I just noticed 
https://github.com/apache/spark/commit/f83fedc9f20869ab4c62bb07bac50113d921207f 
caused this problem, and this is a regression. I am going to revert this PR.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] holdenk commented on a change in pull request #28331: [WIP][SPARK-20629][CORE] Copy shuffle data when nodes are being shutdown

2020-05-26 Thread GitBox


holdenk commented on a change in pull request #28331:
URL: https://github.com/apache/spark/pull/28331#discussion_r430563960



##
File path: 
core/src/test/scala/org/apache/spark/storage/BlockManagerDecommissionSuite.scala
##
@@ -69,36 +84,64 @@ class BlockManagerDecommissionSuite extends SparkFunSuite 
with LocalSparkContext
 })
 
 // Cache the RDD lazily
-sleepyRdd.persist()
+if (persist) {
+  testRdd.persist()
+}
 
 // Start the computation of RDD - this step will also cache the RDD
-val asyncCount = sleepyRdd.countAsync()
+val asyncCount = testRdd.countAsync()
 
 // Wait for the job to have started
 sem.acquire(1)
 
+// Give Spark a tiny bit to start the tasks after the listener says hello
+Thread.sleep(100)

Review comment:
   So this wait is for the task to be properly scheduled not for the number 
of execs. The assert for the number of execs is something I think we should 
keep because we can to make sure that decommissioning isn’t the same as exiti.

##
File path: 
core/src/main/scala/org/apache/spark/shuffle/IndexShuffleBlockResolver.scala
##
@@ -148,6 +170,86 @@ private[spark] class IndexShuffleBlockResolver(
 }
   }
 
+  /**
+   * Write a provided shuffle block as a stream. Used for block migrations.
+   * ShuffleBlockBatchIds must contain the full range represented in the 
ShuffleIndexBlock.
+   * Requires the caller to delete any shuffle index blocks where the shuffle 
block fails to
+   * put.
+   */
+  def putShuffleBlockAsStream(blockId: BlockId, serializerManager: 
SerializerManager):
+  StreamCallbackWithID = {
+val file = blockId match {
+  case ShuffleIndexBlockId(shuffleId, mapId, _) =>
+getIndexFile(shuffleId, mapId)
+  case ShuffleBlockBatchId(shuffleId, mapId, _, _) =>
+getDataFile(shuffleId, mapId)
+  case _ =>
+throw new Exception(s"Unexpected shuffle block transfer ${blockId}")
+}
+val fileTmp = Utils.tempFileWith(file)
+val channel = Channels.newChannel(
+  serializerManager.wrapStream(blockId,
+new FileOutputStream(fileTmp)))
+
+new StreamCallbackWithID {
+
+  override def getID: String = blockId.name
+
+  override def onData(streamId: String, buf: ByteBuffer): Unit = {
+while (buf.hasRemaining) {
+  channel.write(buf)
+}
+  }
+
+  override def onComplete(streamId: String): Unit = {
+logTrace(s"Done receiving block $blockId, now putting into local 
shuffle service")
+channel.close()
+val diskSize = fileTmp.length()
+this.synchronized {
+  if (file.exists()) {
+file.delete()
+  }

Review comment:
   So this mirrors the logic inside of writeIndexFileAndCommit, the 
matching check there was introduced in SPARK-17547
which I believe is for the situation where an exception occurred during a 
previous write and the filesystem is in a dirty state. So I think we should 
keep it to be safe.

##
File path: core/src/main/scala/org/apache/spark/storage/BlockManager.scala
##
@@ -1777,7 +1799,7 @@ private[spark] class BlockManager(
 
   def decommissionBlockManager(): Unit = {
 if (!blockManagerDecommissioning) {
-  logInfo("Starting block manager decommissioning process")
+  logInfo("Starting block manager decommissioning process...")
   blockManagerDecommissioning = true
   decommissionManager = Some(new BlockManagerDecommissionManager(conf))
   decommissionManager.foreach(_.start())

Review comment:
   Yeah this was added in the cache block migration PR (now merged)





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA removed a comment on pull request #27649: [SPARK-30900][SS] FileStreamSource: Avoid reading compact metadata log twice if the query restarts from compact batch

2020-05-26 Thread GitBox


SparkQA removed a comment on pull request #27649:
URL: https://github.com/apache/spark/pull/27649#issuecomment-634306966


   **[Test build #123142 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/123142/testReport)**
 for PR 27649 at commit 
[`6406e36`](https://github.com/apache/spark/commit/6406e36eb34377983aaf113495ca16b1553317a3).



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] igreenfield commented on a change in pull request #28629: [SPARK-31769] Add MDC support for driver threads

2020-05-26 Thread GitBox


igreenfield commented on a change in pull request #28629:
URL: https://github.com/apache/spark/pull/28629#discussion_r430656349



##
File path: core/src/main/scala/org/apache/spark/util/ThreadUtils.scala
##
@@ -32,6 +33,99 @@ import org.apache.spark.SparkException
 
 private[spark] object ThreadUtils {
 
+  object MDCAwareThreadPoolExecutor {
+def newCachedThreadPool(threadFactory: ThreadFactory): ThreadPoolExecutor 
= {
+  // The values needs to be synced with `Executors.newCachedThreadPool`
+  new MDCAwareThreadPoolExecutor(
+0,
+Integer.MAX_VALUE,
+60L,
+TimeUnit.SECONDS,
+new SynchronousQueue[Runnable],
+threadFactory)
+}
+
+def newFixedThreadPool(nThreads: Int, threadFactory: ThreadFactory): 
ThreadPoolExecutor = {
+  // The values needs to be synced with `Executors.newFixedThreadPool`
+  new MDCAwareThreadPoolExecutor(
+nThreads,
+nThreads,
+0L,
+TimeUnit.MILLISECONDS,
+new LinkedBlockingQueue[Runnable],
+threadFactory)
+}
+
+/**
+ * This method differ from the 
`java.util.concurrent.Executors#newSingleThreadExecutor` in
+ * 2 ways:
+ *   1. It use 
`org.apache.spark.util.ThreadUtils.MDCAwareThreadPoolExecutor`
+ *   as underline `java.util.concurrent.ExecutorService`
+ *   2. It does not use the
+ *   `java.util.concurrent.Executors#FinalizableDelegatedExecutorService` 
from JDK
+ */
+def newSingleThreadExecutor(threadFactory: ThreadFactory): ExecutorService 
= {
+  // The values needs to be synced with `Executors.newSingleThreadExecutor`
+  Executors.unconfigurableExecutorService(
+new MDCAwareThreadPoolExecutor(
+  1,
+  1,
+  0L,
+  TimeUnit.MILLISECONDS,
+  new LinkedBlockingQueue[Runnable],
+  threadFactory)
+)
+}
+
+  }
+
+  class MDCAwareRunnable(proxy: Runnable) extends Runnable {
+val callerThreadMDC: util.Map[String, String] = getMDCMap
+
+@inline
+private def getMDCMap: util.Map[String, String] = {
+  org.slf4j.MDC.getCopyOfContextMap match {

Review comment:
   Done





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] jiangxb1987 commented on pull request #28583: [SPARK-31764][CORE] JsonProtocol doesn't write RDDInfo#isBarrier

2020-05-26 Thread GitBox


jiangxb1987 commented on pull request #28583:
URL: https://github.com/apache/spark/pull/28583#issuecomment-634271797


   retest this please



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA removed a comment on pull request #28621: [SPARK-31803][ML] Make sure instance weight is not negative

2020-05-26 Thread GitBox


SparkQA removed a comment on pull request #28621:
URL: https://github.com/apache/spark/pull/28621#issuecomment-633594546







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #27333: [SPARK-29438][SS][FOLLOWUP] Add regression tests for Streaming Aggregation and flatMapGroupsWithState

2020-05-26 Thread GitBox


AmplabJenkins commented on pull request #27333:
URL: https://github.com/apache/spark/pull/27333#issuecomment-634307498







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #28635: [SPARK-31337][SQL]Support MS SQL Kerberos login in JDBC connector

2020-05-26 Thread GitBox


AmplabJenkins removed a comment on pull request #28635:
URL: https://github.com/apache/spark/pull/28635#issuecomment-633587738







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] dongjoon-hyun commented on a change in pull request #28528: [SPARK-31711][CORE] Register the executor source with the metrics system when running in local mode.

2020-05-26 Thread GitBox


dongjoon-hyun commented on a change in pull request #28528:
URL: https://github.com/apache/spark/pull/28528#discussion_r430176730



##
File path: core/src/main/scala/org/apache/spark/executor/Executor.scala
##
@@ -121,7 +121,7 @@ private[spark] class Executor(
   // create. The map key is a task id.
   private val taskReaperForTask: HashMap[Long, TaskReaper] = HashMap[Long, 
TaskReaper]()
 
-  val executorMetricsSource =
+  private val executorMetricsSource =

Review comment:
   This is irrelevant to this PR.

##
File path: core/src/main/scala/org/apache/spark/executor/Executor.scala
##
@@ -134,8 +134,11 @@ private[spark] class Executor(
 env.metricsSystem.registerSource(new JVMCPUSource())
 executorMetricsSource.foreach(_.register(env.metricsSystem))
 env.metricsSystem.registerSource(env.blockManager.shuffleMetricsSource)
+  } else {
+Executor.executorSource = executorSource
   }
 
+

Review comment:
   Please remove this.

##
File path: core/src/main/scala/org/apache/spark/executor/Executor.scala
##
@@ -121,7 +121,7 @@ private[spark] class Executor(
   // create. The map key is a task id.
   private val taskReaperForTask: HashMap[Long, TaskReaper] = HashMap[Long, 
TaskReaper]()
 
-  val executorMetricsSource =
+  private val executorMetricsSource =

Review comment:
   This seems to be irrelevant to this PR.

##
File path: core/src/main/scala/org/apache/spark/executor/Executor.scala
##
@@ -134,8 +134,11 @@ private[spark] class Executor(
 env.metricsSystem.registerSource(new JVMCPUSource())
 executorMetricsSource.foreach(_.register(env.metricsSystem))
 env.metricsSystem.registerSource(env.blockManager.shuffleMetricsSource)
+  } else {
+Executor.executorSource = executorSource

Review comment:
   What happens when we call 
`env.metricsSystem.registerSource(executorSource)` here? 

##
File path: core/src/main/scala/org/apache/spark/executor/Executor.scala
##
@@ -134,8 +134,11 @@ private[spark] class Executor(
 env.metricsSystem.registerSource(new JVMCPUSource())
 executorMetricsSource.foreach(_.register(env.metricsSystem))
 env.metricsSystem.registerSource(env.blockManager.shuffleMetricsSource)
+  } else {
+Executor.executorSource = executorSource

Review comment:
   What happens if we call 
`env.metricsSystem.registerSource(executorSource)` here? 

##
File path: 
core/src/test/scala/org/apache/spark/metrics/source/SourceConfigSuite.scala
##
@@ -80,4 +80,16 @@ class SourceConfigSuite extends SparkFunSuite with 
LocalSparkContext {
 }
   }
 
+  test("Test executor source registration in local mode") {

Review comment:
   Could you add a prefix `SPARK-31711: `?





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] HyukjinKwon closed pull request #28493: [SPARK-31673][SQL] QueryExection.debug.toFile() to take an addtional explain mode param

2020-05-26 Thread GitBox


HyukjinKwon closed pull request #28493:
URL: https://github.com/apache/spark/pull/28493


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] MaxGekk opened a new pull request #28639: [SPARK-31820][SQL][TESTS] Fix flaky JavaBeanDeserializationSuite

2020-05-26 Thread GitBox


MaxGekk opened a new pull request #28639:
URL: https://github.com/apache/spark/pull/28639


   ### What changes were proposed in this pull request?
   Modified formatting of expected timestamp strings in the test 
`JavaBeanDeserializationSuite`.`testSpark22000` to correctly format timestamps 
with **zero** seconds fraction. Current implementation outputs `.0` but must be 
empty string. From SPARK-31820 failure:
   - should be `2020-05-25 12:39:17`
   - but incorrect expected string is `2020-05-25 12:39:17.0`
   
   ### Why are the changes needed?
   To make `JavaBeanDeserializationSuite` stable, and avoid test failures like 
https://github.com/apache/spark/pull/28630#issuecomment-633695723
   
   ### Does this PR introduce _any_ user-facing change?
   No
   
   ### How was this patch tested?
   I changed 
https://github.com/apache/spark/blob/7dff3b125de23a4d6ce834217ee08973b259414c/sql/core/src/test/java/test/org/apache/spark/sql/JavaBeanDeserializationSuite.java#L207
 to
   ```java
   new java.sql.Timestamp((System.currentTimeMillis() / 1000) * 1000),
   ```
   to force zero seconds fraction.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] prakharjain09 edited a comment on pull request #28634: [SPARK-31810][TEST] Fix AlterTableRecoverPartitions test using incorrect api to modify RDD_PARALLEL_LISTING_THRESHOLD

2020-05-26 Thread GitBox


prakharjain09 edited a comment on pull request #28634:
URL: https://github.com/apache/spark/pull/28634#issuecomment-633481802


   cc - @srowen @cloud-fan @Dooyoung-Hwang @holdenk 



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA removed a comment on pull request #28646: [SPARK-31827][SQL] better error message for the JDK8 bug of stand-alone form

2020-05-26 Thread GitBox


SparkQA removed a comment on pull request #28646:
URL: https://github.com/apache/spark/pull/28646#issuecomment-634099573







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] beliefer opened a new pull request #28641: [SPARK-31824][CORE][TESTS] DAGSchedulerSuite: Improve and reuse completeShuffleMapStageSuccessfully

2020-05-26 Thread GitBox


beliefer opened a new pull request #28641:
URL: https://github.com/apache/spark/pull/28641


   ### What changes were proposed in this pull request?
   `DAGSchedulerSuite `provides `completeShuffleMapStageSuccessfully `to make 
`ShuffleMapStage `successfully.
   But many test case uses complete directly as follows:
   `complete(taskSets(0), Seq((Success, makeMapStatus("hostA", 1`
   
   We need to improve `completeShuffleMapStageSuccessfully `and reuse it.
   `completeShuffleMapStageSuccessfully(0, 0, 1, Some(0), Seq("hostA"))`
   
   
   ### Why are the changes needed?
   Improve and reuse completeShuffleMapStageSuccessfully
   
   
   ### Does this PR introduce _any_ user-facing change?
'No'.
   
   
   ### How was this patch tested?
   Jenkins test
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #28528: [SPARK-31711][CORE] Register the executor source with the metrics system when running in local mode.

2020-05-26 Thread GitBox


SparkQA commented on pull request #28528:
URL: https://github.com/apache/spark/pull/28528#issuecomment-633771897







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA removed a comment on pull request #28634: [SPARK-31810][TEST] Fix AlterTableRecoverPartitions test using incorrect api to modify RDD_PARALLEL_LISTING_THRESHOLD

2020-05-26 Thread GitBox


SparkQA removed a comment on pull request #28634:
URL: https://github.com/apache/spark/pull/28634#issuecomment-633581100


   **[Test build #123085 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/123085/testReport)**
 for PR 28634 at commit 
[`f6900b3`](https://github.com/apache/spark/commit/f6900b32b361777ec32e27ec6a8df33c068b1f4d).



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA removed a comment on pull request #28607: [SPARK-24634][SS] Add a new metric regarding number of inputs later than watermark plus allowed delay

2020-05-26 Thread GitBox


SparkQA removed a comment on pull request #28607:
URL: https://github.com/apache/spark/pull/28607#issuecomment-634306964


   **[Test build #123140 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/123140/testReport)**
 for PR 28607 at commit 
[`f09a623`](https://github.com/apache/spark/commit/f09a623064f8a1edf6908f653206d17db3f38d38).



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #28609: [SPARK-31792][DOCS] Introduce the structured streaming UI in the Web UI doc

2020-05-26 Thread GitBox


AmplabJenkins commented on pull request #28609:
URL: https://github.com/apache/spark/pull/28609#issuecomment-633735386







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] dongjoon-hyun commented on pull request #28614: [SPARK-31791][CORE][TEST] Improve cache block migration test reliability

2020-05-26 Thread GitBox


dongjoon-hyun commented on pull request #28614:
URL: https://github.com/apache/spark/pull/28614#issuecomment-633791726


   +1, late LGTM.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] HeartSaVioR commented on pull request #27333: [SPARK-29438][SS][FOLLOWUP] Add regression tests for Streaming Aggregation and flatMapGroupsWithState

2020-05-26 Thread GitBox


HeartSaVioR commented on pull request #27333:
URL: https://github.com/apache/spark/pull/27333#issuecomment-634305431


   retest this, please



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



  1   2   3   4   >