[GitHub] [spark] HeartSaVioR commented on pull request #24173: [SPARK-27237][SS] Introduce State schema validation among query restart

2020-08-18 Thread GitBox


HeartSaVioR commented on pull request #24173:
URL: https://github.com/apache/spark/pull/24173#issuecomment-675299208


retest this, please



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] HeartSaVioR commented on a change in pull request #29256: [SPARK-32456][SS] Check the Distinct by assuming it as Aggregate for Structured Streaming

2020-08-18 Thread GitBox


HeartSaVioR commented on a change in pull request #29256:
URL: https://github.com/apache/spark/pull/29256#discussion_r471960294



##
File path: 
sql/core/src/test/scala/org/apache/spark/sql/streaming/StreamingQuerySuite.scala
##
@@ -1106,6 +1107,54 @@ class StreamingQuerySuite extends StreamTest with 
BeforeAndAfter with Logging wi
 }
   }
 
+  test("union in streaming query of append mode without watermark") {
+val inputData1 = MemoryStream[Int]
+val inputData2 = MemoryStream[Int]
+withTempView("s1", "s2") {
+  inputData1.toDF().createOrReplaceTempView("s1")
+  inputData2.toDF().createOrReplaceTempView("s2")
+  val unioned = spark.sql(
+"select s1.value from s1 union select s2.value from s2")
+  checkExceptionMessage(unioned)
+}
+  }
+
+  test("distinct in streaming query of append mode without watermark") {
+val inputData = MemoryStream[Int]
+withTempView("deduptest") {
+  inputData.toDF().toDF("value").createOrReplaceTempView("deduptest")
+  val distinct = spark.sql("select distinct value from deduptest")
+  checkExceptionMessage(distinct)
+}
+  }
+
+  test("distinct in streaming query of complete mode") {
+val inputData = MemoryStream[Int]
+withTempView("deduptest") {
+  inputData.toDF().toDF("value").createOrReplaceTempView("deduptest")
+  val distinct = spark.sql("select distinct value from deduptest")
+
+  testStream(distinct, Complete)(
+AddData(inputData, 1, 2, 3, 3, 4),
+CheckAnswer(Row(1), Row(2), Row(3), Row(4))

Review comment:
   As an alternative I added some note on SS guide doc. #29461
   
   I'm not sure it is enough to let us free to not complained by improper 
usages, so I just marked the PR as draft. I think it's better to collect the 
voices on this.
   





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #29461: [DO-NOT-MERGE][SPARK-32456][SS][FOLLOWUP] Update doc to note about using SQL statement with streaming Dataset

2020-08-18 Thread GitBox


AmplabJenkins removed a comment on pull request #29461:
URL: https://github.com/apache/spark/pull/29461#issuecomment-675298637







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #25965: [SPARK-26425][SS] Add more constraint checks to avoid checkpoint corruption

2020-08-18 Thread GitBox


AmplabJenkins removed a comment on pull request #25965:
URL: https://github.com/apache/spark/pull/25965#issuecomment-675297890


   Test FAILed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/127538/
   Test FAILed.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #29461: [DO-NOT-MERGE][SPARK-32456][SS][FOLLOWUP] Update doc to note about using SQL statement with streaming Dataset

2020-08-18 Thread GitBox


AmplabJenkins commented on pull request #29461:
URL: https://github.com/apache/spark/pull/29461#issuecomment-675298637







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #27694: [SPARK-30946][SS] Serde entry via DataInputStream/DataOutputStream with LZ4 compression on FileStream(Source/Sink)Log

2020-08-18 Thread GitBox


AmplabJenkins removed a comment on pull request #27694:
URL: https://github.com/apache/spark/pull/27694#issuecomment-675298265


   Merged build finished. Test FAILed.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #28904: [SPARK-30462][SS] Streamline the logic on file stream source and sink metadata log to avoid memory issue

2020-08-18 Thread GitBox


AmplabJenkins removed a comment on pull request #28904:
URL: https://github.com/apache/spark/pull/28904#issuecomment-675298132


   Test FAILed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/127530/
   Test FAILed.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #28904: [SPARK-30462][SS] Streamline the logic on file stream source and sink metadata log to avoid memory issue

2020-08-18 Thread GitBox


AmplabJenkins removed a comment on pull request #28904:
URL: https://github.com/apache/spark/pull/28904#issuecomment-675298123


   Merged build finished. Test FAILed.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #29360: [SPARK-32542][SQL] Add an optimizer rule to split an Expand into multiple Expands for aggregates

2020-08-18 Thread GitBox


AmplabJenkins removed a comment on pull request #29360:
URL: https://github.com/apache/spark/pull/29360#issuecomment-675297814


   Merged build finished. Test FAILed.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #27649: [SPARK-30900][SS] FileStreamSource: Avoid reading compact metadata log twice if the query restarts from compact batch

2020-08-18 Thread GitBox


AmplabJenkins removed a comment on pull request #27649:
URL: https://github.com/apache/spark/pull/27649#issuecomment-675297409


   Test FAILed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/127535/
   Test FAILed.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #27333: [SPARK-29438][SS][FOLLOWUP] Add regression tests for Streaming Aggregation and flatMapGroupsWithState

2020-08-18 Thread GitBox


AmplabJenkins removed a comment on pull request #27333:
URL: https://github.com/apache/spark/pull/27333#issuecomment-675297631


   Test FAILed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/127536/
   Test FAILed.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #29360: [SPARK-32542][SQL] Add an optimizer rule to split an Expand into multiple Expands for aggregates

2020-08-18 Thread GitBox


AmplabJenkins commented on pull request #29360:
URL: https://github.com/apache/spark/pull/29360#issuecomment-675297814







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #29453: [SPARK-31999][SQL][FOLLOWUP] Adds negative test cases with typos

2020-08-18 Thread GitBox


AmplabJenkins removed a comment on pull request #29453:
URL: https://github.com/apache/spark/pull/29453#issuecomment-675297371


   Test FAILed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/127542/
   Test FAILed.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #29457: [SPARK-32646][SQL] ORC predicate pushdown should work with case-insensitive analysis

2020-08-18 Thread GitBox


AmplabJenkins removed a comment on pull request #29457:
URL: https://github.com/apache/spark/pull/29457#issuecomment-675297236


   Test FAILed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/127545/
   Test FAILed.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #28841: [SPARK-31962][SQL] Provide modifiedAfter and modifiedBefore options when filtering from a batch-based file data source

2020-08-18 Thread GitBox


AmplabJenkins removed a comment on pull request #28841:
URL: https://github.com/apache/spark/pull/28841#issuecomment-675297983


   Merged build finished. Test FAILed.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #26935: [SPARK-30294][SS] Explicitly defines read-only StateStore and optimize for HDFSBackedStateStore

2020-08-18 Thread GitBox


AmplabJenkins removed a comment on pull request #26935:
URL: https://github.com/apache/spark/pull/26935#issuecomment-675297424


   Test FAILed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/127537/
   Test FAILed.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #27333: [SPARK-29438][SS][FOLLOWUP] Add regression tests for Streaming Aggregation and flatMapGroupsWithState

2020-08-18 Thread GitBox


AmplabJenkins removed a comment on pull request #27333:
URL: https://github.com/apache/spark/pull/27333#issuecomment-675297620


   Merged build finished. Test FAILed.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #28363: [SPARK-27188][SS] FileStreamSink: provide a new option to have retention on output files

2020-08-18 Thread GitBox


AmplabJenkins removed a comment on pull request #28363:
URL: https://github.com/apache/spark/pull/28363#issuecomment-675297618


   Test FAILed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/127532/
   Test FAILed.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #29460: [DO-NOT-MERGE][SPARK-32249][3.0] Run Github Actions builds in other branches as well

2020-08-18 Thread GitBox


AmplabJenkins removed a comment on pull request #29460:
URL: https://github.com/apache/spark/pull/29460#issuecomment-675297649


   Test FAILed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/127543/
   Test FAILed.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #29461: [DO-NOT-MERGE][SPARK-32456][SS][FOLLOWUP] Update doc to note about using SQL statement with streaming Dataset

2020-08-18 Thread GitBox


SparkQA commented on pull request #29461:
URL: https://github.com/apache/spark/pull/29461#issuecomment-675297860


   **[Test build #127546 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/127546/testReport)**
 for PR 29461 at commit 
[`f8d1416`](https://github.com/apache/spark/commit/f8d1416315cdfded655d860281b807e90f84c002).



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #25965: [SPARK-26425][SS] Add more constraint checks to avoid checkpoint corruption

2020-08-18 Thread GitBox


AmplabJenkins removed a comment on pull request #25965:
URL: https://github.com/apache/spark/pull/25965#issuecomment-675297882


   Merged build finished. Test FAILed.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #28841: [SPARK-31962][SQL] Provide modifiedAfter and modifiedBefore options when filtering from a batch-based file data source

2020-08-18 Thread GitBox


AmplabJenkins commented on pull request #28841:
URL: https://github.com/apache/spark/pull/28841#issuecomment-675297983







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #25965: [SPARK-26425][SS] Add more constraint checks to avoid checkpoint corruption

2020-08-18 Thread GitBox


AmplabJenkins commented on pull request #25965:
URL: https://github.com/apache/spark/pull/25965#issuecomment-675297882







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA removed a comment on pull request #27694: [SPARK-30946][SS] Serde entry via DataInputStream/DataOutputStream with LZ4 compression on FileStream(Source/Sink)Log

2020-08-18 Thread GitBox


SparkQA removed a comment on pull request #27694:
URL: https://github.com/apache/spark/pull/27694#issuecomment-675252986


   **[Test build #127534 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/127534/testReport)**
 for PR 27694 at commit 
[`2559928`](https://github.com/apache/spark/commit/2559928be2d7981c2c1c2d9b6111c4449e721310).



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #28904: [SPARK-30462][SS] Streamline the logic on file stream source and sink metadata log to avoid memory issue

2020-08-18 Thread GitBox


AmplabJenkins commented on pull request #28904:
URL: https://github.com/apache/spark/pull/28904#issuecomment-675298123







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA removed a comment on pull request #29453: [SPARK-31999][SQL][FOLLOWUP] Adds negative test cases with typos

2020-08-18 Thread GitBox


SparkQA removed a comment on pull request #29453:
URL: https://github.com/apache/spark/pull/29453#issuecomment-675275574


   **[Test build #127542 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/127542/testReport)**
 for PR 29453 at commit 
[`69b45be`](https://github.com/apache/spark/commit/69b45bed5e12064d19c4edbac94c3cdbef63f5ff).



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA removed a comment on pull request #29360: [SPARK-32542][SQL] Add an optimizer rule to split an Expand into multiple Expands for aggregates

2020-08-18 Thread GitBox


SparkQA removed a comment on pull request #29360:
URL: https://github.com/apache/spark/pull/29360#issuecomment-675233817


   **[Test build #127526 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/127526/testReport)**
 for PR 29360 at commit 
[`87b9a82`](https://github.com/apache/spark/commit/87b9a825359168eb07fe5f9791e1dc26ce138046).



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #28422: [SPARK-17604][SS] FileStreamSource: provide a new option to have retention on input files

2020-08-18 Thread GitBox


AmplabJenkins removed a comment on pull request #28422:
URL: https://github.com/apache/spark/pull/28422#issuecomment-675297540


   Test FAILed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/127531/
   Test FAILed.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #29460: [DO-NOT-MERGE][SPARK-32249][3.0] Run Github Actions builds in other branches as well

2020-08-18 Thread GitBox


AmplabJenkins commented on pull request #29460:
URL: https://github.com/apache/spark/pull/29460#issuecomment-675297637







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #29456: [SPARK-32647][INFRA] Report SparkR test results with JUnit reporter

2020-08-18 Thread GitBox


AmplabJenkins removed a comment on pull request #29456:
URL: https://github.com/apache/spark/pull/29456#issuecomment-675297225


   Test FAILed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/127544/
   Test FAILed.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA removed a comment on pull request #29460: [DO-NOT-MERGE][SPARK-32249][3.0] Run Github Actions builds in other branches as well

2020-08-18 Thread GitBox


SparkQA removed a comment on pull request #29460:
URL: https://github.com/apache/spark/pull/29460#issuecomment-675273071







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA removed a comment on pull request #28904: [SPARK-30462][SS] Streamline the logic on file stream source and sink metadata log to avoid memory issue

2020-08-18 Thread GitBox


SparkQA removed a comment on pull request #28904:
URL: https://github.com/apache/spark/pull/28904#issuecomment-675250393


   **[Test build #127530 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/127530/testReport)**
 for PR 28904 at commit 
[`e16ebe4`](https://github.com/apache/spark/commit/e16ebe4e530d3c44bb0ba39981c4ec2287c3589e).



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #29460: [DO-NOT-MERGE][SPARK-32249][3.0] Run Github Actions builds in other branches as well

2020-08-18 Thread GitBox


AmplabJenkins removed a comment on pull request #29460:
URL: https://github.com/apache/spark/pull/29460#issuecomment-675297293







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA removed a comment on pull request #29456: [SPARK-32647][INFRA] Report SparkR test results with JUnit reporter

2020-08-18 Thread GitBox


SparkQA removed a comment on pull request #29456:
URL: https://github.com/apache/spark/pull/29456#issuecomment-675291618


   **[Test build #127544 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/127544/testReport)**
 for PR 29456 at commit 
[`603268e`](https://github.com/apache/spark/commit/603268e6598e538946102952aeb46b1874d54e38).



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #28422: [SPARK-17604][SS] FileStreamSource: provide a new option to have retention on input files

2020-08-18 Thread GitBox


AmplabJenkins removed a comment on pull request #28422:
URL: https://github.com/apache/spark/pull/28422#issuecomment-675297527


   Merged build finished. Test FAILed.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #27694: [SPARK-30946][SS] Serde entry via DataInputStream/DataOutputStream with LZ4 compression on FileStream(Source/Sink)Log

2020-08-18 Thread GitBox


AmplabJenkins commented on pull request #27694:
URL: https://github.com/apache/spark/pull/27694#issuecomment-675298265







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #27649: [SPARK-30900][SS] FileStreamSource: Avoid reading compact metadata log twice if the query restarts from compact batch

2020-08-18 Thread GitBox


AmplabJenkins removed a comment on pull request #27649:
URL: https://github.com/apache/spark/pull/27649#issuecomment-675297402


   Merged build finished. Test FAILed.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA removed a comment on pull request #26935: [SPARK-30294][SS] Explicitly defines read-only StateStore and optimize for HDFSBackedStateStore

2020-08-18 Thread GitBox


SparkQA removed a comment on pull request #26935:
URL: https://github.com/apache/spark/pull/26935#issuecomment-675253028


   **[Test build #127537 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/127537/testReport)**
 for PR 26935 at commit 
[`cabd38f`](https://github.com/apache/spark/commit/cabd38f32622b61c73bb3f1ca6c6390df7e89c04).



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA removed a comment on pull request #25965: [SPARK-26425][SS] Add more constraint checks to avoid checkpoint corruption

2020-08-18 Thread GitBox


SparkQA removed a comment on pull request #25965:
URL: https://github.com/apache/spark/pull/25965#issuecomment-675253090


   **[Test build #127538 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/127538/testReport)**
 for PR 25965 at commit 
[`d15acef`](https://github.com/apache/spark/commit/d15acef9698528239dc8a5b92d55c950cdf602b2).



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA removed a comment on pull request #28841: [SPARK-31962][SQL] Provide modifiedAfter and modifiedBefore options when filtering from a batch-based file data source

2020-08-18 Thread GitBox


SparkQA removed a comment on pull request #28841:
URL: https://github.com/apache/spark/pull/28841#issuecomment-675227409


   **[Test build #127524 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/127524/testReport)**
 for PR 28841 at commit 
[`263dd2a`](https://github.com/apache/spark/commit/263dd2a58ee990600aae3c40ea3eb56368a9c48d).



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA removed a comment on pull request #27649: [SPARK-30900][SS] FileStreamSource: Avoid reading compact metadata log twice if the query restarts from compact batch

2020-08-18 Thread GitBox


SparkQA removed a comment on pull request #27649:
URL: https://github.com/apache/spark/pull/27649#issuecomment-675252940


   **[Test build #127535 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/127535/testReport)**
 for PR 27649 at commit 
[`6406e36`](https://github.com/apache/spark/commit/6406e36eb34377983aaf113495ca16b1553317a3).



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #29453: [SPARK-31999][SQL][FOLLOWUP] Adds negative test cases with typos

2020-08-18 Thread GitBox


AmplabJenkins removed a comment on pull request #29453:
URL: https://github.com/apache/spark/pull/29453#issuecomment-675297359


   Merged build finished. Test FAILed.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #29456: [SPARK-32647][INFRA] Report SparkR test results with JUnit reporter

2020-08-18 Thread GitBox


AmplabJenkins removed a comment on pull request #29456:
URL: https://github.com/apache/spark/pull/29456#issuecomment-675297211


   Merged build finished. Test FAILed.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA removed a comment on pull request #28363: [SPARK-27188][SS] FileStreamSink: provide a new option to have retention on output files

2020-08-18 Thread GitBox


SparkQA removed a comment on pull request #28363:
URL: https://github.com/apache/spark/pull/28363#issuecomment-675250435


   **[Test build #127532 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/127532/testReport)**
 for PR 28363 at commit 
[`b648156`](https://github.com/apache/spark/commit/b64815622bb4e8cd8b474cb2983f2c9b78ed9342).



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA removed a comment on pull request #29457: [SPARK-32646][SQL] ORC predicate pushdown should work with case-insensitive analysis

2020-08-18 Thread GitBox


SparkQA removed a comment on pull request #29457:
URL: https://github.com/apache/spark/pull/29457#issuecomment-675294654


   **[Test build #127545 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/127545/testReport)**
 for PR 29457 at commit 
[`090747d`](https://github.com/apache/spark/commit/090747d4a9d1540c4b65e45f960c926a23d76b84).



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #29457: [SPARK-32646][SQL] ORC predicate pushdown should work with case-insensitive analysis

2020-08-18 Thread GitBox


AmplabJenkins removed a comment on pull request #29457:
URL: https://github.com/apache/spark/pull/29457#issuecomment-675297229


   Merged build finished. Test FAILed.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #27333: [SPARK-29438][SS][FOLLOWUP] Add regression tests for Streaming Aggregation and flatMapGroupsWithState

2020-08-18 Thread GitBox


AmplabJenkins commented on pull request #27333:
URL: https://github.com/apache/spark/pull/27333#issuecomment-675297620







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] HeartSaVioR commented on pull request #29461: [DO-NOT-MERGE][SPARK-32456][SS][FOLLOWUP] Update doc to note about using SQL statement with streaming Dataset

2020-08-18 Thread GitBox


HeartSaVioR commented on pull request #29461:
URL: https://github.com/apache/spark/pull/29461#issuecomment-675297309


   I'm marking this as draft as I'd like to see which is preferred - just 
document to warn about end users (this PR) vs collect and prevent some 
error-prone operations for streaming workload (proposed 
https://github.com/apache/spark/pull/29256#discussion_r471945148). If we don't 
mind covering this with guide doc, this PR can be converted to 
"ready-to-review".



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #28422: [SPARK-17604][SS] FileStreamSource: provide a new option to have retention on input files

2020-08-18 Thread GitBox


AmplabJenkins commented on pull request #28422:
URL: https://github.com/apache/spark/pull/28422#issuecomment-675297527







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #28363: [SPARK-27188][SS] FileStreamSink: provide a new option to have retention on output files

2020-08-18 Thread GitBox


AmplabJenkins removed a comment on pull request #28363:
URL: https://github.com/apache/spark/pull/28363#issuecomment-675297607


   Merged build finished. Test FAILed.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #28363: [SPARK-27188][SS] FileStreamSink: provide a new option to have retention on output files

2020-08-18 Thread GitBox


AmplabJenkins commented on pull request #28363:
URL: https://github.com/apache/spark/pull/28363#issuecomment-675297607







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #26935: [SPARK-30294][SS] Explicitly defines read-only StateStore and optimize for HDFSBackedStateStore

2020-08-18 Thread GitBox


AmplabJenkins commented on pull request #26935:
URL: https://github.com/apache/spark/pull/26935#issuecomment-675297411







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA removed a comment on pull request #27333: [SPARK-29438][SS][FOLLOWUP] Add regression tests for Streaming Aggregation and flatMapGroupsWithState

2020-08-18 Thread GitBox


SparkQA removed a comment on pull request #27333:
URL: https://github.com/apache/spark/pull/27333#issuecomment-675252970


   **[Test build #127536 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/127536/testReport)**
 for PR 27333 at commit 
[`466363e`](https://github.com/apache/spark/commit/466363edb22ea83a81e21a72f1b983dc7b5a733e).



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #26935: [SPARK-30294][SS] Explicitly defines read-only StateStore and optimize for HDFSBackedStateStore

2020-08-18 Thread GitBox


AmplabJenkins removed a comment on pull request #26935:
URL: https://github.com/apache/spark/pull/26935#issuecomment-675297411


   Merged build finished. Test FAILed.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #29457: [SPARK-32646][SQL] ORC predicate pushdown should work with case-insensitive analysis

2020-08-18 Thread GitBox


AmplabJenkins commented on pull request #29457:
URL: https://github.com/apache/spark/pull/29457#issuecomment-675297229







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #28363: [SPARK-27188][SS] FileStreamSink: provide a new option to have retention on output files

2020-08-18 Thread GitBox


SparkQA commented on pull request #28363:
URL: https://github.com/apache/spark/pull/28363#issuecomment-675297170


   **[Test build #127532 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/127532/testReport)**
 for PR 28363 at commit 
[`b648156`](https://github.com/apache/spark/commit/b64815622bb4e8cd8b474cb2983f2c9b78ed9342).
* This patch **fails due to an unknown error code, -9**.
* This patch merges cleanly.
* This patch adds no public classes.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #29453: [SPARK-31999][SQL][FOLLOWUP] Adds negative test cases with typos

2020-08-18 Thread GitBox


SparkQA commented on pull request #29453:
URL: https://github.com/apache/spark/pull/29453#issuecomment-675297169


   **[Test build #127542 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/127542/testReport)**
 for PR 29453 at commit 
[`69b45be`](https://github.com/apache/spark/commit/69b45bed5e12064d19c4edbac94c3cdbef63f5ff).
* This patch **fails due to an unknown error code, -9**.
* This patch merges cleanly.
* This patch adds no public classes.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA removed a comment on pull request #28422: [SPARK-17604][SS] FileStreamSource: provide a new option to have retention on input files

2020-08-18 Thread GitBox


SparkQA removed a comment on pull request #28422:
URL: https://github.com/apache/spark/pull/28422#issuecomment-675250414


   **[Test build #127531 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/127531/testReport)**
 for PR 28422 at commit 
[`06ee53d`](https://github.com/apache/spark/commit/06ee53d9dee60756be8563d584d589e198d670f1).



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #29460: [DO-NOT-MERGE][SPARK-32249][3.0] Run Github Actions builds in other branches as well

2020-08-18 Thread GitBox


SparkQA commented on pull request #29460:
URL: https://github.com/apache/spark/pull/29460#issuecomment-675297152







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #27649: [SPARK-30900][SS] FileStreamSource: Avoid reading compact metadata log twice if the query restarts from compact batch

2020-08-18 Thread GitBox


AmplabJenkins commented on pull request #27649:
URL: https://github.com/apache/spark/pull/27649#issuecomment-675297402







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #27333: [SPARK-29438][SS][FOLLOWUP] Add regression tests for Streaming Aggregation and flatMapGroupsWithState

2020-08-18 Thread GitBox


SparkQA commented on pull request #27333:
URL: https://github.com/apache/spark/pull/27333#issuecomment-675297154


   **[Test build #127536 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/127536/testReport)**
 for PR 27333 at commit 
[`466363e`](https://github.com/apache/spark/commit/466363edb22ea83a81e21a72f1b983dc7b5a733e).
* This patch **fails due to an unknown error code, -9**.
* This patch merges cleanly.
* This patch adds no public classes.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #25965: [SPARK-26425][SS] Add more constraint checks to avoid checkpoint corruption

2020-08-18 Thread GitBox


SparkQA commented on pull request #25965:
URL: https://github.com/apache/spark/pull/25965#issuecomment-675297180


   **[Test build #127538 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/127538/testReport)**
 for PR 25965 at commit 
[`d15acef`](https://github.com/apache/spark/commit/d15acef9698528239dc8a5b92d55c950cdf602b2).
* This patch **fails due to an unknown error code, -9**.
* This patch merges cleanly.
* This patch adds no public classes.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #29453: [SPARK-31999][SQL][FOLLOWUP] Adds negative test cases with typos

2020-08-18 Thread GitBox


AmplabJenkins commented on pull request #29453:
URL: https://github.com/apache/spark/pull/29453#issuecomment-675297359







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #27694: [SPARK-30946][SS] Serde entry via DataInputStream/DataOutputStream with LZ4 compression on FileStream(Source/Sink)Log

2020-08-18 Thread GitBox


SparkQA commented on pull request #27694:
URL: https://github.com/apache/spark/pull/27694#issuecomment-675297182


   **[Test build #127534 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/127534/testReport)**
 for PR 27694 at commit 
[`2559928`](https://github.com/apache/spark/commit/2559928be2d7981c2c1c2d9b6111c4449e721310).
* This patch **fails due to an unknown error code, -9**.
* This patch merges cleanly.
* This patch adds no public classes.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #29456: [SPARK-32647][INFRA] Report SparkR test results with JUnit reporter

2020-08-18 Thread GitBox


AmplabJenkins commented on pull request #29456:
URL: https://github.com/apache/spark/pull/29456#issuecomment-675297211







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #28422: [SPARK-17604][SS] FileStreamSource: provide a new option to have retention on input files

2020-08-18 Thread GitBox


SparkQA commented on pull request #28422:
URL: https://github.com/apache/spark/pull/28422#issuecomment-675297171


   **[Test build #127531 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/127531/testReport)**
 for PR 28422 at commit 
[`06ee53d`](https://github.com/apache/spark/commit/06ee53d9dee60756be8563d584d589e198d670f1).
* This patch **fails due to an unknown error code, -9**.
* This patch merges cleanly.
* This patch adds no public classes.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #28841: [SPARK-31962][SQL] Provide modifiedAfter and modifiedBefore options when filtering from a batch-based file data source

2020-08-18 Thread GitBox


SparkQA commented on pull request #28841:
URL: https://github.com/apache/spark/pull/28841#issuecomment-675297165


   **[Test build #127524 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/127524/testReport)**
 for PR 28841 at commit 
[`263dd2a`](https://github.com/apache/spark/commit/263dd2a58ee990600aae3c40ea3eb56368a9c48d).
* This patch **fails due to an unknown error code, -9**.
* This patch merges cleanly.
* This patch adds no public classes.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #29360: [SPARK-32542][SQL] Add an optimizer rule to split an Expand into multiple Expands for aggregates

2020-08-18 Thread GitBox


SparkQA commented on pull request #29360:
URL: https://github.com/apache/spark/pull/29360#issuecomment-675297167


   **[Test build #127526 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/127526/testReport)**
 for PR 29360 at commit 
[`87b9a82`](https://github.com/apache/spark/commit/87b9a825359168eb07fe5f9791e1dc26ce138046).
* This patch **fails due to an unknown error code, -9**.
* This patch merges cleanly.
* This patch adds no public classes.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #29457: [SPARK-32646][SQL] ORC predicate pushdown should work with case-insensitive analysis

2020-08-18 Thread GitBox


SparkQA commented on pull request #29457:
URL: https://github.com/apache/spark/pull/29457#issuecomment-675297166


   **[Test build #127545 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/127545/testReport)**
 for PR 29457 at commit 
[`090747d`](https://github.com/apache/spark/commit/090747d4a9d1540c4b65e45f960c926a23d76b84).
* This patch **fails due to an unknown error code, -9**.
* This patch merges cleanly.
* This patch adds no public classes.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #29460: [DO-NOT-MERGE][SPARK-32249][3.0] Run Github Actions builds in other branches as well

2020-08-18 Thread GitBox


AmplabJenkins commented on pull request #29460:
URL: https://github.com/apache/spark/pull/29460#issuecomment-675297293







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #27649: [SPARK-30900][SS] FileStreamSource: Avoid reading compact metadata log twice if the query restarts from compact batch

2020-08-18 Thread GitBox


SparkQA commented on pull request #27649:
URL: https://github.com/apache/spark/pull/27649#issuecomment-675297150


   **[Test build #127535 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/127535/testReport)**
 for PR 27649 at commit 
[`6406e36`](https://github.com/apache/spark/commit/6406e36eb34377983aaf113495ca16b1553317a3).
* This patch **fails due to an unknown error code, -9**.
* This patch merges cleanly.
* This patch adds no public classes.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #26935: [SPARK-30294][SS] Explicitly defines read-only StateStore and optimize for HDFSBackedStateStore

2020-08-18 Thread GitBox


SparkQA commented on pull request #26935:
URL: https://github.com/apache/spark/pull/26935#issuecomment-675297173


   **[Test build #127537 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/127537/testReport)**
 for PR 26935 at commit 
[`cabd38f`](https://github.com/apache/spark/commit/cabd38f32622b61c73bb3f1ca6c6390df7e89c04).
* This patch **fails due to an unknown error code, -9**.
* This patch merges cleanly.
* This patch adds the following public classes _(experimental)_:
 * `  class HDFSBackedReadOnlyStateStore(val version: Long, map: MapType)`
 * `abstract class ReadOnlyStateStore extends StateStore `
 * `class WrappedReadOnlyStateStore(store: StateStore) extends 
ReadOnlyStateStore `



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #28904: [SPARK-30462][SS] Streamline the logic on file stream source and sink metadata log to avoid memory issue

2020-08-18 Thread GitBox


SparkQA commented on pull request #28904:
URL: https://github.com/apache/spark/pull/28904#issuecomment-675297184


   **[Test build #127530 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/127530/testReport)**
 for PR 28904 at commit 
[`e16ebe4`](https://github.com/apache/spark/commit/e16ebe4e530d3c44bb0ba39981c4ec2287c3589e).
* This patch **fails due to an unknown error code, -9**.
* This patch merges cleanly.
* This patch adds no public classes.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #29456: [SPARK-32647][INFRA] Report SparkR test results with JUnit reporter

2020-08-18 Thread GitBox


SparkQA commented on pull request #29456:
URL: https://github.com/apache/spark/pull/29456#issuecomment-675297168


   **[Test build #127544 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/127544/testReport)**
 for PR 29456 at commit 
[`603268e`](https://github.com/apache/spark/commit/603268e6598e538946102952aeb46b1874d54e38).
* This patch **fails due to an unknown error code, -9**.
* This patch merges cleanly.
* This patch adds no public classes.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #29457: [SPARK-32646][SQL] ORC predicate pushdown should work with case-insensitive analysis

2020-08-18 Thread GitBox


AmplabJenkins removed a comment on pull request #29457:
URL: https://github.com/apache/spark/pull/29457#issuecomment-675295113







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] HeartSaVioR opened a new pull request #29461: [DO-NOT-MERGE][SPARK-32456][SS][FOLLOWUP] Update doc to note about using SQL statement with streaming Dataset

2020-08-18 Thread GitBox


HeartSaVioR opened a new pull request #29461:
URL: https://github.com/apache/spark/pull/29461


   ### What changes were proposed in this pull request?
   
   This patch proposes to update the doc (both SS guide doc and Dataset 
dropDuplicates method doc) to leave a note to check on using SQL statements 
with streaming Dataset.
   
   Once end users create a temp view based on streaming Dataset, they won't 
bother with thinking about "streaming" and do whatever they do with batch 
query. In many cases it works, but not just smoothly for the case when 
streaming aggregation is involved. They still need to concern about maintaining 
state store.
   
   ### Why are the changes needed?
   
   Although SPARK-32456 fixed the weird error message, as a side effect some 
operations are enabled on streaming workload via SQL statement, which is 
error-prone if end users don't indicate what they're doing.
   
   ### Does this PR introduce _any_ user-facing change?
   
   No.
   
   ### How was this patch tested?
   
   Only doc change.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #29457: [SPARK-32646][SQL] ORC predicate pushdown should work with case-insensitive analysis

2020-08-18 Thread GitBox


AmplabJenkins commented on pull request #29457:
URL: https://github.com/apache/spark/pull/29457#issuecomment-675295113







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #29457: [SPARK-32646][SQL] ORC predicate pushdown should work with case-insensitive analysis

2020-08-18 Thread GitBox


SparkQA commented on pull request #29457:
URL: https://github.com/apache/spark/pull/29457#issuecomment-675294654


   **[Test build #127545 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/127545/testReport)**
 for PR 29457 at commit 
[`090747d`](https://github.com/apache/spark/commit/090747d4a9d1540c4b65e45f960c926a23d76b84).



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] gengliangwang commented on a change in pull request #29437: [SPARK-32621][SQL] 'path' option can cause issues while inferring schema in CSV/JSON datasources

2020-08-18 Thread GitBox


gengliangwang commented on a change in pull request #29437:
URL: https://github.com/apache/spark/pull/29437#discussion_r471954974



##
File path: 
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2/FileTable.scala
##
@@ -34,13 +35,21 @@ import org.apache.spark.sql.util.SchemaUtils
 
 abstract class FileTable(
 sparkSession: SparkSession,
-options: CaseInsensitiveStringMap,
+originalOptions: CaseInsensitiveStringMap,
 paths: Seq[String],
 userSpecifiedSchema: Option[StructType])
   extends Table with SupportsRead with SupportsWrite {
 
   import org.apache.spark.sql.connector.catalog.CatalogV2Implicits._
 
+  // Options without path-related options from `originalOptions`.
+  protected final lazy val options: CaseInsensitiveStringMap = {
+val caseInsensitiveMap = 
CaseInsensitiveMap(originalOptions.asCaseSensitiveMap.asScala.toMap)
+val caseInsensitiveMapWithoutPaths = caseInsensitiveMap - "paths" - "path"
+new CaseInsensitiveStringMap(
+  
caseInsensitiveMapWithoutPaths.asInstanceOf[CaseInsensitiveMap[String]].originalMap.asJava)
+  }

Review comment:
   There was a time when the `FileIndex` is created in `FileDataSourceV2`, 
so that the `getPaths` method was in `FileDataSourceV2`. In the current code, 
it seems fine to move the location of the method.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] cloud-fan commented on a change in pull request #29452: [SPARK-32643][CORE] Consolidate state decommissioning in the TaskSchedulerImpl realm

2020-08-18 Thread GitBox


cloud-fan commented on a change in pull request #29452:
URL: https://github.com/apache/spark/pull/29452#discussion_r471954231



##
File path: 
core/src/main/scala/org/apache/spark/scheduler/ExecutorDecommissionInfo.scala
##
@@ -18,11 +18,21 @@
 package org.apache.spark.scheduler
 
 /**
- * Provides more detail when an executor is being decommissioned.
+ * Message providing more detail when an executor is being decommissioned.
  * @param message Human readable reason for why the decommissioning is 
happening.
  * @param isHostDecommissioned Whether the host (aka the `node` or `worker` in 
other places) is
  * being decommissioned too. Used to infer if the 
shuffle data might
  * be lost even if the external shuffle service is 
enabled.
  */
 private[spark]
 case class ExecutorDecommissionInfo(message: String, isHostDecommissioned: 
Boolean)
+
+/**
+ * State related to decommissioning that is kept by the TaskSchedulerImpl. 
This state is derived
+ * from the info message above but it is kept distinct to allow the state to 
evolve independently
+ * from the message.
+ */
+case class ExecutorDecommissionState(message: String,

Review comment:
   why not `(info: ExecutorDecommissionInfo, tsMillis: Long)`?





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #29456: [SPARK-32647][INFRA] Report SparkR test results with JUnit reporter

2020-08-18 Thread GitBox


AmplabJenkins commented on pull request #29456:
URL: https://github.com/apache/spark/pull/29456#issuecomment-675292135







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #29456: [SPARK-32647][INFRA] Report SparkR test results with JUnit reporter

2020-08-18 Thread GitBox


AmplabJenkins removed a comment on pull request #29456:
URL: https://github.com/apache/spark/pull/29456#issuecomment-675292135







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #29456: [SPARK-32647][INFRA] Report SparkR test results with JUnit reporter

2020-08-18 Thread GitBox


SparkQA commented on pull request #29456:
URL: https://github.com/apache/spark/pull/29456#issuecomment-675291618


   **[Test build #127544 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/127544/testReport)**
 for PR 29456 at commit 
[`603268e`](https://github.com/apache/spark/commit/603268e6598e538946102952aeb46b1874d54e38).



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] cloud-fan edited a comment on pull request #29395: [3.0][SPARK-32518][CORE] CoarseGrainedSchedulerBackend.maxNumConcurrentTasks should consider all kinds of resources

2020-08-18 Thread GitBox


cloud-fan edited a comment on pull request #29395:
URL: https://github.com/apache/spark/pull/29395#issuecomment-675290468


   thanks, merging to 3.0!



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] cloud-fan closed pull request #29395: [3.0][SPARK-32518][CORE] CoarseGrainedSchedulerBackend.maxNumConcurrentTasks should consider all kinds of resources

2020-08-18 Thread GitBox


cloud-fan closed pull request #29395:
URL: https://github.com/apache/spark/pull/29395


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] cloud-fan commented on pull request #29395: [3.0][SPARK-32518][CORE] CoarseGrainedSchedulerBackend.maxNumConcurrentTasks should consider all kinds of resources

2020-08-18 Thread GitBox


cloud-fan commented on pull request #29395:
URL: https://github.com/apache/spark/pull/29395#issuecomment-675290468


   thanks, merging to master!



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] cloud-fan closed pull request #29422: [SPARK-32613][CORE] Fix regressions in DecommissionWorkerSuite

2020-08-18 Thread GitBox


cloud-fan closed pull request #29422:
URL: https://github.com/apache/spark/pull/29422


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] cloud-fan commented on pull request #29422: [SPARK-32613][CORE] Fix regressions in DecommissionWorkerSuite

2020-08-18 Thread GitBox


cloud-fan commented on pull request #29422:
URL: https://github.com/apache/spark/pull/29422#issuecomment-675289413


   thanks, merging to master!



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] cloud-fan commented on pull request #29322: [SPARK-32511][SQL] Add dropFields method to Column class

2020-08-18 Thread GitBox


cloud-fan commented on pull request #29322:
URL: https://github.com/apache/spark/pull/29322#issuecomment-675287699


   reopened



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] cloud-fan commented on a change in pull request #29458: [SPARK-32018][FOLLOWUP][Doc] Add migration guide for decimal value overflow in sum aggregation

2020-08-18 Thread GitBox


cloud-fan commented on a change in pull request #29458:
URL: https://github.com/apache/spark/pull/29458#discussion_r471947639



##
File path: docs/sql-migration-guide.md
##
@@ -36,6 +36,10 @@ license: |
 
   - In Spark 3.1, NULL elements of structures, arrays and maps are converted 
to "null" in casting them to strings. In Spark 3.0 or earlier, NULL elements 
are converted to empty strings. To restore the behavior before Spark 3.1, you 
can set `spark.sql.legacy.castComplexTypesToString.enabled` to `true`.
 
+  - In Spark 3.1, when `spark.sql.ansi.enabled` is false, sum aggregation of 
decimal type column always returns `null` on decimal value overflow. In Spark 
3.0 or earlier, when `spark.sql.ansi.enabled` is false and decimal value 
overflow happens in sum aggregation of decimal type column:
+- If it is hash aggregation with `group by` clause, a runtime exception is 
thrown.

Review comment:
   We can use "default mode".
   
   I don't see a difference between "may fail at runtime" or `may return null`. 
They are mutually exclusive.
   

##
File path: docs/sql-migration-guide.md
##
@@ -36,6 +36,10 @@ license: |
 
   - In Spark 3.1, NULL elements of structures, arrays and maps are converted 
to "null" in casting them to strings. In Spark 3.0 or earlier, NULL elements 
are converted to empty strings. To restore the behavior before Spark 3.1, you 
can set `spark.sql.legacy.castComplexTypesToString.enabled` to `true`.
 
+  - In Spark 3.1, when `spark.sql.ansi.enabled` is false, sum aggregation of 
decimal type column always returns `null` on decimal value overflow. In Spark 
3.0 or earlier, when `spark.sql.ansi.enabled` is false and decimal value 
overflow happens in sum aggregation of decimal type column:
+- If it is hash aggregation with `group by` clause, a runtime exception is 
thrown.

Review comment:
   We can use "default mode".
   
   I don't see a difference between "may fail at runtime" or "may return null". 
They are mutually exclusive.
   





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] HeartSaVioR commented on a change in pull request #29256: [SPARK-32456][SS] Check the Distinct by assuming it as Aggregate for Structured Streaming

2020-08-18 Thread GitBox


HeartSaVioR commented on a change in pull request #29256:
URL: https://github.com/apache/spark/pull/29256#discussion_r471945148



##
File path: 
sql/core/src/test/scala/org/apache/spark/sql/streaming/StreamingQuerySuite.scala
##
@@ -1106,6 +1107,54 @@ class StreamingQuerySuite extends StreamTest with 
BeforeAndAfter with Logging wi
 }
   }
 
+  test("union in streaming query of append mode without watermark") {
+val inputData1 = MemoryStream[Int]
+val inputData2 = MemoryStream[Int]
+withTempView("s1", "s2") {
+  inputData1.toDF().createOrReplaceTempView("s1")
+  inputData2.toDF().createOrReplaceTempView("s2")
+  val unioned = spark.sql(
+"select s1.value from s1 union select s2.value from s2")
+  checkExceptionMessage(unioned)
+}
+  }
+
+  test("distinct in streaming query of append mode without watermark") {
+val inputData = MemoryStream[Int]
+withTempView("deduptest") {
+  inputData.toDF().toDF("value").createOrReplaceTempView("deduptest")
+  val distinct = spark.sql("select distinct value from deduptest")
+  checkExceptionMessage(distinct)
+}
+  }
+
+  test("distinct in streaming query of complete mode") {
+val inputData = MemoryStream[Int]
+withTempView("deduptest") {
+  inputData.toDF().toDF("value").createOrReplaceTempView("deduptest")
+  val distinct = spark.sql("select distinct value from deduptest")
+
+  testStream(distinct, Complete)(
+AddData(inputData, 1, 2, 3, 3, 4),
+CheckAnswer(Row(1), Row(2), Row(3), Row(4))

Review comment:
   What I am suggesting is that waiting and hearing the operations we have 
been restricted on SS with the reasons, and if the reasons make sense then ban 
them even with SQL statements. Not only distinct.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] HyukjinKwon commented on pull request #29459: [MINOR][INFRA] Rename master.yml to build_and_test.yml

2020-08-18 Thread GitBox


HyukjinKwon commented on pull request #29459:
URL: https://github.com/apache/spark/pull/29459#issuecomment-675283818


   Thanks guys. Let me merge this after I cherry-pick Github Actions to other 
branches (at https://github.com/apache/spark/pull/29460)



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] HyukjinKwon commented on pull request #29456: [SPARK-32647][INFRA] Report SparkR test results with JUnit reporter

2020-08-18 Thread GitBox


HyukjinKwon commented on pull request #29456:
URL: https://github.com/apache/spark/pull/29456#issuecomment-675283772


   Thanks guys. Let me merge this after I cherry-pick Github Actions to other 
branches (at https://github.com/apache/spark/pull/29460)



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #29459: [MINOR][INFRA] Rename master.yml to build_and_test.yml

2020-08-18 Thread GitBox


AmplabJenkins removed a comment on pull request #29459:
URL: https://github.com/apache/spark/pull/29459#issuecomment-675280417


   Test FAILed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/127529/
   Test FAILed.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] viirya commented on a change in pull request #29437: [SPARK-32621][SQL] 'path' option can cause issues while inferring schema in CSV/JSON datasources

2020-08-18 Thread GitBox


viirya commented on a change in pull request #29437:
URL: https://github.com/apache/spark/pull/29437#discussion_r471933928



##
File path: 
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2/FileTable.scala
##
@@ -34,13 +35,21 @@ import org.apache.spark.sql.util.SchemaUtils
 
 abstract class FileTable(
 sparkSession: SparkSession,
-options: CaseInsensitiveStringMap,
+originalOptions: CaseInsensitiveStringMap,

Review comment:
   Do we have chance to use `path` related options in `FileTable`? If not, 
can we just remove it when create `FileTable`? It feels a bit stranger that we 
assign some options to it, but also ask it to remove a few options.
   





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #29459: [MINOR][INFRA] Rename master.yml to build_and_test.yml

2020-08-18 Thread GitBox


AmplabJenkins removed a comment on pull request #29459:
URL: https://github.com/apache/spark/pull/29459#issuecomment-675280413


   Merged build finished. Test FAILed.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #29459: [MINOR][INFRA] Rename master.yml to build_and_test.yml

2020-08-18 Thread GitBox


AmplabJenkins commented on pull request #29459:
URL: https://github.com/apache/spark/pull/29459#issuecomment-675280413







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA removed a comment on pull request #29459: [MINOR][INFRA] Rename master.yml to build_and_test.yml

2020-08-18 Thread GitBox


SparkQA removed a comment on pull request #29459:
URL: https://github.com/apache/spark/pull/29459#issuecomment-675238599


   **[Test build #127529 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/127529/testReport)**
 for PR 29459 at commit 
[`3bd540f`](https://github.com/apache/spark/commit/3bd540f529970130ede596a78097b24375972841).



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #29459: [MINOR][INFRA] Rename master.yml to build_and_test.yml

2020-08-18 Thread GitBox


SparkQA commented on pull request #29459:
URL: https://github.com/apache/spark/pull/29459#issuecomment-675279846


   **[Test build #127529 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/127529/testReport)**
 for PR 29459 at commit 
[`3bd540f`](https://github.com/apache/spark/commit/3bd540f529970130ede596a78097b24375972841).
* This patch **fails Spark unit tests**.
* This patch merges cleanly.
* This patch adds no public classes.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] cloud-fan commented on a change in pull request #29256: [SPARK-32456][SS] Check the Distinct by assuming it as Aggregate for Structured Streaming

2020-08-18 Thread GitBox


cloud-fan commented on a change in pull request #29256:
URL: https://github.com/apache/spark/pull/29256#discussion_r471939568



##
File path: 
sql/core/src/test/scala/org/apache/spark/sql/streaming/StreamingQuerySuite.scala
##
@@ -1106,6 +1107,54 @@ class StreamingQuerySuite extends StreamTest with 
BeforeAndAfter with Logging wi
 }
   }
 
+  test("union in streaming query of append mode without watermark") {
+val inputData1 = MemoryStream[Int]
+val inputData2 = MemoryStream[Int]
+withTempView("s1", "s2") {
+  inputData1.toDF().createOrReplaceTempView("s1")
+  inputData2.toDF().createOrReplaceTempView("s2")
+  val unioned = spark.sql(
+"select s1.value from s1 union select s2.value from s2")
+  checkExceptionMessage(unioned)
+}
+  }
+
+  test("distinct in streaming query of append mode without watermark") {
+val inputData = MemoryStream[Int]
+withTempView("deduptest") {
+  inputData.toDF().toDF("value").createOrReplaceTempView("deduptest")
+  val distinct = spark.sql("select distinct value from deduptest")
+  checkExceptionMessage(distinct)
+}
+  }
+
+  test("distinct in streaming query of complete mode") {
+val inputData = MemoryStream[Int]
+withTempView("deduptest") {
+  inputData.toDF().toDF("value").createOrReplaceTempView("deduptest")
+  val distinct = spark.sql("select distinct value from deduptest")
+
+  testStream(distinct, Complete)(
+AddData(inputData, 1, 2, 3, 3, 4),
+CheckAnswer(Row(1), Row(2), Row(3), Row(4))

Review comment:
   Are you suggesting to ban `Distinct` in SS completely? I think it's fine 
too, as long as we don't give confusing error messages.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



<    4   5   6   7   8   9   10   >