[GitHub] [spark] AngersZhuuuu commented on pull request #29428: [SPARK-32608][SQL] Script Transform ROW FORMAT DELIMIT value should format value

2020-08-22 Thread GitBox


AngersZh commented on pull request #29428:
URL: https://github.com/apache/spark/pull/29428#issuecomment-678732959


   > @AngersZh Thanks. BTW, my PR accidentially caused compilation error 
for hive-1.2 profile, I'm reverting it in #29519 29519 first, so you can debug 
and fix the failed test.
   
   Can you show me some link about this UT failed in hiev-1.2



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #29519: Revert "[SPARK-32646][SQL] ORC predicate pushdown should work with case-insensitive analysis"

2020-08-22 Thread GitBox


AmplabJenkins removed a comment on pull request #29519:
URL: https://github.com/apache/spark/pull/29519#issuecomment-678732841







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] viirya closed pull request #29518: [SPARK-32646][SQL][FOLLOWUP][test-hadoop2.7][test-hive1.2] ORC predicate pushdown should work with case-insensitive analysis

2020-08-22 Thread GitBox


viirya closed pull request #29518:
URL: https://github.com/apache/spark/pull/29518


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #29519: Revert "[SPARK-32646][SQL] ORC predicate pushdown should work with case-insensitive analysis"

2020-08-22 Thread GitBox


AmplabJenkins commented on pull request #29519:
URL: https://github.com/apache/spark/pull/29519#issuecomment-678732841







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #29519: Revert "[SPARK-32646][SQL] ORC predicate pushdown should work with case-insensitive analysis"

2020-08-22 Thread GitBox


SparkQA commented on pull request #29519:
URL: https://github.com/apache/spark/pull/29519#issuecomment-678732756


   **[Test build #127797 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/127797/testReport)**
 for PR 29519 at commit 
[`cfccfa6`](https://github.com/apache/spark/commit/cfccfa645949011781ae77c5f3c93bade294599a).



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] viirya edited a comment on pull request #29457: [SPARK-32646][SQL] ORC predicate pushdown should work with case-insensitive analysis

2020-08-22 Thread GitBox


viirya edited a comment on pull request #29457:
URL: https://github.com/apache/spark/pull/29457#issuecomment-678732316


   Because master and branch-3.0 both have few tests failed under hive-1.2 
profile. And this diff missed a change in hive-1.2 code that causes compilation 
error. So it will make debugging the failed tests harder.  I'd like revert this 
first at #29519. cc @cloud-fan @HyukjinKwon 



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] viirya commented on pull request #29428: [SPARK-32608][SQL] Script Transform ROW FORMAT DELIMIT value should format value

2020-08-22 Thread GitBox


viirya commented on pull request #29428:
URL: https://github.com/apache/spark/pull/29428#issuecomment-678732571


   @AngersZh Thanks. BTW, my PR accidentially caused compilation error for 
hive-1.2 profile, I'm reverting it in #29519 29519 first, so you can debug and 
fix the failed test.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] viirya opened a new pull request #29519: Revert "[SPARK-32646][SQL] ORC predicate pushdown should work with case-insensitive analysis"

2020-08-22 Thread GitBox


viirya opened a new pull request #29519:
URL: https://github.com/apache/spark/pull/29519


   
   
   ### What changes were proposed in this pull request?
   
   
   This reverts commit e277ef1a83e37bc94e7817467ca882d660c83284.
   
   ### Why are the changes needed?
   
   
   Because master and branch-3.0 both have few tests failed under hive-1.2 
profile. And the PR missed a change in hive-1.2 code that causes compilation 
error. So it will make debugging the failed tests harder. I'd like revert this 
first.
   
   ### Does this PR introduce _any_ user-facing change?
   
   
   No
   
   ### How was this patch tested?
   
   
   Unit test



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AngersZhuuuu commented on pull request #29428: [SPARK-32608][SQL] Script Transform ROW FORMAT DELIMIT value should format value

2020-08-22 Thread GitBox


AngersZh commented on pull request #29428:
URL: https://github.com/apache/spark/pull/29428#issuecomment-678732395


   > @AngersZh The test 
"org.apache.spark.sql.hive.execution.HiveScriptTransformationSuite.[SPARK-32608](https://issues.apache.org/jira/browse/SPARK-32608):
 Script Transform ROW FORMAT DELIMIT value should format value" is failed under 
hive-1.2 profile in master and branch-3.0 branches. Can you look at it?
   
   Checking



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] viirya commented on pull request #29457: [SPARK-32646][SQL] ORC predicate pushdown should work with case-insensitive analysis

2020-08-22 Thread GitBox


viirya commented on pull request #29457:
URL: https://github.com/apache/spark/pull/29457#issuecomment-678732316


   Because master and branch-3.0 both have few tests failed under hive-1.2 
profile. And this diff missed a change in hive-1.2 code that causes compilation 
error. So it will make debugging the failed tests harder.  I'd like revert this 
first. cc @cloud-fan @HyukjinKwon 



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] viirya commented on pull request #29503: [SPARK-32678][SQL] Rename EmptyHashedRelationWithAllNullKeys and simplify NAAJ generated code

2020-08-22 Thread GitBox


viirya commented on pull request #29503:
URL: https://github.com/apache/spark/pull/29503#issuecomment-678731933


   Thanks! Merging to master.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] viirya closed pull request #29503: [SPARK-32678][SQL] Rename EmptyHashedRelationWithAllNullKeys and simplify NAAJ generated code

2020-08-22 Thread GitBox


viirya closed pull request #29503:
URL: https://github.com/apache/spark/pull/29503


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] chanduhawk commented on a change in pull request #29516: [WIP][SPARK-32614][SQL] Don't apply comment processing if 'comment' unset for CSV

2020-08-22 Thread GitBox


chanduhawk commented on a change in pull request #29516:
URL: https://github.com/apache/spark/pull/29516#discussion_r475171964



##
File path: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/csv/CSVOptions.scala
##
@@ -220,7 +220,9 @@ class CSVOptions(
 format.setQuote(quote)
 format.setQuoteEscape(escape)
 charToEscapeQuoteEscaping.foreach(format.setCharToEscapeQuoteEscaping)
-format.setComment(comment)
+if (isCommentSet) {

Review comment:
   If we will change that way then it might impact existing users for which 
\u is a comment character by default. So I would say a separate optional 
config is a better solution. What I am saying here is that we need to wait for 
univocity 3.0.0 to be available where the new changes will be available then we 
can add spark changes in a proper manner.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] chanduhawk commented on a change in pull request #29516: [WIP][SPARK-32614][SQL] Don't apply comment processing if 'comment' unset for CSV

2020-08-22 Thread GitBox


chanduhawk commented on a change in pull request #29516:
URL: https://github.com/apache/spark/pull/29516#discussion_r475171964



##
File path: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/csv/CSVOptions.scala
##
@@ -220,7 +220,9 @@ class CSVOptions(
 format.setQuote(quote)
 format.setQuoteEscape(escape)
 charToEscapeQuoteEscaping.foreach(format.setCharToEscapeQuoteEscaping)
-format.setComment(comment)
+if (isCommentSet) {

Review comment:
   If we will change that way then it might impact existing users for which 
\u is a comment character by default. So I would say a separate optional 
config is a better solution





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] viirya commented on pull request #29428: [SPARK-32608][SQL] Script Transform ROW FORMAT DELIMIT value should format value

2020-08-22 Thread GitBox


viirya commented on pull request #29428:
URL: https://github.com/apache/spark/pull/29428#issuecomment-678730505


   @AngersZh The test 
"org.apache.spark.sql.hive.execution.HiveScriptTransformationSuite.SPARK-32608: 
 Script Transform ROW FORMAT DELIMIT value should format value" is failed under 
hive-1.2 profile in master and branch-3.0 branches. Can you look at it?



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] viirya commented on pull request #29518: [SPARK-32646][SQL][FOLLOWUP][test-hadoop2.7][test-hive1.2] ORC predicate pushdown should work with case-insensitive analysis

2020-08-22 Thread GitBox


viirya commented on pull request #29518:
URL: https://github.com/apache/spark/pull/29518#issuecomment-678730061


   To clarify:
   
   The three tests are also failed in branch-3.0:
   ```
   
org.apache.spark.sql.hive.execution.HiveScriptTransformationSuite.SPARK-32608:  
Script Transform ROW FORMAT DELIMIT value should format value
   org.apache.spark.sql.hive.execution.HiveSerDeReadWriteSuite.Read/Write Hive 
PARQUET serde table
   org.apache.spark.sql.hive.execution.HiveSerDeReadWriteSuite.Read/Write Hive 
TEXTFILE serde table
   ```
   
   This test is failed before #29457. I manually checkouted to 
bf221debd02b11003b092201d0326302196e4ba5, and ran the test locally to verify.
   
   ```
   org.apache.spark.sql.hive.orc.HiveOrcHadoopFsRelationSuite.save()/load()  - 
partitioned table - simple queries - partition columns in data
   ```
   
   
   
   
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] viirya edited a comment on pull request #29513: [SPARK-32646][SQL][BRANCH-3.0][test-hadoop2.7][test-hive1.2] ORC predicate pushdown should work with case-insensitive analysis

2020-08-22 Thread GitBox


viirya edited a comment on pull request #29513:
URL: https://github.com/apache/spark/pull/29513#issuecomment-678719209


   Err.. I think these tests are already failed in current branch-3.0 and 
master branches. Please see https://github.com/apache/spark/pull/29517. I 
created SPARK-32689 to track it.
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #29518: [SPARK-32646][SQL][FOLLOWUP][test-hadoop2.7][test-hive1.2] ORC predicate pushdown should work with case-insensitive analysis

2020-08-22 Thread GitBox


AmplabJenkins removed a comment on pull request #29518:
URL: https://github.com/apache/spark/pull/29518#issuecomment-678726409


   Test FAILed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/127796/
   Test FAILed.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #29518: [SPARK-32646][SQL][FOLLOWUP][test-hadoop2.7][test-hive1.2] ORC predicate pushdown should work with case-insensitive analysis

2020-08-22 Thread GitBox


AmplabJenkins removed a comment on pull request #29518:
URL: https://github.com/apache/spark/pull/29518#issuecomment-678726408


   Merged build finished. Test FAILed.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #29518: [SPARK-32646][SQL][FOLLOWUP][test-hadoop2.7][test-hive1.2] ORC predicate pushdown should work with case-insensitive analysis

2020-08-22 Thread GitBox


AmplabJenkins commented on pull request #29518:
URL: https://github.com/apache/spark/pull/29518#issuecomment-678726408







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #29518: [SPARK-32646][SQL][FOLLOWUP][test-hadoop2.7][test-hive1.2] ORC predicate pushdown should work with case-insensitive analysis

2020-08-22 Thread GitBox


SparkQA commented on pull request #29518:
URL: https://github.com/apache/spark/pull/29518#issuecomment-678726374


   **[Test build #127796 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/127796/testReport)**
 for PR 29518 at commit 
[`5ce7567`](https://github.com/apache/spark/commit/5ce756759b49ff977a8b49c893df30284f21ed96).
* This patch **fails Spark unit tests**.
* This patch merges cleanly.
* This patch adds no public classes.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA removed a comment on pull request #29518: [SPARK-32646][SQL][FOLLOWUP][test-hadoop2.7][test-hive1.2] ORC predicate pushdown should work with case-insensitive analysis

2020-08-22 Thread GitBox


SparkQA removed a comment on pull request #29518:
URL: https://github.com/apache/spark/pull/29518#issuecomment-678721328


   **[Test build #127796 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/127796/testReport)**
 for PR 29518 at commit 
[`5ce7567`](https://github.com/apache/spark/commit/5ce756759b49ff977a8b49c893df30284f21ed96).



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #29516: [WIP][SPARK-32614][SQL] Don't apply comment processing if 'comment' unset for CSV

2020-08-22 Thread GitBox


AmplabJenkins removed a comment on pull request #29516:
URL: https://github.com/apache/spark/pull/29516#issuecomment-678725614


   Test FAILed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/127793/
   Test FAILed.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #29516: [WIP][SPARK-32614][SQL] Don't apply comment processing if 'comment' unset for CSV

2020-08-22 Thread GitBox


AmplabJenkins removed a comment on pull request #29516:
URL: https://github.com/apache/spark/pull/29516#issuecomment-678725613


   Merged build finished. Test FAILed.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA removed a comment on pull request #29516: [WIP][SPARK-32614][SQL] Don't apply comment processing if 'comment' unset for CSV

2020-08-22 Thread GitBox


SparkQA removed a comment on pull request #29516:
URL: https://github.com/apache/spark/pull/29516#issuecomment-678719205


   **[Test build #127793 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/127793/testReport)**
 for PR 29516 at commit 
[`87e8b65`](https://github.com/apache/spark/commit/87e8b65c67fce1bef56e8071e42deca9954fcff8).



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #29516: [WIP][SPARK-32614][SQL] Don't apply comment processing if 'comment' unset for CSV

2020-08-22 Thread GitBox


AmplabJenkins commented on pull request #29516:
URL: https://github.com/apache/spark/pull/29516#issuecomment-678725613







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #29516: [WIP][SPARK-32614][SQL] Don't apply comment processing if 'comment' unset for CSV

2020-08-22 Thread GitBox


SparkQA commented on pull request #29516:
URL: https://github.com/apache/spark/pull/29516#issuecomment-678725602


   **[Test build #127793 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/127793/testReport)**
 for PR 29516 at commit 
[`87e8b65`](https://github.com/apache/spark/commit/87e8b65c67fce1bef56e8071e42deca9954fcff8).
* This patch **fails Spark unit tests**.
* This patch merges cleanly.
* This patch adds no public classes.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #29328: [SPARK-32516][SQL] 'path' option cannot coexist with load()'s path parameters

2020-08-22 Thread GitBox


AmplabJenkins removed a comment on pull request #29328:
URL: https://github.com/apache/spark/pull/29328#issuecomment-678725229







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #29328: [SPARK-32516][SQL] 'path' option cannot coexist with load()'s path parameters

2020-08-22 Thread GitBox


AmplabJenkins commented on pull request #29328:
URL: https://github.com/apache/spark/pull/29328#issuecomment-678725229







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA removed a comment on pull request #29328: [SPARK-32516][SQL] 'path' option cannot coexist with load()'s path parameters

2020-08-22 Thread GitBox


SparkQA removed a comment on pull request #29328:
URL: https://github.com/apache/spark/pull/29328#issuecomment-678707161


   **[Test build #127790 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/127790/testReport)**
 for PR 29328 at commit 
[`e1decd4`](https://github.com/apache/spark/commit/e1decd4c7921f58446a46081b339d308d36529cc).



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #29328: [SPARK-32516][SQL] 'path' option cannot coexist with load()'s path parameters

2020-08-22 Thread GitBox


SparkQA commented on pull request #29328:
URL: https://github.com/apache/spark/pull/29328#issuecomment-678725113


   **[Test build #127790 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/127790/testReport)**
 for PR 29328 at commit 
[`e1decd4`](https://github.com/apache/spark/commit/e1decd4c7921f58446a46081b339d308d36529cc).
* This patch passes all tests.
* This patch merges cleanly.
* This patch adds no public classes.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] LantaoJin commented on pull request #29378: [SPARK-30069][CORE][YARN] Clean up non-shuffle disk block manager files following executor exists on YARN

2020-08-22 Thread GitBox


LantaoJin commented on pull request #29378:
URL: https://github.com/apache/spark/pull/29378#issuecomment-678722916


   Gently ping @tgravescs @dongjoon-hyun 



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #29328: [SPARK-32516][SQL] 'path' option cannot coexist with load()'s path parameters

2020-08-22 Thread GitBox


AmplabJenkins commented on pull request #29328:
URL: https://github.com/apache/spark/pull/29328#issuecomment-678722822







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #29328: [SPARK-32516][SQL] 'path' option cannot coexist with load()'s path parameters

2020-08-22 Thread GitBox


AmplabJenkins removed a comment on pull request #29328:
URL: https://github.com/apache/spark/pull/29328#issuecomment-678722822







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA removed a comment on pull request #29328: [SPARK-32516][SQL] 'path' option cannot coexist with load()'s path parameters

2020-08-22 Thread GitBox


SparkQA removed a comment on pull request #29328:
URL: https://github.com/apache/spark/pull/29328#issuecomment-678703766


   **[Test build #127789 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/127789/testReport)**
 for PR 29328 at commit 
[`92bf5ef`](https://github.com/apache/spark/commit/92bf5efa57a011b8cc306812901348d88e7c1223).



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #29328: [SPARK-32516][SQL] 'path' option cannot coexist with load()'s path parameters

2020-08-22 Thread GitBox


SparkQA commented on pull request #29328:
URL: https://github.com/apache/spark/pull/29328#issuecomment-678722674


   **[Test build #127789 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/127789/testReport)**
 for PR 29328 at commit 
[`92bf5ef`](https://github.com/apache/spark/commit/92bf5efa57a011b8cc306812901348d88e7c1223).
* This patch passes all tests.
* This patch merges cleanly.
* This patch adds no public classes.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] jkleckner commented on pull request #29496: [SPARK-24266][k8s] Back port spark 28423 to 2.4 to restart watcher

2020-08-22 Thread GitBox


jkleckner commented on pull request #29496:
URL: https://github.com/apache/spark/pull/29496#issuecomment-678722230


   FWIW, we have yet to see a hang for our Hourly job.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] jkleckner commented on pull request #29496: [SPARK-24266][k8s] Back port spark 28423 to 2.4 to restart watcher

2020-08-22 Thread GitBox


jkleckner commented on pull request #29496:
URL: https://github.com/apache/spark/pull/29496#issuecomment-678722154


   Please retest this.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #29518: [SPARK-32646][SQL][FOLLOWUP][test-hadoop2.7][test-hive1.2] ORC predicate pushdown should work with case-insensitive analysis

2020-08-22 Thread GitBox


AmplabJenkins removed a comment on pull request #29518:
URL: https://github.com/apache/spark/pull/29518#issuecomment-678721402







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #29518: [SPARK-32646][SQL][FOLLOWUP][test-hadoop2.7][test-hive1.2] ORC predicate pushdown should work with case-insensitive analysis

2020-08-22 Thread GitBox


AmplabJenkins commented on pull request #29518:
URL: https://github.com/apache/spark/pull/29518#issuecomment-678721402







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #29518: [SPARK-32646][SQL][FOLLOWUP][test-hadoop2.7][test-hive1.2] ORC predicate pushdown should work with case-insensitive analysis

2020-08-22 Thread GitBox


SparkQA commented on pull request #29518:
URL: https://github.com/apache/spark/pull/29518#issuecomment-678721328


   **[Test build #127796 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/127796/testReport)**
 for PR 29518 at commit 
[`5ce7567`](https://github.com/apache/spark/commit/5ce756759b49ff977a8b49c893df30284f21ed96).



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #29518: [SPARK-32646][SQL][FOLLOWUP][test-hadoop2.7][test-hive1.2] ORC predicate pushdown should work with case-insensitive analysis

2020-08-22 Thread GitBox


AmplabJenkins removed a comment on pull request #29518:
URL: https://github.com/apache/spark/pull/29518#issuecomment-678720854


   Test FAILed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/127795/
   Test FAILed.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA removed a comment on pull request #29518: [SPARK-32646][SQL][FOLLOWUP][test-hadoop2.7][test-hive1.2] ORC predicate pushdown should work with case-insensitive analysis

2020-08-22 Thread GitBox


SparkQA removed a comment on pull request #29518:
URL: https://github.com/apache/spark/pull/29518#issuecomment-678720465


   **[Test build #127795 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/127795/testReport)**
 for PR 29518 at commit 
[`960e695`](https://github.com/apache/spark/commit/960e6957e925d74e9ace8931275a952395d55165).



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #29518: [SPARK-32646][SQL][FOLLOWUP][test-hadoop2.7][test-hive1.2] ORC predicate pushdown should work with case-insensitive analysis

2020-08-22 Thread GitBox


AmplabJenkins removed a comment on pull request #29518:
URL: https://github.com/apache/spark/pull/29518#issuecomment-678720852


   Merged build finished. Test FAILed.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #29518: [SPARK-32646][SQL][FOLLOWUP][test-hadoop2.7][test-hive1.2] ORC predicate pushdown should work with case-insensitive analysis

2020-08-22 Thread GitBox


SparkQA commented on pull request #29518:
URL: https://github.com/apache/spark/pull/29518#issuecomment-678720850


   **[Test build #127795 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/127795/testReport)**
 for PR 29518 at commit 
[`960e695`](https://github.com/apache/spark/commit/960e6957e925d74e9ace8931275a952395d55165).
* This patch **fails to build**.
* This patch merges cleanly.
* This patch adds no public classes.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #29518: [SPARK-32646][SQL][FOLLOWUP][test-hadoop2.7][test-hive1.2] ORC predicate pushdown should work with case-insensitive analysis

2020-08-22 Thread GitBox


AmplabJenkins commented on pull request #29518:
URL: https://github.com/apache/spark/pull/29518#issuecomment-678720852







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #29518: [SPARK-32646][SQL][FOLLOWUP][test-hadoop2.7][test-hive1.2] ORC predicate pushdown should work with case-insensitive analysis

2020-08-22 Thread GitBox


AmplabJenkins removed a comment on pull request #29518:
URL: https://github.com/apache/spark/pull/29518#issuecomment-678720546







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] tanelk commented on a change in pull request #29515: [SPARK-32688][SQL][TESTS] Add special values to LiteralGenerator for float and double

2020-08-22 Thread GitBox


tanelk commented on a change in pull request #29515:
URL: https://github.com/apache/spark/pull/29515#discussion_r475160410



##
File path: 
sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/LiteralGenerator.scala
##
@@ -70,14 +70,22 @@ object LiteralGenerator {
 
   lazy val floatLiteralGen: Gen[Literal] =
 for {
-  f <- Gen.chooseNum(Float.MinValue / 2, Float.MaxValue / 2,
-Float.NaN, Float.PositiveInfinity, Float.NegativeInfinity)
+  f <- Gen.oneOf(
+Gen.oneOf(
+  Float.NaN, Float.PositiveInfinity, Float.NegativeInfinity, 
Float.MinPositiveValue,
+  0.0f, -0.0f, 1.0f, -1.0f),
+Arbitrary.arbFloat.arbitrary
+  )

Review comment:
   Disregard this comment. When using more than one generator, then it 
would not generate some of the interesting combinations like `0.0 and -0.0`.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #29518: [SPARK-32646][SQL][FOLLOWUP][test-hadoop2.7][test-hive1.2] ORC predicate pushdown should work with case-insensitive analysis

2020-08-22 Thread GitBox


AmplabJenkins commented on pull request #29518:
URL: https://github.com/apache/spark/pull/29518#issuecomment-678720546







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #29518: [SPARK-32646][SQL][FOLLOWUP][test-hadoop2.7][test-hive1.2] ORC predicate pushdown should work with case-insensitive analysis

2020-08-22 Thread GitBox


SparkQA commented on pull request #29518:
URL: https://github.com/apache/spark/pull/29518#issuecomment-678720465


   **[Test build #127795 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/127795/testReport)**
 for PR 29518 at commit 
[`960e695`](https://github.com/apache/spark/commit/960e6957e925d74e9ace8931275a952395d55165).



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] viirya opened a new pull request #29518: [SPARK-32646][SQL][FOLLOWUP][test-hadoop2.7][test-hive1.2] ORC predicate pushdown should work with case-insensitive analysis

2020-08-22 Thread GitBox


viirya opened a new pull request #29518:
URL: https://github.com/apache/spark/pull/29518


   
   
   ### What changes were proposed in this pull request?
   
   
   This is a followup of #29457 to fix a compilation error.
   
   ### Why are the changes needed?
   
   
   Fix a compilation error under hive1.2 profile.
   
   ### Does this PR introduce _any_ user-facing change?
   
   
   No
   
   ### How was this patch tested?
   
   
   Unit test.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] viirya closed pull request #29517: [DO-NOT-MERGE][SQL][BRANCH-3.0][test-hadoop2.7][test-hive1.2] Test HiveSerDeReadWriteSuite

2020-08-22 Thread GitBox


viirya closed pull request #29517:
URL: https://github.com/apache/spark/pull/29517


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] tanelk edited a comment on pull request #29515: [SPARK-32688][SQL][TESTS] Add special values to LiteralGenerator for float and double

2020-08-22 Thread GitBox


tanelk edited a comment on pull request #29515:
URL: https://github.com/apache/spark/pull/29515#issuecomment-678719262


   > I think we can't commit a change that causes tests to fail of course. The 
fix of the tests would have to go with the fix in underlying code as needed.
   
   I meant, that it could be fixed in another PR, before this PR is merged



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] viirya edited a comment on pull request #29513: [SPARK-32646][SQL][BRANCH-3.0][test-hadoop2.7][test-hive1.2] ORC predicate pushdown should work with case-insensitive analysis

2020-08-22 Thread GitBox


viirya edited a comment on pull request #29513:
URL: https://github.com/apache/spark/pull/29513#issuecomment-678719209


   Err.. I think these tests are already failed in current branch-3.0 branch. 
Please see https://github.com/apache/spark/pull/29517. I created SPARK-32689 to 
track it.
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #29515: [SPARK-32688][SQL][TESTS] Add special values to LiteralGenerator for float and double

2020-08-22 Thread GitBox


AmplabJenkins removed a comment on pull request #29515:
URL: https://github.com/apache/spark/pull/29515#issuecomment-678719289







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #29516: [WIP][SPARK-32614][SQL] Don't apply comment processing if 'comment' unset for CSV

2020-08-22 Thread GitBox


AmplabJenkins removed a comment on pull request #29516:
URL: https://github.com/apache/spark/pull/29516#issuecomment-678719270







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #29515: [SPARK-32688][SQL][TESTS] Add special values to LiteralGenerator for float and double

2020-08-22 Thread GitBox


AmplabJenkins commented on pull request #29515:
URL: https://github.com/apache/spark/pull/29515#issuecomment-678719289







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] tanelk commented on pull request #29515: [SPARK-32688][SQL][TESTS] Add special values to LiteralGenerator for float and double

2020-08-22 Thread GitBox


tanelk commented on pull request #29515:
URL: https://github.com/apache/spark/pull/29515#issuecomment-678719262


   > I think we can't commit a change that causes tests to fail of course. The 
fix of the tests would have to go with the fix in underlying code as needed.
   
   I meant, that it could be fixed before this PR is merged



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #29516: [WIP][SPARK-32614][SQL] Don't apply comment processing if 'comment' unset for CSV

2020-08-22 Thread GitBox


AmplabJenkins commented on pull request #29516:
URL: https://github.com/apache/spark/pull/29516#issuecomment-678719270







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] srowen commented on a change in pull request #29516: [WIP][SPARK-32614][SQL] Don't apply comment processing if 'comment' unset for CSV

2020-08-22 Thread GitBox


srowen commented on a change in pull request #29516:
URL: https://github.com/apache/spark/pull/29516#discussion_r475158855



##
File path: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/csv/CSVExprUtils.scala
##
@@ -25,16 +25,21 @@ object CSVExprUtils {
* This is currently being used in CSV reading path and CSV schema inference.
*/
   def filterCommentAndEmpty(iter: Iterator[String], options: CSVOptions): 
Iterator[String] = {
-iter.filter { line =>
-  line.trim.nonEmpty && !line.startsWith(options.comment.toString)
+if (options.isCommentSet) {
+  val commentPrefix = options.comment.toString
+  iter.filter { line =>
+line.trim.nonEmpty && !line.startsWith(commentPrefix)
+  }
+} else {
+  iter.filter(_.trim.nonEmpty)
 }
   }
 
   def skipComments(iter: Iterator[String], options: CSVOptions): 
Iterator[String] = {
 if (options.isCommentSet) {
   val commentPrefix = options.comment.toString
   iter.dropWhile { line =>
-line.trim.isEmpty || line.trim.startsWith(commentPrefix)
+line.trim.isEmpty || line.startsWith(commentPrefix)

Review comment:
   I think it's correct to _not_ trim the string that's checked to see if 
it starts with a comment, which is a slightly separate issue. `\u` can't be 
used as a comment char, but other non-printable chars _could_.

##
File path: 
sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/csv/CSVSuite.scala
##
@@ -1902,25 +1902,26 @@ abstract class CSVSuite extends QueryTest with 
SharedSparkSession with TestCsvDa
 
   test("SPARK-25387: bad input should not cause NPE") {
 val schema = StructType(StructField("a", IntegerType) :: Nil)
-val input = spark.createDataset(Seq("\u\u\u0001234"))
+val input = spark.createDataset(Seq("\u0001\u\u0001234"))

Review comment:
   I think this test was wrong in 2 ways. First it relied on, actually, 
ignoring lines starting with `\u`, which is the very bug we're fixing. You 
can see below it's asserting there is no result at all, when there should be 
_some_ result.

##
File path: 
sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/csv/CSVSuite.scala
##
@@ -1902,25 +1902,26 @@ abstract class CSVSuite extends QueryTest with 
SharedSparkSession with TestCsvDa
 
   test("SPARK-25387: bad input should not cause NPE") {
 val schema = StructType(StructField("a", IntegerType) :: Nil)
-val input = spark.createDataset(Seq("\u\u\u0001234"))
+val input = spark.createDataset(Seq("\u0001\u\u0001234"))
 
 checkAnswer(spark.read.schema(schema).csv(input), Row(null))
 checkAnswer(spark.read.option("multiLine", 
true).schema(schema).csv(input), Row(null))
-assert(spark.read.csv(input).collect().toSet == Set(Row()))
+assert(spark.read.schema(schema).csv(input).collect().toSet == 
Set(Row(null)))
   }
 
   test("SPARK-31261: bad csv input with `columnNameCorruptRecord` should not 
cause NPE") {
 val schema = StructType(
   StructField("a", IntegerType) :: StructField("_corrupt_record", 
StringType) :: Nil)
-val input = spark.createDataset(Seq("\u\u\u0001234"))
+val input = spark.createDataset(Seq("\u0001\u\u0001234"))
 
 checkAnswer(
   spark.read
 .option("columnNameOfCorruptRecord", "_corrupt_record")
 .schema(schema)
 .csv(input),
-  Row(null, null))
-assert(spark.read.csv(input).collect().toSet == Set(Row()))
+  Row(null, "\u0001\u\u0001234"))

Review comment:
   The other problem I think is that this was asserting there is no corrupt 
record -- no result at all -- when I think clearly the test should result in a 
single row with a corrupt record.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #29516: [WIP][SPARK-32614][SQL] Don't apply comment processing if 'comment' unset for CSV

2020-08-22 Thread GitBox


SparkQA commented on pull request #29516:
URL: https://github.com/apache/spark/pull/29516#issuecomment-678719205


   **[Test build #127793 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/127793/testReport)**
 for PR 29516 at commit 
[`87e8b65`](https://github.com/apache/spark/commit/87e8b65c67fce1bef56e8071e42deca9954fcff8).



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #29515: [SPARK-32688][SQL][TESTS] Add special values to LiteralGenerator for float and double

2020-08-22 Thread GitBox


SparkQA commented on pull request #29515:
URL: https://github.com/apache/spark/pull/29515#issuecomment-678719210


   **[Test build #127794 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/127794/testReport)**
 for PR 29515 at commit 
[`77d63ac`](https://github.com/apache/spark/commit/77d63ac4b9356f2d8db407e05b763f3b4107c80d).



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] viirya commented on pull request #29513: [SPARK-32646][SQL][BRANCH-3.0][test-hadoop2.7][test-hive1.2] ORC predicate pushdown should work with case-insensitive analysis

2020-08-22 Thread GitBox


viirya commented on pull request #29513:
URL: https://github.com/apache/spark/pull/29513#issuecomment-678719209


   Err.. I think these tests are already failed in current branch-3.0 branch. 
Please see https://github.com/apache/spark/pull/29517. 
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] tanelk commented on pull request #29515: [SPARK-32688][SQL][TESTS] Add special values to LiteralGenerator for float and double

2020-08-22 Thread GitBox


tanelk commented on pull request #29515:
URL: https://github.com/apache/spark/pull/29515#issuecomment-678719160


   retest this please



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] tanelk commented on a change in pull request #29515: [SPARK-32688][SQL][TESTS] Add special values to LiteralGenerator for float and double

2020-08-22 Thread GitBox


tanelk commented on a change in pull request #29515:
URL: https://github.com/apache/spark/pull/29515#discussion_r475158533



##
File path: 
sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/LiteralGenerator.scala
##
@@ -70,14 +70,22 @@ object LiteralGenerator {
 
   lazy val floatLiteralGen: Gen[Literal] =
 for {
-  f <- Gen.chooseNum(Float.MinValue / 2, Float.MaxValue / 2,
-Float.NaN, Float.PositiveInfinity, Float.NegativeInfinity)
+  f <- Gen.oneOf(
+Gen.oneOf(
+  Float.NaN, Float.PositiveInfinity, Float.NegativeInfinity, 
Float.MinPositiveValue,

Review comment:
   Sure thing





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] srowen commented on pull request #29515: [SPARK-32688][SQL][TESTS] Add special values to LiteralGenerator for float and double

2020-08-22 Thread GitBox


srowen commented on pull request #29515:
URL: https://github.com/apache/spark/pull/29515#issuecomment-678718652


   I think we can't commit a change that causes tests to fail of course. The 
fix of the tests would have to go with the fix in underlying code as needed.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] tanelk commented on pull request #29515: [SPARK-32688][SQL][TESTS] Add special values to LiteralGenerator for float and double

2020-08-22 Thread GitBox


tanelk commented on pull request #29515:
URL: https://github.com/apache/spark/pull/29515#issuecomment-678718410


   > **[Test build #127791 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/127791/testReport)**
 for PR 29515 at commit 
[`8c8313c`](https://github.com/apache/spark/commit/8c8313c7689c04ce781011c420f193ac2a14d9d9).
   > 
   > * This patch **fails Spark unit tests**.
   > * This patch merges cleanly.
   > * This patch adds no public classes.
   
   This failure is discovered by this change, not caused. Should that be fixed 
by a separate pull request? Not sure which of the two is the correct behavior.
   There could be more like this, but due to the random nature of generated 
values it might not show every build.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #29517: [DO-NOT-MERGE][SQL][BRANCH-3.0][test-hadoop2.7][test-hive1.2] Test HiveSerDeReadWriteSuite

2020-08-22 Thread GitBox


AmplabJenkins removed a comment on pull request #29517:
URL: https://github.com/apache/spark/pull/29517#issuecomment-678718120


   Test FAILed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/127792/
   Test FAILed.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #29517: [DO-NOT-MERGE][SQL][BRANCH-3.0][test-hadoop2.7][test-hive1.2] Test HiveSerDeReadWriteSuite

2020-08-22 Thread GitBox


AmplabJenkins removed a comment on pull request #29517:
URL: https://github.com/apache/spark/pull/29517#issuecomment-678718118


   Merged build finished. Test FAILed.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA removed a comment on pull request #29517: [DO-NOT-MERGE][SQL][BRANCH-3.0][test-hadoop2.7][test-hive1.2] Test HiveSerDeReadWriteSuite

2020-08-22 Thread GitBox


SparkQA removed a comment on pull request #29517:
URL: https://github.com/apache/spark/pull/29517#issuecomment-678710787


   **[Test build #127792 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/127792/testReport)**
 for PR 29517 at commit 
[`0ef55d3`](https://github.com/apache/spark/commit/0ef55d35cf52b0e9d2fcc86a1e3530c7509ada93).



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #29517: [DO-NOT-MERGE][SQL][BRANCH-3.0][test-hadoop2.7][test-hive1.2] Test HiveSerDeReadWriteSuite

2020-08-22 Thread GitBox


AmplabJenkins commented on pull request #29517:
URL: https://github.com/apache/spark/pull/29517#issuecomment-678718118







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #29517: [DO-NOT-MERGE][SQL][BRANCH-3.0][test-hadoop2.7][test-hive1.2] Test HiveSerDeReadWriteSuite

2020-08-22 Thread GitBox


SparkQA commented on pull request #29517:
URL: https://github.com/apache/spark/pull/29517#issuecomment-678718091


   **[Test build #127792 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/127792/testReport)**
 for PR 29517 at commit 
[`0ef55d3`](https://github.com/apache/spark/commit/0ef55d35cf52b0e9d2fcc86a1e3530c7509ada93).
* This patch **fails Spark unit tests**.
* This patch merges cleanly.
* This patch adds no public classes.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] tanelk commented on a change in pull request #29515: [SPARK-32688][SQL][TESTS] Add special values to LiteralGenerator for float and double

2020-08-22 Thread GitBox


tanelk commented on a change in pull request #29515:
URL: https://github.com/apache/spark/pull/29515#discussion_r475157521



##
File path: 
sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/LiteralGenerator.scala
##
@@ -70,14 +70,22 @@ object LiteralGenerator {
 
   lazy val floatLiteralGen: Gen[Literal] =
 for {
-  f <- Gen.chooseNum(Float.MinValue / 2, Float.MaxValue / 2,
-Float.NaN, Float.PositiveInfinity, Float.NegativeInfinity)
+  f <- Gen.oneOf(
+Gen.oneOf(
+  Float.NaN, Float.PositiveInfinity, Float.NegativeInfinity, 
Float.MinPositiveValue,
+  0.0f, -0.0f, 1.0f, -1.0f),
+Arbitrary.arbFloat.arbitrary

Review comment:
   It generates all the possible floating point values equally likely, 
besides the special values: `Float.NaN, Float.PositiveInfinity, 
Float.NegativeInfinity`, that are not returned by the 
`Arbitrary.arbFloat.arbitrary`.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] tanelk commented on a change in pull request #29515: [SPARK-32688][SQL][TESTS] Add special values to LiteralGenerator for float and double

2020-08-22 Thread GitBox


tanelk commented on a change in pull request #29515:
URL: https://github.com/apache/spark/pull/29515#discussion_r475157086



##
File path: 
sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/LiteralGenerator.scala
##
@@ -70,14 +70,22 @@ object LiteralGenerator {
 
   lazy val floatLiteralGen: Gen[Literal] =
 for {
-  f <- Gen.chooseNum(Float.MinValue / 2, Float.MaxValue / 2,
-Float.NaN, Float.PositiveInfinity, Float.NegativeInfinity)
+  f <- Gen.oneOf(
+Gen.oneOf(
+  Float.NaN, Float.PositiveInfinity, Float.NegativeInfinity, 
Float.MinPositiveValue,
+  0.0f, -0.0f, 1.0f, -1.0f),

Review comment:
   They aren't in the sense, that `Arbitrary.arbFloat.arbitrary` can 
generate them, but they are in the sense, that it is more likely, that a 
function could act weirdly at these values. For example `log1p`.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] sunchao commented on a change in pull request #29387: [SPARK-32481][CORE][SQL] Support truncate table to move data to trash

2020-08-22 Thread GitBox


sunchao commented on a change in pull request #29387:
URL: https://github.com/apache/spark/pull/29387#discussion_r475156812



##
File path: 
sql/core/src/test/scala/org/apache/spark/sql/execution/command/DDLSuite.scala
##
@@ -3101,6 +3101,28 @@ abstract class DDLSuite extends QueryTest with 
SQLTestUtils {
   assert(spark.sessionState.catalog.isRegisteredFunction(rand))
 }
   }
+
+  test("Move data to trash on truncate table if enabled") {
+withTable("tab1") {
+  withSQLConf(SQLConf.TRUNCATE_TRASH_ENABLED.key -> "true") {
+sql("CREATE TABLE tab1 (col INT) USING parquet")
+sql("INSERT INTO tab1 SELECT 1")
+
+val tablePath = new Path(spark.sessionState.catalog
+  .getTableMetadata(TableIdentifier("tab1")).storage.locationUri.get)
+val hadoopConf = spark.sessionState.newHadoopConf()
+val fs = tablePath.getFileSystem(hadoopConf)
+// trash interval should be configured from hadoop side
+hadoopConf.setInt("fs.trash.Interval", 5)
+
+val trashRoot = fs.getTrashRoot(tablePath)

Review comment:
   Yes a default impl is defined in `FileSystem` which calls 
`getHomeDirectory` implemented in the same class.
   
   Even though it is supported, it seems the trash mechanism is less useful in 
cloud object stores like S3, where renaming doesn't exist and therefore moving 
to trash is much more expensive. However user can disable that by the configs 
given here and in Hadoop itself.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] c21 commented on a change in pull request #29074: [SPARK-32282][SQL] Improve EnsureRquirement.reorderJoinKeys to handle more scenarios such as PartitioningCollection

2020-08-22 Thread GitBox


c21 commented on a change in pull request #29074:
URL: https://github.com/apache/spark/pull/29074#discussion_r475155778



##
File path: 
sql/core/src/test/scala/org/apache/spark/sql/execution/exchange/EnsureRequirementsSuite.scala
##
@@ -0,0 +1,146 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql.execution.exchange
+
+import org.apache.spark.sql.catalyst.expressions.Literal
+import org.apache.spark.sql.catalyst.plans.Inner
+import org.apache.spark.sql.catalyst.plans.physical.{HashPartitioning, 
PartitioningCollection}
+import org.apache.spark.sql.execution.{DummySparkPlan, SortExec}
+import org.apache.spark.sql.execution.joins.SortMergeJoinExec
+import org.apache.spark.sql.test.SharedSparkSession
+
+class EnsureRequirementsSuite extends SharedSparkSession {
+  private val exprA = Literal(1)
+  private val exprB = Literal(2)
+  private val exprC = Literal(3)
+
+  test("reorder should handle PartitioningCollection") {
+val plan1 = DummySparkPlan(
+  outputPartitioning = PartitioningCollection(Seq(
+HashPartitioning(exprA :: exprB :: Nil, 5),
+HashPartitioning(exprA :: Nil, 5
+val plan2 = DummySparkPlan()
+
+// Test PartitioningCollection on the left side of join.
+val smjExec1 = SortMergeJoinExec(
+  exprB :: exprA :: Nil, exprA :: exprB :: Nil, Inner, None, plan1, plan2)
+EnsureRequirements(spark.sessionState.conf).apply(smjExec1) match {
+  case SortMergeJoinExec(leftKeys, rightKeys, _, _,
+SortExec(_, _,
+  DummySparkPlan(_, _, PartitioningCollection(leftPartitionings), _, 
_), _),
+SortExec(_, _,
+  ShuffleExchangeExec(HashPartitioning(rightPartitioningExpressions, 
_), _, _), _), _) =>
+assert(leftKeys !== smjExec1.leftKeys)
+assert(rightKeys !== smjExec1.rightKeys)
+assert(leftKeys === 
leftPartitionings.head.asInstanceOf[HashPartitioning].expressions)
+assert(rightKeys === rightPartitioningExpressions)
+  case other => fail(other.toString)
+}
+
+// Test PartitioningCollection on the right side of join.
+val smjExec2 = SortMergeJoinExec(
+  exprA :: exprB :: Nil, exprB :: exprA :: Nil, Inner, None, plan2, plan1)
+EnsureRequirements(spark.sessionState.conf).apply(smjExec2) match {
+  case SortMergeJoinExec(leftKeys, rightKeys, _, _,
+SortExec(_, _,
+  ShuffleExchangeExec(HashPartitioning(leftPartitioningExpressions, 
_), _, _), _),
+SortExec(_, _,
+  DummySparkPlan(_, _, PartitioningCollection(rightPartitionings), _, 
_), _), _) =>
+assert(leftKeys !== smjExec2.leftKeys)
+assert(rightKeys !== smjExec2.rightKeys)
+assert(leftKeys === leftPartitioningExpressions)
+assert(rightKeys === 
rightPartitionings.head.asInstanceOf[HashPartitioning].expressions)
+  case other => fail(other.toString)
+}
+
+// Both sides are PartitioningCollection, but left side cannot be reorderd 
to match
+// and it should fall back to the right side.
+val smjExec3 = SortMergeJoinExec(
+  exprA :: exprC :: Nil, exprB :: exprA :: Nil, Inner, None, plan1, plan1)
+EnsureRequirements(spark.sessionState.conf).apply(smjExec3) match {
+  case SortMergeJoinExec(leftKeys, rightKeys, _, _,
+SortExec(_, _,
+  ShuffleExchangeExec(HashPartitioning(leftPartitioningExpressions, 
_), _, _), _),
+SortExec(_, _,
+  DummySparkPlan(_, _, PartitioningCollection(rightPartitionings), _, 
_), _), _) =>
+assert(leftKeys !== smjExec3.leftKeys)
+assert(rightKeys !== smjExec3.rightKeys)
+assert(leftKeys === leftPartitioningExpressions)
+assert(rightKeys === 
rightPartitionings.head.asInstanceOf[HashPartitioning].expressions)
+  case other => fail(other.toString)
+}
+  }
+
+  test("reorder should fallback to the other side partitioning") {
+val plan1 = DummySparkPlan(
+  outputPartitioning = HashPartitioning(exprA :: exprB :: exprC :: Nil, 5))
+val plan2 = DummySparkPlan(
+  outputPartitioning = HashPartitioning(exprB :: exprC :: Nil, 5))
+
+// Test fallback to the right side, which has PartitioningCollection.

Review comment:
 

[GitHub] [spark] AmplabJenkins removed a comment on pull request #29515: [SPARK-32688][SQL][TESTS] Add special values to LiteralGenerator for float and double

2020-08-22 Thread GitBox


AmplabJenkins removed a comment on pull request #29515:
URL: https://github.com/apache/spark/pull/29515#issuecomment-678716609


   Test FAILed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/127791/
   Test FAILed.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #29515: [SPARK-32688][SQL][TESTS] Add special values to LiteralGenerator for float and double

2020-08-22 Thread GitBox


AmplabJenkins removed a comment on pull request #29515:
URL: https://github.com/apache/spark/pull/29515#issuecomment-678716607


   Merged build finished. Test FAILed.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #29515: [SPARK-32688][SQL][TESTS] Add special values to LiteralGenerator for float and double

2020-08-22 Thread GitBox


AmplabJenkins commented on pull request #29515:
URL: https://github.com/apache/spark/pull/29515#issuecomment-678716607







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA removed a comment on pull request #29515: [SPARK-32688][SQL][TESTS] Add special values to LiteralGenerator for float and double

2020-08-22 Thread GitBox


SparkQA removed a comment on pull request #29515:
URL: https://github.com/apache/spark/pull/29515#issuecomment-678707898


   **[Test build #127791 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/127791/testReport)**
 for PR 29515 at commit 
[`8c8313c`](https://github.com/apache/spark/commit/8c8313c7689c04ce781011c420f193ac2a14d9d9).



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #29515: [SPARK-32688][SQL][TESTS] Add special values to LiteralGenerator for float and double

2020-08-22 Thread GitBox


SparkQA commented on pull request #29515:
URL: https://github.com/apache/spark/pull/29515#issuecomment-678716570


   **[Test build #127791 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/127791/testReport)**
 for PR 29515 at commit 
[`8c8313c`](https://github.com/apache/spark/commit/8c8313c7689c04ce781011c420f193ac2a14d9d9).
* This patch **fails Spark unit tests**.
* This patch merges cleanly.
* This patch adds no public classes.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] github-actions[bot] closed pull request #26343: [SPARK-29683][YARN] Job will fail due to executor failures all available nodes are blacklisted

2020-08-22 Thread GitBox


github-actions[bot] closed pull request #26343:
URL: https://github.com/apache/spark/pull/26343


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] github-actions[bot] commented on pull request #27968: [SPARK-31202][CORE]Improve SizeEstimator for AppendOnlyMap

2020-08-22 Thread GitBox


github-actions[bot] commented on pull request #27968:
URL: https://github.com/apache/spark/pull/27968#issuecomment-678714066


   We're closing this PR because it hasn't been updated in a while. This isn't 
a judgement on the merit of the PR in any way. It's just a way of keeping the 
PR queue manageable.
   If you'd like to revive this PR, please reopen it and ask a committer to 
remove the Stale tag!



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] github-actions[bot] commented on pull request #27568: [SPARK-30821][K8S]Handle container failure in executor pods with multiple containers

2020-08-22 Thread GitBox


github-actions[bot] commented on pull request #27568:
URL: https://github.com/apache/spark/pull/27568#issuecomment-678714068


   We're closing this PR because it hasn't been updated in a while. This isn't 
a judgement on the merit of the PR in any way. It's just a way of keeping the 
PR queue manageable.
   If you'd like to revive this PR, please reopen it and ask a committer to 
remove the Stale tag!



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] leanken commented on pull request #29503: [SPARK-32678][SQL] Rename EmptyHashedRelationWithAllNullKeys and simplify NAAJ generated code

2020-08-22 Thread GitBox


leanken commented on pull request #29503:
URL: https://github.com/apache/spark/pull/29503#issuecomment-678713927


   @viirya Test passed, is it ok to merge?



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #29516: [SPARK-32614][SQL] Don't apply comment processing if 'comment' unset for CSV

2020-08-22 Thread GitBox


AmplabJenkins removed a comment on pull request #29516:
URL: https://github.com/apache/spark/pull/29516#issuecomment-678712274


   Test FAILed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/127788/
   Test FAILed.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA removed a comment on pull request #29516: [SPARK-32614][SQL] Don't apply comment processing if 'comment' unset for CSV

2020-08-22 Thread GitBox


SparkQA removed a comment on pull request #29516:
URL: https://github.com/apache/spark/pull/29516#issuecomment-678701603


   **[Test build #127788 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/127788/testReport)**
 for PR 29516 at commit 
[`6358727`](https://github.com/apache/spark/commit/6358727eb1c5bb92715a4448f7179727893937b3).



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #29516: [SPARK-32614][SQL] Don't apply comment processing if 'comment' unset for CSV

2020-08-22 Thread GitBox


AmplabJenkins removed a comment on pull request #29516:
URL: https://github.com/apache/spark/pull/29516#issuecomment-678712272


   Merged build finished. Test FAILed.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #29516: [SPARK-32614][SQL] Don't apply comment processing if 'comment' unset for CSV

2020-08-22 Thread GitBox


AmplabJenkins commented on pull request #29516:
URL: https://github.com/apache/spark/pull/29516#issuecomment-678712272







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #29516: [SPARK-32614][SQL] Don't apply comment processing if 'comment' unset for CSV

2020-08-22 Thread GitBox


SparkQA commented on pull request #29516:
URL: https://github.com/apache/spark/pull/29516#issuecomment-678712244


   **[Test build #127788 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/127788/testReport)**
 for PR 29516 at commit 
[`6358727`](https://github.com/apache/spark/commit/6358727eb1c5bb92715a4448f7179727893937b3).
* This patch **fails Spark unit tests**.
* This patch merges cleanly.
* This patch adds no public classes.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #29517: [DO-NOT-MERGE][SQL][BRANCH-3.0][test-hadoop2.7][test-hive1.2] Test HiveSerDeReadWriteSuite

2020-08-22 Thread GitBox


AmplabJenkins removed a comment on pull request #29517:
URL: https://github.com/apache/spark/pull/29517#issuecomment-678710897







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #29517: [DO-NOT-MERGE][SQL][BRANCH-3.0][test-hadoop2.7][test-hive1.2] Test HiveSerDeReadWriteSuite

2020-08-22 Thread GitBox


AmplabJenkins commented on pull request #29517:
URL: https://github.com/apache/spark/pull/29517#issuecomment-678710897







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #29517: [DO-NOT-MERGE][SQL][BRANCH-3.0][test-hadoop2.7][test-hive1.2] Test HiveSerDeReadWriteSuite

2020-08-22 Thread GitBox


SparkQA commented on pull request #29517:
URL: https://github.com/apache/spark/pull/29517#issuecomment-678710787


   **[Test build #127792 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/127792/testReport)**
 for PR 29517 at commit 
[`0ef55d3`](https://github.com/apache/spark/commit/0ef55d35cf52b0e9d2fcc86a1e3530c7509ada93).



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] c21 removed a comment on pull request #29097: [SPARK-32299] [SQL] Decide SMJ Join Orientation adaptively

2020-08-22 Thread GitBox


c21 removed a comment on pull request #29097:
URL: https://github.com/apache/spark/pull/29097#issuecomment-678710762


   I have similar concern with @gatorsmile . I think this also depends on the 
run-time cardinality of data.
   
   E.g., if left side is smaller than right side, but every row from left side 
is same, and every row from right side is not same (unique). We should buffer 
right side here even though ride side is larger, because if we buffer left 
side, we essentially need to read all left side into the buffer.
   
   In addition, this PR is swapping left and right side based on total size. 
However, during run-time, each task/partition can have different amount of data 
per left + right side. I think simply swapping left and right side here might 
cause some tasks to regress but some tasks to improve.
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] c21 commented on pull request #29097: [SPARK-32299] [SQL] Decide SMJ Join Orientation adaptively

2020-08-22 Thread GitBox


c21 commented on pull request #29097:
URL: https://github.com/apache/spark/pull/29097#issuecomment-678710762


   I have similar concern with @gatorsmile . I think this also depends on the 
run-time cardinality of data.
   
   E.g., if left side is smaller than right side, but every row from left side 
is same, and every row from right side is not same (unique). We should buffer 
right side here even though ride side is larger, because if we buffer left 
side, we essentially need to read all left side into the buffer.
   
   In addition, this PR is swapping left and right side based on total size. 
However, during run-time, each task/partition can have different amount of data 
per left + right side. I think simply swapping left and right side here might 
cause some tasks to regress but some tasks to improve.
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] viirya opened a new pull request #29517: [DO-NOT-MERGE][SQL][BRANCH-3.0][test-hadoop2.7][test-hive1.2] Test HiveSerDeReadWriteSuite

2020-08-22 Thread GitBox


viirya opened a new pull request #29517:
URL: https://github.com/apache/spark/pull/29517


   This is just used to run test against hadoop2.7 + hive1.2 with branch-3.0 
branch. Will close it after test.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] viirya commented on pull request #29513: [SPARK-32646][SQL][BRANCH-3.0][test-hadoop2.7][test-hive1.2] ORC predicate pushdown should work with case-insensitive analysis

2020-08-22 Thread GitBox


viirya commented on pull request #29513:
URL: https://github.com/apache/spark/pull/29513#issuecomment-678710381


   Not sure if these errors are related.
   
   E.g., for 
`org.apache.spark.sql.hive.execution.HiveSerDeReadWriteSuite.Read/Write Hive 
PARQUET serde table`, this is the query plan:
   
   ```
   == Parsed Logical Plan ==
   'UnresolvedRelation [hive_serde]
   
   == Analyzed Logical Plan ==
   c1: date
   SubqueryAlias spark_catalog.default.hive_serde
   +- HiveTableRelation `default`.`hive_serde`, 
org.apache.hadoop.hive.ql.io.parquet.serde.ParquetHiveSerDe, [c1#40752]
   
   == Optimized Logical Plan ==
   HiveTableRelation `default`.`hive_serde`, 
org.apache.hadoop.hive.ql.io.parquet.serde.ParquetHiveSerDe, [c1#40752]
   
   == Physical Plan ==
   Scan hive default.hive_serde [c1#40752], HiveTableRelation 
`default`.`hive_serde`, 
org.apache.hadoop.hive.ql.io.parquet.serde.ParquetHiveSerDe, [c1#40752]
   ```
   
   ORC unrelated and no pushdown predicate.
   
   Btw, I cannot reproduce the errors locally.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] srowen commented on a change in pull request #29515: [SPARK-32688][SQL][TESTS] Add special values to LiteralGenerator for float and double

2020-08-22 Thread GitBox


srowen commented on a change in pull request #29515:
URL: https://github.com/apache/spark/pull/29515#discussion_r475148456



##
File path: 
sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/LiteralGenerator.scala
##
@@ -70,14 +70,22 @@ object LiteralGenerator {
 
   lazy val floatLiteralGen: Gen[Literal] =
 for {
-  f <- Gen.chooseNum(Float.MinValue / 2, Float.MaxValue / 2,
-Float.NaN, Float.PositiveInfinity, Float.NegativeInfinity)
+  f <- Gen.oneOf(
+Gen.oneOf(
+  Float.NaN, Float.PositiveInfinity, Float.NegativeInfinity, 
Float.MinPositiveValue,

Review comment:
   Do you want MaxValue in here too, as the largest non-infinite float? 
same for double





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #29515: [SPARK-32688][SQL][TESTS] Add special values to LiteralGenerator for float and double

2020-08-22 Thread GitBox


SparkQA commented on pull request #29515:
URL: https://github.com/apache/spark/pull/29515#issuecomment-678707898


   **[Test build #127791 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/127791/testReport)**
 for PR 29515 at commit 
[`8c8313c`](https://github.com/apache/spark/commit/8c8313c7689c04ce781011c420f193ac2a14d9d9).



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] maropu commented on pull request #29515: [SPARK-32688][SQL][TESTS] Add special values to LiteralGenerator for float and double

2020-08-22 Thread GitBox


maropu commented on pull request #29515:
URL: https://github.com/apache/spark/pull/29515#issuecomment-678707533


   also cc: @srowen 



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



  1   2   3   4   >