[GitHub] [spark] HyukjinKwon commented on pull request #29992: [SPARK-32881][CORE] Catch some race condition errors and log them more clearly

2020-10-20 Thread GitBox


HyukjinKwon commented on pull request #29992:
URL: https://github.com/apache/spark/pull/29992#issuecomment-713227168


   I can give a try. Do you have any stacktrace or symptom for this NPE issue? 
Seems SPARK-32881 did not throw NPE.
   
   @dongjoon-hyun, should SPARK-32881 be resolved after this fix, or is it open 
mistakenly?



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #30108: [SPARK-33197][SQL] Make changes to spark.sql.analyzer.maxIterations take effect at runtime

2020-10-20 Thread GitBox


SparkQA commented on pull request #30108:
URL: https://github.com/apache/spark/pull/30108#issuecomment-713226067


   **[Test build #130062 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/130062/testReport)**
 for PR 30108 at commit 
[`27cb500`](https://github.com/apache/spark/commit/27cb5004711d21d4d8f33f0194207811897f6932).



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] HyukjinKwon commented on pull request #30112: [SPARK-33199] Mesos Task Failed when pyFiles and docker image option used together

2020-10-20 Thread GitBox


HyukjinKwon commented on pull request #30112:
URL: https://github.com/apache/spark/pull/30112#issuecomment-713225959


   ok to test



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] HyukjinKwon commented on pull request #30108: [SPARK-33197][SQL] Make changes to spark.sql.analyzer.maxIterations take effect at runtime

2020-10-20 Thread GitBox


HyukjinKwon commented on pull request #30108:
URL: https://github.com/apache/spark/pull/30108#issuecomment-713225634


   ok to test



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] HyukjinKwon commented on pull request #30107: [WIP][SPARK-33196][SQL] Expose filtered aggregations in spark.sql.functions

2020-10-20 Thread GitBox


HyukjinKwon commented on pull request #30107:
URL: https://github.com/apache/spark/pull/30107#issuecomment-713225028


   I agree with @zero323. It was added into SparkSQL more because of ANSI 
standard, see SPARK-27986. DSL wouldn't necessarily have to have it.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] holdenk commented on pull request #29992: [SPARK-32881][CORE] Catch some race condition errors and log them more clearly

2020-10-20 Thread GitBox


holdenk commented on pull request #29992:
URL: https://github.com/apache/spark/pull/29992#issuecomment-713224196


   Hey sorry I didn’t see your comments. To avoid this I don’t think we could 
do if else, we’d have to introduce locking I think. Since this only happens 
under a race condition where we don’t care about the data I think it’s ok. If 
you’d prefer we could also take that part out since the NPE will get eaten 
later. Happy to review a PR with whatever approach you want or work on a follow 
up if you want.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] github-actions[bot] commented on pull request #29013: [SPARK-32196][SQL] Extract In convertible part if it is not convertible

2020-10-20 Thread GitBox


github-actions[bot] commented on pull request #29013:
URL: https://github.com/apache/spark/pull/29013#issuecomment-713224018


   We're closing this PR because it hasn't been updated in a while. This isn't 
a judgement on the merit of the PR in any way. It's just a way of keeping the 
PR queue manageable.
   If you'd like to revive this PR, please reopen it and ask a committer to 
remove the Stale tag!



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] github-actions[bot] closed pull request #29071: [SPARK-24985][SQL]Avoid full outer join OOM on skewed dataset

2020-10-20 Thread GitBox


github-actions[bot] closed pull request #29071:
URL: https://github.com/apache/spark/pull/29071


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #30019: [SPARK-33135][CORE] Use listLocatedStatus from FileSystem implementations

2020-10-20 Thread GitBox


AmplabJenkins commented on pull request #30019:
URL: https://github.com/apache/spark/pull/30019#issuecomment-713222113







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] tanelk commented on pull request #30093: [SPARK-33183][SQL] Fix EliminateSorts bug when removing global sorts

2020-10-20 Thread GitBox


tanelk commented on pull request #30093:
URL: https://github.com/apache/spark/pull/30093#issuecomment-713222032


   > With the proposed logic, the initial `global` value will be false, which 
will lead to the `Repartition` case to return `false`, and the 
`recursiveRemoveSort` will fail to eliminate the child local sort `Sort('b.asc, 
false, _)`
   > 
   > Can you provide an example when your logic can eliminate local child sorts 
inside a local sort while the current logic cannot?
   
   Ah yes, It seems we wouln't have to change the `canEliminateSort` at all.
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #30019: [SPARK-33135][CORE] Use listLocatedStatus from FileSystem implementations

2020-10-20 Thread GitBox


AmplabJenkins removed a comment on pull request #30019:
URL: https://github.com/apache/spark/pull/30019#issuecomment-713222113







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] HyukjinKwon edited a comment on pull request #30098: [SPARK-33190][INFRA][TESTS] Set upper bound of PyArrow version in GitHub Actions

2020-10-20 Thread GitBox


HyukjinKwon edited a comment on pull request #30098:
URL: https://github.com/apache/spark/pull/30098#issuecomment-712627234


   There are few things to note:
   - This is a temporary fix. Once PySpark supports PyArrow 2.0.0+ 
(SPARK-33189), we can remove this change.
   - We should port this back into other branches in case PyArrow 2.0.0+ 
support is not ported back, and in order to make the builds pass.
   - Python 3.8 build will pass because the packages are pre-installed in the 
docker image (see SPARK-33162), and PyPy3 does not have PyArrow. It fails with 
Python 3.6 because it newly installs the latest PyArrow 2.0.0.
   - I didn't update documentation and `setup.py` yet. This PR currently aims 
to make the build pass first.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA removed a comment on pull request #30019: [SPARK-33135][CORE] Use listLocatedStatus from FileSystem implementations

2020-10-20 Thread GitBox


SparkQA removed a comment on pull request #30019:
URL: https://github.com/apache/spark/pull/30019#issuecomment-713163024


   **[Test build #130059 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/130059/testReport)**
 for PR 30019 at commit 
[`3c0ad25`](https://github.com/apache/spark/commit/3c0ad259bea09095da30863af9e614e23cfff0d6).



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] HyukjinKwon commented on pull request #30098: [SPARK-33190][INFRA][TESTS] Set upper bound of PyArrow version in GitHub Actions

2020-10-20 Thread GitBox


HyukjinKwon commented on pull request #30098:
URL: https://github.com/apache/spark/pull/30098#issuecomment-713221535


   Thank you guys.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #30019: [SPARK-33135][CORE] Use listLocatedStatus from FileSystem implementations

2020-10-20 Thread GitBox


SparkQA commented on pull request #30019:
URL: https://github.com/apache/spark/pull/30019#issuecomment-713221562


   **[Test build #130059 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/130059/testReport)**
 for PR 30019 at commit 
[`3c0ad25`](https://github.com/apache/spark/commit/3c0ad259bea09095da30863af9e614e23cfff0d6).
* This patch passes all tests.
* This patch merges cleanly.
* This patch adds no public classes.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #29122: [SPARK-32320][PYSPARK] Remove mutable default arguments

2020-10-20 Thread GitBox


SparkQA commented on pull request #29122:
URL: https://github.com/apache/spark/pull/29122#issuecomment-713221442


   **[Test build #130061 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/130061/testReport)**
 for PR 29122 at commit 
[`e699c49`](https://github.com/apache/spark/commit/e699c49f171def64bf96b14386e610826c1413aa).



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] HyukjinKwon commented on pull request #30104: Fix tests failing with rounding errors

2020-10-20 Thread GitBox


HyukjinKwon commented on pull request #30104:
URL: https://github.com/apache/spark/pull/30104#issuecomment-713221210


   @AlessandroPatti, mind filing a JIRA?



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] HyukjinKwon commented on pull request #29122: [SPARK-32320][PYSPARK] Remove mutable default arguments

2020-10-20 Thread GitBox


HyukjinKwon commented on pull request #29122:
URL: https://github.com/apache/spark/pull/29122#issuecomment-713220529


   cc @huaxingao and @zhengruifeng FYI



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #30109: [MINOR][CORE] Improve log message during storage decommission

2020-10-20 Thread GitBox


AmplabJenkins removed a comment on pull request #30109:
URL: https://github.com/apache/spark/pull/30109#issuecomment-713220262







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #30109: [MINOR][CORE] Improve log message during storage decommission

2020-10-20 Thread GitBox


AmplabJenkins commented on pull request #30109:
URL: https://github.com/apache/spark/pull/30109#issuecomment-713220262







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #29122: [SPARK-32320][PYSPARK] Remove mutable default arguments

2020-10-20 Thread GitBox


AmplabJenkins removed a comment on pull request #29122:
URL: https://github.com/apache/spark/pull/29122#issuecomment-658859170


   Can one of the admins verify this patch?



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] HyukjinKwon commented on pull request #29122: [SPARK-32320][PYSPARK] Remove mutable default arguments

2020-10-20 Thread GitBox


HyukjinKwon commented on pull request #29122:
URL: https://github.com/apache/spark/pull/29122#issuecomment-713220215


   ok to test



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA removed a comment on pull request #30109: [MINOR][CORE] Improve log message during storage decommission

2020-10-20 Thread GitBox


SparkQA removed a comment on pull request #30109:
URL: https://github.com/apache/spark/pull/30109#issuecomment-713159211


   **[Test build #130058 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/130058/testReport)**
 for PR 30109 at commit 
[`6655c88`](https://github.com/apache/spark/commit/6655c885a4091d413df9219d437b0e46f59ef88f).



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #30109: [MINOR][CORE] Improve log message during storage decommission

2020-10-20 Thread GitBox


SparkQA commented on pull request #30109:
URL: https://github.com/apache/spark/pull/30109#issuecomment-713219693


   **[Test build #130058 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/130058/testReport)**
 for PR 30109 at commit 
[`6655c88`](https://github.com/apache/spark/commit/6655c885a4091d413df9219d437b0e46f59ef88f).
* This patch passes all tests.
* This patch merges cleanly.
* This patch adds no public classes.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA removed a comment on pull request #28026: [SPARK-31257][SQL] Unify create table syntax

2020-10-20 Thread GitBox


SparkQA removed a comment on pull request #28026:
URL: https://github.com/apache/spark/pull/28026#issuecomment-713076071


   **[Test build #130050 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/130050/testReport)**
 for PR 28026 at commit 
[`2de1f03`](https://github.com/apache/spark/commit/2de1f0348aac94f02726a592f2b9896fc62da41d).



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] viirya commented on pull request #29812: [SPARK-32941][SQL] Optimize UpdateFields expression chain and put the rule early in Analysis phase

2020-10-20 Thread GitBox


viirya commented on pull request #29812:
URL: https://github.com/apache/spark/pull/29812#issuecomment-713217464


   Thanks @maropu 



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #30112: [SPARK-33199] Mesos Task Failed when pyFiles and docker image option used together

2020-10-20 Thread GitBox


AmplabJenkins removed a comment on pull request #30112:
URL: https://github.com/apache/spark/pull/30112#issuecomment-713211454


   Can one of the admins verify this patch?



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #28026: [SPARK-31257][SQL] Unify create table syntax

2020-10-20 Thread GitBox


AmplabJenkins removed a comment on pull request #28026:
URL: https://github.com/apache/spark/pull/28026#issuecomment-713211479







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] HyukjinKwon commented on pull request #29992: [SPARK-32881][CORE] Catch some race condition errors and log them more clearly

2020-10-20 Thread GitBox


HyukjinKwon commented on pull request #29992:
URL: https://github.com/apache/spark/pull/29992#issuecomment-713216544


   gentile ping. Would you mind asking to address my comments? I thought it 
wouldn't be difficult to address.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] c21 commented on pull request #29342: [SPARK-32399][SQL] Full outer shuffled hash join

2020-10-20 Thread GitBox


c21 commented on pull request #29342:
URL: https://github.com/apache/spark/pull/29342#issuecomment-713216634


   > BHJ doesn't support full outer join currently. Would it be possible to 
improve BHJ similarly to support full outer join as it's done here for SHJ?
   
   @Tagar - the major blocker is to figure out the non-matched rows from 
broadcasted side, as @maropu said. We did some brainstorming internally and one 
idea can be to make one single task to output all non-matched rows after 
collecting information from all other tasks.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] maropu commented on pull request #29812: [SPARK-32941][SQL] Optimize UpdateFields expression chain and put the rule early in Analysis phase

2020-10-20 Thread GitBox


maropu commented on pull request #29812:
URL: https://github.com/apache/spark/pull/29812#issuecomment-713215442


   (late LGTM)



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] maropu commented on a change in pull request #30101: [SPARK-33193][SQL][TEST] Hive ThriftServer JDBC Database MetaData API Behavior Auditing

2020-10-20 Thread GitBox


maropu commented on a change in pull request #30101:
URL: https://github.com/apache/spark/pull/30101#discussion_r508915667



##
File path: 
sql/hive-thriftserver/src/test/scala/org/apache/spark/sql/hive/thriftserver/SparkMetadataOperationSuite.scala
##
@@ -396,4 +400,187 @@ class SparkMetadataOperationSuite extends 
HiveThriftJdbcTest {
   }
 }
   }
+
+  test("Hive ThriftServer JDBC Database MetaData API Auditing") {

Review comment:
   How about splitting this test into the two parts?
   ```
   test("Hive ThriftServer JDBC Database MetaData API Auditing - supported") {
   test("Hive ThriftServer JDBC Database MetaData API Auditing - not 
supported") {
   ```

##
File path: 
sql/hive-thriftserver/src/test/scala/org/apache/spark/sql/hive/thriftserver/SparkMetadataOperationSuite.scala
##
@@ -396,4 +400,187 @@ class SparkMetadataOperationSuite extends 
HiveThriftJdbcTest {
   }
 }
   }
+
+  test("Hive ThriftServer JDBC Database MetaData API Auditing") {
+withJdbcStatement() { statement =>
+  val metaData = statement.getConnection.getMetaData
+  Seq(
+() => metaData.allProceduresAreCallable(),
+() => metaData.getURL,
+() => metaData.getUserName,
+() => metaData.isReadOnly,
+() => metaData.nullsAreSortedHigh,
+() => metaData.nullsAreSortedLow,
+() => metaData.nullsAreSortedAtStart(),
+() => metaData.nullsAreSortedAtEnd(),
+() => metaData.usesLocalFiles(),
+() => metaData.usesLocalFilePerTable(),
+() => metaData.supportsMixedCaseIdentifiers(),

Review comment:
   nit: we don't need `()` here: 
https://github.com/databricks/scala-style-guide#parentheses

##
File path: 
sql/hive-thriftserver/src/test/scala/org/apache/spark/sql/hive/thriftserver/SparkMetadataOperationSuite.scala
##
@@ -396,4 +400,187 @@ class SparkMetadataOperationSuite extends 
HiveThriftJdbcTest {
   }
 }
   }
+
+  test("Hive ThriftServer JDBC Database MetaData API Auditing") {
+withJdbcStatement() { statement =>
+  val metaData = statement.getConnection.getMetaData
+  Seq(
+() => metaData.allProceduresAreCallable(),
+() => metaData.getURL,
+() => metaData.getUserName,
+() => metaData.isReadOnly,
+() => metaData.nullsAreSortedHigh,
+() => metaData.nullsAreSortedLow,
+() => metaData.nullsAreSortedAtStart(),
+() => metaData.nullsAreSortedAtEnd(),
+() => metaData.usesLocalFiles(),
+() => metaData.usesLocalFilePerTable(),
+() => metaData.supportsMixedCaseIdentifiers(),
+() => metaData.supportsMixedCaseQuotedIdentifiers(),
+() => metaData.storesUpperCaseIdentifiers(),
+() => metaData.storesUpperCaseQuotedIdentifiers(),
+() => metaData.storesLowerCaseIdentifiers(),
+() => metaData.storesLowerCaseQuotedIdentifiers(),
+() => metaData.storesMixedCaseIdentifiers(),
+() => metaData.storesMixedCaseQuotedIdentifiers(),
+() => metaData.getSQLKeywords,
+() => metaData.nullPlusNonNullIsNull,
+() => metaData.supportsConvert,
+() => metaData.supportsTableCorrelationNames,
+() => metaData.supportsDifferentTableCorrelationNames,
+() => metaData.supportsExpressionsInOrderBy(),
+() => metaData.supportsOrderByUnrelated,
+() => metaData.supportsGroupByUnrelated,
+() => metaData.supportsGroupByBeyondSelect,
+() => metaData.supportsLikeEscapeClause,
+() => metaData.supportsMultipleTransactions,
+() => metaData.supportsMinimumSQLGrammar,
+() => metaData.supportsCoreSQLGrammar,
+() => metaData.supportsExtendedSQLGrammar,
+() => metaData.supportsANSI92EntryLevelSQL,
+() => metaData.supportsANSI92IntermediateSQL,
+() => metaData.supportsANSI92FullSQL,
+() => metaData.supportsIntegrityEnhancementFacility,
+() => metaData.isCatalogAtStart,
+() => metaData.supportsSubqueriesInComparisons,
+() => metaData.supportsSubqueriesInExists,
+() => metaData.supportsSubqueriesInIns,
+() => metaData.supportsSubqueriesInQuantifieds,
+// Spark support this, see 
https://issues.apache.org/jira/browse/SPARK-18455

Review comment:
   This comment looks a bit confusing. btw, could we fix this in followup 
activities? If we can, could you file it in jira?





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #30112: [SPARK-33199] Mesos Task Failed when pyFiles and docker image option used together

2020-10-20 Thread GitBox


AmplabJenkins commented on pull request #30112:
URL: https://github.com/apache/spark/pull/30112#issuecomment-713213685


   Can one of the admins verify this patch?



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] HyukjinKwon closed pull request #30111: [SPARK-33189][PYTHON][TESTS] Add env var to tests for legacy nested timestamps in pyarrow

2020-10-20 Thread GitBox


HyukjinKwon closed pull request #30111:
URL: https://github.com/apache/spark/pull/30111


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #30112: [SPARK-33199] Mesos Task Failed when pyFiles and docker image option used together

2020-10-20 Thread GitBox


AmplabJenkins commented on pull request #30112:
URL: https://github.com/apache/spark/pull/30112#issuecomment-713211454


   Can one of the admins verify this patch?



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] HyukjinKwon commented on pull request #30111: [SPARK-33189][PYTHON][TESTS] Add env var to tests for legacy nested timestamps in pyarrow

2020-10-20 Thread GitBox


HyukjinKwon commented on pull request #30111:
URL: https://github.com/apache/spark/pull/30111#issuecomment-713211611


   Thanks @BryanCutler. Merged to master, branch-3.0 and branch-2.4.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #28026: [SPARK-31257][SQL] Unify create table syntax

2020-10-20 Thread GitBox


AmplabJenkins commented on pull request #28026:
URL: https://github.com/apache/spark/pull/28026#issuecomment-713211479







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] farhan5900 opened a new pull request #30112: [SPARK-33199] Mesos Task Failed when pyFiles and docker image option used together

2020-10-20 Thread GitBox


farhan5900 opened a new pull request #30112:
URL: https://github.com/apache/spark/pull/30112


   ### What changes were proposed in this pull request?
   This PR removes generic `shellEscape` and put it in specific places. More 
specifically shell-escape only appName, mainClass, default, and driverConf.
   
   ### Why are the changes needed?
   Changes are needed because we see PySpark jobs fail to launch when 1) run 
with docker and 2) including --py-files
   
   ### Does this PR introduce _any_ user-facing change?
   No
   
   ### How was this patch tested?
   - Unit Test



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #28026: [SPARK-31257][SQL] Unify create table syntax

2020-10-20 Thread GitBox


SparkQA commented on pull request #28026:
URL: https://github.com/apache/spark/pull/28026#issuecomment-713210640


   **[Test build #130050 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/130050/testReport)**
 for PR 28026 at commit 
[`2de1f03`](https://github.com/apache/spark/commit/2de1f0348aac94f02726a592f2b9896fc62da41d).
* This patch passes all tests.
* This patch merges cleanly.
* This patch adds no public classes.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] HyukjinKwon edited a comment on pull request #30098: [SPARK-33190][INFRA][TESTS] Set upper bound of PyArrow version in GitHub Actions

2020-10-20 Thread GitBox


HyukjinKwon edited a comment on pull request #30098:
URL: https://github.com/apache/spark/pull/30098#issuecomment-712627234


   There are few things to note:
   - This is a temporary fix. Once PySpark supports PyArrow 2.0.0+ 
(SPARK-33189), we can remove this change.
   - We should port this back into other branches in case PyArrow 2.0.0+ 
support is not ported back, and in order to make the builds pass.
   - PyPy3 and Python 3.8 build will pass because the packages are 
pre-installed in the docker image (see SPARK-33162). It fails with Python 3.6 
because it newly installs the latest PyArrow 2.0.0.
   - I didn't update documentation and `setup.py` yet. This PR currently aims 
to make the build pass first.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] maropu commented on pull request #29342: [SPARK-32399][SQL] Full outer shuffled hash join

2020-10-20 Thread GitBox


maropu commented on pull request #29342:
URL: https://github.com/apache/spark/pull/29342#issuecomment-713208202


   How do we know if join keys in a build(broadcasted) side don't exist in a 
stream side? If we can implement it cleanly, it looks okay to add it.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] maropu commented on a change in pull request #30093: [SPARK-33183][SQL] Fix EliminateSorts bug when removing global sorts

2020-10-20 Thread GitBox


maropu commented on a change in pull request #30093:
URL: https://github.com/apache/spark/pull/30093#discussion_r508902492



##
File path: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala
##
@@ -1052,11 +1052,11 @@ object CombineFilters extends Rule[LogicalPlan] with 
PredicateHelper {
  *function is order irrelevant
  */
 object EliminateSorts extends Rule[LogicalPlan] {
-  def apply(plan: LogicalPlan): LogicalPlan = plan transform {
+  def apply(plan: LogicalPlan): LogicalPlan = plan transformUp {

Review comment:
   I remember this comment: 
https://github.com/apache/spark/pull/21072#discussion_r183392394





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] maropu commented on a change in pull request #30093: [SPARK-33183][SQL] Fix EliminateSorts bug when removing global sorts

2020-10-20 Thread GitBox


maropu commented on a change in pull request #30093:
URL: https://github.com/apache/spark/pull/30093#discussion_r508901147



##
File path: 
sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/optimizer/EliminateSortsSuite.scala
##
@@ -97,12 +97,27 @@ class EliminateSortsSuite extends PlanTest {
 comparePlans(optimized, correctAnswer)
   }
 
-  test("remove redundant order by") {
+  test("SPARK-33183: remove redundant sortBy") {
 val orderedPlan = testRelation.select('a, 'b).orderBy('a.asc, 
'b.desc_nullsFirst)
-val unnecessaryReordered = orderedPlan.limit(2).select('a).orderBy('a.asc, 
'b.desc_nullsFirst)

Review comment:
   hm, but this update can cause performance regression? (Just an idea) we 
cannot add a new physical rewrite rule in the preparation phase for this case?





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #30109: [MINOR][CORE] Improve log message during storage decommission

2020-10-20 Thread GitBox


AmplabJenkins removed a comment on pull request #30109:
URL: https://github.com/apache/spark/pull/30109#issuecomment-713183978







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #30110: [SPARK-33198][CORE] getMigrationBlocks should not fail at missing files

2020-10-20 Thread GitBox


AmplabJenkins removed a comment on pull request #30110:
URL: https://github.com/apache/spark/pull/30110#issuecomment-713196807







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA removed a comment on pull request #30110: [SPARK-33198][CORE] getMigrationBlocks should not fail at missing files

2020-10-20 Thread GitBox


SparkQA removed a comment on pull request #30110:
URL: https://github.com/apache/spark/pull/30110#issuecomment-713121308


   **[Test build #130055 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/130055/testReport)**
 for PR 30110 at commit 
[`73864b0`](https://github.com/apache/spark/commit/73864b0c633ad5e9479c4c52ee8ac97b839495b2).



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #30111: [SPARK-33189][PYTHON][TESTS] Add env var to tests for legacy nested timestamps in pyarrow

2020-10-20 Thread GitBox


AmplabJenkins removed a comment on pull request #30111:
URL: https://github.com/apache/spark/pull/30111#issuecomment-713188776







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #30111: [SPARK-33189][PYTHON][TESTS] Add env var to tests for legacy nested timestamps in pyarrow

2020-10-20 Thread GitBox


AmplabJenkins commented on pull request #30111:
URL: https://github.com/apache/spark/pull/30111#issuecomment-713197239







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #30111: [SPARK-33189][PYTHON][TESTS] Add env var to tests for legacy nested timestamps in pyarrow

2020-10-20 Thread GitBox


SparkQA commented on pull request #30111:
URL: https://github.com/apache/spark/pull/30111#issuecomment-713197230


   Kubernetes integration test status success
   URL: 
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/34669/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA removed a comment on pull request #30111: [SPARK-33189][PYTHON][TESTS] Add env var to tests for legacy nested timestamps in pyarrow

2020-10-20 Thread GitBox


SparkQA removed a comment on pull request #30111:
URL: https://github.com/apache/spark/pull/30111#issuecomment-713173956


   **[Test build #130060 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/130060/testReport)**
 for PR 30111 at commit 
[`a0c8ff6`](https://github.com/apache/spark/commit/a0c8ff6538b0ea3da8d99c23c1a003b959409975).



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #30019: [SPARK-33135][CORE] Use listLocatedStatus from FileSystem implementations

2020-10-20 Thread GitBox


AmplabJenkins removed a comment on pull request #30019:
URL: https://github.com/apache/spark/pull/30019#issuecomment-713189478







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #30110: [SPARK-33198][CORE] getMigrationBlocks should not fail at missing files

2020-10-20 Thread GitBox


AmplabJenkins commented on pull request #30110:
URL: https://github.com/apache/spark/pull/30110#issuecomment-713196801







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #30110: [SPARK-33198][CORE] getMigrationBlocks should not fail at missing files

2020-10-20 Thread GitBox


SparkQA commented on pull request #30110:
URL: https://github.com/apache/spark/pull/30110#issuecomment-713195964


   **[Test build #130055 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/130055/testReport)**
 for PR 30110 at commit 
[`73864b0`](https://github.com/apache/spark/commit/73864b0c633ad5e9479c4c52ee8ac97b839495b2).
* This patch passes all tests.
* This patch merges cleanly.
* This patch adds no public classes.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] sunchao commented on pull request #29843: [SPARK-29250][BUILD] Move to shaded clients for Hadoop 3.2

2020-10-20 Thread GitBox


sunchao commented on pull request #29843:
URL: https://github.com/apache/spark/pull/29843#issuecomment-713194961


   > Should we create a new JIRA for moving to shaded client?
   
   I'm fine with a new JIRA - was going to use SPARK-29250 (which is titled 
"Upgrade to Hadoop 3.2.1 and move to shaded client") for both PRs.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] dbtsai commented on pull request #29843: [SPARK-29250][BUILD] Move to shaded clients for Hadoop 3.2

2020-10-20 Thread GitBox


dbtsai commented on pull request #29843:
URL: https://github.com/apache/spark/pull/29843#issuecomment-713194268


   Should we create a new JIRA for moving to shaded client?



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] sunchao commented on pull request #29843: [SPARK-29250][BUILD] Move to shaded clients for Hadoop 3.2

2020-10-20 Thread GitBox


sunchao commented on pull request #29843:
URL: https://github.com/apache/spark/pull/29843#issuecomment-713193113


   The test failure is recorded in 
https://issues.apache.org/jira/browse/SPARK-33189



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #30111: [SPARK-33189][PYTHON][TESTS] Add env var to tests for legacy nested timestamps in pyarrow

2020-10-20 Thread GitBox


SparkQA commented on pull request #30111:
URL: https://github.com/apache/spark/pull/30111#issuecomment-713189852


   Kubernetes integration test starting
   URL: 
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/34669/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #30019: [SPARK-33135][CORE] Use listLocatedStatus from FileSystem implementations

2020-10-20 Thread GitBox


AmplabJenkins commented on pull request #30019:
URL: https://github.com/apache/spark/pull/30019#issuecomment-713189469







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #30019: [SPARK-33135][CORE] Use listLocatedStatus from FileSystem implementations

2020-10-20 Thread GitBox


SparkQA commented on pull request #30019:
URL: https://github.com/apache/spark/pull/30019#issuecomment-713189444


   Kubernetes integration test status success
   URL: 
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/34668/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #30111: [SPARK-33189][PYTHON][TESTS] Add env var to tests for legacy nested timestamps in pyarrow

2020-10-20 Thread GitBox


AmplabJenkins commented on pull request #30111:
URL: https://github.com/apache/spark/pull/30111#issuecomment-713188760







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #30111: [SPARK-33189][PYTHON][TESTS] Add env var to tests for legacy nested timestamps in pyarrow

2020-10-20 Thread GitBox


SparkQA commented on pull request #30111:
URL: https://github.com/apache/spark/pull/30111#issuecomment-713188421


   **[Test build #130060 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/130060/testReport)**
 for PR 30111 at commit 
[`a0c8ff6`](https://github.com/apache/spark/commit/a0c8ff6538b0ea3da8d99c23c1a003b959409975).
* This patch passes all tests.
* This patch merges cleanly.
* This patch adds no public classes.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #30109: [MINOR][CORE] Improve log message during storage decommission

2020-10-20 Thread GitBox


AmplabJenkins commented on pull request #30109:
URL: https://github.com/apache/spark/pull/30109#issuecomment-713183978







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #30109: [MINOR][CORE] Improve log message during storage decommission

2020-10-20 Thread GitBox


SparkQA commented on pull request #30109:
URL: https://github.com/apache/spark/pull/30109#issuecomment-713183964


   Kubernetes integration test status success
   URL: 
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/34667/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #30110: [SPARK-33198][CORE] getMigrationBlocks should not fail at missing files

2020-10-20 Thread GitBox


AmplabJenkins removed a comment on pull request #30110:
URL: https://github.com/apache/spark/pull/30110#issuecomment-713181606


   Test FAILed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/130054/
   Test FAILed.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #30110: [SPARK-33198][CORE] getMigrationBlocks should not fail at missing files

2020-10-20 Thread GitBox


AmplabJenkins commented on pull request #30110:
URL: https://github.com/apache/spark/pull/30110#issuecomment-713181595







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #30110: [SPARK-33198][CORE] getMigrationBlocks should not fail at missing files

2020-10-20 Thread GitBox


AmplabJenkins removed a comment on pull request #30110:
URL: https://github.com/apache/spark/pull/30110#issuecomment-713181595


   Merged build finished. Test FAILed.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #30110: [SPARK-33198][CORE] getMigrationBlocks should not fail at missing files

2020-10-20 Thread GitBox


SparkQA commented on pull request #30110:
URL: https://github.com/apache/spark/pull/30110#issuecomment-713181069


   **[Test build #130054 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/130054/testReport)**
 for PR 30110 at commit 
[`69bf6c0`](https://github.com/apache/spark/commit/69bf6c0267b1314295b40e42a951652e12b657b5).
* This patch **fails Spark unit tests**.
* This patch merges cleanly.
* This patch adds no public classes.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA removed a comment on pull request #30110: [SPARK-33198][CORE] getMigrationBlocks should not fail at missing files

2020-10-20 Thread GitBox


SparkQA removed a comment on pull request #30110:
URL: https://github.com/apache/spark/pull/30110#issuecomment-713111907


   **[Test build #130054 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/130054/testReport)**
 for PR 30110 at commit 
[`69bf6c0`](https://github.com/apache/spark/commit/69bf6c0267b1314295b40e42a951652e12b657b5).



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #28026: [SPARK-31257][SQL] Unify create table syntax

2020-10-20 Thread GitBox


AmplabJenkins removed a comment on pull request #28026:
URL: https://github.com/apache/spark/pull/28026#issuecomment-713179802







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #30019: [SPARK-33135][CORE] Use listLocatedStatus from FileSystem implementations

2020-10-20 Thread GitBox


SparkQA commented on pull request #30019:
URL: https://github.com/apache/spark/pull/30019#issuecomment-713179916


   Kubernetes integration test starting
   URL: 
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/34668/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #30093: [SPARK-33183][SQL] Fix EliminateSorts bug when removing global sorts

2020-10-20 Thread GitBox


AmplabJenkins removed a comment on pull request #30093:
URL: https://github.com/apache/spark/pull/30093#issuecomment-713179432







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #28026: [SPARK-31257][SQL] Unify create table syntax

2020-10-20 Thread GitBox


AmplabJenkins commented on pull request #28026:
URL: https://github.com/apache/spark/pull/28026#issuecomment-713179802







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #30093: [SPARK-33183][SQL] Fix EliminateSorts bug when removing global sorts

2020-10-20 Thread GitBox


AmplabJenkins commented on pull request #30093:
URL: https://github.com/apache/spark/pull/30093#issuecomment-713179432







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #30093: [SPARK-33183][SQL] Fix EliminateSorts bug when removing global sorts

2020-10-20 Thread GitBox


SparkQA commented on pull request #30093:
URL: https://github.com/apache/spark/pull/30093#issuecomment-713179418


   Kubernetes integration test status success
   URL: 
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/34666/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA removed a comment on pull request #28026: [SPARK-31257][SQL] Unify create table syntax

2020-10-20 Thread GitBox


SparkQA removed a comment on pull request #28026:
URL: https://github.com/apache/spark/pull/28026#issuecomment-713036369


   **[Test build #130049 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/130049/testReport)**
 for PR 28026 at commit 
[`81d75b5`](https://github.com/apache/spark/commit/81d75b5a38288f46ee647b98a811e5a37bacc415).



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #28026: [SPARK-31257][SQL] Unify create table syntax

2020-10-20 Thread GitBox


SparkQA commented on pull request #28026:
URL: https://github.com/apache/spark/pull/28026#issuecomment-713178935


   **[Test build #130049 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/130049/testReport)**
 for PR 28026 at commit 
[`81d75b5`](https://github.com/apache/spark/commit/81d75b5a38288f46ee647b98a811e5a37bacc415).
* This patch passes all tests.
* This patch merges cleanly.
* This patch adds no public classes.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #30110: [SPARK-33198][CORE] getMigrationBlocks should not fail at missing files

2020-10-20 Thread GitBox


AmplabJenkins commented on pull request #30110:
URL: https://github.com/apache/spark/pull/30110#issuecomment-713177592







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #30110: [SPARK-33198][CORE] getMigrationBlocks should not fail at missing files

2020-10-20 Thread GitBox


AmplabJenkins removed a comment on pull request #30110:
URL: https://github.com/apache/spark/pull/30110#issuecomment-713177592







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #30093: [SPARK-33183][SQL] Fix EliminateSorts bug when removing global sorts

2020-10-20 Thread GitBox


AmplabJenkins removed a comment on pull request #30093:
URL: https://github.com/apache/spark/pull/30093#issuecomment-713173583







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] BryanCutler commented on a change in pull request #30111: [SPARK-33189][PYTHON][TESTS] Add env var to tests for legacy nested timestamps in pyarrow

2020-10-20 Thread GitBox


BryanCutler commented on a change in pull request #30111:
URL: https://github.com/apache/spark/pull/30111#discussion_r508878495



##
File path: .github/workflows/build_and_test.yml
##
@@ -136,7 +136,7 @@ jobs:
 - name: Install Python packages (Python 3.8)
   if: (contains(matrix.modules, 'sql') && !contains(matrix.modules, 
'sql-'))
   run: |
-python3.8 -m pip install numpy 'pyarrow<2.0.0' pandas scipy xmlrunner
+python3.8 -m pip install numpy 'pyarrow<3.0.0' pandas scipy xmlrunner

Review comment:
   This should be ok to bump now since Arrow is following semantic 
versioning now

##
File path: .github/workflows/build_and_test.yml
##
@@ -136,7 +136,7 @@ jobs:
 - name: Install Python packages (Python 3.8)
   if: (contains(matrix.modules, 'sql') && !contains(matrix.modules, 
'sql-'))
   run: |
-python3.8 -m pip install numpy 'pyarrow<2.0.0' pandas scipy xmlrunner
+python3.8 -m pip install numpy 'pyarrow<3.0.0' pandas scipy xmlrunner

Review comment:
   This should be ok to bump since Arrow is following semantic versioning 
now





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA removed a comment on pull request #30109: [MINOR][CORE] Improve log message during storage decommission

2020-10-20 Thread GitBox


SparkQA removed a comment on pull request #30109:
URL: https://github.com/apache/spark/pull/30109#issuecomment-713080664


   **[Test build #130051 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/130051/testReport)**
 for PR 30109 at commit 
[`b156618`](https://github.com/apache/spark/commit/b156618a58cbebb4ff25ea13c2f5f5f02ba82a88).



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #30109: [MINOR][CORE] Improve log message during storage decommission

2020-10-20 Thread GitBox


AmplabJenkins removed a comment on pull request #30109:
URL: https://github.com/apache/spark/pull/30109#issuecomment-713168720







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #30110: [SPARK-33198][CORE] getMigrationBlocks should not fail at missing files

2020-10-20 Thread GitBox


SparkQA commented on pull request #30110:
URL: https://github.com/apache/spark/pull/30110#issuecomment-713176854


   **[Test build #130053 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/130053/testReport)**
 for PR 30110 at commit 
[`824d868`](https://github.com/apache/spark/commit/824d86809c509bb21f77a6ffbbebdbaa37a24701).
* This patch passes all tests.
* This patch merges cleanly.
* This patch adds no public classes.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA removed a comment on pull request #30110: [SPARK-33198][CORE] getMigrationBlocks should not fail at missing files

2020-10-20 Thread GitBox


SparkQA removed a comment on pull request #30110:
URL: https://github.com/apache/spark/pull/30110#issuecomment-713102719


   **[Test build #130053 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/130053/testReport)**
 for PR 30110 at commit 
[`824d868`](https://github.com/apache/spark/commit/824d86809c509bb21f77a6ffbbebdbaa37a24701).



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #30109: [MINOR][CORE] Improve log message during storage decommission

2020-10-20 Thread GitBox


SparkQA commented on pull request #30109:
URL: https://github.com/apache/spark/pull/30109#issuecomment-713176477


   Kubernetes integration test starting
   URL: 
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/34667/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #30111: [SPARK-33189][PYTHON][TESTS] Add env var to tests for legacy nested timestamps in pyarrow

2020-10-20 Thread GitBox


SparkQA commented on pull request #30111:
URL: https://github.com/apache/spark/pull/30111#issuecomment-713173956


   **[Test build #130060 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/130060/testReport)**
 for PR 30111 at commit 
[`a0c8ff6`](https://github.com/apache/spark/commit/a0c8ff6538b0ea3da8d99c23c1a003b959409975).



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #30093: [SPARK-33183][SQL] Fix EliminateSorts bug when removing global sorts

2020-10-20 Thread GitBox


SparkQA commented on pull request #30093:
URL: https://github.com/apache/spark/pull/30093#issuecomment-713173560


   Kubernetes integration test status success
   URL: 
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/34665/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #30093: [SPARK-33183][SQL] Fix EliminateSorts bug when removing global sorts

2020-10-20 Thread GitBox


AmplabJenkins commented on pull request #30093:
URL: https://github.com/apache/spark/pull/30093#issuecomment-713173583







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] BryanCutler opened a new pull request #30111: [SPARK-33189][PYTHON][TESTS] Add env var to tests for legacy nested timestamps in pyarrow

2020-10-20 Thread GitBox


BryanCutler opened a new pull request #30111:
URL: https://github.com/apache/spark/pull/30111


   
   
   ### What changes were proposed in this pull request?
   
   Add an environment variable `PYARROW_IGNORE_TIMEZONE` to pyspark tests in 
run-tests.py to use legacy nested timestamp behavior. This means that when 
converting arrow to pandas, nested timestamps with timezones will have the 
timezone localized during conversion.
   
   ### Why are the changes needed?
   
   The default behavior was changed in PyArrow 2.0.0 to propagate timezone 
information. Using the environment variable enables testing with newer versions 
of pyarrow until the issue can be fixed in SPARK-32285.
   
   
   ### Does this PR introduce _any_ user-facing change?
   
   No
   
   ### How was this patch tested?
   
   Existing tests



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #30093: [SPARK-33183][SQL] Fix EliminateSorts bug when removing global sorts

2020-10-20 Thread GitBox


SparkQA commented on pull request #30093:
URL: https://github.com/apache/spark/pull/30093#issuecomment-713171387


   Kubernetes integration test starting
   URL: 
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/34666/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] Fokko commented on a change in pull request #29122: [SPARK-32320][PYSPARK] Remove mutable default arguments

2020-10-20 Thread GitBox


Fokko commented on a change in pull request #29122:
URL: https://github.com/apache/spark/pull/29122#discussion_r508872469



##
File path: python/pyspark/sql/avro/functions.py
##
@@ -26,7 +26,7 @@
 
 
 @since(3.0)
-def from_avro(data, jsonFormatSchema, options={}):
+def from_avro(data, jsonFormatSchema, options=None):

Review comment:
   Thanks!   Yes, definitely a process that we have to go through. Let's 
break this up into smaller pieces.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] Fokko commented on a change in pull request #29122: [SPARK-32320][PYSPARK] Remove mutable default arguments

2020-10-20 Thread GitBox


Fokko commented on a change in pull request #29122:
URL: https://github.com/apache/spark/pull/29122#discussion_r508871957



##
File path: python/pyspark/ml/regression.py
##
@@ -1654,7 +1656,7 @@ class _AFTSurvivalRegressionParams(_PredictorParams, 
HasMaxIter, HasTol, HasFitI
 def __init__(self, *args):
 super(_AFTSurvivalRegressionParams, self).__init__(*args)
 self._setDefault(censorCol="censor",
- quantileProbabilities=[0.01, 0.05, 0.1, 0.25, 0.5, 
0.75, 0.9, 0.95, 0.99],
+ quantileProbabilities=DEFAULT_QUANTILE_PROBABILITIES,

Review comment:
   I've removed the default argument in the dead code.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #30109: [MINOR][CORE] Improve log message during storage decommission

2020-10-20 Thread GitBox


AmplabJenkins commented on pull request #30109:
URL: https://github.com/apache/spark/pull/30109#issuecomment-713168732







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] zero323 commented on a change in pull request #29122: [SPARK-32320][PYSPARK] Remove mutable default arguments

2020-10-20 Thread GitBox


zero323 commented on a change in pull request #29122:
URL: https://github.com/apache/spark/pull/29122#discussion_r508870060



##
File path: python/pyspark/sql/avro/functions.py
##
@@ -26,7 +26,7 @@
 
 
 @since(3.0)
-def from_avro(data, jsonFormatSchema, options={}):
+def from_avro(data, jsonFormatSchema, options=None):

Review comment:
   FYI I am trying to establish optimal set of flags for mypy config, but 
it is a bit slowish work ‒ unlike standalone stubs, we have to correct for 
internal modules, tests and such, and I'd prefer to avoid atomic options 
(ignoring all errors there).





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #30109: [MINOR][CORE] Improve log message during storage decommission

2020-10-20 Thread GitBox


SparkQA commented on pull request #30109:
URL: https://github.com/apache/spark/pull/30109#issuecomment-713167825


   **[Test build #130051 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/130051/testReport)**
 for PR 30109 at commit 
[`b156618`](https://github.com/apache/spark/commit/b156618a58cbebb4ff25ea13c2f5f5f02ba82a88).
* This patch passes all tests.
* This patch merges cleanly.
* This patch adds no public classes.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] zero323 commented on a change in pull request #29122: [SPARK-32320][PYSPARK] Remove mutable default arguments

2020-10-20 Thread GitBox


zero323 commented on a change in pull request #29122:
URL: https://github.com/apache/spark/pull/29122#discussion_r508868556



##
File path: python/pyspark/sql/avro/functions.py
##
@@ -26,7 +26,7 @@
 
 
 @since(3.0)
-def from_avro(data, jsonFormatSchema, options={}):
+def from_avro(data, jsonFormatSchema, options=None):

Review comment:
   >  Mypy has been added, and I would expect that this would be caught.
   
   As far as I recall this behavior depends on the MyPy flags. In particular 
see [Disabling strict optional 
checking](https://mypy.readthedocs.io/en/stable/kinds_of_types.html#no-strict-optional).
   
   So adding `no_implicit_optional` to `mypy.ini` should catch this.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] zero323 commented on a change in pull request #29122: [SPARK-32320][PYSPARK] Remove mutable default arguments

2020-10-20 Thread GitBox


zero323 commented on a change in pull request #29122:
URL: https://github.com/apache/spark/pull/29122#discussion_r508868556



##
File path: python/pyspark/sql/avro/functions.py
##
@@ -26,7 +26,7 @@
 
 
 @since(3.0)
-def from_avro(data, jsonFormatSchema, options={}):
+def from_avro(data, jsonFormatSchema, options=None):

Review comment:
   
   As far as I recall this behavior depends on the MyPy flags. In particular 
see [Disabling strict optional 
checking](https://mypy.readthedocs.io/en/stable/kinds_of_types.html#no-strict-optional).
   
   So adding `no_implicit_optional` to `mypy.ini` should catch this.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] dongjoon-hyun closed pull request #30110: [SPARK-33198][CORE] getMigrationBlocks should not fail at missing files

2020-10-20 Thread GitBox


dongjoon-hyun closed pull request #30110:
URL: https://github.com/apache/spark/pull/30110


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] dongjoon-hyun commented on pull request #30110: [SPARK-33198][CORE] getMigrationBlocks should not fail at missing files

2020-10-20 Thread GitBox


dongjoon-hyun commented on pull request #30110:
URL: https://github.com/apache/spark/pull/30110#issuecomment-713165794


   GitHub Action and K8s IT passed. Merged to master.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #30093: [SPARK-33183][SQL] Fix EliminateSorts bug when removing global sorts

2020-10-20 Thread GitBox


SparkQA commented on pull request #30093:
URL: https://github.com/apache/spark/pull/30093#issuecomment-713163872


   Kubernetes integration test starting
   URL: 
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/34665/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



  1   2   3   4   5   6   7   8   9   10   >