date:20210622

[GitHub] [spark] SparkQA commented on pull request #32940: [SPARK-35768][SQL] Take into account year-month interval fields in cast

2021-06-22 Thread GitBox



SparkQA commented on pull request #32940:
URL: https://github.com/apache/spark/pull/32940#issuecomment-866551970


   **[Test build #140178 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/140178/testReport)**
 for PR 32940 at commit 
[`6842958`](https://github.com/apache/spark/commit/684295860242099f71995e82713a2b6f6467dab1).
* This patch **fails Spark unit tests**.
* This patch **does not merge cleanly**.
* This patch adds no public classes.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] MaxGekk closed pull request #32970: [SPARK-35772][SQL][TESTS] Check all year-month interval types in `HiveInspectors` tests

2021-06-22 Thread GitBox



MaxGekk closed pull request #32970:
URL: https://github.com/apache/spark/pull/32970


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] cloud-fan commented on a change in pull request #33006: [SPARK-35846][SQL] Introduce ParquetReadState to track various states while reading a Parquet column chunk

2021-06-22 Thread GitBox



cloud-fan commented on a change in pull request #33006:
URL: https://github.com/apache/spark/pull/33006#discussion_r656777628



##
File path: 
sql/core/src/main/java/org/apache/spark/sql/execution/datasources/parquet/VectorizedColumnReader.java
##
@@ -216,53 +195,49 @@ void readBatch(int total, WritableColumnVector column) 
throws IOException {
   boolean needTransform = castLongToInt || isUnsignedInt32 || 
isUnsignedInt64;
   column.setDictionary(new ParquetDictionary(dictionary, 
needTransform));
 } else {
-  updater.decodeDictionaryIds(num, rowId, column, dictionaryIds, 
dictionary);
+  updater.decodeDictionaryIds(readState.offset - startOffset, 
startOffset, column,
+dictionaryIds, dictionary);
 }
   } else {
-if (column.hasDictionary() && rowId != 0) {
+if (column.hasDictionary() && readState.offset != 0) {
   // This batch already has dictionary encoded values but this new 
page is not. The batch
   // does not support a mix of dictionary and not so we will decode 
the dictionary.
-  updater.decodeDictionaryIds(rowId, 0, column, dictionaryIds, 
dictionary);
+  updater.decodeDictionaryIds(readState.offset, 0, column, 
dictionaryIds, dictionary);
 }
 column.setDictionary(null);
 VectorizedValuesReader valuesReader = (VectorizedValuesReader) 
dataColumn;
-defColumn.readBatch(num, rowId, column, maxDefLevel, valuesReader, 
updater);
+defColumn.readBatch(readState, column, valuesReader, updater);
   }
-
-  valuesRead += num;
-  rowId += num;
-  total -= num;
 }
   }
 
-  private void readPage() {
+  private int readPage() {
 DataPage page = pageReader.readPage();
-// TODO: Why is this a visitor?
-page.accept(new DataPage.Visitor() {
+return page.accept(new DataPage.Visitor() {
   @Override
-  public Void visit(DataPageV1 dataPageV1) {
+  public Integer visit(DataPageV1 dataPageV1) {

Review comment:
   ah I see, let's leave it then.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins removed a comment on pull request #32940: [SPARK-35768][SQL] Take into account year-month interval fields in cast

2021-06-22 Thread GitBox



AmplabJenkins removed a comment on pull request #32940:
URL: https://github.com/apache/spark/pull/32940#issuecomment-866545140


   
   Refer to this link for build results (access rights to CI server needed): 
   
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/44711/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] SparkQA commented on pull request #32940: [SPARK-35768][SQL] Take into account year-month interval fields in cast

2021-06-22 Thread GitBox



SparkQA commented on pull request #32940:
URL: https://github.com/apache/spark/pull/32940#issuecomment-866545129


   Kubernetes integration test unable to build dist.
   
   exiting with code: 1
   URL: 
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/44711/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins commented on pull request #32940: [SPARK-35768][SQL] Take into account year-month interval fields in cast

2021-06-22 Thread GitBox



AmplabJenkins commented on pull request #32940:
URL: https://github.com/apache/spark/pull/32940#issuecomment-866545140


   
   Refer to this link for build results (access rights to CI server needed): 
   
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/44711/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins removed a comment on pull request #33028: [SPARK-35714][FOLLOW-UP][CORE] Use a shared stopping flag for WorkerWatcher to avoid the duplicate System.exit

2021-06-22 Thread GitBox



AmplabJenkins removed a comment on pull request #33028:
URL: https://github.com/apache/spark/pull/33028#issuecomment-866538615






-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins removed a comment on pull request #32958: [SPARK-35065][SQL] Group exception messages in spark/sql (core)

2021-06-22 Thread GitBox



AmplabJenkins removed a comment on pull request #32958:
URL: https://github.com/apache/spark/pull/32958#issuecomment-866542051


   
   Refer to this link for build results (access rights to CI server needed): 
   
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/44708/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] SparkQA removed a comment on pull request #32940: [SPARK-35768][SQL] Take into account year-month interval fields in cast

2021-06-22 Thread GitBox



SparkQA removed a comment on pull request #32940:
URL: https://github.com/apache/spark/pull/32940#issuecomment-866540521


   **[Test build #140184 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/140184/testReport)**
 for PR 32940 at commit 
[`81ccf24`](https://github.com/apache/spark/commit/81ccf242571adc1ecbe83f347384f8c525e5b1e3).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins removed a comment on pull request #32970: [SPARK-35772][SQL] Check all year-month interval types in HiveInspectors tests

2021-06-22 Thread GitBox



AmplabJenkins removed a comment on pull request #32970:
URL: https://github.com/apache/spark/pull/32970#issuecomment-866538613






-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins removed a comment on pull request #33012: [SPARK-33298][CORE] Introduce new API to FileCommitProtocol allow flexible file naming

2021-06-22 Thread GitBox



AmplabJenkins removed a comment on pull request #33012:
URL: https://github.com/apache/spark/pull/33012#issuecomment-866538616


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/140170/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins removed a comment on pull request #33027: [SPARK-35857][SQL] The ANSI flag of Cast should be kept after being copied

2021-06-22 Thread GitBox



AmplabJenkins removed a comment on pull request #33027:
URL: https://github.com/apache/spark/pull/33027#issuecomment-866541492


   
   Refer to this link for build results (access rights to CI server needed): 
   
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/44707/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins removed a comment on pull request #32940: [SPARK-35768][SQL] Take into account year-month interval fields in cast

2021-06-22 Thread GitBox



AmplabJenkins removed a comment on pull request #32940:
URL: https://github.com/apache/spark/pull/32940#issuecomment-866538611






-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] SparkQA commented on pull request #32940: [SPARK-35768][SQL] Take into account year-month interval fields in cast

2021-06-22 Thread GitBox



SparkQA commented on pull request #32940:
URL: https://github.com/apache/spark/pull/32940#issuecomment-866543462


   **[Test build #140184 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/140184/testReport)**
 for PR 32940 at commit 
[`81ccf24`](https://github.com/apache/spark/commit/81ccf242571adc1ecbe83f347384f8c525e5b1e3).
* This patch **fails to build**.
* This patch merges cleanly.
* This patch adds no public classes.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins commented on pull request #32940: [SPARK-35768][SQL] Take into account year-month interval fields in cast

2021-06-22 Thread GitBox



AmplabJenkins commented on pull request #32940:
URL: https://github.com/apache/spark/pull/32940#issuecomment-866543480


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/140184/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins commented on pull request #33028: [SPARK-35714][FOLLOW-UP][CORE] Use a shared stopping flag for WorkerWatcher to avoid the duplicate System.exit

2021-06-22 Thread GitBox



AmplabJenkins commented on pull request #33028:
URL: https://github.com/apache/spark/pull/33028#issuecomment-866542848


   
   Refer to this link for build results (access rights to CI server needed): 
   
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/44706/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] SparkQA commented on pull request #33028: [SPARK-35714][FOLLOW-UP][CORE] Use a shared stopping flag for WorkerWatcher to avoid the duplicate System.exit

2021-06-22 Thread GitBox



SparkQA commented on pull request #33028:
URL: https://github.com/apache/spark/pull/33028#issuecomment-866542834


   Kubernetes integration test status failure
   URL: 
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/44706/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] SparkQA commented on pull request #32958: [SPARK-35065][SQL] Group exception messages in spark/sql (core)

2021-06-22 Thread GitBox



SparkQA commented on pull request #32958:
URL: https://github.com/apache/spark/pull/32958#issuecomment-866542035


   Kubernetes integration test status failure
   URL: 
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/44708/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins commented on pull request #32958: [SPARK-35065][SQL] Group exception messages in spark/sql (core)

2021-06-22 Thread GitBox



AmplabJenkins commented on pull request #32958:
URL: https://github.com/apache/spark/pull/32958#issuecomment-866542051


   
   Refer to this link for build results (access rights to CI server needed): 
   
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/44708/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] cloud-fan commented on a change in pull request #32932: [SPARK-35786][SQL] Add a new operator to distingush if AQE can optimize safely

2021-06-22 Thread GitBox



cloud-fan commented on a change in pull request #32932:
URL: https://github.com/apache/spark/pull/32932#discussion_r656771644



##
File path: docs/sql-performance-tuning.md
##
@@ -228,6 +228,8 @@ The "REPARTITION_BY_RANGE" hint must have column names and 
a partition number is
 SELECT /*+ REPARTITION */ * FROM t
 SELECT /*+ REPARTITION_BY_RANGE(c) */ * FROM t
 SELECT /*+ REPARTITION_BY_RANGE(3, c) */ * FROM t
+SELECT /*+ REPARTITION_BY_AQE */ * FROM t

Review comment:
   I like `REBALANCE_PARTITIONS` most, as this is a partition-level thing, 
not row-levle.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] SparkQA commented on pull request #32049: [SPARK-34952][SQL] Aggregate (Min/Max/Count) push down for Parquet

2021-06-22 Thread GitBox



SparkQA commented on pull request #32049:
URL: https://github.com/apache/spark/pull/32049#issuecomment-866541781


   **[Test build #140185 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/140185/testReport)**
 for PR 32049 at commit 
[`564e6de`](https://github.com/apache/spark/commit/564e6de0718856f93588d0ed7350349de4269236).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] SparkQA commented on pull request #33027: [SPARK-35857][SQL] The ANSI flag of Cast should be kept after being copied

2021-06-22 Thread GitBox



SparkQA commented on pull request #33027:
URL: https://github.com/apache/spark/pull/33027#issuecomment-866541415


   Kubernetes integration test status failure
   URL: 
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/44707/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins commented on pull request #33027: [SPARK-35857][SQL] The ANSI flag of Cast should be kept after being copied

2021-06-22 Thread GitBox



AmplabJenkins commented on pull request #33027:
URL: https://github.com/apache/spark/pull/33027#issuecomment-866541492


   
   Refer to this link for build results (access rights to CI server needed): 
   
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/44707/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] SparkQA commented on pull request #32940: [SPARK-35768][SQL] Take into account year-month interval fields in cast

2021-06-22 Thread GitBox



SparkQA commented on pull request #32940:
URL: https://github.com/apache/spark/pull/32940#issuecomment-866540521


   **[Test build #140184 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/140184/testReport)**
 for PR 32940 at commit 
[`81ccf24`](https://github.com/apache/spark/commit/81ccf242571adc1ecbe83f347384f8c525e5b1e3).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] SparkQA commented on pull request #32958: [SPARK-35065][SQL] Group exception messages in spark/sql (core)

2021-06-22 Thread GitBox



SparkQA commented on pull request #32958:
URL: https://github.com/apache/spark/pull/32958#issuecomment-866540234


   **[Test build #140183 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/140183/testReport)**
 for PR 32958 at commit 
[`1d08c71`](https://github.com/apache/spark/commit/1d08c71ebb2c6dc96ec02c637a7ba70a323c0eec).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] SparkQA commented on pull request #33027: [SPARK-35857][SQL] The ANSI flag of Cast should be kept after being copied

2021-06-22 Thread GitBox



SparkQA commented on pull request #33027:
URL: https://github.com/apache/spark/pull/33027#issuecomment-866539972


   **[Test build #140182 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/140182/testReport)**
 for PR 33027 at commit 
[`55891cf`](https://github.com/apache/spark/commit/55891cfc9df412ae34dda7d5f3f6c98c832f4c01).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins commented on pull request #33012: [SPARK-33298][CORE] Introduce new API to FileCommitProtocol allow flexible file naming

2021-06-22 Thread GitBox



AmplabJenkins commented on pull request #33012:
URL: https://github.com/apache/spark/pull/33012#issuecomment-866538616


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/140170/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins commented on pull request #33028: [SPARK-35714][FOLLOW-UP][CORE] Use a shared stopping flag for WorkerWatcher to avoid the duplicate System.exit

2021-06-22 Thread GitBox



AmplabJenkins commented on pull request #33028:
URL: https://github.com/apache/spark/pull/33028#issuecomment-866538615


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/140172/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins commented on pull request #32940: [SPARK-35768][SQL] Take into account year-month interval fields in cast

2021-06-22 Thread GitBox



AmplabJenkins commented on pull request #32940:
URL: https://github.com/apache/spark/pull/32940#issuecomment-866538611


   
   Refer to this link for build results (access rights to CI server needed): 
   
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/44705/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins commented on pull request #32970: [SPARK-35772][SQL] Check all year-month interval types in HiveInspectors tests

2021-06-22 Thread GitBox



AmplabJenkins commented on pull request #32970:
URL: https://github.com/apache/spark/pull/32970#issuecomment-866538613






-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] SparkQA commented on pull request #33028: [SPARK-35714][FOLLOW-UP][CORE] Use a shared stopping flag for WorkerWatcher to avoid the duplicate System.exit

2021-06-22 Thread GitBox



SparkQA commented on pull request #33028:
URL: https://github.com/apache/spark/pull/33028#issuecomment-866532672


   Kubernetes integration test starting
   URL: 
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/44706/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] MaxGekk commented on pull request #32985: [SPARK-35777][SQL][TEST] Check all year-month interval types in UDF

2021-06-22 Thread GitBox



MaxGekk commented on pull request #32985:
URL: https://github.com/apache/spark/pull/32985#issuecomment-866532041


   Let's merge https://github.com/apache/spark/pull/33035 before this, and come 
back to the PR later.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] SparkQA commented on pull request #32958: [SPARK-35065][SQL] Group exception messages in spark/sql (core)

2021-06-22 Thread GitBox



SparkQA commented on pull request #32958:
URL: https://github.com/apache/spark/pull/32958#issuecomment-866531973


   Kubernetes integration test starting
   URL: 
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/44708/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] SparkQA commented on pull request #33027: [SPARK-35857][SQL] The ANSI flag of Cast should be kept after being copied

2021-06-22 Thread GitBox



SparkQA commented on pull request #33027:
URL: https://github.com/apache/spark/pull/33027#issuecomment-866531552


   Kubernetes integration test starting
   URL: 
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/44707/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] MaxGekk commented on a change in pull request #32985: [SPARK-35777][SQL][TEST] Check all year-month interval types in UDF

2021-06-22 Thread GitBox



MaxGekk commented on a change in pull request #32985:
URL: https://github.com/apache/spark/pull/32985#discussion_r656766921



##
File path: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/Cast.scala
##
@@ -175,6 +175,8 @@ object Cast {
 
 case (from: UserDefinedType[_], to: UserDefinedType[_]) if 
to.acceptsType(from) => true
 
+case (_: YearMonthIntervalType, _: YearMonthIntervalType) => true

Review comment:
   @AngersZh Thank you.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] otterc commented on a change in pull request #33034: WIP: [SPARK-32923][CORE][SHUFFLE] Handle indeterminate stage retries for push-based shuffle

2021-06-22 Thread GitBox



otterc commented on a change in pull request #33034:
URL: https://github.com/apache/spark/pull/33034#discussion_r656761110



##
File path: 
common/network-shuffle/src/main/java/org/apache/spark/network/shuffle/RemoteBlockPushResolver.java
##
@@ -156,26 +157,31 @@ private AppShufflePartitionInfo 
getOrCreateAppShufflePartitionInfo(
   @VisibleForTesting
   AppShufflePartitionInfo newAppShufflePartitionInfo(
   AppShuffleId appShuffleId,
+  int shuffleSequenceId,
   int reduceId,
   File dataFile,
   File indexFile,
   File metaFile) throws IOException {
-return new AppShufflePartitionInfo(appShuffleId, reduceId, dataFile,
+return new AppShufflePartitionInfo(appShuffleId, shuffleSequenceId, 
reduceId, dataFile,
   new MergeShuffleFile(indexFile), new MergeShuffleFile(metaFile));
   }
 
   @Override
-  public MergedBlockMeta getMergedBlockMeta(String appId, int shuffleId, int 
reduceId) {
+  public MergedBlockMeta getMergedBlockMeta(
+  String appId,
+  int shuffleId,
+  int shuffleSequenceId,
+  int reduceId) {
 AppShuffleId appShuffleId = new AppShuffleId(appId, shuffleId);
-File indexFile = getMergedShuffleIndexFile(appShuffleId, reduceId);
+File indexFile = getMergedShuffleIndexFile(appShuffleId, 
shuffleSequenceId, reduceId);

Review comment:
   It seems you are changing the fetch side protocols so that you can 
figure out the `shuffleSequenceId` here to find which files to use. I don't 
think we should change the fetch side protocols if it's just for this reason.
   Is the request here ever going to be for an older shuffleSequenceId?
   If not, then you should try to figure out the latest shuffleSequenceId in 
`RemoteBlockPushResolver` rather than adding it to the protocol




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] viirya commented on a change in pull request #32932: [SPARK-35786][SQL] Add a new operator to distingush if AQE can optimize safely

2021-06-22 Thread GitBox



viirya commented on a change in pull request #32932:
URL: https://github.com/apache/spark/pull/32932#discussion_r656765126



##
File path: docs/sql-performance-tuning.md
##
@@ -228,6 +228,8 @@ The "REPARTITION_BY_RANGE" hint must have column names and 
a partition number is
 SELECT /*+ REPARTITION */ * FROM t
 SELECT /*+ REPARTITION_BY_RANGE(c) */ * FROM t
 SELECT /*+ REPARTITION_BY_RANGE(3, c) */ * FROM t
+SELECT /*+ REPARTITION_BY_AQE */ * FROM t

Review comment:
   `REBALANCE_OUTPUT` sounds good, or `REPARTITION_BY_AUTO`, 
`REBALANCE_PARTITION`, `REPARTITION_BY_REBALANCE`.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] SparkQA removed a comment on pull request #33028: [SPARK-35714][FOLLOW-UP][CORE] Use a shared stopping flag for WorkerWatcher to avoid the duplicate System.exit

2021-06-22 Thread GitBox



SparkQA removed a comment on pull request #33028:
URL: https://github.com/apache/spark/pull/33028#issuecomment-866473923


   **[Test build #140172 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/140172/testReport)**
 for PR 33028 at commit 
[`b8d48e2`](https://github.com/apache/spark/commit/b8d48e29c67ca30a866d2247dceee8618e16320c).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] SparkQA commented on pull request #33028: [SPARK-35714][FOLLOW-UP][CORE] Use a shared stopping flag for WorkerWatcher to avoid the duplicate System.exit

2021-06-22 Thread GitBox



SparkQA commented on pull request #33028:
URL: https://github.com/apache/spark/pull/33028#issuecomment-866525959


   **[Test build #140172 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/140172/testReport)**
 for PR 33028 at commit 
[`b8d48e2`](https://github.com/apache/spark/commit/b8d48e29c67ca30a866d2247dceee8618e16320c).
* This patch passes all tests.
* This patch merges cleanly.
* This patch adds no public classes.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] otterc commented on a change in pull request #33034: WIP: [SPARK-32923][CORE][SHUFFLE] Handle indeterminate stage retries for push-based shuffle

2021-06-22 Thread GitBox



otterc commented on a change in pull request #33034:
URL: https://github.com/apache/spark/pull/33034#discussion_r656760036



##
File path: 
common/network-shuffle/src/main/java/org/apache/spark/network/shuffle/ExternalBlockHandler.java
##
@@ -498,7 +501,7 @@ public boolean hasNext() {
 @Override
 public ManagedBuffer next() {
   ManagedBuffer block = 
Preconditions.checkNotNull(mergeManager.getMergedBlockData(
-appId, shuffleId, reduceIds[reduceIdx], 
chunkIds[reduceIdx][chunkIdx]));
+appId, shuffleId, shuffleSequenceId, reduceIds[reduceIdx], 
chunkIds[reduceIdx][chunkIdx]));

Review comment:
   Same here

##
File path: 
common/network-shuffle/src/main/java/org/apache/spark/network/shuffle/ExternalBlockHandler.java
##
@@ -476,12 +477,14 @@ public ManagedBuffer next() {
 
 private final String appId;
 private final int shuffleId;
+private final int shuffleSequenceId;
 private final int[] reduceIds;
 private final int[][] chunkIds;
 
 ShuffleChunkManagedBufferIterator(FetchShuffleBlockChunks msg) {
   appId = msg.appId;
   shuffleId = msg.shuffleId;
+  shuffleSequenceId = msg.shuffleSequenceId;

Review comment:
   Same here.

##
File path: 
common/network-common/src/main/java/org/apache/spark/network/protocol/MergedBlockMetaRequest.java
##
@@ -32,13 +32,20 @@
   public final long requestId;
   public final String appId;
   public final int shuffleId;
+  public final int shuffleSequenceId;
   public final int reduceId;
 
-  public MergedBlockMetaRequest(long requestId, String appId, int shuffleId, 
int reduceId) {
+  public MergedBlockMetaRequest(
+  long requestId,
+  String appId,
+  int shuffleId,
+  int shuffleSequenceId,

Review comment:
   Same here. Why do we need to modify the fetch side requests?

##
File path: 
common/network-shuffle/src/main/java/org/apache/spark/network/shuffle/RemoteBlockPushResolver.java
##
@@ -156,26 +157,31 @@ private AppShufflePartitionInfo 
getOrCreateAppShufflePartitionInfo(
   @VisibleForTesting
   AppShufflePartitionInfo newAppShufflePartitionInfo(
   AppShuffleId appShuffleId,
+  int shuffleSequenceId,
   int reduceId,
   File dataFile,
   File indexFile,
   File metaFile) throws IOException {
-return new AppShufflePartitionInfo(appShuffleId, reduceId, dataFile,
+return new AppShufflePartitionInfo(appShuffleId, shuffleSequenceId, 
reduceId, dataFile,
   new MergeShuffleFile(indexFile), new MergeShuffleFile(metaFile));
   }
 
   @Override
-  public MergedBlockMeta getMergedBlockMeta(String appId, int shuffleId, int 
reduceId) {
+  public MergedBlockMeta getMergedBlockMeta(
+  String appId,
+  int shuffleId,
+  int shuffleSequenceId,
+  int reduceId) {
 AppShuffleId appShuffleId = new AppShuffleId(appId, shuffleId);
-File indexFile = getMergedShuffleIndexFile(appShuffleId, reduceId);
+File indexFile = getMergedShuffleIndexFile(appShuffleId, 
shuffleSequenceId, reduceId);

Review comment:
   It seems you are changing the fetch side protocols so that you can 
figure out the `shuffleSequenceId` here to find which files to use. I don't 
think we should change the fetch side protocols if it's just for this reason.
   Instead there should be some logic in the {{RemoteBlockPushResolver}} to 
know which is the latest shuffle sequence Id 




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] SparkQA removed a comment on pull request #32970: [SPARK-35772][SQL] Check all year-month interval types in HiveInspectors tests

2021-06-22 Thread GitBox



SparkQA removed a comment on pull request #32970:
URL: https://github.com/apache/spark/pull/32970#issuecomment-866475402


   **[Test build #140175 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/140175/testReport)**
 for PR 32970 at commit 
[`4c74dde`](https://github.com/apache/spark/commit/4c74dde8f5f1789d7387287121f9b8f745204491).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] SparkQA commented on pull request #32970: [SPARK-35772][SQL] Check all year-month interval types in HiveInspectors tests

2021-06-22 Thread GitBox



SparkQA commented on pull request #32970:
URL: https://github.com/apache/spark/pull/32970#issuecomment-866523295


   **[Test build #140175 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/140175/testReport)**
 for PR 32970 at commit 
[`4c74dde`](https://github.com/apache/spark/commit/4c74dde8f5f1789d7387287121f9b8f745204491).
* This patch passes all tests.
* This patch merges cleanly.
* This patch adds no public classes.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] MaxGekk closed pull request #33031: [SPARK-35734][SQL][FOLLOWUP] IntervalUtils.toDayTimeIntervalString should consider the case a day-time type is casted as another day-time type

2021-06-22 Thread GitBox



MaxGekk closed pull request #33031:
URL: https://github.com/apache/spark/pull/33031


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] cloud-fan commented on a change in pull request #32932: [SPARK-35786][SQL] Add a new operator to distingush if AQE can optimize safely

2021-06-22 Thread GitBox



cloud-fan commented on a change in pull request #32932:
URL: https://github.com/apache/spark/pull/32932#discussion_r656760225



##
File path: docs/sql-performance-tuning.md
##
@@ -228,6 +228,8 @@ The "REPARTITION_BY_RANGE" hint must have column names and 
a partition number is
 SELECT /*+ REPARTITION */ * FROM t
 SELECT /*+ REPARTITION_BY_RANGE(c) */ * FROM t
 SELECT /*+ REPARTITION_BY_RANGE(3, c) */ * FROM t
+SELECT /*+ REPARTITION_BY_AQE */ * FROM t

Review comment:
   or just `REBALANCE_OUTPUT`?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AngersZhuuuu commented on a change in pull request #32940: [SPARK-35768][SQL] Take into account year-month interval fields in cast

2021-06-22 Thread GitBox



AngersZh commented on a change in pull request #32940:
URL: https://github.com/apache/spark/pull/32940#discussion_r656759755



##
File path: 
sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/CastSuite.scala
##
@@ -654,6 +654,33 @@ class CastSuite extends CastSuiteBase {
   }
   }
 
+  test("SPARK-35768: Take into account year-month interval fields in cast") {
+Seq(("1-1", YearMonthIntervalType(YEAR, YEAR), 12, 12, 12),
+  ("1-1", YearMonthIntervalType(YEAR, MONTH), 13, 12, 13),
+  ("1-1", YearMonthIntervalType(MONTH, MONTH), 13, 12, 13),
+  ("-1-1", YearMonthIntervalType(YEAR, YEAR), -12, -12, -12),
+  ("-1-1", YearMonthIntervalType(YEAR, MONTH), -13, -12, -13),
+  ("-1-1", YearMonthIntervalType(MONTH, MONTH), -13, -12, -13))
+  .foreach { case (str, dataType, ym, year, month) =>
+checkEvaluation(cast(Literal.create(str), dataType), ym)
+checkEvaluation(cast(Literal.create(s"INTERVAL '$str' YEAR TO MONTH"), 
dataType), ym)
+checkEvaluation(cast(Literal.create(s"INTERVAL -'$str' YEAR TO 
MONTH"), dataType), -ym)
+checkEvaluation(cast(Literal.create(s"INTERVAL '$str' YEAR"), 
dataType), year)

Review comment:
   Done




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] otterc commented on a change in pull request #33034: WIP: [SPARK-32923][CORE][SHUFFLE] Handle indeterminate stage retries for push-based shuffle

2021-06-22 Thread GitBox



otterc commented on a change in pull request #33034:
URL: https://github.com/apache/spark/pull/33034#discussion_r656759577



##
File path: 
common/network-common/src/main/java/org/apache/spark/network/client/TransportClient.java
##
@@ -222,7 +223,7 @@ public void sendMergedBlockMetaReq(
 handler.addRpcRequest(requestId, callback);
 RpcChannelListener listener = new RpcChannelListener(requestId, callback);
 channel.writeAndFlush(
-  new MergedBlockMetaRequest(requestId, appId, shuffleId, 
reduceId)).addListener(listener);
+  new MergedBlockMetaRequest(requestId, appId, shuffleId, 
shuffleSequenceId, reduceId)).addListener(listener);

Review comment:
   Why do we need to modify the fetch side requests? When will this request 
for shuffle data of an older shuffleSequenceId?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] SparkQA commented on pull request #32940: [SPARK-35768][SQL] Take into account year-month interval fields in cast

2021-06-22 Thread GitBox



SparkQA commented on pull request #32940:
URL: https://github.com/apache/spark/pull/32940#issuecomment-866521057


   Kubernetes integration test status failure
   URL: 
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/44705/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] SparkQA removed a comment on pull request #33012: [SPARK-33298][CORE] Introduce new API to FileCommitProtocol allow flexible file naming

2021-06-22 Thread GitBox



SparkQA removed a comment on pull request #33012:
URL: https://github.com/apache/spark/pull/33012#issuecomment-866457393


   **[Test build #140170 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/140170/testReport)**
 for PR 33012 at commit 
[`b24f40d`](https://github.com/apache/spark/commit/b24f40de3ab625c9b9058a5490be55c6ce26c392).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] SparkQA removed a comment on pull request #32970: [SPARK-35772][SQL] Check all year-month interval types in HiveInspectors tests

2021-06-22 Thread GitBox



SparkQA removed a comment on pull request #32970:
URL: https://github.com/apache/spark/pull/32970#issuecomment-866476891


   **[Test build #140176 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/140176/testReport)**
 for PR 32970 at commit 
[`79486b3`](https://github.com/apache/spark/commit/79486b309c69d30d244252015443cfeaac31a8a4).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] SparkQA commented on pull request #32970: [SPARK-35772][SQL] Check all year-month interval types in HiveInspectors tests

2021-06-22 Thread GitBox



SparkQA commented on pull request #32970:
URL: https://github.com/apache/spark/pull/32970#issuecomment-866518457


   **[Test build #140176 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/140176/testReport)**
 for PR 32970 at commit 
[`79486b3`](https://github.com/apache/spark/commit/79486b309c69d30d244252015443cfeaac31a8a4).
* This patch passes all tests.
* This patch merges cleanly.
* This patch adds no public classes.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] SparkQA commented on pull request #33012: [SPARK-33298][CORE] Introduce new API to FileCommitProtocol allow flexible file naming

2021-06-22 Thread GitBox



SparkQA commented on pull request #33012:
URL: https://github.com/apache/spark/pull/33012#issuecomment-866518270


   **[Test build #140170 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/140170/testReport)**
 for PR 33012 at commit 
[`b24f40d`](https://github.com/apache/spark/commit/b24f40de3ab625c9b9058a5490be55c6ce26c392).
* This patch passes all tests.
* This patch merges cleanly.
* This patch adds no public classes.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] SparkQA commented on pull request #32940: [SPARK-35768][SQL] Take into account year-month interval fields in cast

2021-06-22 Thread GitBox



SparkQA commented on pull request #32940:
URL: https://github.com/apache/spark/pull/32940#issuecomment-866517460


   Kubernetes integration test starting
   URL: 
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/44705/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins removed a comment on pull request #32933: [WIP][SPARK-35785][SS] Cleanup support for RocksDB instance

2021-06-22 Thread GitBox



AmplabJenkins removed a comment on pull request #32933:
URL: https://github.com/apache/spark/pull/32933#issuecomment-866517192


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/140171/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins commented on pull request #32933: [WIP][SPARK-35785][SS] Cleanup support for RocksDB instance

2021-06-22 Thread GitBox



AmplabJenkins commented on pull request #32933:
URL: https://github.com/apache/spark/pull/32933#issuecomment-866517192


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/140171/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] ulysses-you commented on a change in pull request #32932: [SPARK-35786][SQL] Add a new operator to distingush if AQE can optimize safely

2021-06-22 Thread GitBox



ulysses-you commented on a change in pull request #32932:
URL: https://github.com/apache/spark/pull/32932#discussion_r656754576



##
File path: docs/sql-performance-tuning.md
##
@@ -228,6 +228,8 @@ The "REPARTITION_BY_RANGE" hint must have column names and 
a partition number is
 SELECT /*+ REPARTITION */ * FROM t
 SELECT /*+ REPARTITION_BY_RANGE(c) */ * FROM t
 SELECT /*+ REPARTITION_BY_RANGE(3, c) */ * FROM t
+SELECT /*+ REPARTITION_BY_AQE */ * FROM t

Review comment:
   good point,  seems in previous sql config which user faced, we have no 
word about `output partitions`. `REBALANCE_SHUFFLE_PARTITIONS` ?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] SparkQA removed a comment on pull request #32933: [WIP][SPARK-35785][SS] Cleanup support for RocksDB instance

2021-06-22 Thread GitBox



SparkQA removed a comment on pull request #32933:
URL: https://github.com/apache/spark/pull/32933#issuecomment-866457489


   **[Test build #140171 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/140171/testReport)**
 for PR 32933 at commit 
[`f8f9d20`](https://github.com/apache/spark/commit/f8f9d20690838c725d0f657832257b62f3caf19e).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] SparkQA commented on pull request #32933: [WIP][SPARK-35785][SS] Cleanup support for RocksDB instance

2021-06-22 Thread GitBox



SparkQA commented on pull request #32933:
URL: https://github.com/apache/spark/pull/32933#issuecomment-866516634


   **[Test build #140171 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/140171/testReport)**
 for PR 32933 at commit 
[`f8f9d20`](https://github.com/apache/spark/commit/f8f9d20690838c725d0f657832257b62f3caf19e).
* This patch passes all tests.
* This patch merges cleanly.
* This patch adds no public classes.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] cloud-fan commented on a change in pull request #33027: [SPARK-35857][SQL] The ANSI flag of Cast should be kept after being copied

2021-06-22 Thread GitBox



cloud-fan commented on a change in pull request #33027:
URL: https://github.com/apache/spark/pull/33027#discussion_r656753977



##
File path: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/Cast.scala
##
@@ -1942,16 +1942,18 @@ abstract class CastBase extends UnaryExpression with 
TimeZoneAwareExpression wit
   """,
   since = "1.0.0",
   group = "conversion_funcs")
-case class Cast(child: Expression, dataType: DataType, timeZoneId: 
Option[String] = None)
+case class Cast(
+child: Expression,
+dataType: DataType,
+timeZoneId: Option[String] = None,
+override val ansiEnabled: Boolean = SQLConf.get.ansiEnabled)
   extends CastBase {
 

Review comment:
   I checked `Add`, I think we should also add a new `def this` to allow 
omitting the `ansiEnabled` parameter, to get a bit more compatibility, such as 
`new Cast(...)`




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins removed a comment on pull request #32970: [SPARK-35772][SQL] Check all year-month interval types in HiveInspectors tests

2021-06-22 Thread GitBox



AmplabJenkins removed a comment on pull request #32970:
URL: https://github.com/apache/spark/pull/32970#issuecomment-866513004


   
   Refer to this link for build results (access rights to CI server needed): 
   
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/44703/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins removed a comment on pull request #32958: [SPARK-35065][SQL] Group exception messages in spark/sql (core)

2021-06-22 Thread GitBox



AmplabJenkins removed a comment on pull request #32958:
URL: https://github.com/apache/spark/pull/32958#issuecomment-866515261


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/140181/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins removed a comment on pull request #33028: [SPARK-35714][FOLLOW-UP][CORE] Use a shared stopping flag for WorkerWatcher to avoid the duplicate System.exit

2021-06-22 Thread GitBox



AmplabJenkins removed a comment on pull request #33028:
URL: https://github.com/apache/spark/pull/33028#issuecomment-866513003


   
   Refer to this link for build results (access rights to CI server needed): 
   
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/44699/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins removed a comment on pull request #33035: [SPARK-35860][SQL] Support UpCast between different field of YearMonthIntervalType/DayTimeIntervalType

2021-06-22 Thread GitBox



AmplabJenkins removed a comment on pull request #33035:
URL: https://github.com/apache/spark/pull/33035#issuecomment-866514425


   
   Refer to this link for build results (access rights to CI server needed): 
   
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/44704/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] SparkQA removed a comment on pull request #32958: [SPARK-35065][SQL] Group exception messages in spark/sql (core)

2021-06-22 Thread GitBox



SparkQA removed a comment on pull request #32958:
URL: https://github.com/apache/spark/pull/32958#issuecomment-866514437


   **[Test build #140181 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/140181/testReport)**
 for PR 32958 at commit 
[`9b1dafc`](https://github.com/apache/spark/commit/9b1dafc5d47d51cd0f7524a24653ca19383b934a).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] SparkQA commented on pull request #32958: [SPARK-35065][SQL] Group exception messages in spark/sql (core)

2021-06-22 Thread GitBox



SparkQA commented on pull request #32958:
URL: https://github.com/apache/spark/pull/32958#issuecomment-866515250


   **[Test build #140181 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/140181/testReport)**
 for PR 32958 at commit 
[`9b1dafc`](https://github.com/apache/spark/commit/9b1dafc5d47d51cd0f7524a24653ca19383b934a).
* This patch **fails Scala style tests**.
* This patch merges cleanly.
* This patch adds no public classes.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins commented on pull request #32958: [SPARK-35065][SQL] Group exception messages in spark/sql (core)

2021-06-22 Thread GitBox



AmplabJenkins commented on pull request #32958:
URL: https://github.com/apache/spark/pull/32958#issuecomment-866515261


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/140181/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] SparkQA commented on pull request #32958: [SPARK-35065][SQL] Group exception messages in spark/sql (core)

2021-06-22 Thread GitBox



SparkQA commented on pull request #32958:
URL: https://github.com/apache/spark/pull/32958#issuecomment-866514437


   **[Test build #140181 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/140181/testReport)**
 for PR 32958 at commit 
[`9b1dafc`](https://github.com/apache/spark/commit/9b1dafc5d47d51cd0f7524a24653ca19383b934a).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins commented on pull request #33035: [SPARK-35860][SQL] Support UpCast between different field of YearMonthIntervalType/DayTimeIntervalType

2021-06-22 Thread GitBox



AmplabJenkins commented on pull request #33035:
URL: https://github.com/apache/spark/pull/33035#issuecomment-866514425


   
   Refer to this link for build results (access rights to CI server needed): 
   
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/44704/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] SparkQA commented on pull request #33028: [SPARK-35714][FOLLOW-UP][CORE] Use a shared stopping flag for WorkerWatcher to avoid the duplicate System.exit

2021-06-22 Thread GitBox



SparkQA commented on pull request #33028:
URL: https://github.com/apache/spark/pull/33028#issuecomment-866514366


   **[Test build #140179 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/140179/testReport)**
 for PR 33028 at commit 
[`6f68b48`](https://github.com/apache/spark/commit/6f68b48701e37b24fa3eb2a64cc8bcf3ee5444e9).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] SparkQA commented on pull request #33035: [SPARK-35860][SQL] Support UpCast between different field of YearMonthIntervalType/DayTimeIntervalType

2021-06-22 Thread GitBox



SparkQA commented on pull request #33035:
URL: https://github.com/apache/spark/pull/33035#issuecomment-866514410


   Kubernetes integration test status failure
   URL: 
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/44704/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] SparkQA commented on pull request #33027: [SPARK-35857][SQL] The ANSI flag of Cast should be kept after being copied

2021-06-22 Thread GitBox



SparkQA commented on pull request #33027:
URL: https://github.com/apache/spark/pull/33027#issuecomment-866514350


   **[Test build #140180 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/140180/testReport)**
 for PR 33027 at commit 
[`07ed68a`](https://github.com/apache/spark/commit/07ed68a518f88d20ef67228fd1ad1f3e3b2f66d9).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins commented on pull request #32970: [SPARK-35772][SQL] Check all year-month interval types in HiveInspectors tests

2021-06-22 Thread GitBox



AmplabJenkins commented on pull request #32970:
URL: https://github.com/apache/spark/pull/32970#issuecomment-866513004


   
   Refer to this link for build results (access rights to CI server needed): 
   
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/44703/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins commented on pull request #33028: [SPARK-35714][FOLLOW-UP][CORE] Use a shared stopping flag for WorkerWatcher to avoid the duplicate System.exit

2021-06-22 Thread GitBox



AmplabJenkins commented on pull request #33028:
URL: https://github.com/apache/spark/pull/33028#issuecomment-866513003


   
   Refer to this link for build results (access rights to CI server needed): 
   
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/44699/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] SparkQA commented on pull request #33035: [SPARK-35860][SQL] Support UpCast between different field of YearMonthIntervalType/DayTimeIntervalType

2021-06-22 Thread GitBox



SparkQA commented on pull request #33035:
URL: https://github.com/apache/spark/pull/33035#issuecomment-866511319


   Kubernetes integration test starting
   URL: 
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/44704/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] Ngone51 commented on pull request #32790: [SPARK-35543][CORE] Fix memory leak in BlockManagerMasterEndpoint removeRdd

2021-06-22 Thread GitBox



Ngone51 commented on pull request #32790:
URL: https://github.com/apache/spark/pull/32790#issuecomment-866510223


   @mridulm Interesting example! I also investigate a bit on it. Looks like all 
the elements of the underlying array has been nulled out but remain the array 
refrence unchanged. So its size doesn't change. But the memory usage shouldn't 
be the same as before but it's also not empty since `null` still takes a bit 
memory.
   
   ```java
   public void clear() {
   Node[] tab;
   modCount++;
   if ((tab = table) != null && size > 0) {
   size = 0;
   for (int i = 0; i < tab.length; ++i)
   tab[i] = null;
   }
   }
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] cloud-fan commented on a change in pull request #33023: [SPARK-35812][PYTHON] Throw ValueError if version and timestamp are used together in to_delta

2021-06-22 Thread GitBox



cloud-fan commented on a change in pull request #33023:
URL: https://github.com/apache/spark/pull/33023#discussion_r656747652



##
File path: python/pyspark/pandas/namespace.py
##
@@ -562,6 +562,8 @@ def read_delta(
 3  13
 4  14
 """
+if version is not None and timestamp is not None:
+raise ValueError("version and timestamp cannot be used together.")

Review comment:
   how about the document?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] beliefer commented on a change in pull request #32958: [SPARK-35065][SQL] Group exception messages in spark/sql (core)

2021-06-22 Thread GitBox



beliefer commented on a change in pull request #32958:
URL: https://github.com/apache/spark/pull/32958#discussion_r656745122



##
File path: 
sql/catalyst/src/main/scala/org/apache/spark/sql/errors/QueryExecutionErrors.scala
##
@@ -1422,4 +1426,251 @@ object QueryExecutionErrors {
   def invalidStreamingOutputModeError(outputMode: Option[OutputMode]): 
Throwable = {
 new UnsupportedOperationException(s"Invalid output mode: $outputMode")
   }
+
+  def multiFailuresInStageMaterializationError(error: Throwable): Throwable = {
+new SparkException("Multiple failures in stage materialization.", error)
+  }
+
+  def unrecognizedCompressionSchemaTypeIDError(typeId: Int): Throwable = {
+new UnsupportedOperationException(s"Unrecognized compression scheme type 
ID: $typeId")
+  }
+
+  def getParentLoggerNotImplementedError(className: String): Throwable = {
+new SQLFeatureNotSupportedException(s"$className.getParentLogger is not 
yet implemented.")
+  }
+
+  def cannotCreateParquetConverterForTypeError(t: DecimalType, parquetType: 
String): Throwable = {
+new RuntimeException(
+  s"""
+ |Unable to create Parquet converter for ${t.typeName}
+ |whose Parquet type is $parquetType without decimal metadata. Please 
read this
+ |column/field as Spark BINARY type.
+   """.stripMargin.replaceAll("\n", " "))
+  }
+
+  def cannotCreateParquetConverterForDecimalTypeError(
+  t: DecimalType, parquetType: String): Throwable = {
+new RuntimeException(
+  s"""
+ |Unable to create Parquet converter for decimal type ${t.json} whose 
Parquet type is
+ |$parquetType.  Parquet DECIMAL type can only be backed by INT32, 
INT64,
+ |FIXED_LEN_BYTE_ARRAY, or BINARY.
+   """.stripMargin.replaceAll("\n", " "))
+  }
+
+  def cannotCreateParquetConverterForDataTypeError(
+  t: DataType, parquetType: String): Throwable = {
+new RuntimeException(s"Unable to create Parquet converter for data type 
${t.json} " +
+  s"whose Parquet type is $parquetType")
+  }
+
+  def cannotAddMultiPartitionsOnNonatomicPartitionTableError(tableName: 
String): Throwable = {
+new UnsupportedOperationException(
+  s"Nonatomic partition table $tableName can not add multiple partitions.")
+  }
+
+  def userSpecifiedSchemaUnsupportedByDataSourceError(provider: 
TableProvider): Throwable = {
+new UnsupportedOperationException(
+  s"${provider.getClass.getSimpleName} source does not support 
user-specified schema.")
+  }
+
+  def cannotDropMultiPartitionsOnNonatomicPartitionTableError(tableName: 
String): Throwable = {
+new UnsupportedOperationException(
+  s"Nonatomic partition table $tableName can not drop multiple 
partitions.")
+  }
+
+  def truncateMultiPartitionUnsupportedError(tableName: String): Throwable = {
+new UnsupportedOperationException(
+  s"The table $tableName does not support truncation of multiple 
partition.")
+  }
+
+  def overwriteTableByUnsupportedExpressionError(table: Table): Throwable = {
+new SparkException(s"Table does not support overwrite by expression: 
$table")
+  }
+
+  def dynamicPartitionOverwriteUnsupportedByTableError(table: Table): 
Throwable = {
+new SparkException(s"Table does not support dynamic partition overwrite: 
$table")
+  }
+
+  def failedMergingSchemaError(schema: StructType, e: SparkException): 
Throwable = {
+new SparkException(s"Failed merging schema:\n${schema.treeString}", e)
+  }
+
+  def cannotBroadcastExceedMaxTableRowsError(
+  maxBroadcastTableRows: Long, numRows: Long): Throwable = {
+new SparkException(
+  s"Cannot broadcast the table over $maxBroadcastTableRows rows: $numRows 
rows")
+  }
+
+  def cannotBroadcastExceedMaxTableBytesError(
+  maxBroadcastTableBytes: Long, dataSize: Long): Throwable = {
+new SparkException("Cannot broadcast the table that is larger than" +
+  s" ${maxBroadcastTableBytes >> 30}GB: ${dataSize >> 30} GB")
+  }
+
+  def notEnoughMemoryToBuildAndBroadcastTableError(oe: OutOfMemoryError): 
Throwable = {
+new OutOfMemoryError("Not enough memory to build and broadcast the table 
to all " +
+  "worker nodes. As a workaround, you can either disable broadcast by 
setting " +
+  s"${SQLConf.AUTO_BROADCASTJOIN_THRESHOLD.key} to -1 or increase the 
spark " +
+  s"driver memory by setting ${SparkLauncher.DRIVER_MEMORY} to a higher 
value.")
+  .initCause(oe.getCause)
+  }
+
+  def executeUnsupportedByExecError(execName: String): Throwable = {
+new UnsupportedOperationException(s"$execName does not support the 
execute() code path.")
+  }
+
+  def cannotMergeClassWithOtherClassError(className: String, otherClass: 
String): Throwable = {
+new UnsupportedOperationException(
+  s"Cannot merge $className with $otherClass")
+  }
+
+  def continuousProcessingUnsupportedByDataSourceError(sourceName: String): 
Throwable = {
+new UnsupportedOperationException(
+  s"Data source

[GitHub] [spark] Ngone51 commented on a change in pull request #33020: [SPARK-35543][CORE][FOLLOWUP] Fix memory leak in BlockManagerMasterEndpoint removeRdd

2021-06-22 Thread GitBox



Ngone51 commented on a change in pull request #33020:
URL: https://github.com/apache/spark/pull/33020#discussion_r656745082



##
File path: 
core/src/main/scala/org/apache/spark/storage/BlockManagerMasterEndpoint.scala
##
@@ -570,7 +565,7 @@ class BlockManagerMasterEndpoint(
   val externalShuffleServiceBlockStatus =
 if (externalShuffleServiceRddFetchEnabled) {
   val externalShuffleServiceBlocks = blockStatusByShuffleService
-.getOrElseUpdate(externalShuffleServiceIdOnHost(id), new 
JHashMap[BlockId, BlockStatus])
+.getOrElseUpdate(externalShuffleServiceIdOnHost(id), new 
BlockStatusPerBlockId)

Review comment:
   Seems like we never clear the key after this change. Could you add some 
comments (maybe here) to explain?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] beliefer commented on a change in pull request #32958: [SPARK-35065][SQL] Group exception messages in spark/sql (core)

2021-06-22 Thread GitBox



beliefer commented on a change in pull request #32958:
URL: https://github.com/apache/spark/pull/32958#discussion_r656743625



##
File path: 
sql/catalyst/src/main/scala/org/apache/spark/sql/errors/QueryExecutionErrors.scala
##
@@ -1422,4 +1426,251 @@ object QueryExecutionErrors {
   def invalidStreamingOutputModeError(outputMode: Option[OutputMode]): 
Throwable = {
 new UnsupportedOperationException(s"Invalid output mode: $outputMode")
   }
+
+  def multiFailuresInStageMaterializationError(error: Throwable): Throwable = {
+new SparkException("Multiple failures in stage materialization.", error)
+  }
+
+  def unrecognizedCompressionSchemaTypeIDError(typeId: Int): Throwable = {
+new UnsupportedOperationException(s"Unrecognized compression scheme type 
ID: $typeId")
+  }
+
+  def getParentLoggerNotImplementedError(className: String): Throwable = {
+new SQLFeatureNotSupportedException(s"$className.getParentLogger is not 
yet implemented.")
+  }
+
+  def cannotCreateParquetConverterForTypeError(t: DecimalType, parquetType: 
String): Throwable = {
+new RuntimeException(
+  s"""
+ |Unable to create Parquet converter for ${t.typeName}
+ |whose Parquet type is $parquetType without decimal metadata. Please 
read this
+ |column/field as Spark BINARY type.
+   """.stripMargin.replaceAll("\n", " "))
+  }
+
+  def cannotCreateParquetConverterForDecimalTypeError(
+  t: DecimalType, parquetType: String): Throwable = {
+new RuntimeException(
+  s"""
+ |Unable to create Parquet converter for decimal type ${t.json} whose 
Parquet type is
+ |$parquetType.  Parquet DECIMAL type can only be backed by INT32, 
INT64,
+ |FIXED_LEN_BYTE_ARRAY, or BINARY.
+   """.stripMargin.replaceAll("\n", " "))
+  }
+
+  def cannotCreateParquetConverterForDataTypeError(
+  t: DataType, parquetType: String): Throwable = {
+new RuntimeException(s"Unable to create Parquet converter for data type 
${t.json} " +
+  s"whose Parquet type is $parquetType")
+  }
+
+  def cannotAddMultiPartitionsOnNonatomicPartitionTableError(tableName: 
String): Throwable = {
+new UnsupportedOperationException(
+  s"Nonatomic partition table $tableName can not add multiple partitions.")
+  }
+
+  def userSpecifiedSchemaUnsupportedByDataSourceError(provider: 
TableProvider): Throwable = {
+new UnsupportedOperationException(
+  s"${provider.getClass.getSimpleName} source does not support 
user-specified schema.")
+  }
+
+  def cannotDropMultiPartitionsOnNonatomicPartitionTableError(tableName: 
String): Throwable = {
+new UnsupportedOperationException(
+  s"Nonatomic partition table $tableName can not drop multiple 
partitions.")
+  }
+
+  def truncateMultiPartitionUnsupportedError(tableName: String): Throwable = {
+new UnsupportedOperationException(
+  s"The table $tableName does not support truncation of multiple 
partition.")
+  }
+
+  def overwriteTableByUnsupportedExpressionError(table: Table): Throwable = {
+new SparkException(s"Table does not support overwrite by expression: 
$table")
+  }
+
+  def dynamicPartitionOverwriteUnsupportedByTableError(table: Table): 
Throwable = {
+new SparkException(s"Table does not support dynamic partition overwrite: 
$table")
+  }
+
+  def failedMergingSchemaError(schema: StructType, e: SparkException): 
Throwable = {
+new SparkException(s"Failed merging schema:\n${schema.treeString}", e)
+  }
+
+  def cannotBroadcastExceedMaxTableRowsError(
+  maxBroadcastTableRows: Long, numRows: Long): Throwable = {
+new SparkException(
+  s"Cannot broadcast the table over $maxBroadcastTableRows rows: $numRows 
rows")
+  }
+
+  def cannotBroadcastExceedMaxTableBytesError(
+  maxBroadcastTableBytes: Long, dataSize: Long): Throwable = {
+new SparkException("Cannot broadcast the table that is larger than" +
+  s" ${maxBroadcastTableBytes >> 30}GB: ${dataSize >> 30} GB")
+  }
+
+  def notEnoughMemoryToBuildAndBroadcastTableError(oe: OutOfMemoryError): 
Throwable = {
+new OutOfMemoryError("Not enough memory to build and broadcast the table 
to all " +
+  "worker nodes. As a workaround, you can either disable broadcast by 
setting " +
+  s"${SQLConf.AUTO_BROADCASTJOIN_THRESHOLD.key} to -1 or increase the 
spark " +
+  s"driver memory by setting ${SparkLauncher.DRIVER_MEMORY} to a higher 
value.")
+  .initCause(oe.getCause)
+  }
+
+  def executeUnsupportedByExecError(execName: String): Throwable = {
+new UnsupportedOperationException(s"$execName does not support the 
execute() code path.")
+  }
+
+  def cannotMergeClassWithOtherClassError(className: String, otherClass: 
String): Throwable = {
+new UnsupportedOperationException(
+  s"Cannot merge $className with $otherClass")
+  }
+
+  def continuousProcessingUnsupportedByDataSourceError(sourceName: String): 
Throwable = {
+new UnsupportedOperationException(
+  s"Data source

[GitHub] [spark] SparkQA commented on pull request #32970: [SPARK-35772][SQL] Check all year-month interval types in HiveInspectors tests

2021-06-22 Thread GitBox



SparkQA commented on pull request #32970:
URL: https://github.com/apache/spark/pull/32970#issuecomment-866504186


   Kubernetes integration test status failure
   URL: 
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/44703/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] SparkQA commented on pull request #33028: [SPARK-35714][FOLLOW-UP][CORE] Use a shared stopping flag for WorkerWatcher to avoid the duplicate System.exit

2021-06-22 Thread GitBox



SparkQA commented on pull request #33028:
URL: https://github.com/apache/spark/pull/33028#issuecomment-866503738


   Kubernetes integration test status failure
   URL: 
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/44699/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] cloud-fan commented on a change in pull request #32932: [SPARK-35786][SQL] Add a new operator to distingush if AQE can optimize safely

2021-06-22 Thread GitBox



cloud-fan commented on a change in pull request #32932:
URL: https://github.com/apache/spark/pull/32932#discussion_r656741164



##
File path: docs/sql-performance-tuning.md
##
@@ -228,6 +228,8 @@ The "REPARTITION_BY_RANGE" hint must have column names and 
a partition number is
 SELECT /*+ REPARTITION */ * FROM t
 SELECT /*+ REPARTITION_BY_RANGE(c) */ * FROM t
 SELECT /*+ REPARTITION_BY_RANGE(3, c) */ * FROM t
+SELECT /*+ REPARTITION_BY_AQE */ * FROM t

Review comment:
   Other repartition hints can also be optimized by AQE, so I think this 
name is not precise enough.
   
   The key point here is the user intention. To optimize for data writing, we 
don't need a specific number of partitions, we don't need a strict output 
partitioning (like partition by a column). We only need to make the output 
evenly distributed and be partitioned by come columns as possible as we can 
(best effort).
   
   How about `REBALANCE_OUTPUT_PARTITIONS`?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] Peng-Lei commented on a change in pull request #32931: [SPARK-33898][SQL] Support SHOW CREATE TABLE In V2

2021-06-22 Thread GitBox



Peng-Lei commented on a change in pull request #32931:
URL: https://github.com/apache/spark/pull/32931#discussion_r656740205



##
File path: 
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2/ShowCreateTableExec.scala
##
@@ -0,0 +1,120 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql.execution.datasources.v2
+
+import scala.collection.mutable
+
+import org.apache.spark.sql.catalyst.InternalRow
+import org.apache.spark.sql.catalyst.expressions.Attribute
+import org.apache.spark.sql.catalyst.util.escapeSingleQuotedString
+import org.apache.spark.sql.connector.catalog.{CatalogV2Util, Table, 
TableCatalog}
+import org.apache.spark.sql.execution.LeafExecNode
+import org.apache.spark.unsafe.types.UTF8String
+
+/**
+ * Physical plan node for show create table.
+ */
+case class ShowCreateTableExec(
+output: Seq[Attribute],
+table: Table) extends V2CommandExec with LeafExecNode {
+  override protected def run(): Seq[InternalRow] = {
+val builder = StringBuilder.newBuilder
+// it is used to generate Spark DDL for given table. include Hive Serde 
table
+showCreateTable(table, builder)
+Seq(InternalRow(UTF8String.fromString(builder.toString)))
+  }
+
+  private def showCreateTable(table: Table, builder: StringBuilder): Unit = {
+builder ++= s"CREATE TABLE ${table.name()} "
+
+showTableDataColumns(table, builder)
+showTableUsing(table, builder)
+showTableOptions(table, builder)
+showTablePartitioning(table, builder)
+showTableComment(table, builder)
+showTableLocation(table, builder)
+showTableProperties(table, builder)
+  }
+
+  private def showTableDataColumns(table: Table, builder: StringBuilder): Unit 
= {
+val columns = table.schema().fields.map(_.toDDL)
+builder ++= concatByMultiLines(columns)
+  }
+
+  private def showTableUsing(table: Table, builder: StringBuilder): Unit = {
+Option(table.properties.get(TableCatalog.PROP_PROVIDER))
+  .map("USING " + escapeSingleQuotedString(_) + "\n")
+  .foreach(builder.append)
+  }
+
+  private def showTableOptions(table: Table, builder: StringBuilder): Unit = {
+import scala.collection.JavaConverters._
+val dataSourceOptions = table.properties.asScala
+  .filterKeys(_.startsWith(TableCatalog.OPTION_PREFIX))
+if (dataSourceOptions.nonEmpty) {
+  val props = dataSourceOptions.map { case (key, value) =>
+s"'${escapeSingleQuotedString(key)}' = 
'${escapeSingleQuotedString(value)}'"
+  }
+
+  builder ++= "OPTIONS"
+  builder ++= concatByMultiLines(props)
+}
+  }
+
+  private def showTablePartitioning(table: Table, builder: StringBuilder): 
Unit = {
+if (!table.partitioning.isEmpty) {
+  val transforms = new mutable.ArrayBuffer[String]
+  table.partitioning.foreach(t => transforms += t.describe())
+  if (transforms.nonEmpty) {

Review comment:
   yes




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] Peng-Lei commented on a change in pull request #32931: [SPARK-33898][SQL] Support SHOW CREATE TABLE In V2

2021-06-22 Thread GitBox



Peng-Lei commented on a change in pull request #32931:
URL: https://github.com/apache/spark/pull/32931#discussion_r656740158



##
File path: 
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2/DataSourceV2Strategy.scala
##
@@ -377,8 +377,11 @@ class DataSourceV2Strategy(session: SparkSession) extends 
Strategy with Predicat
 case LoadData(_: ResolvedTable, _, _, _, _) =>
   throw QueryCompilationErrors.loadDataNotSupportedForV2TablesError()
 
-case ShowCreateTable(_: ResolvedTable, _, _) =>
-  throw 
QueryCompilationErrors.showCreateTableNotSupportedForV2TablesError()

Review comment:
   yes




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AngersZhuuuu commented on a change in pull request #32940: [SPARK-35768][SQL] Take into account year-month interval fields in cast

2021-06-22 Thread GitBox



AngersZh commented on a change in pull request #32940:
URL: https://github.com/apache/spark/pull/32940#discussion_r656739982



##
File path: 
sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/CastSuite.scala
##
@@ -654,6 +654,33 @@ class CastSuite extends CastSuiteBase {
   }
   }
 
+  test("SPARK-35768: Take into account year-month interval fields in cast") {
+Seq(("1-1", YearMonthIntervalType(YEAR, YEAR), 12, 12, 12),
+  ("1-1", YearMonthIntervalType(YEAR, MONTH), 13, 12, 13),
+  ("1-1", YearMonthIntervalType(MONTH, MONTH), 13, 12, 13),

Review comment:
   Done




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] SparkQA commented on pull request #32970: [SPARK-35772][SQL] Check all year-month interval types in HiveInspectors tests

2021-06-22 Thread GitBox



SparkQA commented on pull request #32970:
URL: https://github.com/apache/spark/pull/32970#issuecomment-866501347


   Kubernetes integration test starting
   URL: 
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/44703/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] SparkQA commented on pull request #33028: [SPARK-35714][FOLLOW-UP][CORE] Use a shared stopping flag for WorkerWatcher to avoid the duplicate System.exit

2021-06-22 Thread GitBox



SparkQA commented on pull request #33028:
URL: https://github.com/apache/spark/pull/33028#issuecomment-866500374


   Kubernetes integration test starting
   URL: 
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/44699/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] SparkQA commented on pull request #32993: [SPARK-35776][SQL] Check all year-month interval types in arrow

2021-06-22 Thread GitBox



SparkQA commented on pull request #32993:
URL: https://github.com/apache/spark/pull/32993#issuecomment-866500161


   Kubernetes integration test unable to build dist.
   
   exiting with code: 1
   URL: 
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/44701/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins commented on pull request #32993: [SPARK-35776][SQL] Check all year-month interval types in arrow

2021-06-22 Thread GitBox



AmplabJenkins commented on pull request #32993:
URL: https://github.com/apache/spark/pull/32993#issuecomment-866500178


   
   Refer to this link for build results (access rights to CI server needed): 
   
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/44701/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins removed a comment on pull request #32970: [SPARK-35772][SQL] Check all year-month interval types in HiveInspectors tests

2021-06-22 Thread GitBox



AmplabJenkins removed a comment on pull request #32970:
URL: https://github.com/apache/spark/pull/32970#issuecomment-866499627


   
   Refer to this link for build results (access rights to CI server needed): 
   
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/44702/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins commented on pull request #32970: [SPARK-35772][SQL] Check all year-month interval types in HiveInspectors tests

2021-06-22 Thread GitBox



AmplabJenkins commented on pull request #32970:
URL: https://github.com/apache/spark/pull/32970#issuecomment-866499627


   
   Refer to this link for build results (access rights to CI server needed): 
   
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/44702/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] SparkQA commented on pull request #32970: [SPARK-35772][SQL] Check all year-month interval types in HiveInspectors tests

2021-06-22 Thread GitBox



SparkQA commented on pull request #32970:
URL: https://github.com/apache/spark/pull/32970#issuecomment-866499604


   Kubernetes integration test status failure
   URL: 
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/44702/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins removed a comment on pull request #32932: [SPARK-35786][SQL] Add a new operator to distingush if AQE can optimize safely

2021-06-22 Thread GitBox



AmplabJenkins removed a comment on pull request #32932:
URL: https://github.com/apache/spark/pull/32932#issuecomment-866498911


   
   Refer to this link for build results (access rights to CI server needed): 
   
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/44700/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] SparkQA commented on pull request #32932: [SPARK-35786][SQL] Add a new operator to distingush if AQE can optimize safely

2021-06-22 Thread GitBox



SparkQA commented on pull request #32932:
URL: https://github.com/apache/spark/pull/32932#issuecomment-866498895


   Kubernetes integration test status failure
   URL: 
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/44700/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins commented on pull request #32932: [SPARK-35786][SQL] Add a new operator to distingush if AQE can optimize safely

2021-06-22 Thread GitBox



AmplabJenkins commented on pull request #32932:
URL: https://github.com/apache/spark/pull/32932#issuecomment-866498911


   
   Refer to this link for build results (access rights to CI server needed): 
   
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/44700/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins removed a comment on pull request #32932: [SPARK-35786][SQL] Add a new operator to distingush if AQE can optimize safely

2021-06-22 Thread GitBox



AmplabJenkins removed a comment on pull request #32932:
URL: https://github.com/apache/spark/pull/32932#issuecomment-866494187


   
   Refer to this link for build results (access rights to CI server needed): 
   
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/44696/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins removed a comment on pull request #32933: [WIP][SPARK-35785][SS] Cleanup support for RocksDB instance

2021-06-22 Thread GitBox



AmplabJenkins removed a comment on pull request #32933:
URL: https://github.com/apache/spark/pull/32933#issuecomment-866494184


   
   Refer to this link for build results (access rights to CI server needed): 
   
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/44698/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins removed a comment on pull request #33008: [WIP][SPARK-35801][SQL] Support DELETE operations that require rewriting data

2021-06-22 Thread GitBox



AmplabJenkins removed a comment on pull request #33008:
URL: https://github.com/apache/spark/pull/33008#issuecomment-866494185


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/140167/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins removed a comment on pull request #33012: [SPARK-33298][CORE] Introduce new API to FileCommitProtocol allow flexible file naming

2021-06-22 Thread GitBox



AmplabJenkins removed a comment on pull request #33012:
URL: https://github.com/apache/spark/pull/33012#issuecomment-866494186


   
   Refer to this link for build results (access rights to CI server needed): 
   
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/44697/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] SparkQA commented on pull request #32970: [SPARK-35772][SQL] Check all year-month interval types in HiveInspectors tests

2021-06-22 Thread GitBox



SparkQA commented on pull request #32970:
URL: https://github.com/apache/spark/pull/32970#issuecomment-866496586


   Kubernetes integration test starting
   URL: 
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/44702/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] gengliangwang commented on a change in pull request #33022: [SPARK-35856][SQL][TESTS] Move new interval type test cases from CastSuite to CastBaseSuite

2021-06-22 Thread GitBox



gengliangwang commented on a change in pull request #33022:
URL: https://github.com/apache/spark/pull/33022#discussion_r656733692



##
File path: 
sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/CastSuiteBase.scala
##
@@ -73,7 +73,9 @@ abstract class CastSuiteBase extends SparkFunSuite with 
ExpressionEvalHelper {
 }
   }
 
-  protected def isAlwaysNullable: Boolean = false
+  // Whether the test suite is for TryCast. If yes, there is no exceptions and 
the result is
+  // always nullable.
+  protected def isTryCast: Boolean = false

Review comment:
   +1, PR description updated.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

1 2 3 4 5 6 7 8 9 10 >

1 - 100 of 1236 matches

Mail list logo