[GitHub] [spark] SparkQA commented on pull request #32940: [SPARK-35768][SQL] Take into account year-month interval fields in cast
SparkQA commented on pull request #32940: URL: https://github.com/apache/spark/pull/32940#issuecomment-866551970 **[Test build #140178 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/140178/testReport)** for PR 32940 at commit [`6842958`](https://github.com/apache/spark/commit/684295860242099f71995e82713a2b6f6467dab1). * This patch **fails Spark unit tests**. * This patch **does not merge cleanly**. * This patch adds no public classes. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] MaxGekk closed pull request #32970: [SPARK-35772][SQL][TESTS] Check all year-month interval types in `HiveInspectors` tests
MaxGekk closed pull request #32970: URL: https://github.com/apache/spark/pull/32970 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] cloud-fan commented on a change in pull request #33006: [SPARK-35846][SQL] Introduce ParquetReadState to track various states while reading a Parquet column chunk
cloud-fan commented on a change in pull request #33006: URL: https://github.com/apache/spark/pull/33006#discussion_r656777628 ## File path: sql/core/src/main/java/org/apache/spark/sql/execution/datasources/parquet/VectorizedColumnReader.java ## @@ -216,53 +195,49 @@ void readBatch(int total, WritableColumnVector column) throws IOException { boolean needTransform = castLongToInt || isUnsignedInt32 || isUnsignedInt64; column.setDictionary(new ParquetDictionary(dictionary, needTransform)); } else { - updater.decodeDictionaryIds(num, rowId, column, dictionaryIds, dictionary); + updater.decodeDictionaryIds(readState.offset - startOffset, startOffset, column, +dictionaryIds, dictionary); } } else { -if (column.hasDictionary() && rowId != 0) { +if (column.hasDictionary() && readState.offset != 0) { // This batch already has dictionary encoded values but this new page is not. The batch // does not support a mix of dictionary and not so we will decode the dictionary. - updater.decodeDictionaryIds(rowId, 0, column, dictionaryIds, dictionary); + updater.decodeDictionaryIds(readState.offset, 0, column, dictionaryIds, dictionary); } column.setDictionary(null); VectorizedValuesReader valuesReader = (VectorizedValuesReader) dataColumn; -defColumn.readBatch(num, rowId, column, maxDefLevel, valuesReader, updater); +defColumn.readBatch(readState, column, valuesReader, updater); } - - valuesRead += num; - rowId += num; - total -= num; } } - private void readPage() { + private int readPage() { DataPage page = pageReader.readPage(); -// TODO: Why is this a visitor? -page.accept(new DataPage.Visitor() { +return page.accept(new DataPage.Visitor() { @Override - public Void visit(DataPageV1 dataPageV1) { + public Integer visit(DataPageV1 dataPageV1) { Review comment: ah I see, let's leave it then. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #32940: [SPARK-35768][SQL] Take into account year-month interval fields in cast
AmplabJenkins removed a comment on pull request #32940: URL: https://github.com/apache/spark/pull/32940#issuecomment-866545140 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/44711/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #32940: [SPARK-35768][SQL] Take into account year-month interval fields in cast
SparkQA commented on pull request #32940: URL: https://github.com/apache/spark/pull/32940#issuecomment-866545129 Kubernetes integration test unable to build dist. exiting with code: 1 URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/44711/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #32940: [SPARK-35768][SQL] Take into account year-month interval fields in cast
AmplabJenkins commented on pull request #32940: URL: https://github.com/apache/spark/pull/32940#issuecomment-866545140 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/44711/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #33028: [SPARK-35714][FOLLOW-UP][CORE] Use a shared stopping flag for WorkerWatcher to avoid the duplicate System.exit
AmplabJenkins removed a comment on pull request #33028: URL: https://github.com/apache/spark/pull/33028#issuecomment-866538615 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #32958: [SPARK-35065][SQL] Group exception messages in spark/sql (core)
AmplabJenkins removed a comment on pull request #32958: URL: https://github.com/apache/spark/pull/32958#issuecomment-866542051 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/44708/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA removed a comment on pull request #32940: [SPARK-35768][SQL] Take into account year-month interval fields in cast
SparkQA removed a comment on pull request #32940: URL: https://github.com/apache/spark/pull/32940#issuecomment-866540521 **[Test build #140184 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/140184/testReport)** for PR 32940 at commit [`81ccf24`](https://github.com/apache/spark/commit/81ccf242571adc1ecbe83f347384f8c525e5b1e3). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #32970: [SPARK-35772][SQL] Check all year-month interval types in HiveInspectors tests
AmplabJenkins removed a comment on pull request #32970: URL: https://github.com/apache/spark/pull/32970#issuecomment-866538613 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #33012: [SPARK-33298][CORE] Introduce new API to FileCommitProtocol allow flexible file naming
AmplabJenkins removed a comment on pull request #33012: URL: https://github.com/apache/spark/pull/33012#issuecomment-866538616 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/140170/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #33027: [SPARK-35857][SQL] The ANSI flag of Cast should be kept after being copied
AmplabJenkins removed a comment on pull request #33027: URL: https://github.com/apache/spark/pull/33027#issuecomment-866541492 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/44707/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #32940: [SPARK-35768][SQL] Take into account year-month interval fields in cast
AmplabJenkins removed a comment on pull request #32940: URL: https://github.com/apache/spark/pull/32940#issuecomment-866538611 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #32940: [SPARK-35768][SQL] Take into account year-month interval fields in cast
SparkQA commented on pull request #32940: URL: https://github.com/apache/spark/pull/32940#issuecomment-866543462 **[Test build #140184 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/140184/testReport)** for PR 32940 at commit [`81ccf24`](https://github.com/apache/spark/commit/81ccf242571adc1ecbe83f347384f8c525e5b1e3). * This patch **fails to build**. * This patch merges cleanly. * This patch adds no public classes. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #32940: [SPARK-35768][SQL] Take into account year-month interval fields in cast
AmplabJenkins commented on pull request #32940: URL: https://github.com/apache/spark/pull/32940#issuecomment-866543480 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/140184/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #33028: [SPARK-35714][FOLLOW-UP][CORE] Use a shared stopping flag for WorkerWatcher to avoid the duplicate System.exit
AmplabJenkins commented on pull request #33028: URL: https://github.com/apache/spark/pull/33028#issuecomment-866542848 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/44706/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #33028: [SPARK-35714][FOLLOW-UP][CORE] Use a shared stopping flag for WorkerWatcher to avoid the duplicate System.exit
SparkQA commented on pull request #33028: URL: https://github.com/apache/spark/pull/33028#issuecomment-866542834 Kubernetes integration test status failure URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/44706/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #32958: [SPARK-35065][SQL] Group exception messages in spark/sql (core)
SparkQA commented on pull request #32958: URL: https://github.com/apache/spark/pull/32958#issuecomment-866542035 Kubernetes integration test status failure URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/44708/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #32958: [SPARK-35065][SQL] Group exception messages in spark/sql (core)
AmplabJenkins commented on pull request #32958: URL: https://github.com/apache/spark/pull/32958#issuecomment-866542051 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/44708/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] cloud-fan commented on a change in pull request #32932: [SPARK-35786][SQL] Add a new operator to distingush if AQE can optimize safely
cloud-fan commented on a change in pull request #32932: URL: https://github.com/apache/spark/pull/32932#discussion_r656771644 ## File path: docs/sql-performance-tuning.md ## @@ -228,6 +228,8 @@ The "REPARTITION_BY_RANGE" hint must have column names and a partition number is SELECT /*+ REPARTITION */ * FROM t SELECT /*+ REPARTITION_BY_RANGE(c) */ * FROM t SELECT /*+ REPARTITION_BY_RANGE(3, c) */ * FROM t +SELECT /*+ REPARTITION_BY_AQE */ * FROM t Review comment: I like `REBALANCE_PARTITIONS` most, as this is a partition-level thing, not row-levle. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #32049: [SPARK-34952][SQL] Aggregate (Min/Max/Count) push down for Parquet
SparkQA commented on pull request #32049: URL: https://github.com/apache/spark/pull/32049#issuecomment-866541781 **[Test build #140185 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/140185/testReport)** for PR 32049 at commit [`564e6de`](https://github.com/apache/spark/commit/564e6de0718856f93588d0ed7350349de4269236). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #33027: [SPARK-35857][SQL] The ANSI flag of Cast should be kept after being copied
SparkQA commented on pull request #33027: URL: https://github.com/apache/spark/pull/33027#issuecomment-866541415 Kubernetes integration test status failure URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/44707/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #33027: [SPARK-35857][SQL] The ANSI flag of Cast should be kept after being copied
AmplabJenkins commented on pull request #33027: URL: https://github.com/apache/spark/pull/33027#issuecomment-866541492 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/44707/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #32940: [SPARK-35768][SQL] Take into account year-month interval fields in cast
SparkQA commented on pull request #32940: URL: https://github.com/apache/spark/pull/32940#issuecomment-866540521 **[Test build #140184 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/140184/testReport)** for PR 32940 at commit [`81ccf24`](https://github.com/apache/spark/commit/81ccf242571adc1ecbe83f347384f8c525e5b1e3). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #32958: [SPARK-35065][SQL] Group exception messages in spark/sql (core)
SparkQA commented on pull request #32958: URL: https://github.com/apache/spark/pull/32958#issuecomment-866540234 **[Test build #140183 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/140183/testReport)** for PR 32958 at commit [`1d08c71`](https://github.com/apache/spark/commit/1d08c71ebb2c6dc96ec02c637a7ba70a323c0eec). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #33027: [SPARK-35857][SQL] The ANSI flag of Cast should be kept after being copied
SparkQA commented on pull request #33027: URL: https://github.com/apache/spark/pull/33027#issuecomment-866539972 **[Test build #140182 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/140182/testReport)** for PR 33027 at commit [`55891cf`](https://github.com/apache/spark/commit/55891cfc9df412ae34dda7d5f3f6c98c832f4c01). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #33012: [SPARK-33298][CORE] Introduce new API to FileCommitProtocol allow flexible file naming
AmplabJenkins commented on pull request #33012: URL: https://github.com/apache/spark/pull/33012#issuecomment-866538616 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/140170/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #33028: [SPARK-35714][FOLLOW-UP][CORE] Use a shared stopping flag for WorkerWatcher to avoid the duplicate System.exit
AmplabJenkins commented on pull request #33028: URL: https://github.com/apache/spark/pull/33028#issuecomment-866538615 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/140172/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #32940: [SPARK-35768][SQL] Take into account year-month interval fields in cast
AmplabJenkins commented on pull request #32940: URL: https://github.com/apache/spark/pull/32940#issuecomment-866538611 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/44705/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #32970: [SPARK-35772][SQL] Check all year-month interval types in HiveInspectors tests
AmplabJenkins commented on pull request #32970: URL: https://github.com/apache/spark/pull/32970#issuecomment-866538613 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #33028: [SPARK-35714][FOLLOW-UP][CORE] Use a shared stopping flag for WorkerWatcher to avoid the duplicate System.exit
SparkQA commented on pull request #33028: URL: https://github.com/apache/spark/pull/33028#issuecomment-866532672 Kubernetes integration test starting URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/44706/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] MaxGekk commented on pull request #32985: [SPARK-35777][SQL][TEST] Check all year-month interval types in UDF
MaxGekk commented on pull request #32985: URL: https://github.com/apache/spark/pull/32985#issuecomment-866532041 Let's merge https://github.com/apache/spark/pull/33035 before this, and come back to the PR later. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #32958: [SPARK-35065][SQL] Group exception messages in spark/sql (core)
SparkQA commented on pull request #32958: URL: https://github.com/apache/spark/pull/32958#issuecomment-866531973 Kubernetes integration test starting URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/44708/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #33027: [SPARK-35857][SQL] The ANSI flag of Cast should be kept after being copied
SparkQA commented on pull request #33027: URL: https://github.com/apache/spark/pull/33027#issuecomment-866531552 Kubernetes integration test starting URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/44707/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] MaxGekk commented on a change in pull request #32985: [SPARK-35777][SQL][TEST] Check all year-month interval types in UDF
MaxGekk commented on a change in pull request #32985: URL: https://github.com/apache/spark/pull/32985#discussion_r656766921 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/Cast.scala ## @@ -175,6 +175,8 @@ object Cast { case (from: UserDefinedType[_], to: UserDefinedType[_]) if to.acceptsType(from) => true +case (_: YearMonthIntervalType, _: YearMonthIntervalType) => true Review comment: @AngersZh Thank you. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] otterc commented on a change in pull request #33034: WIP: [SPARK-32923][CORE][SHUFFLE] Handle indeterminate stage retries for push-based shuffle
otterc commented on a change in pull request #33034: URL: https://github.com/apache/spark/pull/33034#discussion_r656761110 ## File path: common/network-shuffle/src/main/java/org/apache/spark/network/shuffle/RemoteBlockPushResolver.java ## @@ -156,26 +157,31 @@ private AppShufflePartitionInfo getOrCreateAppShufflePartitionInfo( @VisibleForTesting AppShufflePartitionInfo newAppShufflePartitionInfo( AppShuffleId appShuffleId, + int shuffleSequenceId, int reduceId, File dataFile, File indexFile, File metaFile) throws IOException { -return new AppShufflePartitionInfo(appShuffleId, reduceId, dataFile, +return new AppShufflePartitionInfo(appShuffleId, shuffleSequenceId, reduceId, dataFile, new MergeShuffleFile(indexFile), new MergeShuffleFile(metaFile)); } @Override - public MergedBlockMeta getMergedBlockMeta(String appId, int shuffleId, int reduceId) { + public MergedBlockMeta getMergedBlockMeta( + String appId, + int shuffleId, + int shuffleSequenceId, + int reduceId) { AppShuffleId appShuffleId = new AppShuffleId(appId, shuffleId); -File indexFile = getMergedShuffleIndexFile(appShuffleId, reduceId); +File indexFile = getMergedShuffleIndexFile(appShuffleId, shuffleSequenceId, reduceId); Review comment: It seems you are changing the fetch side protocols so that you can figure out the `shuffleSequenceId` here to find which files to use. I don't think we should change the fetch side protocols if it's just for this reason. Is the request here ever going to be for an older shuffleSequenceId? If not, then you should try to figure out the latest shuffleSequenceId in `RemoteBlockPushResolver` rather than adding it to the protocol -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] viirya commented on a change in pull request #32932: [SPARK-35786][SQL] Add a new operator to distingush if AQE can optimize safely
viirya commented on a change in pull request #32932: URL: https://github.com/apache/spark/pull/32932#discussion_r656765126 ## File path: docs/sql-performance-tuning.md ## @@ -228,6 +228,8 @@ The "REPARTITION_BY_RANGE" hint must have column names and a partition number is SELECT /*+ REPARTITION */ * FROM t SELECT /*+ REPARTITION_BY_RANGE(c) */ * FROM t SELECT /*+ REPARTITION_BY_RANGE(3, c) */ * FROM t +SELECT /*+ REPARTITION_BY_AQE */ * FROM t Review comment: `REBALANCE_OUTPUT` sounds good, or `REPARTITION_BY_AUTO`, `REBALANCE_PARTITION`, `REPARTITION_BY_REBALANCE`. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA removed a comment on pull request #33028: [SPARK-35714][FOLLOW-UP][CORE] Use a shared stopping flag for WorkerWatcher to avoid the duplicate System.exit
SparkQA removed a comment on pull request #33028: URL: https://github.com/apache/spark/pull/33028#issuecomment-866473923 **[Test build #140172 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/140172/testReport)** for PR 33028 at commit [`b8d48e2`](https://github.com/apache/spark/commit/b8d48e29c67ca30a866d2247dceee8618e16320c). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #33028: [SPARK-35714][FOLLOW-UP][CORE] Use a shared stopping flag for WorkerWatcher to avoid the duplicate System.exit
SparkQA commented on pull request #33028: URL: https://github.com/apache/spark/pull/33028#issuecomment-866525959 **[Test build #140172 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/140172/testReport)** for PR 33028 at commit [`b8d48e2`](https://github.com/apache/spark/commit/b8d48e29c67ca30a866d2247dceee8618e16320c). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] otterc commented on a change in pull request #33034: WIP: [SPARK-32923][CORE][SHUFFLE] Handle indeterminate stage retries for push-based shuffle
otterc commented on a change in pull request #33034: URL: https://github.com/apache/spark/pull/33034#discussion_r656760036 ## File path: common/network-shuffle/src/main/java/org/apache/spark/network/shuffle/ExternalBlockHandler.java ## @@ -498,7 +501,7 @@ public boolean hasNext() { @Override public ManagedBuffer next() { ManagedBuffer block = Preconditions.checkNotNull(mergeManager.getMergedBlockData( -appId, shuffleId, reduceIds[reduceIdx], chunkIds[reduceIdx][chunkIdx])); +appId, shuffleId, shuffleSequenceId, reduceIds[reduceIdx], chunkIds[reduceIdx][chunkIdx])); Review comment: Same here ## File path: common/network-shuffle/src/main/java/org/apache/spark/network/shuffle/ExternalBlockHandler.java ## @@ -476,12 +477,14 @@ public ManagedBuffer next() { private final String appId; private final int shuffleId; +private final int shuffleSequenceId; private final int[] reduceIds; private final int[][] chunkIds; ShuffleChunkManagedBufferIterator(FetchShuffleBlockChunks msg) { appId = msg.appId; shuffleId = msg.shuffleId; + shuffleSequenceId = msg.shuffleSequenceId; Review comment: Same here. ## File path: common/network-common/src/main/java/org/apache/spark/network/protocol/MergedBlockMetaRequest.java ## @@ -32,13 +32,20 @@ public final long requestId; public final String appId; public final int shuffleId; + public final int shuffleSequenceId; public final int reduceId; - public MergedBlockMetaRequest(long requestId, String appId, int shuffleId, int reduceId) { + public MergedBlockMetaRequest( + long requestId, + String appId, + int shuffleId, + int shuffleSequenceId, Review comment: Same here. Why do we need to modify the fetch side requests? ## File path: common/network-shuffle/src/main/java/org/apache/spark/network/shuffle/RemoteBlockPushResolver.java ## @@ -156,26 +157,31 @@ private AppShufflePartitionInfo getOrCreateAppShufflePartitionInfo( @VisibleForTesting AppShufflePartitionInfo newAppShufflePartitionInfo( AppShuffleId appShuffleId, + int shuffleSequenceId, int reduceId, File dataFile, File indexFile, File metaFile) throws IOException { -return new AppShufflePartitionInfo(appShuffleId, reduceId, dataFile, +return new AppShufflePartitionInfo(appShuffleId, shuffleSequenceId, reduceId, dataFile, new MergeShuffleFile(indexFile), new MergeShuffleFile(metaFile)); } @Override - public MergedBlockMeta getMergedBlockMeta(String appId, int shuffleId, int reduceId) { + public MergedBlockMeta getMergedBlockMeta( + String appId, + int shuffleId, + int shuffleSequenceId, + int reduceId) { AppShuffleId appShuffleId = new AppShuffleId(appId, shuffleId); -File indexFile = getMergedShuffleIndexFile(appShuffleId, reduceId); +File indexFile = getMergedShuffleIndexFile(appShuffleId, shuffleSequenceId, reduceId); Review comment: It seems you are changing the fetch side protocols so that you can figure out the `shuffleSequenceId` here to find which files to use. I don't think we should change the fetch side protocols if it's just for this reason. Instead there should be some logic in the {{RemoteBlockPushResolver}} to know which is the latest shuffle sequence Id -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA removed a comment on pull request #32970: [SPARK-35772][SQL] Check all year-month interval types in HiveInspectors tests
SparkQA removed a comment on pull request #32970: URL: https://github.com/apache/spark/pull/32970#issuecomment-866475402 **[Test build #140175 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/140175/testReport)** for PR 32970 at commit [`4c74dde`](https://github.com/apache/spark/commit/4c74dde8f5f1789d7387287121f9b8f745204491). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #32970: [SPARK-35772][SQL] Check all year-month interval types in HiveInspectors tests
SparkQA commented on pull request #32970: URL: https://github.com/apache/spark/pull/32970#issuecomment-866523295 **[Test build #140175 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/140175/testReport)** for PR 32970 at commit [`4c74dde`](https://github.com/apache/spark/commit/4c74dde8f5f1789d7387287121f9b8f745204491). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] MaxGekk closed pull request #33031: [SPARK-35734][SQL][FOLLOWUP] IntervalUtils.toDayTimeIntervalString should consider the case a day-time type is casted as another day-time type
MaxGekk closed pull request #33031: URL: https://github.com/apache/spark/pull/33031 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] cloud-fan commented on a change in pull request #32932: [SPARK-35786][SQL] Add a new operator to distingush if AQE can optimize safely
cloud-fan commented on a change in pull request #32932: URL: https://github.com/apache/spark/pull/32932#discussion_r656760225 ## File path: docs/sql-performance-tuning.md ## @@ -228,6 +228,8 @@ The "REPARTITION_BY_RANGE" hint must have column names and a partition number is SELECT /*+ REPARTITION */ * FROM t SELECT /*+ REPARTITION_BY_RANGE(c) */ * FROM t SELECT /*+ REPARTITION_BY_RANGE(3, c) */ * FROM t +SELECT /*+ REPARTITION_BY_AQE */ * FROM t Review comment: or just `REBALANCE_OUTPUT`? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AngersZhuuuu commented on a change in pull request #32940: [SPARK-35768][SQL] Take into account year-month interval fields in cast
AngersZh commented on a change in pull request #32940: URL: https://github.com/apache/spark/pull/32940#discussion_r656759755 ## File path: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/CastSuite.scala ## @@ -654,6 +654,33 @@ class CastSuite extends CastSuiteBase { } } + test("SPARK-35768: Take into account year-month interval fields in cast") { +Seq(("1-1", YearMonthIntervalType(YEAR, YEAR), 12, 12, 12), + ("1-1", YearMonthIntervalType(YEAR, MONTH), 13, 12, 13), + ("1-1", YearMonthIntervalType(MONTH, MONTH), 13, 12, 13), + ("-1-1", YearMonthIntervalType(YEAR, YEAR), -12, -12, -12), + ("-1-1", YearMonthIntervalType(YEAR, MONTH), -13, -12, -13), + ("-1-1", YearMonthIntervalType(MONTH, MONTH), -13, -12, -13)) + .foreach { case (str, dataType, ym, year, month) => +checkEvaluation(cast(Literal.create(str), dataType), ym) +checkEvaluation(cast(Literal.create(s"INTERVAL '$str' YEAR TO MONTH"), dataType), ym) +checkEvaluation(cast(Literal.create(s"INTERVAL -'$str' YEAR TO MONTH"), dataType), -ym) +checkEvaluation(cast(Literal.create(s"INTERVAL '$str' YEAR"), dataType), year) Review comment: Done -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] otterc commented on a change in pull request #33034: WIP: [SPARK-32923][CORE][SHUFFLE] Handle indeterminate stage retries for push-based shuffle
otterc commented on a change in pull request #33034: URL: https://github.com/apache/spark/pull/33034#discussion_r656759577 ## File path: common/network-common/src/main/java/org/apache/spark/network/client/TransportClient.java ## @@ -222,7 +223,7 @@ public void sendMergedBlockMetaReq( handler.addRpcRequest(requestId, callback); RpcChannelListener listener = new RpcChannelListener(requestId, callback); channel.writeAndFlush( - new MergedBlockMetaRequest(requestId, appId, shuffleId, reduceId)).addListener(listener); + new MergedBlockMetaRequest(requestId, appId, shuffleId, shuffleSequenceId, reduceId)).addListener(listener); Review comment: Why do we need to modify the fetch side requests? When will this request for shuffle data of an older shuffleSequenceId? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #32940: [SPARK-35768][SQL] Take into account year-month interval fields in cast
SparkQA commented on pull request #32940: URL: https://github.com/apache/spark/pull/32940#issuecomment-866521057 Kubernetes integration test status failure URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/44705/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA removed a comment on pull request #33012: [SPARK-33298][CORE] Introduce new API to FileCommitProtocol allow flexible file naming
SparkQA removed a comment on pull request #33012: URL: https://github.com/apache/spark/pull/33012#issuecomment-866457393 **[Test build #140170 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/140170/testReport)** for PR 33012 at commit [`b24f40d`](https://github.com/apache/spark/commit/b24f40de3ab625c9b9058a5490be55c6ce26c392). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA removed a comment on pull request #32970: [SPARK-35772][SQL] Check all year-month interval types in HiveInspectors tests
SparkQA removed a comment on pull request #32970: URL: https://github.com/apache/spark/pull/32970#issuecomment-866476891 **[Test build #140176 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/140176/testReport)** for PR 32970 at commit [`79486b3`](https://github.com/apache/spark/commit/79486b309c69d30d244252015443cfeaac31a8a4). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #32970: [SPARK-35772][SQL] Check all year-month interval types in HiveInspectors tests
SparkQA commented on pull request #32970: URL: https://github.com/apache/spark/pull/32970#issuecomment-866518457 **[Test build #140176 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/140176/testReport)** for PR 32970 at commit [`79486b3`](https://github.com/apache/spark/commit/79486b309c69d30d244252015443cfeaac31a8a4). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #33012: [SPARK-33298][CORE] Introduce new API to FileCommitProtocol allow flexible file naming
SparkQA commented on pull request #33012: URL: https://github.com/apache/spark/pull/33012#issuecomment-866518270 **[Test build #140170 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/140170/testReport)** for PR 33012 at commit [`b24f40d`](https://github.com/apache/spark/commit/b24f40de3ab625c9b9058a5490be55c6ce26c392). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #32940: [SPARK-35768][SQL] Take into account year-month interval fields in cast
SparkQA commented on pull request #32940: URL: https://github.com/apache/spark/pull/32940#issuecomment-866517460 Kubernetes integration test starting URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/44705/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #32933: [WIP][SPARK-35785][SS] Cleanup support for RocksDB instance
AmplabJenkins removed a comment on pull request #32933: URL: https://github.com/apache/spark/pull/32933#issuecomment-866517192 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/140171/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #32933: [WIP][SPARK-35785][SS] Cleanup support for RocksDB instance
AmplabJenkins commented on pull request #32933: URL: https://github.com/apache/spark/pull/32933#issuecomment-866517192 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/140171/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] ulysses-you commented on a change in pull request #32932: [SPARK-35786][SQL] Add a new operator to distingush if AQE can optimize safely
ulysses-you commented on a change in pull request #32932: URL: https://github.com/apache/spark/pull/32932#discussion_r656754576 ## File path: docs/sql-performance-tuning.md ## @@ -228,6 +228,8 @@ The "REPARTITION_BY_RANGE" hint must have column names and a partition number is SELECT /*+ REPARTITION */ * FROM t SELECT /*+ REPARTITION_BY_RANGE(c) */ * FROM t SELECT /*+ REPARTITION_BY_RANGE(3, c) */ * FROM t +SELECT /*+ REPARTITION_BY_AQE */ * FROM t Review comment: good point, seems in previous sql config which user faced, we have no word about `output partitions`. `REBALANCE_SHUFFLE_PARTITIONS` ? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA removed a comment on pull request #32933: [WIP][SPARK-35785][SS] Cleanup support for RocksDB instance
SparkQA removed a comment on pull request #32933: URL: https://github.com/apache/spark/pull/32933#issuecomment-866457489 **[Test build #140171 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/140171/testReport)** for PR 32933 at commit [`f8f9d20`](https://github.com/apache/spark/commit/f8f9d20690838c725d0f657832257b62f3caf19e). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #32933: [WIP][SPARK-35785][SS] Cleanup support for RocksDB instance
SparkQA commented on pull request #32933: URL: https://github.com/apache/spark/pull/32933#issuecomment-866516634 **[Test build #140171 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/140171/testReport)** for PR 32933 at commit [`f8f9d20`](https://github.com/apache/spark/commit/f8f9d20690838c725d0f657832257b62f3caf19e). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] cloud-fan commented on a change in pull request #33027: [SPARK-35857][SQL] The ANSI flag of Cast should be kept after being copied
cloud-fan commented on a change in pull request #33027: URL: https://github.com/apache/spark/pull/33027#discussion_r656753977 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/Cast.scala ## @@ -1942,16 +1942,18 @@ abstract class CastBase extends UnaryExpression with TimeZoneAwareExpression wit """, since = "1.0.0", group = "conversion_funcs") -case class Cast(child: Expression, dataType: DataType, timeZoneId: Option[String] = None) +case class Cast( +child: Expression, +dataType: DataType, +timeZoneId: Option[String] = None, +override val ansiEnabled: Boolean = SQLConf.get.ansiEnabled) extends CastBase { Review comment: I checked `Add`, I think we should also add a new `def this` to allow omitting the `ansiEnabled` parameter, to get a bit more compatibility, such as `new Cast(...)` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #32970: [SPARK-35772][SQL] Check all year-month interval types in HiveInspectors tests
AmplabJenkins removed a comment on pull request #32970: URL: https://github.com/apache/spark/pull/32970#issuecomment-866513004 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/44703/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #32958: [SPARK-35065][SQL] Group exception messages in spark/sql (core)
AmplabJenkins removed a comment on pull request #32958: URL: https://github.com/apache/spark/pull/32958#issuecomment-866515261 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/140181/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #33028: [SPARK-35714][FOLLOW-UP][CORE] Use a shared stopping flag for WorkerWatcher to avoid the duplicate System.exit
AmplabJenkins removed a comment on pull request #33028: URL: https://github.com/apache/spark/pull/33028#issuecomment-866513003 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/44699/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #33035: [SPARK-35860][SQL] Support UpCast between different field of YearMonthIntervalType/DayTimeIntervalType
AmplabJenkins removed a comment on pull request #33035: URL: https://github.com/apache/spark/pull/33035#issuecomment-866514425 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/44704/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA removed a comment on pull request #32958: [SPARK-35065][SQL] Group exception messages in spark/sql (core)
SparkQA removed a comment on pull request #32958: URL: https://github.com/apache/spark/pull/32958#issuecomment-866514437 **[Test build #140181 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/140181/testReport)** for PR 32958 at commit [`9b1dafc`](https://github.com/apache/spark/commit/9b1dafc5d47d51cd0f7524a24653ca19383b934a). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #32958: [SPARK-35065][SQL] Group exception messages in spark/sql (core)
SparkQA commented on pull request #32958: URL: https://github.com/apache/spark/pull/32958#issuecomment-866515250 **[Test build #140181 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/140181/testReport)** for PR 32958 at commit [`9b1dafc`](https://github.com/apache/spark/commit/9b1dafc5d47d51cd0f7524a24653ca19383b934a). * This patch **fails Scala style tests**. * This patch merges cleanly. * This patch adds no public classes. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #32958: [SPARK-35065][SQL] Group exception messages in spark/sql (core)
AmplabJenkins commented on pull request #32958: URL: https://github.com/apache/spark/pull/32958#issuecomment-866515261 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/140181/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #32958: [SPARK-35065][SQL] Group exception messages in spark/sql (core)
SparkQA commented on pull request #32958: URL: https://github.com/apache/spark/pull/32958#issuecomment-866514437 **[Test build #140181 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/140181/testReport)** for PR 32958 at commit [`9b1dafc`](https://github.com/apache/spark/commit/9b1dafc5d47d51cd0f7524a24653ca19383b934a). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #33035: [SPARK-35860][SQL] Support UpCast between different field of YearMonthIntervalType/DayTimeIntervalType
AmplabJenkins commented on pull request #33035: URL: https://github.com/apache/spark/pull/33035#issuecomment-866514425 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/44704/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #33028: [SPARK-35714][FOLLOW-UP][CORE] Use a shared stopping flag for WorkerWatcher to avoid the duplicate System.exit
SparkQA commented on pull request #33028: URL: https://github.com/apache/spark/pull/33028#issuecomment-866514366 **[Test build #140179 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/140179/testReport)** for PR 33028 at commit [`6f68b48`](https://github.com/apache/spark/commit/6f68b48701e37b24fa3eb2a64cc8bcf3ee5444e9). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #33035: [SPARK-35860][SQL] Support UpCast between different field of YearMonthIntervalType/DayTimeIntervalType
SparkQA commented on pull request #33035: URL: https://github.com/apache/spark/pull/33035#issuecomment-866514410 Kubernetes integration test status failure URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/44704/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #33027: [SPARK-35857][SQL] The ANSI flag of Cast should be kept after being copied
SparkQA commented on pull request #33027: URL: https://github.com/apache/spark/pull/33027#issuecomment-866514350 **[Test build #140180 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/140180/testReport)** for PR 33027 at commit [`07ed68a`](https://github.com/apache/spark/commit/07ed68a518f88d20ef67228fd1ad1f3e3b2f66d9). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #32970: [SPARK-35772][SQL] Check all year-month interval types in HiveInspectors tests
AmplabJenkins commented on pull request #32970: URL: https://github.com/apache/spark/pull/32970#issuecomment-866513004 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/44703/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #33028: [SPARK-35714][FOLLOW-UP][CORE] Use a shared stopping flag for WorkerWatcher to avoid the duplicate System.exit
AmplabJenkins commented on pull request #33028: URL: https://github.com/apache/spark/pull/33028#issuecomment-866513003 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/44699/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #33035: [SPARK-35860][SQL] Support UpCast between different field of YearMonthIntervalType/DayTimeIntervalType
SparkQA commented on pull request #33035: URL: https://github.com/apache/spark/pull/33035#issuecomment-866511319 Kubernetes integration test starting URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/44704/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] Ngone51 commented on pull request #32790: [SPARK-35543][CORE] Fix memory leak in BlockManagerMasterEndpoint removeRdd
Ngone51 commented on pull request #32790: URL: https://github.com/apache/spark/pull/32790#issuecomment-866510223 @mridulm Interesting example! I also investigate a bit on it. Looks like all the elements of the underlying array has been nulled out but remain the array refrence unchanged. So its size doesn't change. But the memory usage shouldn't be the same as before but it's also not empty since `null` still takes a bit memory. ```java public void clear() { Node[] tab; modCount++; if ((tab = table) != null && size > 0) { size = 0; for (int i = 0; i < tab.length; ++i) tab[i] = null; } } ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] cloud-fan commented on a change in pull request #33023: [SPARK-35812][PYTHON] Throw ValueError if version and timestamp are used together in to_delta
cloud-fan commented on a change in pull request #33023: URL: https://github.com/apache/spark/pull/33023#discussion_r656747652 ## File path: python/pyspark/pandas/namespace.py ## @@ -562,6 +562,8 @@ def read_delta( 3 13 4 14 """ +if version is not None and timestamp is not None: +raise ValueError("version and timestamp cannot be used together.") Review comment: how about the document? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] beliefer commented on a change in pull request #32958: [SPARK-35065][SQL] Group exception messages in spark/sql (core)
beliefer commented on a change in pull request #32958: URL: https://github.com/apache/spark/pull/32958#discussion_r656745122 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/errors/QueryExecutionErrors.scala ## @@ -1422,4 +1426,251 @@ object QueryExecutionErrors { def invalidStreamingOutputModeError(outputMode: Option[OutputMode]): Throwable = { new UnsupportedOperationException(s"Invalid output mode: $outputMode") } + + def multiFailuresInStageMaterializationError(error: Throwable): Throwable = { +new SparkException("Multiple failures in stage materialization.", error) + } + + def unrecognizedCompressionSchemaTypeIDError(typeId: Int): Throwable = { +new UnsupportedOperationException(s"Unrecognized compression scheme type ID: $typeId") + } + + def getParentLoggerNotImplementedError(className: String): Throwable = { +new SQLFeatureNotSupportedException(s"$className.getParentLogger is not yet implemented.") + } + + def cannotCreateParquetConverterForTypeError(t: DecimalType, parquetType: String): Throwable = { +new RuntimeException( + s""" + |Unable to create Parquet converter for ${t.typeName} + |whose Parquet type is $parquetType without decimal metadata. Please read this + |column/field as Spark BINARY type. + """.stripMargin.replaceAll("\n", " ")) + } + + def cannotCreateParquetConverterForDecimalTypeError( + t: DecimalType, parquetType: String): Throwable = { +new RuntimeException( + s""" + |Unable to create Parquet converter for decimal type ${t.json} whose Parquet type is + |$parquetType. Parquet DECIMAL type can only be backed by INT32, INT64, + |FIXED_LEN_BYTE_ARRAY, or BINARY. + """.stripMargin.replaceAll("\n", " ")) + } + + def cannotCreateParquetConverterForDataTypeError( + t: DataType, parquetType: String): Throwable = { +new RuntimeException(s"Unable to create Parquet converter for data type ${t.json} " + + s"whose Parquet type is $parquetType") + } + + def cannotAddMultiPartitionsOnNonatomicPartitionTableError(tableName: String): Throwable = { +new UnsupportedOperationException( + s"Nonatomic partition table $tableName can not add multiple partitions.") + } + + def userSpecifiedSchemaUnsupportedByDataSourceError(provider: TableProvider): Throwable = { +new UnsupportedOperationException( + s"${provider.getClass.getSimpleName} source does not support user-specified schema.") + } + + def cannotDropMultiPartitionsOnNonatomicPartitionTableError(tableName: String): Throwable = { +new UnsupportedOperationException( + s"Nonatomic partition table $tableName can not drop multiple partitions.") + } + + def truncateMultiPartitionUnsupportedError(tableName: String): Throwable = { +new UnsupportedOperationException( + s"The table $tableName does not support truncation of multiple partition.") + } + + def overwriteTableByUnsupportedExpressionError(table: Table): Throwable = { +new SparkException(s"Table does not support overwrite by expression: $table") + } + + def dynamicPartitionOverwriteUnsupportedByTableError(table: Table): Throwable = { +new SparkException(s"Table does not support dynamic partition overwrite: $table") + } + + def failedMergingSchemaError(schema: StructType, e: SparkException): Throwable = { +new SparkException(s"Failed merging schema:\n${schema.treeString}", e) + } + + def cannotBroadcastExceedMaxTableRowsError( + maxBroadcastTableRows: Long, numRows: Long): Throwable = { +new SparkException( + s"Cannot broadcast the table over $maxBroadcastTableRows rows: $numRows rows") + } + + def cannotBroadcastExceedMaxTableBytesError( + maxBroadcastTableBytes: Long, dataSize: Long): Throwable = { +new SparkException("Cannot broadcast the table that is larger than" + + s" ${maxBroadcastTableBytes >> 30}GB: ${dataSize >> 30} GB") + } + + def notEnoughMemoryToBuildAndBroadcastTableError(oe: OutOfMemoryError): Throwable = { +new OutOfMemoryError("Not enough memory to build and broadcast the table to all " + + "worker nodes. As a workaround, you can either disable broadcast by setting " + + s"${SQLConf.AUTO_BROADCASTJOIN_THRESHOLD.key} to -1 or increase the spark " + + s"driver memory by setting ${SparkLauncher.DRIVER_MEMORY} to a higher value.") + .initCause(oe.getCause) + } + + def executeUnsupportedByExecError(execName: String): Throwable = { +new UnsupportedOperationException(s"$execName does not support the execute() code path.") + } + + def cannotMergeClassWithOtherClassError(className: String, otherClass: String): Throwable = { +new UnsupportedOperationException( + s"Cannot merge $className with $otherClass") + } + + def continuousProcessingUnsupportedByDataSourceError(sourceName: String): Throwable = { +new UnsupportedOperationException( + s"Data source
[GitHub] [spark] Ngone51 commented on a change in pull request #33020: [SPARK-35543][CORE][FOLLOWUP] Fix memory leak in BlockManagerMasterEndpoint removeRdd
Ngone51 commented on a change in pull request #33020: URL: https://github.com/apache/spark/pull/33020#discussion_r656745082 ## File path: core/src/main/scala/org/apache/spark/storage/BlockManagerMasterEndpoint.scala ## @@ -570,7 +565,7 @@ class BlockManagerMasterEndpoint( val externalShuffleServiceBlockStatus = if (externalShuffleServiceRddFetchEnabled) { val externalShuffleServiceBlocks = blockStatusByShuffleService -.getOrElseUpdate(externalShuffleServiceIdOnHost(id), new JHashMap[BlockId, BlockStatus]) +.getOrElseUpdate(externalShuffleServiceIdOnHost(id), new BlockStatusPerBlockId) Review comment: Seems like we never clear the key after this change. Could you add some comments (maybe here) to explain? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] beliefer commented on a change in pull request #32958: [SPARK-35065][SQL] Group exception messages in spark/sql (core)
beliefer commented on a change in pull request #32958: URL: https://github.com/apache/spark/pull/32958#discussion_r656743625 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/errors/QueryExecutionErrors.scala ## @@ -1422,4 +1426,251 @@ object QueryExecutionErrors { def invalidStreamingOutputModeError(outputMode: Option[OutputMode]): Throwable = { new UnsupportedOperationException(s"Invalid output mode: $outputMode") } + + def multiFailuresInStageMaterializationError(error: Throwable): Throwable = { +new SparkException("Multiple failures in stage materialization.", error) + } + + def unrecognizedCompressionSchemaTypeIDError(typeId: Int): Throwable = { +new UnsupportedOperationException(s"Unrecognized compression scheme type ID: $typeId") + } + + def getParentLoggerNotImplementedError(className: String): Throwable = { +new SQLFeatureNotSupportedException(s"$className.getParentLogger is not yet implemented.") + } + + def cannotCreateParquetConverterForTypeError(t: DecimalType, parquetType: String): Throwable = { +new RuntimeException( + s""" + |Unable to create Parquet converter for ${t.typeName} + |whose Parquet type is $parquetType without decimal metadata. Please read this + |column/field as Spark BINARY type. + """.stripMargin.replaceAll("\n", " ")) + } + + def cannotCreateParquetConverterForDecimalTypeError( + t: DecimalType, parquetType: String): Throwable = { +new RuntimeException( + s""" + |Unable to create Parquet converter for decimal type ${t.json} whose Parquet type is + |$parquetType. Parquet DECIMAL type can only be backed by INT32, INT64, + |FIXED_LEN_BYTE_ARRAY, or BINARY. + """.stripMargin.replaceAll("\n", " ")) + } + + def cannotCreateParquetConverterForDataTypeError( + t: DataType, parquetType: String): Throwable = { +new RuntimeException(s"Unable to create Parquet converter for data type ${t.json} " + + s"whose Parquet type is $parquetType") + } + + def cannotAddMultiPartitionsOnNonatomicPartitionTableError(tableName: String): Throwable = { +new UnsupportedOperationException( + s"Nonatomic partition table $tableName can not add multiple partitions.") + } + + def userSpecifiedSchemaUnsupportedByDataSourceError(provider: TableProvider): Throwable = { +new UnsupportedOperationException( + s"${provider.getClass.getSimpleName} source does not support user-specified schema.") + } + + def cannotDropMultiPartitionsOnNonatomicPartitionTableError(tableName: String): Throwable = { +new UnsupportedOperationException( + s"Nonatomic partition table $tableName can not drop multiple partitions.") + } + + def truncateMultiPartitionUnsupportedError(tableName: String): Throwable = { +new UnsupportedOperationException( + s"The table $tableName does not support truncation of multiple partition.") + } + + def overwriteTableByUnsupportedExpressionError(table: Table): Throwable = { +new SparkException(s"Table does not support overwrite by expression: $table") + } + + def dynamicPartitionOverwriteUnsupportedByTableError(table: Table): Throwable = { +new SparkException(s"Table does not support dynamic partition overwrite: $table") + } + + def failedMergingSchemaError(schema: StructType, e: SparkException): Throwable = { +new SparkException(s"Failed merging schema:\n${schema.treeString}", e) + } + + def cannotBroadcastExceedMaxTableRowsError( + maxBroadcastTableRows: Long, numRows: Long): Throwable = { +new SparkException( + s"Cannot broadcast the table over $maxBroadcastTableRows rows: $numRows rows") + } + + def cannotBroadcastExceedMaxTableBytesError( + maxBroadcastTableBytes: Long, dataSize: Long): Throwable = { +new SparkException("Cannot broadcast the table that is larger than" + + s" ${maxBroadcastTableBytes >> 30}GB: ${dataSize >> 30} GB") + } + + def notEnoughMemoryToBuildAndBroadcastTableError(oe: OutOfMemoryError): Throwable = { +new OutOfMemoryError("Not enough memory to build and broadcast the table to all " + + "worker nodes. As a workaround, you can either disable broadcast by setting " + + s"${SQLConf.AUTO_BROADCASTJOIN_THRESHOLD.key} to -1 or increase the spark " + + s"driver memory by setting ${SparkLauncher.DRIVER_MEMORY} to a higher value.") + .initCause(oe.getCause) + } + + def executeUnsupportedByExecError(execName: String): Throwable = { +new UnsupportedOperationException(s"$execName does not support the execute() code path.") + } + + def cannotMergeClassWithOtherClassError(className: String, otherClass: String): Throwable = { +new UnsupportedOperationException( + s"Cannot merge $className with $otherClass") + } + + def continuousProcessingUnsupportedByDataSourceError(sourceName: String): Throwable = { +new UnsupportedOperationException( + s"Data source
[GitHub] [spark] SparkQA commented on pull request #32970: [SPARK-35772][SQL] Check all year-month interval types in HiveInspectors tests
SparkQA commented on pull request #32970: URL: https://github.com/apache/spark/pull/32970#issuecomment-866504186 Kubernetes integration test status failure URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/44703/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #33028: [SPARK-35714][FOLLOW-UP][CORE] Use a shared stopping flag for WorkerWatcher to avoid the duplicate System.exit
SparkQA commented on pull request #33028: URL: https://github.com/apache/spark/pull/33028#issuecomment-866503738 Kubernetes integration test status failure URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/44699/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] cloud-fan commented on a change in pull request #32932: [SPARK-35786][SQL] Add a new operator to distingush if AQE can optimize safely
cloud-fan commented on a change in pull request #32932: URL: https://github.com/apache/spark/pull/32932#discussion_r656741164 ## File path: docs/sql-performance-tuning.md ## @@ -228,6 +228,8 @@ The "REPARTITION_BY_RANGE" hint must have column names and a partition number is SELECT /*+ REPARTITION */ * FROM t SELECT /*+ REPARTITION_BY_RANGE(c) */ * FROM t SELECT /*+ REPARTITION_BY_RANGE(3, c) */ * FROM t +SELECT /*+ REPARTITION_BY_AQE */ * FROM t Review comment: Other repartition hints can also be optimized by AQE, so I think this name is not precise enough. The key point here is the user intention. To optimize for data writing, we don't need a specific number of partitions, we don't need a strict output partitioning (like partition by a column). We only need to make the output evenly distributed and be partitioned by come columns as possible as we can (best effort). How about `REBALANCE_OUTPUT_PARTITIONS`? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] Peng-Lei commented on a change in pull request #32931: [SPARK-33898][SQL] Support SHOW CREATE TABLE In V2
Peng-Lei commented on a change in pull request #32931: URL: https://github.com/apache/spark/pull/32931#discussion_r656740205 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2/ShowCreateTableExec.scala ## @@ -0,0 +1,120 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.spark.sql.execution.datasources.v2 + +import scala.collection.mutable + +import org.apache.spark.sql.catalyst.InternalRow +import org.apache.spark.sql.catalyst.expressions.Attribute +import org.apache.spark.sql.catalyst.util.escapeSingleQuotedString +import org.apache.spark.sql.connector.catalog.{CatalogV2Util, Table, TableCatalog} +import org.apache.spark.sql.execution.LeafExecNode +import org.apache.spark.unsafe.types.UTF8String + +/** + * Physical plan node for show create table. + */ +case class ShowCreateTableExec( +output: Seq[Attribute], +table: Table) extends V2CommandExec with LeafExecNode { + override protected def run(): Seq[InternalRow] = { +val builder = StringBuilder.newBuilder +// it is used to generate Spark DDL for given table. include Hive Serde table +showCreateTable(table, builder) +Seq(InternalRow(UTF8String.fromString(builder.toString))) + } + + private def showCreateTable(table: Table, builder: StringBuilder): Unit = { +builder ++= s"CREATE TABLE ${table.name()} " + +showTableDataColumns(table, builder) +showTableUsing(table, builder) +showTableOptions(table, builder) +showTablePartitioning(table, builder) +showTableComment(table, builder) +showTableLocation(table, builder) +showTableProperties(table, builder) + } + + private def showTableDataColumns(table: Table, builder: StringBuilder): Unit = { +val columns = table.schema().fields.map(_.toDDL) +builder ++= concatByMultiLines(columns) + } + + private def showTableUsing(table: Table, builder: StringBuilder): Unit = { +Option(table.properties.get(TableCatalog.PROP_PROVIDER)) + .map("USING " + escapeSingleQuotedString(_) + "\n") + .foreach(builder.append) + } + + private def showTableOptions(table: Table, builder: StringBuilder): Unit = { +import scala.collection.JavaConverters._ +val dataSourceOptions = table.properties.asScala + .filterKeys(_.startsWith(TableCatalog.OPTION_PREFIX)) +if (dataSourceOptions.nonEmpty) { + val props = dataSourceOptions.map { case (key, value) => +s"'${escapeSingleQuotedString(key)}' = '${escapeSingleQuotedString(value)}'" + } + + builder ++= "OPTIONS" + builder ++= concatByMultiLines(props) +} + } + + private def showTablePartitioning(table: Table, builder: StringBuilder): Unit = { +if (!table.partitioning.isEmpty) { + val transforms = new mutable.ArrayBuffer[String] + table.partitioning.foreach(t => transforms += t.describe()) + if (transforms.nonEmpty) { Review comment: yes -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] Peng-Lei commented on a change in pull request #32931: [SPARK-33898][SQL] Support SHOW CREATE TABLE In V2
Peng-Lei commented on a change in pull request #32931: URL: https://github.com/apache/spark/pull/32931#discussion_r656740158 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2/DataSourceV2Strategy.scala ## @@ -377,8 +377,11 @@ class DataSourceV2Strategy(session: SparkSession) extends Strategy with Predicat case LoadData(_: ResolvedTable, _, _, _, _) => throw QueryCompilationErrors.loadDataNotSupportedForV2TablesError() -case ShowCreateTable(_: ResolvedTable, _, _) => - throw QueryCompilationErrors.showCreateTableNotSupportedForV2TablesError() Review comment: yes -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AngersZhuuuu commented on a change in pull request #32940: [SPARK-35768][SQL] Take into account year-month interval fields in cast
AngersZh commented on a change in pull request #32940: URL: https://github.com/apache/spark/pull/32940#discussion_r656739982 ## File path: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/CastSuite.scala ## @@ -654,6 +654,33 @@ class CastSuite extends CastSuiteBase { } } + test("SPARK-35768: Take into account year-month interval fields in cast") { +Seq(("1-1", YearMonthIntervalType(YEAR, YEAR), 12, 12, 12), + ("1-1", YearMonthIntervalType(YEAR, MONTH), 13, 12, 13), + ("1-1", YearMonthIntervalType(MONTH, MONTH), 13, 12, 13), Review comment: Done -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #32970: [SPARK-35772][SQL] Check all year-month interval types in HiveInspectors tests
SparkQA commented on pull request #32970: URL: https://github.com/apache/spark/pull/32970#issuecomment-866501347 Kubernetes integration test starting URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/44703/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #33028: [SPARK-35714][FOLLOW-UP][CORE] Use a shared stopping flag for WorkerWatcher to avoid the duplicate System.exit
SparkQA commented on pull request #33028: URL: https://github.com/apache/spark/pull/33028#issuecomment-866500374 Kubernetes integration test starting URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/44699/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #32993: [SPARK-35776][SQL] Check all year-month interval types in arrow
SparkQA commented on pull request #32993: URL: https://github.com/apache/spark/pull/32993#issuecomment-866500161 Kubernetes integration test unable to build dist. exiting with code: 1 URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/44701/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #32993: [SPARK-35776][SQL] Check all year-month interval types in arrow
AmplabJenkins commented on pull request #32993: URL: https://github.com/apache/spark/pull/32993#issuecomment-866500178 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/44701/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #32970: [SPARK-35772][SQL] Check all year-month interval types in HiveInspectors tests
AmplabJenkins removed a comment on pull request #32970: URL: https://github.com/apache/spark/pull/32970#issuecomment-866499627 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/44702/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #32970: [SPARK-35772][SQL] Check all year-month interval types in HiveInspectors tests
AmplabJenkins commented on pull request #32970: URL: https://github.com/apache/spark/pull/32970#issuecomment-866499627 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/44702/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #32970: [SPARK-35772][SQL] Check all year-month interval types in HiveInspectors tests
SparkQA commented on pull request #32970: URL: https://github.com/apache/spark/pull/32970#issuecomment-866499604 Kubernetes integration test status failure URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/44702/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #32932: [SPARK-35786][SQL] Add a new operator to distingush if AQE can optimize safely
AmplabJenkins removed a comment on pull request #32932: URL: https://github.com/apache/spark/pull/32932#issuecomment-866498911 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/44700/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #32932: [SPARK-35786][SQL] Add a new operator to distingush if AQE can optimize safely
SparkQA commented on pull request #32932: URL: https://github.com/apache/spark/pull/32932#issuecomment-866498895 Kubernetes integration test status failure URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/44700/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #32932: [SPARK-35786][SQL] Add a new operator to distingush if AQE can optimize safely
AmplabJenkins commented on pull request #32932: URL: https://github.com/apache/spark/pull/32932#issuecomment-866498911 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/44700/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #32932: [SPARK-35786][SQL] Add a new operator to distingush if AQE can optimize safely
AmplabJenkins removed a comment on pull request #32932: URL: https://github.com/apache/spark/pull/32932#issuecomment-866494187 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/44696/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #32933: [WIP][SPARK-35785][SS] Cleanup support for RocksDB instance
AmplabJenkins removed a comment on pull request #32933: URL: https://github.com/apache/spark/pull/32933#issuecomment-866494184 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/44698/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #33008: [WIP][SPARK-35801][SQL] Support DELETE operations that require rewriting data
AmplabJenkins removed a comment on pull request #33008: URL: https://github.com/apache/spark/pull/33008#issuecomment-866494185 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/140167/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #33012: [SPARK-33298][CORE] Introduce new API to FileCommitProtocol allow flexible file naming
AmplabJenkins removed a comment on pull request #33012: URL: https://github.com/apache/spark/pull/33012#issuecomment-866494186 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/44697/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #32970: [SPARK-35772][SQL] Check all year-month interval types in HiveInspectors tests
SparkQA commented on pull request #32970: URL: https://github.com/apache/spark/pull/32970#issuecomment-866496586 Kubernetes integration test starting URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/44702/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] gengliangwang commented on a change in pull request #33022: [SPARK-35856][SQL][TESTS] Move new interval type test cases from CastSuite to CastBaseSuite
gengliangwang commented on a change in pull request #33022: URL: https://github.com/apache/spark/pull/33022#discussion_r656733692 ## File path: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/CastSuiteBase.scala ## @@ -73,7 +73,9 @@ abstract class CastSuiteBase extends SparkFunSuite with ExpressionEvalHelper { } } - protected def isAlwaysNullable: Boolean = false + // Whether the test suite is for TryCast. If yes, there is no exceptions and the result is + // always nullable. + protected def isTryCast: Boolean = false Review comment: +1, PR description updated. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org