[GitHub] [spark] holdenk commented on pull request #32289: [SPARK-33357][K8S] Support Spark application managing with SparkAppHandle on Kubernetes

2021-06-25 Thread GitBox
holdenk commented on pull request #32289: URL: https://github.com/apache/spark/pull/32289#issuecomment-868734353 Jenkins ok to test -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific c

[GitHub] [spark] SparkQA commented on pull request #33091: [WIP][SPARK-35896][SS] Include more granular metrics for stateful operators in StreamingQueryProgress

2021-06-25 Thread GitBox
SparkQA commented on pull request #33091: URL: https://github.com/apache/spark/pull/33091#issuecomment-868734376 **[Test build #140334 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/140334/testReport)** for PR 33091 at commit [`2b1d758`](https://github.co

[GitHub] [spark] AmplabJenkins commented on pull request #33091: [WIP][SPARK-35896][SS] Include more granular metrics for stateful operators in StreamingQueryProgress

2021-06-25 Thread GitBox
AmplabJenkins commented on pull request #33091: URL: https://github.com/apache/spark/pull/33091#issuecomment-868734402 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/140334/ -- This

[GitHub] [spark] holdenk commented on pull request #32031: [WIP] Initial work of Remote Shuffle Service on Kubernetes

2021-06-25 Thread GitBox
holdenk commented on pull request #32031: URL: https://github.com/apache/spark/pull/32031#issuecomment-868733992 Thanks for working on this, the PR is super huge though is it possible to break into smaller chunks? -- This is an automated message from the Apache Git Service. To respond to

[GitHub] [spark] SparkQA removed a comment on pull request #33091: [WIP][SPARK-35896][SS] Include more granular metrics for stateful operators in StreamingQueryProgress

2021-06-25 Thread GitBox
SparkQA removed a comment on pull request #33091: URL: https://github.com/apache/spark/pull/33091#issuecomment-868729240 **[Test build #140332 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/140332/testReport)** for PR 33091 at commit [`2788de7`](https://gi

[GitHub] [spark] SparkQA commented on pull request #33091: [WIP][SPARK-35896][SS] Include more granular metrics for stateful operators in StreamingQueryProgress

2021-06-25 Thread GitBox
SparkQA commented on pull request #33091: URL: https://github.com/apache/spark/pull/33091#issuecomment-868733009 **[Test build #140332 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/140332/testReport)** for PR 33091 at commit [`2788de7`](https://github.co

[GitHub] [spark] AmplabJenkins commented on pull request #33091: [WIP][SPARK-35896][SS] Include more granular metrics for stateful operators in StreamingQueryProgress

2021-06-25 Thread GitBox
AmplabJenkins commented on pull request #33091: URL: https://github.com/apache/spark/pull/33091#issuecomment-868733033 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/140332/ -- This

[GitHub] [spark] AmplabJenkins removed a comment on pull request #33091: [WIP][SPARK-35896][SS] Include more granular metrics for stateful operators in StreamingQueryProgress

2021-06-25 Thread GitBox
AmplabJenkins removed a comment on pull request #33091: URL: https://github.com/apache/spark/pull/33091#issuecomment-868733033 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/140332/ -

[GitHub] [spark] AmplabJenkins removed a comment on pull request #33085: [SPARK-35894][BUILD] Introduce new style enforce to not import scala.collection.Seq/IndexedSeq

2021-06-25 Thread GitBox
AmplabJenkins removed a comment on pull request #33085: URL: https://github.com/apache/spark/pull/33085#issuecomment-868728275 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/140327/ -

[GitHub] [spark] AmplabJenkins removed a comment on pull request #33086: [SPARK-35895][SQL] Support subtracting Intervals from TimestampWithoutTZ

2021-06-25 Thread GitBox
AmplabJenkins removed a comment on pull request #33086: URL: https://github.com/apache/spark/pull/33086#issuecomment-868728274 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/140323/ -

[GitHub] [spark] SparkQA commented on pull request #33091: [WIP][SPARK-35896][SS] Include more granular metrics for stateful operators in StreamingQueryProgress

2021-06-25 Thread GitBox
SparkQA commented on pull request #33091: URL: https://github.com/apache/spark/pull/33091#issuecomment-868731555 **[Test build #140334 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/140334/testReport)** for PR 33091 at commit [`2b1d758`](https://github.com

[GitHub] [spark] rahulsmahadev commented on pull request #33093: [SPARK-35897][SS][WIP] Support user defined initial state with flatMapGroupsWithState in Structured Streaming

2021-06-25 Thread GitBox
rahulsmahadev commented on pull request #33093: URL: https://github.com/apache/spark/pull/33093#issuecomment-868731086 cc: @tdas can you enable CI on this -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above

[GitHub] [spark] aokolnychyi commented on a change in pull request #32921: [WIP][SPARK-35779][SQL] Dynamic filtering for Data Source V2

2021-06-25 Thread GitBox
aokolnychyi commented on a change in pull request #32921: URL: https://github.com/apache/spark/pull/32921#discussion_r658936118 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/dynamicpruning/PartitionPruning.scala ## @@ -66,10 +71,33 @@ object PartitionPru

[GitHub] [spark] AmplabJenkins commented on pull request #33093: [SPARK-35897][SS][WIP] Support user defined initial state with flatMapGroupsWithState in Structured Streaming

2021-06-25 Thread GitBox
AmplabJenkins commented on pull request #33093: URL: https://github.com/apache/spark/pull/33093#issuecomment-868730030 Can one of the admins verify this patch? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL a

[GitHub] [spark] rahulsmahadev commented on pull request #33093: [SPARK-35897][SS][WIP] Support user defined initial state with flatMapGroupsWithState in Structured Streaming

2021-06-25 Thread GitBox
rahulsmahadev commented on pull request #33093: URL: https://github.com/apache/spark/pull/33093#issuecomment-868729952 @tdas can you enable CI on this -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to g

[GitHub] [spark] rahulsmahadev opened a new pull request #33093: [SPARK-35897][SS][WIP] Support user defined initial state with flatMapGroupsWithState in Structured Streaming

2021-06-25 Thread GitBox
rahulsmahadev opened a new pull request #33093: URL: https://github.com/apache/spark/pull/33093 ### What changes were proposed in this pull request? ### Why are the changes needed? ### Does this PR introduce _any_ user-facing change? #

[GitHub] [spark] SparkQA commented on pull request #33082: [SPARK-35886][SQL] CodeGenerator.getLocalInputVariableValues should handle matched subQuery but not VariableValue

2021-06-25 Thread GitBox
SparkQA commented on pull request #33082: URL: https://github.com/apache/spark/pull/33082#issuecomment-868729292 **[Test build #140333 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/140333/testReport)** for PR 33082 at commit [`01e5d38`](https://github.com

[GitHub] [spark] SparkQA commented on pull request #33091: [WIP][SPARK-35896][SS] Include more granular metrics for stateful operators in StreamingQueryProgress

2021-06-25 Thread GitBox
SparkQA commented on pull request #33091: URL: https://github.com/apache/spark/pull/33091#issuecomment-868729240 **[Test build #140332 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/140332/testReport)** for PR 33091 at commit [`2788de7`](https://github.com

[GitHub] [spark] aokolnychyi commented on a change in pull request #32921: [WIP][SPARK-35779][SQL] Dynamic filtering for Data Source V2

2021-06-25 Thread GitBox
aokolnychyi commented on a change in pull request #32921: URL: https://github.com/apache/spark/pull/32921#discussion_r658934396 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/dynamicpruning/PartitionPruning.scala ## @@ -66,10 +71,33 @@ object PartitionPru

[GitHub] [spark] SparkQA commented on pull request #33092: [SPARK-33338][SQL][FOLLOWUP][TESTS] Fix UT mistake in SQLQuerySuite

2021-06-25 Thread GitBox
SparkQA commented on pull request #33092: URL: https://github.com/apache/spark/pull/33092#issuecomment-868729194 **[Test build #140331 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/140331/testReport)** for PR 33092 at commit [`2a808be`](https://github.com

[GitHub] [spark] AmplabJenkins commented on pull request #33086: [SPARK-35895][SQL] Support subtracting Intervals from TimestampWithoutTZ

2021-06-25 Thread GitBox
AmplabJenkins commented on pull request #33086: URL: https://github.com/apache/spark/pull/33086#issuecomment-868728274 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/140323/ -- This

[GitHub] [spark] AmplabJenkins commented on pull request #33085: [SPARK-35894][BUILD] Introduce new style enforce to not import scala.collection.Seq/IndexedSeq

2021-06-25 Thread GitBox
AmplabJenkins commented on pull request #33085: URL: https://github.com/apache/spark/pull/33085#issuecomment-868728275 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/140327/ -- This

[GitHub] [spark] AngersZhuuuu commented on pull request #33092: [SPARK-33338][SQL][FOLLOWUP][TESTS] Fix UT mistake in SQLQuerySuite

2021-06-25 Thread GitBox
AngersZh commented on pull request #33092: URL: https://github.com/apache/spark/pull/33092#issuecomment-868727184 FYI @dongjoon-hyun -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the spec

[GitHub] [spark] AngersZhuuuu opened a new pull request #33092: [SPARK-33338][SQL][FOLLOWUP][TESTS] Fix UT mistake in SQLQuerySuite

2021-06-25 Thread GitBox
AngersZh opened a new pull request #33092: URL: https://github.com/apache/spark/pull/33092 ### What changes were proposed in this pull request? Fix UT mistake in SQLQuerySuite ### Why are the changes needed? Fix UT mistake in SQLQuerySuite ### Does this PR int

[GitHub] [spark] AngersZhuuuu commented on pull request #33082: [SPARK-35886][SQL] CodeGenerator.getLocalInputVariableValues should handle matched subQuery but not VariableValue

2021-06-25 Thread GitBox
AngersZh commented on pull request #33082: URL: https://github.com/apache/spark/pull/33082#issuecomment-868725728 ping @cloud-fan @maropu @wangyum @Ngone51 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL

[GitHub] [spark] holdenk commented on a change in pull request #33008: [WIP][SPARK-35801][SQL] Support DELETE operations that require rewriting data

2021-06-25 Thread GitBox
holdenk commented on a change in pull request #33008: URL: https://github.com/apache/spark/pull/33008#discussion_r658929106 ## File path: sql/catalyst/src/main/java/org/apache/spark/sql/connector/write/SupportsDelta.java ## @@ -0,0 +1,35 @@ +/* + * Licensed to the Apache Softw

[GitHub] [spark] holdenk commented on a change in pull request #33008: [WIP][SPARK-35801][SQL] Support DELETE operations that require rewriting data

2021-06-25 Thread GitBox
holdenk commented on a change in pull request #33008: URL: https://github.com/apache/spark/pull/33008#discussion_r658928712 ## File path: sql/catalyst/src/main/java/org/apache/spark/sql/connector/write/DeltaWriter.java ## @@ -0,0 +1,38 @@ +/* + * Licensed to the Apache Softwar

[GitHub] [spark] SparkQA commented on pull request #33063: [SPARK-35879][Core][Shuffle] Fix performance regression caused by collectFetchRequests

2021-06-25 Thread GitBox
SparkQA commented on pull request #33063: URL: https://github.com/apache/spark/pull/33063#issuecomment-868722374 Kubernetes integration test starting URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/44862/ -- This is an automated message from the Apache

[GitHub] [spark] holdenk commented on a change in pull request #33008: [WIP][SPARK-35801][SQL] Support DELETE operations that require rewriting data

2021-06-25 Thread GitBox
holdenk commented on a change in pull request #33008: URL: https://github.com/apache/spark/pull/33008#discussion_r658927460 ## File path: sql/catalyst/src/main/java/org/apache/spark/sql/connector/write/DeltaWriter.java ## @@ -0,0 +1,38 @@ +/* + * Licensed to the Apache Softwar

[GitHub] [spark] SparkQA commented on pull request #33090: [SPARK-35672][CORE][YARN][3.1] Pass user classpath entries to executors using config instead of command line

2021-06-25 Thread GitBox
SparkQA commented on pull request #33090: URL: https://github.com/apache/spark/pull/33090#issuecomment-868721814 Kubernetes integration test starting URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/44861/ -- This is an automated message from the Apache

[GitHub] [spark] allisonwang-db commented on pull request #33070: [SPARK-35551][SQL] Handle the COUNT bug for lateral subqueries

2021-06-25 Thread GitBox
allisonwang-db commented on pull request #33070: URL: https://github.com/apache/spark/pull/33070#issuecomment-868719722 cc @cloud-fan -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specifi

[GitHub] [spark] sunchao commented on a change in pull request #32753: [SPARK-34859][SQL] Handle column index when using vectorized Parquet reader

2021-06-25 Thread GitBox
sunchao commented on a change in pull request #32753: URL: https://github.com/apache/spark/pull/32753#discussion_r658923673 ## File path: sql/core/src/main/java/org/apache/spark/sql/execution/datasources/parquet/ParquetReadState.java ## @@ -17,13 +17,31 @@ package org.apach

[GitHub] [spark] holdenk commented on a change in pull request #32921: [WIP][SPARK-35779][SQL] Dynamic filtering for Data Source V2

2021-06-25 Thread GitBox
holdenk commented on a change in pull request #32921: URL: https://github.com/apache/spark/pull/32921#discussion_r658922064 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/dynamicpruning/PartitionPruning.scala ## @@ -66,10 +71,33 @@ object PartitionPruning

[GitHub] [spark] holdenk commented on a change in pull request #32921: [WIP][SPARK-35779][SQL] Dynamic filtering for Data Source V2

2021-06-25 Thread GitBox
holdenk commented on a change in pull request #32921: URL: https://github.com/apache/spark/pull/32921#discussion_r658921482 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2/BatchScanExec.scala ## @@ -17,38 +17,91 @@ package org.apache.spar

[GitHub] [spark] holdenk commented on a change in pull request #32921: [WIP][SPARK-35779][SQL] Dynamic filtering for Data Source V2

2021-06-25 Thread GitBox
holdenk commented on a change in pull request #32921: URL: https://github.com/apache/spark/pull/32921#discussion_r658920527 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/DataSourceStrategy.scala ## @@ -631,6 +631,19 @@ object DataSourceStrate

[GitHub] [spark] vkorukanti opened a new pull request #33091: [WIP][SPARK-35896][SS] Include more granular metrics for stateful operators in StreamingQueryProgress

2021-06-25 Thread GitBox
vkorukanti opened a new pull request #33091: URL: https://github.com/apache/spark/pull/33091 ### What changes were proposed in this pull request? Currently the `StateOperatorProgress` in `StreamingQueryProgress` is missing few metrics. ### Why are the changes need

[GitHub] [spark] holdenk commented on a change in pull request #32921: [WIP][SPARK-35779][SQL] Dynamic filtering for Data Source V2

2021-06-25 Thread GitBox
holdenk commented on a change in pull request #32921: URL: https://github.com/apache/spark/pull/32921#discussion_r658919292 ## File path: sql/catalyst/src/main/java/org/apache/spark/sql/connector/read/SupportsRuntimeFiltering.java ## @@ -0,0 +1,61 @@ +/* + * Licensed to the Ap

[GitHub] [spark] holdenk commented on a change in pull request #32921: [WIP][SPARK-35779][SQL] Dynamic filtering for Data Source V2

2021-06-25 Thread GitBox
holdenk commented on a change in pull request #32921: URL: https://github.com/apache/spark/pull/32921#discussion_r658918342 ## File path: sql/catalyst/src/main/java/org/apache/spark/sql/connector/read/SupportsRuntimeFiltering.java ## @@ -0,0 +1,61 @@ +/* + * Licensed to the Ap

[GitHub] [spark] karenfeng commented on a change in pull request #32850: [SPARK-34920][CORE][SQL] Add error classes with SQLSTATE

2021-06-25 Thread GitBox
karenfeng commented on a change in pull request #32850: URL: https://github.com/apache/spark/pull/32850#discussion_r658917625 ## File path: core/src/main/scala/org/apache/spark/SparkError.scala ## @@ -0,0 +1,86 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under o

[GitHub] [spark] SparkQA removed a comment on pull request #33086: [SPARK-35895][SQL] Support subtracting Intervals from TimestampWithoutTZ

2021-06-25 Thread GitBox
SparkQA removed a comment on pull request #33086: URL: https://github.com/apache/spark/pull/33086#issuecomment-868471836 **[Test build #140323 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/140323/testReport)** for PR 33086 at commit [`55ce2b6`](https://gi

[GitHub] [spark] SparkQA commented on pull request #33086: [SPARK-35895][SQL] Support subtracting Intervals from TimestampWithoutTZ

2021-06-25 Thread GitBox
SparkQA commented on pull request #33086: URL: https://github.com/apache/spark/pull/33086#issuecomment-868707728 **[Test build #140323 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/140323/testReport)** for PR 33086 at commit [`55ce2b6`](https://github.co

[GitHub] [spark] SparkQA removed a comment on pull request #33085: [SPARK-35894][BUILD] Introduce new style enforce to not import scala.collection.Seq/IndexedSeq

2021-06-25 Thread GitBox
SparkQA removed a comment on pull request #33085: URL: https://github.com/apache/spark/pull/33085#issuecomment-868550414 **[Test build #140327 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/140327/testReport)** for PR 33085 at commit [`fc29c1a`](https://gi

[GitHub] [spark] SparkQA commented on pull request #33085: [SPARK-35894][BUILD] Introduce new style enforce to not import scala.collection.Seq/IndexedSeq

2021-06-25 Thread GitBox
SparkQA commented on pull request #33085: URL: https://github.com/apache/spark/pull/33085#issuecomment-868706912 **[Test build #140327 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/140327/testReport)** for PR 33085 at commit [`fc29c1a`](https://github.co

[GitHub] [spark] rdblue commented on pull request #32921: [WIP][SPARK-35779][SQL] Dynamic filtering for Data Source V2

2021-06-25 Thread GitBox
rdblue commented on pull request #32921: URL: https://github.com/apache/spark/pull/32921#issuecomment-868700945 > The current logic will never introduce new shuffles and is very close to the existing approach for v1 tables. I think this looks good. -- This is an automated message

[GitHub] [spark] rdblue commented on a change in pull request #32921: [WIP][SPARK-35779][SQL] Dynamic filtering for Data Source V2

2021-06-25 Thread GitBox
rdblue commented on a change in pull request #32921: URL: https://github.com/apache/spark/pull/32921#discussion_r658907547 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2/BatchScanExec.scala ## @@ -17,38 +17,91 @@ package org.apache.spark

[GitHub] [spark] rdblue commented on a change in pull request #32921: [WIP][SPARK-35779][SQL] Dynamic filtering for Data Source V2

2021-06-25 Thread GitBox
rdblue commented on a change in pull request #32921: URL: https://github.com/apache/spark/pull/32921#discussion_r658906047 ## File path: sql/catalyst/src/main/java/org/apache/spark/sql/connector/read/SupportsRuntimeFiltering.java ## @@ -0,0 +1,61 @@ +/* + * Licensed to the Apa

[GitHub] [spark] AmplabJenkins removed a comment on pull request #33065: [SPARK-35880][SS] Track the duplicates dropped count in dedupe operator

2021-06-25 Thread GitBox
AmplabJenkins removed a comment on pull request #33065: URL: https://github.com/apache/spark/pull/33065#issuecomment-868691634 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/44860/

[GitHub] [spark] Kimahriman commented on a change in pull request #33040: [SPARK-35290][SQL] Append new nested struct fields rather than sort for unionByName with null filling

2021-06-25 Thread GitBox
Kimahriman commented on a change in pull request #33040: URL: https://github.com/apache/spark/pull/33040#discussion_r658903460 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/ResolveUnion.scala ## @@ -175,11 +102,9 @@ object ResolveUnion extend

[GitHub] [spark] AmplabJenkins removed a comment on pull request #33085: [SPARK-35894][BUILD] Introduce new style enforce to not import scala.collection.Seq/IndexedSeq

2021-06-25 Thread GitBox
AmplabJenkins removed a comment on pull request #33085: URL: https://github.com/apache/spark/pull/33085#issuecomment-868691631 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/44859/

[GitHub] [spark] AmplabJenkins removed a comment on pull request #33084: [SPARK-35628][SS][FOLLOW-UP] Fix the consistent break on Scala 2.13 build

2021-06-25 Thread GitBox
AmplabJenkins removed a comment on pull request #33084: URL: https://github.com/apache/spark/pull/33084#issuecomment-868691629 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/140326/ -

[GitHub] [spark] SparkQA commented on pull request #33063: [SPARK-35879][Core][Shuffle] Fix performance regression caused by collectFetchRequests

2021-06-25 Thread GitBox
SparkQA commented on pull request #33063: URL: https://github.com/apache/spark/pull/33063#issuecomment-868692253 **[Test build #140330 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/140330/testReport)** for PR 33063 at commit [`dd78ab8`](https://github.com

[GitHub] [spark] SparkQA commented on pull request #33090: [SPARK-35672][CORE][YARN][3.1] Pass user classpath entries to executors using config instead of command line

2021-06-25 Thread GitBox
SparkQA commented on pull request #33090: URL: https://github.com/apache/spark/pull/33090#issuecomment-868692174 **[Test build #140329 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/140329/testReport)** for PR 33090 at commit [`5963a21`](https://github.com

[GitHub] [spark] AmplabJenkins commented on pull request #33065: [SPARK-35880][SS] Track the duplicates dropped count in dedupe operator

2021-06-25 Thread GitBox
AmplabJenkins commented on pull request #33065: URL: https://github.com/apache/spark/pull/33065#issuecomment-868691634 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/44860/ -- T

[GitHub] [spark] AmplabJenkins commented on pull request #33084: [SPARK-35628][SS][FOLLOW-UP] Fix the consistent break on Scala 2.13 build

2021-06-25 Thread GitBox
AmplabJenkins commented on pull request #33084: URL: https://github.com/apache/spark/pull/33084#issuecomment-868691629 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/140326/ -- This

[GitHub] [spark] AmplabJenkins commented on pull request #33085: [SPARK-35894][BUILD] Introduce new style enforce to not import scala.collection.Seq/IndexedSeq

2021-06-25 Thread GitBox
AmplabJenkins commented on pull request #33085: URL: https://github.com/apache/spark/pull/33085#issuecomment-868691631 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/44859/ -- T

[GitHub] [spark] aokolnychyi commented on a change in pull request #32921: [WIP][SPARK-35779][SQL] Dynamic filtering for Data Source V2

2021-06-25 Thread GitBox
aokolnychyi commented on a change in pull request #32921: URL: https://github.com/apache/spark/pull/32921#discussion_r658897093 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/subquery.scala ## @@ -227,3 +228,14 @@ object ReuseSubquery extends Rule[SparkPla

[GitHub] [spark] xkrogen commented on a change in pull request #31490: [SPARK-34365][AVRO] Add support for positional Catalyst-to-Avro schema matching

2021-06-25 Thread GitBox
xkrogen commented on a change in pull request #31490: URL: https://github.com/apache/spark/pull/31490#discussion_r658895264 ## File path: external/avro/src/main/scala/org/apache/spark/sql/avro/AvroDeserializer.scala ## @@ -335,36 +336,32 @@ private[sql] class AvroDeserializer(

[GitHub] [spark] xkrogen commented on a change in pull request #31490: [SPARK-34365][AVRO] Add support for positional Catalyst-to-Avro schema matching

2021-06-25 Thread GitBox
xkrogen commented on a change in pull request #31490: URL: https://github.com/apache/spark/pull/31490#discussion_r658895264 ## File path: external/avro/src/main/scala/org/apache/spark/sql/avro/AvroDeserializer.scala ## @@ -335,36 +336,32 @@ private[sql] class AvroDeserializer(

[GitHub] [spark] xkrogen commented on a change in pull request #31490: [SPARK-34365][AVRO] Add support for positional Catalyst-to-Avro schema matching

2021-06-25 Thread GitBox
xkrogen commented on a change in pull request #31490: URL: https://github.com/apache/spark/pull/31490#discussion_r658893000 ## File path: external/avro/src/main/scala/org/apache/spark/sql/avro/AvroSerializer.scala ## @@ -240,29 +240,30 @@ private[sql] class AvroSerializer(

[GitHub] [spark] SparkQA commented on pull request #33065: [SPARK-35880][SS] Track the duplicates dropped count in dedupe operator

2021-06-25 Thread GitBox
SparkQA commented on pull request #33065: URL: https://github.com/apache/spark/pull/33065#issuecomment-868673159 Kubernetes integration test status success URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/44860/ -- This is an automated message from the A

[GitHub] [spark] SparkQA removed a comment on pull request #33084: [SPARK-35628][SS][FOLLOW-UP] Fix the consistent break on Scala 2.13 build

2021-06-25 Thread GitBox
SparkQA removed a comment on pull request #33084: URL: https://github.com/apache/spark/pull/33084#issuecomment-868509060 **[Test build #140326 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/140326/testReport)** for PR 33084 at commit [`25c1583`](https://gi

[GitHub] [spark] SparkQA commented on pull request #33084: [SPARK-35628][SS][FOLLOW-UP] Fix the consistent break on Scala 2.13 build

2021-06-25 Thread GitBox
SparkQA commented on pull request #33084: URL: https://github.com/apache/spark/pull/33084#issuecomment-868672210 **[Test build #140326 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/140326/testReport)** for PR 33084 at commit [`25c1583`](https://github.co

[GitHub] [spark] SparkQA commented on pull request #33085: [SPARK-35894][BUILD] Introduce new style enforce to not import scala.collection.Seq/IndexedSeq

2021-06-25 Thread GitBox
SparkQA commented on pull request #33085: URL: https://github.com/apache/spark/pull/33085#issuecomment-868667868 Kubernetes integration test status success URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/44859/ -- This is an automated message from the A

[GitHub] [spark] xkrogen commented on pull request #32810: [SPARK-35672][CORE][YARN] Pass user classpath entries to executors using config instead of command line.

2021-06-25 Thread GitBox
xkrogen commented on pull request #32810: URL: https://github.com/apache/spark/pull/32810#issuecomment-868667554 Thank you @tgravescs ! Your comments along the way were much appreciated. I put up a `branch-3.1` backport at #33090. -- This is an automated message from the Apache Git

[GitHub] [spark] xkrogen opened a new pull request #33090: [SPARK-35672][CORE][YARN][3.1] Pass user classpath entries to executors using config instead of command line

2021-06-25 Thread GitBox
xkrogen opened a new pull request #33090: URL: https://github.com/apache/spark/pull/33090 ### What changes were proposed in this pull request? Refactor the logic for constructing the user classpath from `yarn.ApplicationMaster` into `yarn.Client` so that it can be leveraged on the execu

[GitHub] [spark] AmplabJenkins removed a comment on pull request #33084: [SPARK-35628][SS][FOLLOW-UP] Fix the consistent break on Scala 2.13 build

2021-06-25 Thread GitBox
AmplabJenkins removed a comment on pull request #33084: URL: https://github.com/apache/spark/pull/33084#issuecomment-868634343 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/44858/

[GitHub] [spark] AmplabJenkins commented on pull request #33084: [SPARK-35628][SS][FOLLOW-UP] Fix the consistent break on Scala 2.13 build

2021-06-25 Thread GitBox
AmplabJenkins commented on pull request #33084: URL: https://github.com/apache/spark/pull/33084#issuecomment-868634343 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/44858/ -- T

[GitHub] [spark] SparkQA commented on pull request #33065: [SPARK-35880][SS] Track the duplicates dropped count in dedupe operator

2021-06-25 Thread GitBox
SparkQA commented on pull request #33065: URL: https://github.com/apache/spark/pull/33065#issuecomment-868622014 Kubernetes integration test starting URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/44860/ -- This is an automated message from the Apache

[GitHub] [spark] SparkQA commented on pull request #33085: [SPARK-35894][BUILD] Introduce new style enforce to not import scala.collection.Seq/IndexedSeq

2021-06-25 Thread GitBox
SparkQA commented on pull request #33085: URL: https://github.com/apache/spark/pull/33085#issuecomment-868602824 Kubernetes integration test starting URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/44859/ -- This is an automated message from the Apache

[GitHub] [spark] SparkQA commented on pull request #33084: [SPARK-35628][SS][FOLLOW-UP] Fix the consistent break on Scala 2.13 build

2021-06-25 Thread GitBox
SparkQA commented on pull request #33084: URL: https://github.com/apache/spark/pull/33084#issuecomment-868561409 Kubernetes integration test status success URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/44858/ -- This is an automated message from the A

[GitHub] [spark] AmplabJenkins removed a comment on pull request #33076: [SPARK-35889][SQL] Support adding TimestampWithoutTZ with Interval types

2021-06-25 Thread GitBox
AmplabJenkins removed a comment on pull request #33076: URL: https://github.com/apache/spark/pull/33076#issuecomment-868547682 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/140319/ -

[GitHub] [spark] xuanyuanking commented on a change in pull request #32928: [WIP][SPARK-35784] Implementation for RocksDB instance

2021-06-25 Thread GitBox
xuanyuanking commented on a change in pull request #32928: URL: https://github.com/apache/spark/pull/32928#discussion_r658824970 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/state/RocksDB.scala ## @@ -0,0 +1,455 @@ +/* + * Licensed to the Apac

[GitHub] [spark] AmplabJenkins removed a comment on pull request #32928: [WIP][SPARK-35784] Implementation for RocksDB instance

2021-06-25 Thread GitBox
AmplabJenkins removed a comment on pull request #32928: URL: https://github.com/apache/spark/pull/32928#issuecomment-868547677 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/140322/ -

[GitHub] [spark] AmplabJenkins removed a comment on pull request #33085: [SPARK-35894][BUILD] Introduce new style enforce to not import scala.collection.Seq/IndexedSeq

2021-06-25 Thread GitBox
AmplabJenkins removed a comment on pull request #33085: URL: https://github.com/apache/spark/pull/33085#issuecomment-868547680 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/44857/

[GitHub] [spark] SparkQA removed a comment on pull request #33084: [SPARK-35628][SS][FOLLOW-UP] Fix the consistent break on Scala 2.13 build

2021-06-25 Thread GitBox
SparkQA removed a comment on pull request #33084: URL: https://github.com/apache/spark/pull/33084#issuecomment-868438508 **[Test build #140321 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/140321/testReport)** for PR 33084 at commit [`447d8cd`](https://gi

[GitHub] [spark] SparkQA removed a comment on pull request #33076: [SPARK-35889][SQL] Support adding TimestampWithoutTZ with Interval types

2021-06-25 Thread GitBox
SparkQA removed a comment on pull request #33076: URL: https://github.com/apache/spark/pull/33076#issuecomment-868378836 **[Test build #140319 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/140319/testReport)** for PR 33076 at commit [`8a4a244`](https://gi

[GitHub] [spark] AmplabJenkins removed a comment on pull request #33084: [SPARK-35628][SS][FOLLOW-UP] Fix the consistent break on Scala 2.13 build

2021-06-25 Thread GitBox
AmplabJenkins removed a comment on pull request #33084: URL: https://github.com/apache/spark/pull/33084#issuecomment-868547678 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/140321/ -

[GitHub] [spark] AmplabJenkins removed a comment on pull request #33086: [SPARK-35895][SQL] Support subtracting Intervals from TimestampWithoutTZ

2021-06-25 Thread GitBox
AmplabJenkins removed a comment on pull request #33086: URL: https://github.com/apache/spark/pull/33086#issuecomment-868547679 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/44855/

[GitHub] [spark] SparkQA commented on pull request #33085: [SPARK-35894][BUILD] Introduce new style enforce to not import scala.collection.Seq/IndexedSeq

2021-06-25 Thread GitBox
SparkQA commented on pull request #33085: URL: https://github.com/apache/spark/pull/33085#issuecomment-868550414 **[Test build #140327 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/140327/testReport)** for PR 33085 at commit [`fc29c1a`](https://github.com

[GitHub] [spark] SparkQA commented on pull request #33065: [SPARK-35880][SS] Track the duplicates dropped count in dedupe operator

2021-06-25 Thread GitBox
SparkQA commented on pull request #33065: URL: https://github.com/apache/spark/pull/33065#issuecomment-868550467 **[Test build #140328 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/140328/testReport)** for PR 33065 at commit [`fe080f8`](https://github.com

[GitHub] [spark] AmplabJenkins commented on pull request #33089: [SPARK-35892]Modify function saveTable in JdbcUtils to support config numPartitions bigger than RDD's partition number

2021-06-25 Thread GitBox
AmplabJenkins commented on pull request #33089: URL: https://github.com/apache/spark/pull/33089#issuecomment-868548404 Can one of the admins verify this patch? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL a

[GitHub] [spark] AmplabJenkins commented on pull request #33086: [SPARK-35895][SQL] Support subtracting Intervals from TimestampWithoutTZ

2021-06-25 Thread GitBox
AmplabJenkins commented on pull request #33086: URL: https://github.com/apache/spark/pull/33086#issuecomment-868547679 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/44855/ -- T

[GitHub] [spark] AmplabJenkins commented on pull request #33076: [SPARK-35889][SQL] Support adding TimestampWithoutTZ with Interval types

2021-06-25 Thread GitBox
AmplabJenkins commented on pull request #33076: URL: https://github.com/apache/spark/pull/33076#issuecomment-868547682 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/140319/ -- This

[GitHub] [spark] AmplabJenkins commented on pull request #33085: [SPARK-35894][BUILD] Introduce new style enforce to not import scala.collection.Seq/IndexedSeq

2021-06-25 Thread GitBox
AmplabJenkins commented on pull request #33085: URL: https://github.com/apache/spark/pull/33085#issuecomment-868547680 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/44857/ -- T

[GitHub] [spark] AmplabJenkins commented on pull request #33084: [SPARK-35628][SS][FOLLOW-UP] Fix the consistent break on Scala 2.13 build

2021-06-25 Thread GitBox
AmplabJenkins commented on pull request #33084: URL: https://github.com/apache/spark/pull/33084#issuecomment-868547678 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/140321/ -- This

[GitHub] [spark] AmplabJenkins commented on pull request #32928: [WIP][SPARK-35784] Implementation for RocksDB instance

2021-06-25 Thread GitBox
AmplabJenkins commented on pull request #32928: URL: https://github.com/apache/spark/pull/32928#issuecomment-868547677 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/140322/ -- This

[GitHub] [spark] dongjoon-hyun closed pull request #33088: [SPARK-35863][BUILD] Update Ivy to 2.5.0

2021-06-25 Thread GitBox
dongjoon-hyun closed pull request #33088: URL: https://github.com/apache/spark/pull/33088 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-

[GitHub] [spark] SparkQA commented on pull request #33085: [SPARK-35894][BUILD] Introduce new style enforce to not import scala.collection.Seq/IndexedSeq

2021-06-25 Thread GitBox
SparkQA commented on pull request #33085: URL: https://github.com/apache/spark/pull/33085#issuecomment-868546494 Kubernetes integration test status failure URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/44857/ -- This is an automated message from the A

[GitHub] [spark] SparkQA commented on pull request #33076: [SPARK-35889][SQL] Support adding TimestampWithoutTZ with Interval types

2021-06-25 Thread GitBox
SparkQA commented on pull request #33076: URL: https://github.com/apache/spark/pull/33076#issuecomment-868544383 **[Test build #140319 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/140319/testReport)** for PR 33076 at commit [`8a4a244`](https://github.co

[GitHub] [spark] SparkQA commented on pull request #33085: [SPARK-35894][BUILD] Introduce new style enforce to not import scala.collection.Seq/IndexedSeq

2021-06-25 Thread GitBox
SparkQA commented on pull request #33085: URL: https://github.com/apache/spark/pull/33085#issuecomment-868541264 Kubernetes integration test starting URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/44857/ -- This is an automated message from the Apache

[GitHub] [spark] SparkQA commented on pull request #33084: [SPARK-35628][SS][FOLLOW-UP] Fix the consistent break on Scala 2.13 build

2021-06-25 Thread GitBox
SparkQA commented on pull request #33084: URL: https://github.com/apache/spark/pull/33084#issuecomment-868538597 Kubernetes integration test starting URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/44858/ -- This is an automated message from the Apache

[GitHub] [spark] arghya18 commented on pull request #30135: [SPARK-29250][BUILD] Upgrade to Hadoop 3.3.1

2021-06-25 Thread GitBox
arghya18 commented on pull request #30135: URL: https://github.com/apache/spark/pull/30135#issuecomment-868536148 @steveloughran Thanks for the suggestion. Can anyone help me with steps (or Dockerfile) to change hadoop version in prebuilt spark docker image with Hadoop if available han

[GitHub] [spark] vkorukanti commented on a change in pull request #33065: [SPARK-35880][SS] Track the duplicates dropped count in dedupe operator

2021-06-25 Thread GitBox
vkorukanti commented on a change in pull request #33065: URL: https://github.com/apache/spark/pull/33065#discussion_r658800635 ## File path: sql/core/src/test/scala/org/apache/spark/sql/streaming/StateStoreMetricsTest.scala ## @@ -76,6 +76,41 @@ trait StateStoreMetricsTest ext

[GitHub] [spark] HeartSaVioR commented on pull request #33085: [SPARK-35894][BUILD] Introduce new style enforce to not import scala.collection.Seq/IndexedSeq

2021-06-25 Thread GitBox
HeartSaVioR commented on pull request #33085: URL: https://github.com/apache/spark/pull/33085#issuecomment-868533548 Rebased to contain #33084 , and fixed one missed spot in this PR. style check should pass now. -- This is an automated message from the Apache Git Service. To respond to t

[GitHub] [spark] SparkQA commented on pull request #33084: [SPARK-35628][SS][FOLLOW-UP] Fix the consistent break on Scala 2.13 build

2021-06-25 Thread GitBox
SparkQA commented on pull request #33084: URL: https://github.com/apache/spark/pull/33084#issuecomment-868531378 **[Test build #140321 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/140321/testReport)** for PR 33084 at commit [`447d8cd`](https://github.co

[GitHub] [spark] steveloughran commented on pull request #30135: [SPARK-29250][BUILD] Upgrade to Hadoop 3.3.1

2021-06-25 Thread GitBox
steveloughran commented on pull request #30135: URL: https://github.com/apache/spark/pull/30135#issuecomment-868530293 the parquet EOF fix is also in hadoop-3.2.2, so you could try that. However, testing with 3.3.1 is better because 1. we can do workarounds in spark before the releas

[GitHub] [spark] tgravescs commented on pull request #32810: [SPARK-35672][CORE][YARN] Pass user classpath entries to executors using config instead of command line.

2021-06-25 Thread GitBox
tgravescs commented on pull request #32810: URL: https://github.com/apache/spark/pull/32810#issuecomment-868518131 merged to master, it wasn't a clean merge to branch-3.1, were you wanting to get it into that, if so could you put up separate pr? -- This is an automated message from the A

[GitHub] [spark] asfgit closed pull request #32810: [SPARK-35672][CORE][YARN] Pass user classpath entries to executors using config instead of command line.

2021-06-25 Thread GitBox
asfgit closed pull request #32810: URL: https://github.com/apache/spark/pull/32810 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubsc

[GitHub] [spark] tgravescs commented on pull request #32810: [SPARK-35672][CORE][YARN] Pass user classpath entries to executors using config instead of command line.

2021-06-25 Thread GitBox
tgravescs commented on pull request #32810: URL: https://github.com/apache/spark/pull/32810#issuecomment-868515983 looks like the tests passed in the normal QA build and different tests failed this run, so I'll merge -- This is an automated message from the Apache Git Service. To respond

[GitHub] [spark] steveloughran commented on pull request #30135: [SPARK-29250][BUILD] Upgrade to Hadoop 3.3.1

2021-06-25 Thread GitBox
steveloughran commented on pull request #30135: URL: https://github.com/apache/spark/pull/30135#issuecomment-868515080 @arghya18 1. we havent seen HADOOP-17755 in any of our testing, unless it is HADOOP-16109 2. still waiting on that JIRA for you to provide config details. Like I've

<    1   2   3   4   5   6   7   >