[GitHub] [spark] SparkQA commented on pull request #34741: [SPARK-37463][SQL] Read/Write Timestamp ntz from/to Orc uses UTC time zone

2021-11-29 Thread GitBox
SparkQA commented on pull request #34741: URL: https://github.com/apache/spark/pull/34741#issuecomment-981888169 **[Test build #145721 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/145721/testReport)** for PR 34741 at commit

[GitHub] [spark] AmplabJenkins removed a comment on pull request #34747: [SPARK-37490][SQL] Show extra hint if analyzer fails due to ANSI type coercion

2021-11-29 Thread GitBox
AmplabJenkins removed a comment on pull request #34747: URL: https://github.com/apache/spark/pull/34747#issuecomment-981949353 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/145728/

[GitHub] [spark] SparkQA removed a comment on pull request #34747: [SPARK-37490][SQL] Show extra hint if analyzer fails due to ANSI type coercion

2021-11-29 Thread GitBox
SparkQA removed a comment on pull request #34747: URL: https://github.com/apache/spark/pull/34747#issuecomment-981849118 **[Test build #145728 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/145728/testReport)** for PR 34747 at commit

[GitHub] [spark] SparkQA commented on pull request #34747: [SPARK-37490][SQL] Show extra hint if analyzer fails due to ANSI type coercion

2021-11-29 Thread GitBox
SparkQA commented on pull request #34747: URL: https://github.com/apache/spark/pull/34747#issuecomment-981949047 **[Test build #145728 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/145728/testReport)** for PR 34747 at commit

[GitHub] [spark] AmplabJenkins commented on pull request #34747: [SPARK-37490][SQL] Show extra hint if analyzer fails due to ANSI type coercion

2021-11-29 Thread GitBox
AmplabJenkins commented on pull request #34747: URL: https://github.com/apache/spark/pull/34747#issuecomment-981949353 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/145728/ -- This

[GitHub] [spark] c21 commented on pull request #34640: [SPARK-31585][SQL] Introduce Z-order expression

2021-11-29 Thread GitBox
c21 commented on pull request #34640: URL: https://github.com/apache/spark/pull/34640#issuecomment-981978009 > Is there any query performance improvement report from your internal production usage? How about storage efficiency(any improvement on compression ratio)? @advancedxy - we

[GitHub] [spark] HyukjinKwon commented on a change in pull request #34737: [SPARK-37482][PYTHON] Skip check monotonic increasing for Series.asof with 'compute.eager_check'

2021-11-29 Thread GitBox
HyukjinKwon commented on a change in pull request #34737: URL: https://github.com/apache/spark/pull/34737#discussion_r758723396 ## File path: python/pyspark/pandas/series.py ## @@ -5179,7 +5179,9 @@ def asof(self, where: Union[Any, List]) -> Union[Scalar, "Series"]:

[GitHub] [spark] huaxingao commented on a change in pull request #34060: [SPARK-36850][SQL] Migrate CreateTableStatement to v2 command framework

2021-11-29 Thread GitBox
huaxingao commented on a change in pull request #34060: URL: https://github.com/apache/spark/pull/34060#discussion_r758769208 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/trees/TreeNode.scala ## @@ -819,6 +820,9 @@ abstract class TreeNode[BaseType

[GitHub] [spark] AmplabJenkins commented on pull request #34747: [SPARK-37490][SQL] Show extra hint if analyzer fails due to ANSI type coercion

2021-11-29 Thread GitBox
AmplabJenkins commented on pull request #34747: URL: https://github.com/apache/spark/pull/34747#issuecomment-981939756 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/50198/ --

[GitHub] [spark] AmplabJenkins removed a comment on pull request #34747: [SPARK-37490][SQL] Show extra hint if analyzer fails due to ANSI type coercion

2021-11-29 Thread GitBox
AmplabJenkins removed a comment on pull request #34747: URL: https://github.com/apache/spark/pull/34747#issuecomment-981939756 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/50198/

[GitHub] [spark] ueshin commented on a change in pull request #34717: [SPARK-37465][PYTHON] Bump minimum pandas version to 1.0.5

2021-11-29 Thread GitBox
ueshin commented on a change in pull request #34717: URL: https://github.com/apache/spark/pull/34717#discussion_r758685142 ## File path: python/pyspark/pandas/tests/test_series.py ## @@ -2209,12 +2209,12 @@ def test_mad(self): pser = pd.Series([1, 2, 3, 4],

[GitHub] [spark] SparkQA commented on pull request #34701: [SPARK-37450][SQL] Prune unnecessary fiels from Generate under count-only Aggregate

2021-11-29 Thread GitBox
SparkQA commented on pull request #34701: URL: https://github.com/apache/spark/pull/34701#issuecomment-981961706 Kubernetes integration test status failure URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/50199/ -- This is an automated message from the

[GitHub] [spark] c21 commented on a change in pull request #34640: [SPARK-31585][SQL] Introduce Z-order expression

2021-11-29 Thread GitBox
c21 commented on a change in pull request #34640: URL: https://github.com/apache/spark/pull/34640#discussion_r758702199 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/ZOrder.scala ## @@ -0,0 +1,222 @@ +/* + * Licensed to the Apache

[GitHub] [spark] sunchao commented on a change in pull request #34611: [SPARK-35867][SQL] Enable vectorized read for VectorizedPlainValuesReader.readBooleans

2021-11-29 Thread GitBox
sunchao commented on a change in pull request #34611: URL: https://github.com/apache/spark/pull/34611#discussion_r758708422 ## File path: sql/core/src/main/java/org/apache/spark/sql/execution/vectorized/WritableColumnVector.java ## @@ -201,6 +202,28 @@ public

[GitHub] [spark] tdg5 commented on a change in pull request #34745: [WIP][SPARK-37391][SQL] JdbcConnectionProvider must indicate if it needs lock

2021-11-29 Thread GitBox
tdg5 commented on a change in pull request #34745: URL: https://github.com/apache/spark/pull/34745#discussion_r758587271 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/jdbc/connection/BasicConnectionProvider.scala ## @@ -48,4 +48,12 @@

[GitHub] [spark] entong commented on a change in pull request #34747: [SPARK-37490][SQL] Show extra hint if analyzer fails due to ANSI type coercion

2021-11-29 Thread GitBox
entong commented on a change in pull request #34747: URL: https://github.com/apache/spark/pull/34747#discussion_r758595236 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala ## @@ -198,21 +205,39 @@ class Analyzer(override val

[GitHub] [spark] SparkQA commented on pull request #34746: [SPARK-37489][PYTHON] Skip hasnans check in numops if eager_check disable

2021-11-29 Thread GitBox
SparkQA commented on pull request #34746: URL: https://github.com/apache/spark/pull/34746#issuecomment-981884692 Kubernetes integration test status failure URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/50195/ -- This is an automated message from the

[GitHub] [spark] SparkQA removed a comment on pull request #34741: [SPARK-37463][SQL] Read/Write Timestamp ntz from/to Orc uses UTC time zone

2021-11-29 Thread GitBox
SparkQA removed a comment on pull request #34741: URL: https://github.com/apache/spark/pull/34741#issuecomment-981627865 **[Test build #145721 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/145721/testReport)** for PR 34741 at commit

[GitHub] [spark] SparkQA commented on pull request #34701: [SPARK-37450][SQL] Prune unnecessary fiels from Generate under count-only Aggregate

2021-11-29 Thread GitBox
SparkQA commented on pull request #34701: URL: https://github.com/apache/spark/pull/34701#issuecomment-981896899 **[Test build #145729 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/145729/testReport)** for PR 34701 at commit

[GitHub] [spark] entong commented on a change in pull request #34747: [SPARK-37490][SQL] Show extra hint if analyzer fails due to ANSI type coercion

2021-11-29 Thread GitBox
entong commented on a change in pull request #34747: URL: https://github.com/apache/spark/pull/34747#discussion_r758595236 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala ## @@ -198,21 +205,39 @@ class Analyzer(override val

[GitHub] [spark] SparkQA commented on pull request #34716: [SPARK-37468][SQL] Support ANSI intervals and TimestampNTZ for UnionEstimation

2021-11-29 Thread GitBox
SparkQA commented on pull request #34716: URL: https://github.com/apache/spark/pull/34716#issuecomment-981978503 **[Test build #145724 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/145724/testReport)** for PR 34716 at commit

[GitHub] [spark] c21 commented on a change in pull request #34640: [SPARK-31585][SQL] Introduce Z-order expression

2021-11-29 Thread GitBox
c21 commented on a change in pull request #34640: URL: https://github.com/apache/spark/pull/34640#discussion_r758700614 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/ZOrder.scala ## @@ -0,0 +1,222 @@ +/* + * Licensed to the Apache

[GitHub] [spark] SparkQA removed a comment on pull request #34716: [SPARK-37468][SQL] Support ANSI intervals and TimestampNTZ for UnionEstimation

2021-11-29 Thread GitBox
SparkQA removed a comment on pull request #34716: URL: https://github.com/apache/spark/pull/34716#issuecomment-981747137 **[Test build #145724 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/145724/testReport)** for PR 34716 at commit

[GitHub] [spark] viirya commented on a change in pull request #34701: [SPARK-37450][SQL] Prune unnecessary fiels from Generate under count-only Aggregate

2021-11-29 Thread GitBox
viirya commented on a change in pull request #34701: URL: https://github.com/apache/spark/pull/34701#discussion_r758616810 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala ## @@ -2162,6 +2163,74 @@ object

[GitHub] [spark] AmplabJenkins removed a comment on pull request #34744: [SPARK-37454][SQL][FOLLOWUP] Time travel timestamp expression should support RuntimeReplaceable

2021-11-29 Thread GitBox
AmplabJenkins removed a comment on pull request #34744: URL: https://github.com/apache/spark/pull/34744#issuecomment-981900030 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/145726/

[GitHub] [spark] tdg5 commented on a change in pull request #34745: [WIP][SPARK-37391][SQL] JdbcConnectionProvider must indicate if it needs lock

2021-11-29 Thread GitBox
tdg5 commented on a change in pull request #34745: URL: https://github.com/apache/spark/pull/34745#discussion_r758510360 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/jdbc/connection/BasicConnectionProvider.scala ## @@ -48,4 +48,12 @@

[GitHub] [spark] tdg5 commented on a change in pull request #34745: [WIP][SPARK-37391][SQL] JdbcConnectionProvider must indicate if it needs lock

2021-11-29 Thread GitBox
tdg5 commented on a change in pull request #34745: URL: https://github.com/apache/spark/pull/34745#discussion_r758551435 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/jdbc/connection/BasicConnectionProvider.scala ## @@ -48,4 +48,12 @@

[GitHub] [spark] SparkQA commented on pull request #34746: [SPARK-37489][PYTHON] Skip hasnans check in numops if eager_check disable

2021-11-29 Thread GitBox
SparkQA commented on pull request #34746: URL: https://github.com/apache/spark/pull/34746#issuecomment-981895815 Kubernetes integration test status failure URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/50197/ -- This is an automated message from the

[GitHub] [spark] AmplabJenkins commented on pull request #34746: [SPARK-37489][PYTHON] Skip hasnans check in numops if eager_check disable

2021-11-29 Thread GitBox
AmplabJenkins commented on pull request #34746: URL: https://github.com/apache/spark/pull/34746#issuecomment-981895865 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/50197/ --

[GitHub] [spark] AmplabJenkins removed a comment on pull request #34746: [SPARK-37489][PYTHON] Skip hasnans check in numops if eager_check disable

2021-11-29 Thread GitBox
AmplabJenkins removed a comment on pull request #34746: URL: https://github.com/apache/spark/pull/34746#issuecomment-981895865 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/50197/

[GitHub] [spark] MaxGekk commented on pull request #34716: [SPARK-37468][SQL] Support ANSI intervals and TimestampNTZ for UnionEstimation

2021-11-29 Thread GitBox
MaxGekk commented on pull request #34716: URL: https://github.com/apache/spark/pull/34716#issuecomment-981918086 +1, LGTM. Merging to master. Thank you, @sarutak . -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use

[GitHub] [spark] MaxGekk closed pull request #34716: [SPARK-37468][SQL] Support ANSI intervals and TimestampNTZ for UnionEstimation

2021-11-29 Thread GitBox
MaxGekk closed pull request #34716: URL: https://github.com/apache/spark/pull/34716 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail:

[GitHub] [spark] AmplabJenkins commented on pull request #34716: [SPARK-37468][SQL] Support ANSI intervals and TimestampNTZ for UnionEstimation

2021-11-29 Thread GitBox
AmplabJenkins commented on pull request #34716: URL: https://github.com/apache/spark/pull/34716#issuecomment-981987226 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/145724/ -- This

[GitHub] [spark] AmplabJenkins removed a comment on pull request #34701: [SPARK-37450][SQL] Prune unnecessary fiels from Generate under count-only Aggregate

2021-11-29 Thread GitBox
AmplabJenkins removed a comment on pull request #34701: URL: https://github.com/apache/spark/pull/34701#issuecomment-981987228 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/50199/

[GitHub] [spark] AmplabJenkins removed a comment on pull request #34716: [SPARK-37468][SQL] Support ANSI intervals and TimestampNTZ for UnionEstimation

2021-11-29 Thread GitBox
AmplabJenkins removed a comment on pull request #34716: URL: https://github.com/apache/spark/pull/34716#issuecomment-981987226 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/145724/

[GitHub] [spark] AmplabJenkins commented on pull request #34701: [SPARK-37450][SQL] Prune unnecessary fiels from Generate under count-only Aggregate

2021-11-29 Thread GitBox
AmplabJenkins commented on pull request #34701: URL: https://github.com/apache/spark/pull/34701#issuecomment-981987228 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/50199/ --

[GitHub] [spark] MaxGekk commented on a change in pull request #34568: [SPARK-37287][SQL] Pull out dynamic partition and bucket sort from FileFormatWriter

2021-11-29 Thread GitBox
MaxGekk commented on a change in pull request #34568: URL: https://github.com/apache/spark/pull/34568#discussion_r758654693 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/SparkOptimizer.scala ## @@ -37,7 +36,8 @@ class SparkOptimizer( override def

[GitHub] [spark] c21 commented on pull request #34738: [SPARK-37483][SQL] Support pushdown down top N to JDBC data source V2

2021-11-29 Thread GitBox
c21 commented on pull request #34738: URL: https://github.com/apache/spark/pull/34738#issuecomment-981953879 cc @huaxingao FYI. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific

[GitHub] [spark] ueshin commented on a change in pull request #34717: [SPARK-37465][PYTHON] Bump minimum pandas version to 1.0.5

2021-11-29 Thread GitBox
ueshin commented on a change in pull request #34717: URL: https://github.com/apache/spark/pull/34717#discussion_r758685142 ## File path: python/pyspark/pandas/tests/test_series.py ## @@ -2209,12 +2209,12 @@ def test_mad(self): pser = pd.Series([1, 2, 3, 4],

[GitHub] [spark] viirya commented on a change in pull request #34611: [SPARK-35867][SQL] Enable vectorized read for VectorizedPlainValuesReader.readBooleans

2021-11-29 Thread GitBox
viirya commented on a change in pull request #34611: URL: https://github.com/apache/spark/pull/34611#discussion_r758704028 ## File path: sql/core/src/main/java/org/apache/spark/sql/execution/vectorized/WritableColumnVector.java ## @@ -201,6 +202,28 @@ public

[GitHub] [spark] viirya commented on pull request #34642: [SPARK-37369][SQL] Avoid redundant ColumnarToRow transistion on InMemoryTableScan

2021-11-29 Thread GitBox
viirya commented on pull request #34642: URL: https://github.com/apache/spark/pull/34642#issuecomment-981988277 Although I'm not sure if this would be general case, since @attilapiros and @HeartSaVioR both think it is good idea to have a general flag. I can change this to add one. cc

[GitHub] [spark] MaxGekk commented on a change in pull request #34675: [SPARK-37433][SQL] Uses TimeZone.getDefault when timeZoneId is None for ZoneAwareExpression

2021-11-29 Thread GitBox
MaxGekk commented on a change in pull request #34675: URL: https://github.com/apache/spark/pull/34675#discussion_r758719068 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/datetimeExpressions.scala ## @@ -59,7 +59,10 @@ trait

[GitHub] [spark] HyukjinKwon commented on a change in pull request #34509: [SPARK-34521][PYTHON][SQL] Fix spark.createDataFrame when using pandas with StringDtype

2021-11-29 Thread GitBox
HyukjinKwon commented on a change in pull request #34509: URL: https://github.com/apache/spark/pull/34509#discussion_r758725621 ## File path: python/pyspark/sql/pandas/serializers.py ## @@ -169,6 +169,8 @@ def create_array(s, t): elif

[GitHub] [spark] kazuyukitanimura commented on a change in pull request #34611: [SPARK-35867][SQL] Enable vectorized read for VectorizedPlainValuesReader.readBooleans

2021-11-29 Thread GitBox
kazuyukitanimura commented on a change in pull request #34611: URL: https://github.com/apache/spark/pull/34611#discussion_r758739538 ## File path: sql/core/src/main/java/org/apache/spark/sql/execution/vectorized/WritableColumnVector.java ## @@ -201,6 +202,28 @@ public

[GitHub] [spark] SparkQA commented on pull request #34611: [SPARK-35867][SQL] Enable vectorized read for VectorizedPlainValuesReader.readBooleans

2021-11-29 Thread GitBox
SparkQA commented on pull request #34611: URL: https://github.com/apache/spark/pull/34611#issuecomment-982068155 Kubernetes integration test starting URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/50200/ -- This is an automated message from the Apache

[GitHub] [spark] SparkQA commented on pull request #34596: [SPARK-37326][SQL] Support TimestampNTZ in CSV data source

2021-11-29 Thread GitBox
SparkQA commented on pull request #34596: URL: https://github.com/apache/spark/pull/34596#issuecomment-982067919 Kubernetes integration test starting URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/50201/ -- This is an automated message from the Apache

[GitHub] [spark] SparkQA commented on pull request #34060: [SPARK-36850][SQL] Migrate CreateTableStatement to v2 command framework

2021-11-29 Thread GitBox
SparkQA commented on pull request #34060: URL: https://github.com/apache/spark/pull/34060#issuecomment-982078391 **[Test build #145733 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/145733/testReport)** for PR 34060 at commit

[GitHub] [spark] tdg5 commented on a change in pull request #34745: [WIP][SPARK-37391][SQL] JdbcConnectionProvider must indicate if it needs lock

2021-11-29 Thread GitBox
tdg5 commented on a change in pull request #34745: URL: https://github.com/apache/spark/pull/34745#discussion_r758587271 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/jdbc/connection/BasicConnectionProvider.scala ## @@ -48,4 +48,12 @@

[GitHub] [spark] SparkQA commented on pull request #34747: [SPARK-37490][SQL] Show extra hint if analyzer fails due to ANSI type coercion

2021-11-29 Thread GitBox
SparkQA commented on pull request #34747: URL: https://github.com/apache/spark/pull/34747#issuecomment-981882408 Kubernetes integration test starting URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/50198/ -- This is an automated message from the Apache

[GitHub] [spark] ChenMichael commented on a change in pull request #34684: [SPARK-37442][SQL] InMemoryRelation statistics bug causing broadcast join failures with AQE enabled

2021-11-29 Thread GitBox
ChenMichael commented on a change in pull request #34684: URL: https://github.com/apache/spark/pull/34684#discussion_r758668913 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/columnar/InMemoryRelation.scala ## @@ -259,6 +259,12 @@ case class

[GitHub] [spark] c21 commented on pull request #34640: [SPARK-31585][SQL] Introduce Z-order expression

2021-11-29 Thread GitBox
c21 commented on pull request #34640: URL: https://github.com/apache/spark/pull/34640#issuecomment-981975058 > Just FYI some faster interleave bits approach from stanford @ulysses-you - thanks for pointing it out. I think this is good to add later, once community agrees we should

[GitHub] [spark] SparkQA commented on pull request #34596: [SPARK-37326][SQL] Support TimestampNTZ in CSV data source

2021-11-29 Thread GitBox
SparkQA commented on pull request #34596: URL: https://github.com/apache/spark/pull/34596#issuecomment-982034660 **[Test build #145731 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/145731/testReport)** for PR 34596 at commit

[GitHub] [spark] SparkQA commented on pull request #34611: [SPARK-35867][SQL] Enable vectorized read for VectorizedPlainValuesReader.readBooleans

2021-11-29 Thread GitBox
SparkQA commented on pull request #34611: URL: https://github.com/apache/spark/pull/34611#issuecomment-982034664 **[Test build #145730 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/145730/testReport)** for PR 34611 at commit

[GitHub] [spark] SparkQA commented on pull request #34611: [SPARK-35867][SQL] Enable vectorized read for VectorizedPlainValuesReader.readBooleans

2021-11-29 Thread GitBox
SparkQA commented on pull request #34611: URL: https://github.com/apache/spark/pull/34611#issuecomment-982077845 **[Test build #145732 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/145732/testReport)** for PR 34611 at commit

[GitHub] [spark] AmplabJenkins removed a comment on pull request #34746: [SPARK-37489][PYTHON] Skip hasnans check in numops if eager_check disable

2021-11-29 Thread GitBox
AmplabJenkins removed a comment on pull request #34746: URL: https://github.com/apache/spark/pull/34746#issuecomment-981893851 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/50195/

[GitHub] [spark] AmplabJenkins commented on pull request #34741: [SPARK-37463][SQL] Read/Write Timestamp ntz from/to Orc uses UTC time zone

2021-11-29 Thread GitBox
AmplabJenkins commented on pull request #34741: URL: https://github.com/apache/spark/pull/34741#issuecomment-981893850 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/145721/ -- This

[GitHub] [spark] AmplabJenkins commented on pull request #34746: [SPARK-37489][PYTHON] Skip hasnans check in numops if eager_check disable

2021-11-29 Thread GitBox
AmplabJenkins commented on pull request #34746: URL: https://github.com/apache/spark/pull/34746#issuecomment-981893851 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/50195/ --

[GitHub] [spark] AmplabJenkins removed a comment on pull request #34741: [SPARK-37463][SQL] Read/Write Timestamp ntz from/to Orc uses UTC time zone

2021-11-29 Thread GitBox
AmplabJenkins removed a comment on pull request #34741: URL: https://github.com/apache/spark/pull/34741#issuecomment-981893850 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/145721/

[GitHub] [spark] AmplabJenkins commented on pull request #34744: [SPARK-37454][SQL][FOLLOWUP] Time travel timestamp expression should support RuntimeReplaceable

2021-11-29 Thread GitBox
AmplabJenkins commented on pull request #34744: URL: https://github.com/apache/spark/pull/34744#issuecomment-981900030 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/145726/ -- This

[GitHub] [spark] SparkQA commented on pull request #34744: [SPARK-37454][SQL][FOLLOWUP] Time travel timestamp expression should support RuntimeReplaceable

2021-11-29 Thread GitBox
SparkQA commented on pull request #34744: URL: https://github.com/apache/spark/pull/34744#issuecomment-981899733 **[Test build #145726 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/145726/testReport)** for PR 34744 at commit

[GitHub] [spark] SparkQA removed a comment on pull request #34744: [SPARK-37454][SQL][FOLLOWUP] Time travel timestamp expression should support RuntimeReplaceable

2021-11-29 Thread GitBox
SparkQA removed a comment on pull request #34744: URL: https://github.com/apache/spark/pull/34744#issuecomment-981795354 **[Test build #145726 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/145726/testReport)** for PR 34744 at commit

[GitHub] [spark] SparkQA commented on pull request #34701: [SPARK-37450][SQL] Prune unnecessary fiels from Generate under count-only Aggregate

2021-11-29 Thread GitBox
SparkQA commented on pull request #34701: URL: https://github.com/apache/spark/pull/34701#issuecomment-981927319 Kubernetes integration test starting URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/50199/ -- This is an automated message from the Apache

[GitHub] [spark] SparkQA commented on pull request #34747: [SPARK-37490][SQL] Show extra hint if analyzer fails due to ANSI type coercion

2021-11-29 Thread GitBox
SparkQA commented on pull request #34747: URL: https://github.com/apache/spark/pull/34747#issuecomment-981927343 Kubernetes integration test status failure URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/50198/ -- This is an automated message from the

[GitHub] [spark] c21 commented on a change in pull request #34640: [SPARK-31585][SQL] Introduce Z-order expression

2021-11-29 Thread GitBox
c21 commented on a change in pull request #34640: URL: https://github.com/apache/spark/pull/34640#discussion_r758701226 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/ZOrder.scala ## @@ -0,0 +1,222 @@ +/* + * Licensed to the Apache

[GitHub] [spark] sunchao commented on pull request #34659: [SPARK-34863][SQL] Support complex types for Parquet vectorized reader

2021-11-29 Thread GitBox
sunchao commented on pull request #34659: URL: https://github.com/apache/spark/pull/34659#issuecomment-982016495 Thanks @agrawaldevesh . Will address your comments soon. > How do we confirm that this hasn't slowed down the non nested code path ? I've run the

[GitHub] [spark] Yikun edited a comment on pull request #34646: [SPARK-37372][K8S] Removing redundant label addition and refactoring related test case

2021-11-29 Thread GitBox
Yikun edited a comment on pull request #34646: URL: https://github.com/apache/spark/pull/34646#issuecomment-981371597 @dongjoon-hyun Would you mind taking a look again? Or I misundertanded your suggestion, it's not enough to update the PR message, I should split this PR to 2 PRs: 1.

[GitHub] [spark] SparkQA commented on pull request #34733: [SPARK-36346][SQL][FOLLOWUP] Rename `withAllOrcReaders` to `withAllNativeOrcReaders`

2021-11-29 Thread GitBox
SparkQA commented on pull request #34733: URL: https://github.com/apache/spark/pull/34733#issuecomment-981382106 Kubernetes integration test starting URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/50168/ -- This is an automated message from the Apache

[GitHub] [spark] SparkQA commented on pull request #34732: [SPARK-37291][PYSPARK][FOLLOWUP] PySpark create SparkSession should pass initialSessionOptions

2021-11-29 Thread GitBox
SparkQA commented on pull request #34732: URL: https://github.com/apache/spark/pull/34732#issuecomment-981381807 Kubernetes integration test starting URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/50169/ -- This is an automated message from the Apache

[GitHub] [spark] HeartSaVioR commented on pull request #34642: [SPARK-37369][SQL] Avoid redundant ColumnarToRow transistion on InMemoryTableScan

2021-11-29 Thread GitBox
HeartSaVioR commented on pull request #34642: URL: https://github.com/apache/spark/pull/34642#issuecomment-981386950 The rationalization seems to make sense; supporting row read and columnar read are not mutually exclusive, and it might not be always true that `columnar read +

[GitHub] [spark] SparkQA commented on pull request #34732: [SPARK-37291][PYSPARK][FOLLOWUP] PySpark create SparkSession should pass initialSessionOptions

2021-11-29 Thread GitBox
SparkQA commented on pull request #34732: URL: https://github.com/apache/spark/pull/34732#issuecomment-981375537 Kubernetes integration test status failure URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/50164/ -- This is an automated message from the

[GitHub] [spark] Yikun edited a comment on pull request #34646: [SPARK-37372][K8S] Removing redundant label addition and refactoring related test case

2021-11-29 Thread GitBox
Yikun edited a comment on pull request #34646: URL: https://github.com/apache/spark/pull/34646#issuecomment-981371597 @dongjoon-hyun Would you mind taking a look again? Or I misundertanded your suggestion, it's not enough to update the PR message, I should split this PR to 2 PRs: 1.

[GitHub] [spark] SparkQA commented on pull request #34715: [SPARK-37445][BUILD] Rename the maven hadoop profile to hadoop-3 and hadoop-2

2021-11-29 Thread GitBox
SparkQA commented on pull request #34715: URL: https://github.com/apache/spark/pull/34715#issuecomment-981397441 **[Test build #145695 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/145695/testReport)** for PR 34715 at commit

[GitHub] [spark] SparkQA removed a comment on pull request #34715: [SPARK-37445][BUILD] Rename the maven hadoop profile to hadoop-3 and hadoop-2

2021-11-29 Thread GitBox
SparkQA removed a comment on pull request #34715: URL: https://github.com/apache/spark/pull/34715#issuecomment-981324203 **[Test build #145695 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/145695/testReport)** for PR 34715 at commit

[GitHub] [spark] dchvn commented on a change in pull request #34673: [SPARK-37343][SQL] Implement createIndex, IndexExists and dropIndex in JDBC (Postgres dialect)

2021-11-29 Thread GitBox
dchvn commented on a change in pull request #34673: URL: https://github.com/apache/spark/pull/34673#discussion_r758113045 ## File path: sql/core/src/main/scala/org/apache/spark/sql/jdbc/PostgresDialect.scala ## @@ -164,4 +172,78 @@ private object PostgresDialect extends

[GitHub] [spark] cloud-fan commented on a change in pull request #34726: [SPARK-33875][SQL][FOLLOWUP] Handle the char/varchar column for `Describe column` command

2021-11-29 Thread GitBox
cloud-fan commented on a change in pull request #34726: URL: https://github.com/apache/spark/pull/34726#discussion_r758113230 ## File path: sql/core/src/test/scala/org/apache/spark/sql/connector/DataSourceV2SQLSuite.scala ## @@ -158,7 +158,8 @@ class DataSourceV2SQLSuite

[GitHub] [spark] SparkQA commented on pull request #34732: [SPARK-37291][PYSPARK][FOLLOWUP] PySpark create SparkSession should pass initialSessionOptions

2021-11-29 Thread GitBox
SparkQA commented on pull request #34732: URL: https://github.com/apache/spark/pull/34732#issuecomment-981383959 Kubernetes integration test starting URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/50170/ -- This is an automated message from the Apache

[GitHub] [spark] AmplabJenkins removed a comment on pull request #34732: [SPARK-37291][PYSPARK][FOLLOWUP] PySpark create SparkSession should pass initialSessionOptions

2021-11-29 Thread GitBox
AmplabJenkins removed a comment on pull request #34732: URL: https://github.com/apache/spark/pull/34732#issuecomment-981387662 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/50164/

[GitHub] [spark] AmplabJenkins removed a comment on pull request #34715: [SPARK-37445][BUILD] Rename the maven hadoop profile to hadoop-3 and hadoop-2

2021-11-29 Thread GitBox
AmplabJenkins removed a comment on pull request #34715: URL: https://github.com/apache/spark/pull/34715#issuecomment-981387660 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/50165/

[GitHub] [spark] sunchao commented on a change in pull request #34611: [SPARK-35867][SQL] Enable vectorized read for VectorizedPlainValuesReader.readBooleans

2021-11-29 Thread GitBox
sunchao commented on a change in pull request #34611: URL: https://github.com/apache/spark/pull/34611#discussion_r758120451 ## File path: sql/core/src/main/java/org/apache/spark/sql/execution/datasources/parquet/VectorizedPlainValuesReader.java ## @@ -53,19 +53,50 @@ public

[GitHub] [spark] AmplabJenkins removed a comment on pull request #34734: [SPARK-37480][K8S][DOC] Sync Kubernetes configuration to latest in running-on-k8s.md

2021-11-29 Thread GitBox
AmplabJenkins removed a comment on pull request #34734: URL: https://github.com/apache/spark/pull/34734#issuecomment-981387657 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/145700/

[GitHub] [spark] AmplabJenkins commented on pull request #34734: [SPARK-37480][K8S][DOC] Sync Kubernetes configuration to latest in running-on-k8s.md

2021-11-29 Thread GitBox
AmplabJenkins commented on pull request #34734: URL: https://github.com/apache/spark/pull/34734#issuecomment-981387657 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/145700/ -- This

[GitHub] [spark] AmplabJenkins commented on pull request #34715: [SPARK-37445][BUILD] Rename the maven hadoop profile to hadoop-3 and hadoop-2

2021-11-29 Thread GitBox
AmplabJenkins commented on pull request #34715: URL: https://github.com/apache/spark/pull/34715#issuecomment-981387660 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/50165/ --

[GitHub] [spark] AmplabJenkins commented on pull request #34732: [SPARK-37291][PYSPARK][FOLLOWUP] PySpark create SparkSession should pass initialSessionOptions

2021-11-29 Thread GitBox
AmplabJenkins commented on pull request #34732: URL: https://github.com/apache/spark/pull/34732#issuecomment-981387662 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/50164/ --

[GitHub] [spark] HeartSaVioR edited a comment on pull request #34642: [SPARK-37369][SQL] Avoid redundant ColumnarToRow transistion on InMemoryTableScan

2021-11-29 Thread GitBox
HeartSaVioR edited a comment on pull request #34642: URL: https://github.com/apache/spark/pull/34642#issuecomment-981386950 The rationalization seems to make sense; supporting row read and columnar read are not mutually exclusive, and it might not be always true that `columnar read +

[GitHub] [spark] MaxGekk commented on pull request #34719: [SPARK-37381][SQL] Unify v1 and v2 SHOW CREATE TABLE tests

2021-11-29 Thread GitBox
MaxGekk commented on pull request #34719: URL: https://github.com/apache/spark/pull/34719#issuecomment-981394909 @Peng-Lei Thank you for the ping. Will review this PR today. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub

[GitHub] [spark] Yikun commented on a change in pull request #34646: [SPARK-37372][K8S] Removing redundant label addition and refactoring related test case

2021-11-29 Thread GitBox
Yikun commented on a change in pull request #34646: URL: https://github.com/apache/spark/pull/34646#discussion_r758112591 ## File path: resource-managers/kubernetes/core/src/test/scala/org/apache/spark/deploy/k8s/features/BasicDriverFeatureStepSuite.scala ## @@ -34,7 +34,7 @@

[GitHub] [spark] SparkQA commented on pull request #34734: [SPARK-37480][K8S][DOC] Sync Kubernetes configuration to latest in running-on-k8s.md

2021-11-29 Thread GitBox
SparkQA commented on pull request #34734: URL: https://github.com/apache/spark/pull/34734#issuecomment-981388429 Kubernetes integration test starting URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/50171/ -- This is an automated message from the Apache

[GitHub] [spark] SparkQA commented on pull request #34735: [SPARK-37481][Core] Fix disappearance of skipped stages after they retry

2021-11-29 Thread GitBox
SparkQA commented on pull request #34735: URL: https://github.com/apache/spark/pull/34735#issuecomment-981388161 **[Test build #145702 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/145702/testReport)** for PR 34735 at commit

[GitHub] [spark] AmplabJenkins commented on pull request #34715: [SPARK-37445][BUILD] Rename the maven hadoop profile to hadoop-3 and hadoop-2

2021-11-29 Thread GitBox
AmplabJenkins commented on pull request #34715: URL: https://github.com/apache/spark/pull/34715#issuecomment-981398398 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/145695/ -- This

[GitHub] [spark] Peng-Lei commented on a change in pull request #34726: [SPARK-33875][SQL][FOLLOWUP] Handle the char/varchar column for `Describe column` command

2021-11-29 Thread GitBox
Peng-Lei commented on a change in pull request #34726: URL: https://github.com/apache/spark/pull/34726#discussion_r758132225 ## File path: sql/core/src/test/scala/org/apache/spark/sql/connector/DataSourceV2SQLSuite.scala ## @@ -158,7 +158,8 @@ class DataSourceV2SQLSuite

[GitHub] [spark] cloud-fan commented on pull request #34691: [SPARK-37447][SQL] Cache LogicalPlan.isStreaming() result in a lazy val

2021-11-29 Thread GitBox
cloud-fan commented on pull request #34691: URL: https://github.com/apache/spark/pull/34691#issuecomment-981407034 thanks, merging to master! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

[GitHub] [spark] cloud-fan closed pull request #34691: [SPARK-37447][SQL] Cache LogicalPlan.isStreaming() result in a lazy val

2021-11-29 Thread GitBox
cloud-fan closed pull request #34691: URL: https://github.com/apache/spark/pull/34691 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail:

[GitHub] [spark] seayoun commented on pull request #34725: [SPARK-37473][CORE] BypassMergeSortShuffleWriter may loss data when disk is missing however catagory is present

2021-11-29 Thread GitBox
seayoun commented on pull request #34725: URL: https://github.com/apache/spark/pull/34725#issuecomment-981414935 > Mind filling the PR description for: > > * Does this PR introduce any user-facing change? > * How was this patch tested? done -- This is an automated

[GitHub] [spark] cloud-fan commented on a change in pull request #34719: [SPARK-37381][SQL] Unify v1 and v2 SHOW CREATE TABLE tests

2021-11-29 Thread GitBox
cloud-fan commented on a change in pull request #34719: URL: https://github.com/apache/spark/pull/34719#discussion_r758153605 ## File path: sql/core/src/test/scala/org/apache/spark/sql/execution/command/ShowCreateTableSuiteBase.scala ## @@ -0,0 +1,145 @@ +/* + * Licensed to

[GitHub] [spark] cloud-fan commented on a change in pull request #34719: [SPARK-37381][SQL] Unify v1 and v2 SHOW CREATE TABLE tests

2021-11-29 Thread GitBox
cloud-fan commented on a change in pull request #34719: URL: https://github.com/apache/spark/pull/34719#discussion_r758156350 ## File path: sql/core/src/test/scala/org/apache/spark/sql/execution/command/v1/ShowCreateTableSuite.scala ## @@ -0,0 +1,140 @@ +/* + * Licensed to

[GitHub] [spark] AmplabJenkins removed a comment on pull request #34733: [SPARK-36346][SQL][FOLLOWUP] Rename `withAllOrcReaders` to `withAllNativeOrcReaders`

2021-11-29 Thread GitBox
AmplabJenkins removed a comment on pull request #34733: URL: https://github.com/apache/spark/pull/34733#issuecomment-981431729 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/50168/

[GitHub] [spark] AmplabJenkins commented on pull request #34733: [SPARK-36346][SQL][FOLLOWUP] Rename `withAllOrcReaders` to `withAllNativeOrcReaders`

2021-11-29 Thread GitBox
AmplabJenkins commented on pull request #34733: URL: https://github.com/apache/spark/pull/34733#issuecomment-981431729 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/50168/ --

[GitHub] [spark] SparkQA commented on pull request #34734: [SPARK-37480][K8S][DOC] Sync Kubernetes configuration to latest in running-on-k8s.md

2021-11-29 Thread GitBox
SparkQA commented on pull request #34734: URL: https://github.com/apache/spark/pull/34734#issuecomment-981432046 Kubernetes integration test status failure URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/50171/ -- This is an automated message from the

[GitHub] [spark] AmplabJenkins commented on pull request #34734: [SPARK-37480][K8S][DOC] Sync Kubernetes configuration to latest in running-on-k8s.md

2021-11-29 Thread GitBox
AmplabJenkins commented on pull request #34734: URL: https://github.com/apache/spark/pull/34734#issuecomment-981432091 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/50171/ --

[GitHub] [spark] Peng-Lei commented on a change in pull request #34719: [SPARK-37381][SQL] Unify v1 and v2 SHOW CREATE TABLE tests

2021-11-29 Thread GitBox
Peng-Lei commented on a change in pull request #34719: URL: https://github.com/apache/spark/pull/34719#discussion_r758174307 ## File path: sql/core/src/test/scala/org/apache/spark/sql/execution/command/ShowCreateTableSuiteBase.scala ## @@ -0,0 +1,145 @@ +/* + * Licensed to

  1   2   3   4   5   6   7   8   9   >