[GitHub] [spark] beliefer commented on pull request #34554: [SPARK-37286][SQL] Move compileFilter and compileAggregates from JDBCRDD to JdbcDialect

2021-12-01 Thread GitBox


beliefer commented on pull request #34554:
URL: https://github.com/apache/spark/pull/34554#issuecomment-984376002


   Based on discussion offline, reopen this PR.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] beliefer opened a new pull request #34554: [SPARK-37286][SQL] Move compileFilter and compileAggregates from JDBCRDD to JdbcDialect

2021-12-01 Thread GitBox


beliefer opened a new pull request #34554:
URL: https://github.com/apache/spark/pull/34554


   ### What changes were proposed in this pull request?
   Currently, the method `compileFilter` and `compileAggregates` is a member of 
`JDBCRDD`. But it is not reasonable, because the JDBC source knowns how to 
compile filter and aggregate expressions to itself's dialect well.
   
   
   ### Why are the changes needed?
   JDBC source knowns how to compile filter and  aggregate expressions to 
itself's dialect well.
   After this PR, we can extend the pushdown(e.g. filter, aggregate) based on 
different dialect between different JDBC database.
   
   There are two situations:
   First, database A and B implement a different number of aggregate functions 
that meet the SQL standard.
   Second, some database implement some aggregate functions that not meet the 
SQL standard.
   
   
   ### Does this PR introduce _any_ user-facing change?
   'No'. Just change the inner implementation.
   
   
   ### How was this patch tested?
   Jenkins tests.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #34701: [SPARK-37450][SQL] Prune unnecessary fields from Generate

2021-12-01 Thread GitBox


SparkQA commented on pull request #34701:
URL: https://github.com/apache/spark/pull/34701#issuecomment-984366546


   **[Test build #145849 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/145849/testReport)**
 for PR 34701 at commit 
[`e479e5c`](https://github.com/apache/spark/commit/e479e5c2088510f88f62e6baa89ebbd18ff135af).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] viirya commented on a change in pull request #34701: [SPARK-37450][SQL] Prune unnecessary fields from Generate

2021-12-01 Thread GitBox


viirya commented on a change in pull request #34701:
URL: https://github.com/apache/spark/pull/34701#discussion_r760827034



##
File path: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala
##
@@ -2162,6 +2163,48 @@ object RemoveLiteralFromGroupExpressions extends 
Rule[LogicalPlan] {
   }
 }
 
+/**
+ * Prunes unnecessary fields from a [[Generate]] if it is under a project 
which does not refer
+ * any generated attributes, .e.g., count-like aggregation on an exploded 
array.
+ */
+object GenerateOptimization extends Rule[LogicalPlan] {
+  def apply(plan: LogicalPlan): LogicalPlan = plan.transformDownWithPruning(
+  _.containsAllPatterns(PROJECT, GENERATE), ruleId) {
+  case p @ Project(_, g: Generate) if p.references.isEmpty
+  && g.generator.isInstanceOf[ExplodeBase] =>
+g.generator.children.head.dataType match {
+  case ArrayType(StructType(fields), _) =>
+val atomicFields = fields.collect {
+  case f: StructField if f.dataType.isInstanceOf[AtomicType] => f
+}
+val extractor = if (atomicFields.size > 0) {
+  // Pick an arbitrary atomic field, if any
+  ExtractValue(g.generator.children.head,

Review comment:
   okay, updated to use `GetArrayStructFields`.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] viirya commented on a change in pull request #34701: [SPARK-37450][SQL] Prune unnecessary fields from Generate

2021-12-01 Thread GitBox


viirya commented on a change in pull request #34701:
URL: https://github.com/apache/spark/pull/34701#discussion_r760826896



##
File path: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala
##
@@ -2162,6 +2163,48 @@ object RemoveLiteralFromGroupExpressions extends 
Rule[LogicalPlan] {
   }
 }
 
+/**
+ * Prunes unnecessary fields from a [[Generate]] if it is under a project 
which does not refer
+ * any generated attributes, .e.g., count-like aggregation on an exploded 
array.
+ */
+object GenerateOptimization extends Rule[LogicalPlan] {
+  def apply(plan: LogicalPlan): LogicalPlan = plan.transformDownWithPruning(
+  _.containsAllPatterns(PROJECT, GENERATE), ruleId) {
+  case p @ Project(_, g: Generate) if p.references.isEmpty
+  && g.generator.isInstanceOf[ExplodeBase] =>
+g.generator.children.head.dataType match {
+  case ArrayType(StructType(fields), _) =>
+val atomicFields = fields.collect {
+  case f: StructField if f.dataType.isInstanceOf[AtomicType] => f
+}
+val extractor = if (atomicFields.size > 0) {
+  // Pick an arbitrary atomic field, if any

Review comment:
   good idea. updated.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #34764: [SPARK-37330][SQL] Migrate ReplaceTableStatement to v2 command

2021-12-01 Thread GitBox


SparkQA commented on pull request #34764:
URL: https://github.com/apache/spark/pull/34764#issuecomment-984361799


   **[Test build #145848 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/145848/testReport)**
 for PR 34764 at commit 
[`ec966bf`](https://github.com/apache/spark/commit/ec966bfc901290d129ab7f6f076b6708eb71463f).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #34701: [SPARK-37450][SQL] Prune unnecessary fields from Generate

2021-12-01 Thread GitBox


AmplabJenkins removed a comment on pull request #34701:
URL: https://github.com/apache/spark/pull/34701#issuecomment-984360337


   
   Refer to this link for build results (access rights to CI server needed): 
   
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/50320/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #34757: [SPARK-37504][PYTHON] Pyspark create SparkSession with existed session should not pass static conf

2021-12-01 Thread GitBox


AmplabJenkins removed a comment on pull request #34757:
URL: https://github.com/apache/spark/pull/34757#issuecomment-984360338


   
   Refer to this link for build results (access rights to CI server needed): 
   
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/50319/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #34775: [SPARK-37511][DOCS][FOLLOW-UP] Fix documentation build warning from TimedeltaIndex

2021-12-01 Thread GitBox


AmplabJenkins removed a comment on pull request #34775:
URL: https://github.com/apache/spark/pull/34775#issuecomment-984360344


   
   Refer to this link for build results (access rights to CI server needed): 
   
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/50317/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #34777: [SPARK-37326][SQL][FOLLOW-UP] Update code and tests for TimestampNTZ support in CSV data source

2021-12-01 Thread GitBox


AmplabJenkins removed a comment on pull request #34777:
URL: https://github.com/apache/spark/pull/34777#issuecomment-984360339


   
   Refer to this link for build results (access rights to CI server needed): 
   
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/50315/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #34774: [SPARK-37516][PYTHON][SQL] Uses Python's standard string formatter for SQL API in PySpark

2021-12-01 Thread GitBox


AmplabJenkins removed a comment on pull request #34774:
URL: https://github.com/apache/spark/pull/34774#issuecomment-984360335


   
   Refer to this link for build results (access rights to CI server needed): 
   
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/50318/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #34776: [SPARK-37512][PYTHON] Support TimedeltaIndex creation (from Series/Index) and TimedeltaIndex.astype

2021-12-01 Thread GitBox


AmplabJenkins removed a comment on pull request #34776:
URL: https://github.com/apache/spark/pull/34776#issuecomment-984360334






-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #34778: [SPARK-36396][PYTHON][FOLLOWUP] Fix test with extensions dtype when pandas version < 1.2

2021-12-01 Thread GitBox


AmplabJenkins removed a comment on pull request #34778:
URL: https://github.com/apache/spark/pull/34778#issuecomment-984360340


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/145846/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #34778: [SPARK-36396][PYTHON][FOLLOWUP] Fix test with extensions dtype when pandas version < 1.2

2021-12-01 Thread GitBox


AmplabJenkins commented on pull request #34778:
URL: https://github.com/apache/spark/pull/34778#issuecomment-984360340


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/145846/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #34776: [SPARK-37512][PYTHON] Support TimedeltaIndex creation (from Series/Index) and TimedeltaIndex.astype

2021-12-01 Thread GitBox


AmplabJenkins commented on pull request #34776:
URL: https://github.com/apache/spark/pull/34776#issuecomment-984360336






-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #34775: [SPARK-37511][DOCS][FOLLOW-UP] Fix documentation build warning from TimedeltaIndex

2021-12-01 Thread GitBox


AmplabJenkins commented on pull request #34775:
URL: https://github.com/apache/spark/pull/34775#issuecomment-984360344


   
   Refer to this link for build results (access rights to CI server needed): 
   
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/50317/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #34701: [SPARK-37450][SQL] Prune unnecessary fields from Generate

2021-12-01 Thread GitBox


AmplabJenkins commented on pull request #34701:
URL: https://github.com/apache/spark/pull/34701#issuecomment-984360337


   
   Refer to this link for build results (access rights to CI server needed): 
   
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/50320/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #34757: [SPARK-37504][PYTHON] Pyspark create SparkSession with existed session should not pass static conf

2021-12-01 Thread GitBox


AmplabJenkins commented on pull request #34757:
URL: https://github.com/apache/spark/pull/34757#issuecomment-984360338


   
   Refer to this link for build results (access rights to CI server needed): 
   
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/50319/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #34774: [SPARK-37516][PYTHON][SQL] Uses Python's standard string formatter for SQL API in PySpark

2021-12-01 Thread GitBox


AmplabJenkins commented on pull request #34774:
URL: https://github.com/apache/spark/pull/34774#issuecomment-984360335


   
   Refer to this link for build results (access rights to CI server needed): 
   
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/50318/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #34777: [SPARK-37326][SQL][FOLLOW-UP] Update code and tests for TimestampNTZ support in CSV data source

2021-12-01 Thread GitBox


AmplabJenkins commented on pull request #34777:
URL: https://github.com/apache/spark/pull/34777#issuecomment-984360339


   
   Refer to this link for build results (access rights to CI server needed): 
   
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/50315/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #34777: [SPARK-37326][SQL][FOLLOW-UP] Update code and tests for TimestampNTZ support in CSV data source

2021-12-01 Thread GitBox


SparkQA commented on pull request #34777:
URL: https://github.com/apache/spark/pull/34777#issuecomment-984358043


   Kubernetes integration test status failure
   URL: 
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/50315/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #34776: [SPARK-37512][PYTHON] Support TimedeltaIndex creation (from Series/Index) and TimedeltaIndex.astype

2021-12-01 Thread GitBox


SparkQA commented on pull request #34776:
URL: https://github.com/apache/spark/pull/34776#issuecomment-984354840


   Kubernetes integration test starting
   URL: 
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/50322/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #34778: [SPARK-36396][PYTHON][FOLLOWUP] Fix test with extensions dtype when pandas version < 1.2

2021-12-01 Thread GitBox


SparkQA commented on pull request #34778:
URL: https://github.com/apache/spark/pull/34778#issuecomment-984353579


   Kubernetes integration test starting
   URL: 
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/50321/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #34776: [SPARK-37512][PYTHON] Support TimedeltaIndex creation (from Series/Index) and TimedeltaIndex.astype

2021-12-01 Thread GitBox


SparkQA commented on pull request #34776:
URL: https://github.com/apache/spark/pull/34776#issuecomment-984353353


   Kubernetes integration test status failure
   URL: 
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/50316/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #34774: [SPARK-37516][PYTHON][SQL] Uses Python's standard string formatter for SQL API in PySpark

2021-12-01 Thread GitBox


SparkQA commented on pull request #34774:
URL: https://github.com/apache/spark/pull/34774#issuecomment-984351389


   Kubernetes integration test status failure
   URL: 
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/50318/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] itholic commented on pull request #34772: [SPARK-37514][PYTHON] Remove workarounds due to older pandas

2021-12-01 Thread GitBox


itholic commented on pull request #34772:
URL: https://github.com/apache/spark/pull/34772#issuecomment-984350857


   Clean!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #34775: [SPARK-37511][DOCS][FOLLOW-UP] Fix documentation build warning from TimedeltaIndex

2021-12-01 Thread GitBox


SparkQA commented on pull request #34775:
URL: https://github.com/apache/spark/pull/34775#issuecomment-984350247


   Kubernetes integration test status failure
   URL: 
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/50317/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA removed a comment on pull request #34776: [SPARK-37512][PYTHON] Support TimedeltaIndex creation (from Series/Index) and TimedeltaIndex.astype

2021-12-01 Thread GitBox


SparkQA removed a comment on pull request #34776:
URL: https://github.com/apache/spark/pull/34776#issuecomment-984332768


   **[Test build #145847 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/145847/testReport)**
 for PR 34776 at commit 
[`acb7c55`](https://github.com/apache/spark/commit/acb7c55efbccab5bc3225d83b7a65c6d45ef8926).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA removed a comment on pull request #34778: [SPARK-36396][PYTHON][FOLLOWUP] Fix test with extensions dtype when pandas version < 1.2

2021-12-01 Thread GitBox


SparkQA removed a comment on pull request #34778:
URL: https://github.com/apache/spark/pull/34778#issuecomment-984332719


   **[Test build #145846 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/145846/testReport)**
 for PR 34778 at commit 
[`73a84fc`](https://github.com/apache/spark/commit/73a84fccdcfcc84cf68e8a73ee830dd0c4522627).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #34701: [SPARK-37450][SQL] Prune unnecessary fields from Generate

2021-12-01 Thread GitBox


SparkQA commented on pull request #34701:
URL: https://github.com/apache/spark/pull/34701#issuecomment-984348042


   Kubernetes integration test status failure
   URL: 
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/50320/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #34757: [SPARK-37504][PYTHON] Pyspark create SparkSession with existed session should not pass static conf

2021-12-01 Thread GitBox


SparkQA commented on pull request #34757:
URL: https://github.com/apache/spark/pull/34757#issuecomment-984347565


   Kubernetes integration test status failure
   URL: 
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/50319/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #34776: [SPARK-37512][PYTHON] Support TimedeltaIndex creation (from Series/Index) and TimedeltaIndex.astype

2021-12-01 Thread GitBox


SparkQA commented on pull request #34776:
URL: https://github.com/apache/spark/pull/34776#issuecomment-984345366


   **[Test build #145847 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/145847/testReport)**
 for PR 34776 at commit 
[`acb7c55`](https://github.com/apache/spark/commit/acb7c55efbccab5bc3225d83b7a65c6d45ef8926).
* This patch passes all tests.
* This patch merges cleanly.
* This patch adds no public classes.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #34778: [SPARK-36396][PYTHON][FOLLOWUP] Fix test with extensions dtype when pandas version < 1.2

2021-12-01 Thread GitBox


SparkQA commented on pull request #34778:
URL: https://github.com/apache/spark/pull/34778#issuecomment-984345256


   **[Test build #145846 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/145846/testReport)**
 for PR 34778 at commit 
[`73a84fc`](https://github.com/apache/spark/commit/73a84fccdcfcc84cf68e8a73ee830dd0c4522627).
* This patch passes all tests.
* This patch merges cleanly.
* This patch adds no public classes.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] dchvn commented on a change in pull request #34667: [SPARK-36902][SQL] Migrate CreateTableAsSelectStatement to v2 command

2021-12-01 Thread GitBox


dchvn commented on a change in pull request #34667:
URL: https://github.com/apache/spark/pull/34667#discussion_r760805798



##
File path: sql/core/src/main/scala/org/apache/spark/sql/DataFrameWriter.scala
##
@@ -323,18 +323,25 @@ final class DataFrameWriter[T] private[sql](ds: 
Dataset[T]) {
   provider match {
 case supportsExtract: SupportsCatalogOptions =>
   val ident = supportsExtract.extractIdentifier(dsOptions)
-  val catalog = CatalogV2Util.getTableProviderCatalog(

Review comment:
   `CatalogV2Util.getTableProviderCatalog.name` will return `spark_catalog` 
when we save with default catalog, that is cause of failed test 
`SupportsCatalogOptionsSuite`. 
   ```scala
 test(s"save works with ErrorIfExists - no table, no partitioning, session 
catalog") {
   testCreateAndRead(SaveMode.ErrorIfExists, None, Nil)
 }
   ```
   which we need here is `default` namespace got from `ident.namespaces`. It 
can satisfy the test.
   




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] dchvn commented on a change in pull request #34667: [SPARK-36902][SQL] Migrate CreateTableAsSelectStatement to v2 command

2021-12-01 Thread GitBox


dchvn commented on a change in pull request #34667:
URL: https://github.com/apache/spark/pull/34667#discussion_r760805798



##
File path: sql/core/src/main/scala/org/apache/spark/sql/DataFrameWriter.scala
##
@@ -323,18 +323,25 @@ final class DataFrameWriter[T] private[sql](ds: 
Dataset[T]) {
   provider match {
 case supportsExtract: SupportsCatalogOptions =>
   val ident = supportsExtract.extractIdentifier(dsOptions)
-  val catalog = CatalogV2Util.getTableProviderCatalog(

Review comment:
   `CatalogV2Util.getTableProviderCatalog.name` will return `spark_catalog` 
when we save default catalog, that is cause of failed test 
`SupportsCatalogOptionsSuite`. 
   ```scala
 test(s"save works with ErrorIfExists - no table, no partitioning, session 
catalog") {
   testCreateAndRead(SaveMode.ErrorIfExists, None, Nil)
 }
   ```
   which we need here is `default` namespace got from `ident.namespaces`. It 
can satisfy the test.
   




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] dchvn commented on a change in pull request #34667: [SPARK-36902][SQL] Migrate CreateTableAsSelectStatement to v2 command

2021-12-01 Thread GitBox


dchvn commented on a change in pull request #34667:
URL: https://github.com/apache/spark/pull/34667#discussion_r760805798



##
File path: sql/core/src/main/scala/org/apache/spark/sql/DataFrameWriter.scala
##
@@ -323,18 +323,25 @@ final class DataFrameWriter[T] private[sql](ds: 
Dataset[T]) {
   provider match {
 case supportsExtract: SupportsCatalogOptions =>
   val ident = supportsExtract.extractIdentifier(dsOptions)
-  val catalog = CatalogV2Util.getTableProviderCatalog(

Review comment:
   `CatalogV2Util.getTableProviderCatalog.name` will return `spark_catalog` 
when we save default catalog, that is cause of failed test 
`SupportsCatalogOptionsSuite`. 
   ```scala
 test(s"save works with ErrorIfExists - no table, no partitioning, session 
catalog") {
   testCreateAndRead(SaveMode.ErrorIfExists, None, Nil)
 }
   ```
   which we need here is `default` namespace got from `ident.namespaces`.
   




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] dchvn commented on a change in pull request #34667: [SPARK-36902][SQL] Migrate CreateTableAsSelectStatement to v2 command

2021-12-01 Thread GitBox


dchvn commented on a change in pull request #34667:
URL: https://github.com/apache/spark/pull/34667#discussion_r760805798



##
File path: sql/core/src/main/scala/org/apache/spark/sql/DataFrameWriter.scala
##
@@ -323,18 +323,25 @@ final class DataFrameWriter[T] private[sql](ds: 
Dataset[T]) {
   provider match {
 case supportsExtract: SupportsCatalogOptions =>
   val ident = supportsExtract.extractIdentifier(dsOptions)
-  val catalog = CatalogV2Util.getTableProviderCatalog(

Review comment:
   `CatalogV2Util.getTableProviderCatalog.name` will return `spark_catalog` 
when we save default catalog, that is cause of failed test 
`SupportsCatalogOptionsSuite`. which we need here is `default` namespace got 
from `ident.namespaces`.
   




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] dchvn commented on a change in pull request #34667: [SPARK-36902][SQL] Migrate CreateTableAsSelectStatement to v2 command

2021-12-01 Thread GitBox


dchvn commented on a change in pull request #34667:
URL: https://github.com/apache/spark/pull/34667#discussion_r760805798



##
File path: sql/core/src/main/scala/org/apache/spark/sql/DataFrameWriter.scala
##
@@ -323,18 +323,25 @@ final class DataFrameWriter[T] private[sql](ds: 
Dataset[T]) {
   provider match {
 case supportsExtract: SupportsCatalogOptions =>
   val ident = supportsExtract.extractIdentifier(dsOptions)
-  val catalog = CatalogV2Util.getTableProviderCatalog(

Review comment:
   `CatalogV2Util.getTableProviderCatalog.name` will return `spark_catalog` 
when we save default catalog, that is cause of test fail 
`SupportsCatalogOptionsSuite`. which we need here is `default` namespace got 
from `ident.namespaces`.
   




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] HyukjinKwon commented on pull request #34778: [SPARK-36396][PYTHON][FOLLOWUP] Fix test with extensions dtype when pandas version < 1.2

2021-12-01 Thread GitBox


HyukjinKwon commented on pull request #34778:
URL: https://github.com/apache/spark/pull/34778#issuecomment-984333910


   Thanks @dchvn !


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] dchvn commented on a change in pull request #34778: [SPARK-36396][PYTHON][FOLLOWUP] Fix test with extensions dtype when pandas version < 1.2

2021-12-01 Thread GitBox


dchvn commented on a change in pull request #34778:
URL: https://github.com/apache/spark/pull/34778#discussion_r760795723



##
File path: python/pyspark/pandas/tests/test_dataframe.py
##
@@ -5870,8 +5870,12 @@ def test_cov(self):
 self.assert_eq(pdf.cov(min_periods=5), psdf.cov(min_periods=5))
 
 # extension dtype
-numeric_dtypes = ["Int8", "Int16", "Int32", "Int64", "Float32", 
"Float64", "float"]
-boolean_dtypes = ["boolean", "bool"]
+if LooseVersion(pd.__version__) >= LooseVersion("1.2"):

Review comment:
   with pandas < 1.2, `pd.Dataframe.cov` can not work with extension dtype 
`NAType`, so I just fix the test only
   ```python
   >>> pd.__version__
   '1.0.5'
   >>> pdf = pd.DataFrame([[1,2],[None, 3]], dtype="Int64")
   >>> pdf
 0  1
   0 1  2
   13
   >>> pdf.cov()
   Traceback (most recent call last):
 File "", line 1, in 
 File 
"/u02/venv/python3.8/lib/python3.8/site-packages/pandas/core/frame.py", line 
7608, in cov
   baseCov = libalgos.nancorr(ensure_float64(mat), cov=True, 
minp=min_periods)
 File "pandas/_libs/algos_common_helper.pxi", line 41, in 
pandas._libs.algos.ensure_float64
   TypeError: float() argument must be a string or a number, not 'NAType'
   ```




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #34776: [SPARK-37512][PYTHON] Support TimedeltaIndex creation (from Series/Index) and TimedeltaIndex.astype

2021-12-01 Thread GitBox


SparkQA commented on pull request #34776:
URL: https://github.com/apache/spark/pull/34776#issuecomment-984332768


   **[Test build #145847 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/145847/testReport)**
 for PR 34776 at commit 
[`acb7c55`](https://github.com/apache/spark/commit/acb7c55efbccab5bc3225d83b7a65c6d45ef8926).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #34778: [SPARK-36396][PYTHON][FOLLOWUP] Fix test with extensions dtype when pandas version < 1.2

2021-12-01 Thread GitBox


SparkQA commented on pull request #34778:
URL: https://github.com/apache/spark/pull/34778#issuecomment-984332719


   **[Test build #145846 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/145846/testReport)**
 for PR 34778 at commit 
[`73a84fc`](https://github.com/apache/spark/commit/73a84fccdcfcc84cf68e8a73ee830dd0c4522627).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #34774: [SPARK-37516][PYTHON][SQL] Uses Python's standard string formatter for SQL API in PySpark

2021-12-01 Thread GitBox


AmplabJenkins removed a comment on pull request #34774:
URL: https://github.com/apache/spark/pull/34774#issuecomment-984331468


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/145843/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #34673: [SPARK-37343][SQL] Implement createIndex, IndexExists and dropIndex in JDBC (Postgres dialect)

2021-12-01 Thread GitBox


AmplabJenkins removed a comment on pull request #34673:
URL: https://github.com/apache/spark/pull/34673#issuecomment-984331466


   
   Refer to this link for build results (access rights to CI server needed): 
   
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/50314/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #34753: [SPARK-37494][SQL] Unify v1 and v2 options output of `SHOW CREATE TABLE` command

2021-12-01 Thread GitBox


AmplabJenkins removed a comment on pull request #34753:
URL: https://github.com/apache/spark/pull/34753#issuecomment-984331465


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/145833/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #34775: [SPARK-37511][DOCS][FOLLOW-UP] Fix documentation build warning from TimedeltaIndex

2021-12-01 Thread GitBox


AmplabJenkins removed a comment on pull request #34775:
URL: https://github.com/apache/spark/pull/34775#issuecomment-984331467


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/145842/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #32875: [SPARK-35703][SQL] Relax constraint for bucket join and remove HashClusteredDistribution

2021-12-01 Thread GitBox


AmplabJenkins removed a comment on pull request #32875:
URL: https://github.com/apache/spark/pull/32875#issuecomment-984331423


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/145834/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #32875: [SPARK-35703][SQL] Relax constraint for bucket join and remove HashClusteredDistribution

2021-12-01 Thread GitBox


AmplabJenkins commented on pull request #32875:
URL: https://github.com/apache/spark/pull/32875#issuecomment-984331423


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/145834/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #34775: [SPARK-37511][DOCS][FOLLOW-UP] Fix documentation build warning from TimedeltaIndex

2021-12-01 Thread GitBox


AmplabJenkins commented on pull request #34775:
URL: https://github.com/apache/spark/pull/34775#issuecomment-984331467


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/145842/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #34774: [SPARK-37516][PYTHON][SQL] Uses Python's standard string formatter for SQL API in PySpark

2021-12-01 Thread GitBox


AmplabJenkins commented on pull request #34774:
URL: https://github.com/apache/spark/pull/34774#issuecomment-984331468


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/145843/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #34673: [SPARK-37343][SQL] Implement createIndex, IndexExists and dropIndex in JDBC (Postgres dialect)

2021-12-01 Thread GitBox


AmplabJenkins commented on pull request #34673:
URL: https://github.com/apache/spark/pull/34673#issuecomment-984331466


   
   Refer to this link for build results (access rights to CI server needed): 
   
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/50314/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #34753: [SPARK-37494][SQL] Unify v1 and v2 options output of `SHOW CREATE TABLE` command

2021-12-01 Thread GitBox


AmplabJenkins commented on pull request #34753:
URL: https://github.com/apache/spark/pull/34753#issuecomment-984331465


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/145833/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA removed a comment on pull request #32875: [SPARK-35703][SQL] Relax constraint for bucket join and remove HashClusteredDistribution

2021-12-01 Thread GitBox


SparkQA removed a comment on pull request #32875:
URL: https://github.com/apache/spark/pull/32875#issuecomment-984210804


   **[Test build #145834 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/145834/testReport)**
 for PR 32875 at commit 
[`0345612`](https://github.com/apache/spark/commit/0345612b5ab9347e8b14eec243ec332f98765ab5).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #32875: [SPARK-35703][SQL] Relax constraint for bucket join and remove HashClusteredDistribution

2021-12-01 Thread GitBox


SparkQA commented on pull request #32875:
URL: https://github.com/apache/spark/pull/32875#issuecomment-984330466


   **[Test build #145834 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/145834/testReport)**
 for PR 32875 at commit 
[`0345612`](https://github.com/apache/spark/commit/0345612b5ab9347e8b14eec243ec332f98765ab5).
* This patch passes all tests.
* This patch merges cleanly.
* This patch adds no public classes.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #34673: [SPARK-37343][SQL] Implement createIndex, IndexExists and dropIndex in JDBC (Postgres dialect)

2021-12-01 Thread GitBox


SparkQA commented on pull request #34673:
URL: https://github.com/apache/spark/pull/34673#issuecomment-984329538


   Kubernetes integration test status failure
   URL: 
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/50314/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #34777: [SPARK-37326][SQL][FOLLOW-UP] Update code and tests for TimestampNTZ support in CSV data source

2021-12-01 Thread GitBox


SparkQA commented on pull request #34777:
URL: https://github.com/apache/spark/pull/34777#issuecomment-984329506


   Kubernetes integration test starting
   URL: 
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/50315/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #34757: [SPARK-37504][PYTHON] Pyspark create SparkSession with existed session should not pass static conf

2021-12-01 Thread GitBox


SparkQA commented on pull request #34757:
URL: https://github.com/apache/spark/pull/34757#issuecomment-984328332


   Kubernetes integration test starting
   URL: 
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/50319/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #34701: [SPARK-37450][SQL] Prune unnecessary fields from Generate

2021-12-01 Thread GitBox


SparkQA commented on pull request #34701:
URL: https://github.com/apache/spark/pull/34701#issuecomment-984328016


   Kubernetes integration test starting
   URL: 
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/50320/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] HyukjinKwon commented on a change in pull request #34778: [SPARK-36396][PYTHON][FOLLOWUP] Fix test with extensions dtype when pandas version < 1.2

2021-12-01 Thread GitBox


HyukjinKwon commented on a change in pull request #34778:
URL: https://github.com/apache/spark/pull/34778#discussion_r760789523



##
File path: python/pyspark/pandas/tests/test_dataframe.py
##
@@ -5870,8 +5870,12 @@ def test_cov(self):
 self.assert_eq(pdf.cov(min_periods=5), psdf.cov(min_periods=5))
 
 # extension dtype
-numeric_dtypes = ["Int8", "Int16", "Int32", "Int64", "Float32", 
"Float64", "float"]
-boolean_dtypes = ["boolean", "bool"]
+if LooseVersion(pd.__version__) >= LooseVersion("1.2"):

Review comment:
   @dchvn just to clarify, so cov API only works correctly with pandas 
1.2+? or are you just fixing the tests only?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #34775: [SPARK-37511][DOCS][FOLLOW-UP] Fix documentation build warning from TimedeltaIndex

2021-12-01 Thread GitBox


SparkQA commented on pull request #34775:
URL: https://github.com/apache/spark/pull/34775#issuecomment-984325655


   Kubernetes integration test starting
   URL: 
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/50317/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #34776: [SPARK-37512][PYTHON] Support TimedeltaIndex creation given a timedelta Series/Index

2021-12-01 Thread GitBox


SparkQA commented on pull request #34776:
URL: https://github.com/apache/spark/pull/34776#issuecomment-984325276


   Kubernetes integration test starting
   URL: 
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/50316/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #34774: [SPARK-37516][PYTHON][SQL] Uses Python's standard string formatter for SQL API in PySpark

2021-12-01 Thread GitBox


SparkQA commented on pull request #34774:
URL: https://github.com/apache/spark/pull/34774#issuecomment-984324189


   Kubernetes integration test starting
   URL: 
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/50318/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA removed a comment on pull request #34774: [SPARK-37516][PYTHON][SQL] Uses Python's standard string formatter for SQL API in PySpark

2021-12-01 Thread GitBox


SparkQA removed a comment on pull request #34774:
URL: https://github.com/apache/spark/pull/34774#issuecomment-984307383


   **[Test build #145843 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/145843/testReport)**
 for PR 34774 at commit 
[`b14db3d`](https://github.com/apache/spark/commit/b14db3d31491cdb85401046371613912b99b84dd).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA removed a comment on pull request #34775: [SPARK-37511][DOCS][FOLLOW-UP] Fix documentation build warning from TimedeltaIndex

2021-12-01 Thread GitBox


SparkQA removed a comment on pull request #34775:
URL: https://github.com/apache/spark/pull/34775#issuecomment-984307321


   **[Test build #145842 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/145842/testReport)**
 for PR 34775 at commit 
[`4682604`](https://github.com/apache/spark/commit/4682604d7628afc5ab855a1cefb8d5bb8e64004d).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA removed a comment on pull request #34753: [SPARK-37494][SQL] Unify v1 and v2 options output of `SHOW CREATE TABLE` command

2021-12-01 Thread GitBox


SparkQA removed a comment on pull request #34753:
URL: https://github.com/apache/spark/pull/34753#issuecomment-984210101


   **[Test build #145833 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/145833/testReport)**
 for PR 34753 at commit 
[`7771b14`](https://github.com/apache/spark/commit/7771b14e20a374bf638dd2abd3dc8ba74e14c3c2).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #34753: [SPARK-37494][SQL] Unify v1 and v2 options output of `SHOW CREATE TABLE` command

2021-12-01 Thread GitBox


SparkQA commented on pull request #34753:
URL: https://github.com/apache/spark/pull/34753#issuecomment-984321198


   **[Test build #145833 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/145833/testReport)**
 for PR 34753 at commit 
[`7771b14`](https://github.com/apache/spark/commit/7771b14e20a374bf638dd2abd3dc8ba74e14c3c2).
* This patch passes all tests.
* This patch merges cleanly.
* This patch adds no public classes.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #34774: [SPARK-37516][PYTHON][SQL] Uses Python's standard string formatter for SQL API in PySpark

2021-12-01 Thread GitBox


SparkQA commented on pull request #34774:
URL: https://github.com/apache/spark/pull/34774#issuecomment-984320572


   **[Test build #145843 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/145843/testReport)**
 for PR 34774 at commit 
[`b14db3d`](https://github.com/apache/spark/commit/b14db3d31491cdb85401046371613912b99b84dd).
* This patch passes all tests.
* This patch merges cleanly.
* This patch adds the following public classes _(experimental)_:
 * `class PandasSQLStringFormatter(string.Formatter):`
 * `class SQLStringFormatter(string.Formatter):`


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] dchvn opened a new pull request #34778: [SPARK-36396][PYTHON][FOLLOWUP] Fix test with extensions dtype when pandas version < 1.2

2021-12-01 Thread GitBox


dchvn opened a new pull request #34778:
URL: https://github.com/apache/spark/pull/34778


   ### What changes were proposed in this pull request?
   Fix test of `pd.Dataframe.cov` with extensions dtype when pandas version < 
1.2
   
   ### Why are the changes needed?
   Pass test of `pd.Dataframe.cov` with pandas version < 1.2
   
   ### Does this PR introduce _any_ user-facing change?
   No
   
   ### How was this patch tested?
   Existing tests
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #34775: [SPARK-37511][DOCS][FOLLOW-UP] Fix documentation build warning from TimedeltaIndex

2021-12-01 Thread GitBox


SparkQA commented on pull request #34775:
URL: https://github.com/apache/spark/pull/34775#issuecomment-984320063


   **[Test build #145842 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/145842/testReport)**
 for PR 34775 at commit 
[`4682604`](https://github.com/apache/spark/commit/4682604d7628afc5ab855a1cefb8d5bb8e64004d).
* This patch passes all tests.
* This patch merges cleanly.
* This patch adds no public classes.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #34667: [SPARK-36902][SQL] Migrate CreateTableAsSelectStatement to v2 command

2021-12-01 Thread GitBox


AmplabJenkins removed a comment on pull request #34667:
URL: https://github.com/apache/spark/pull/34667#issuecomment-984312341


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/145837/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA removed a comment on pull request #34667: [SPARK-36902][SQL] Migrate CreateTableAsSelectStatement to v2 command

2021-12-01 Thread GitBox


SparkQA removed a comment on pull request #34667:
URL: https://github.com/apache/spark/pull/34667#issuecomment-984260258


   **[Test build #145837 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/145837/testReport)**
 for PR 34667 at commit 
[`729cc22`](https://github.com/apache/spark/commit/729cc2272f77216eb825e6ec81e0254560972e01).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #34667: [SPARK-36902][SQL] Migrate CreateTableAsSelectStatement to v2 command

2021-12-01 Thread GitBox


AmplabJenkins commented on pull request #34667:
URL: https://github.com/apache/spark/pull/34667#issuecomment-984312341


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/145837/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #34667: [SPARK-36902][SQL] Migrate CreateTableAsSelectStatement to v2 command

2021-12-01 Thread GitBox


SparkQA commented on pull request #34667:
URL: https://github.com/apache/spark/pull/34667#issuecomment-984312176


   **[Test build #145837 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/145837/testReport)**
 for PR 34667 at commit 
[`729cc22`](https://github.com/apache/spark/commit/729cc2272f77216eb825e6ec81e0254560972e01).
* This patch **fails Spark unit tests**.
* This patch merges cleanly.
* This patch adds no public classes.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #34760: [SPARK-37506][CORE][SQL][DSTREAM][GRAPHX][ML][MLLIB][SS][EXAMPLES] Change the never changed 'var' to 'val'

2021-12-01 Thread GitBox


AmplabJenkins removed a comment on pull request #34760:
URL: https://github.com/apache/spark/pull/34760#issuecomment-984310479


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/145835/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #34760: [SPARK-37506][CORE][SQL][DSTREAM][GRAPHX][ML][MLLIB][SS][EXAMPLES] Change the never changed 'var' to 'val'

2021-12-01 Thread GitBox


AmplabJenkins commented on pull request #34760:
URL: https://github.com/apache/spark/pull/34760#issuecomment-984310479


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/145835/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] cloud-fan commented on a change in pull request #34701: [SPARK-37450][SQL] Prune unnecessary fields from Generate

2021-12-01 Thread GitBox


cloud-fan commented on a change in pull request #34701:
URL: https://github.com/apache/spark/pull/34701#discussion_r760776756



##
File path: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala
##
@@ -2162,6 +2163,48 @@ object RemoveLiteralFromGroupExpressions extends 
Rule[LogicalPlan] {
   }
 }
 
+/**
+ * Prunes unnecessary fields from a [[Generate]] if it is under a project 
which does not refer
+ * any generated attributes, .e.g., count-like aggregation on an exploded 
array.
+ */
+object GenerateOptimization extends Rule[LogicalPlan] {
+  def apply(plan: LogicalPlan): LogicalPlan = plan.transformDownWithPruning(
+  _.containsAllPatterns(PROJECT, GENERATE), ruleId) {
+  case p @ Project(_, g: Generate) if p.references.isEmpty
+  && g.generator.isInstanceOf[ExplodeBase] =>
+g.generator.children.head.dataType match {
+  case ArrayType(StructType(fields), _) =>
+val atomicFields = fields.collect {
+  case f: StructField if f.dataType.isInstanceOf[AtomicType] => f
+}
+val extractor = if (atomicFields.size > 0) {
+  // Pick an arbitrary atomic field, if any

Review comment:
   shall we pick the smallest one? e.g. prefer int over string




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] cloud-fan commented on a change in pull request #34701: [SPARK-37450][SQL] Prune unnecessary fields from Generate

2021-12-01 Thread GitBox


cloud-fan commented on a change in pull request #34701:
URL: https://github.com/apache/spark/pull/34701#discussion_r760776756



##
File path: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala
##
@@ -2162,6 +2163,48 @@ object RemoveLiteralFromGroupExpressions extends 
Rule[LogicalPlan] {
   }
 }
 
+/**
+ * Prunes unnecessary fields from a [[Generate]] if it is under a project 
which does not refer
+ * any generated attributes, .e.g., count-like aggregation on an exploded 
array.
+ */
+object GenerateOptimization extends Rule[LogicalPlan] {
+  def apply(plan: LogicalPlan): LogicalPlan = plan.transformDownWithPruning(
+  _.containsAllPatterns(PROJECT, GENERATE), ruleId) {
+  case p @ Project(_, g: Generate) if p.references.isEmpty
+  && g.generator.isInstanceOf[ExplodeBase] =>
+g.generator.children.head.dataType match {
+  case ArrayType(StructType(fields), _) =>
+val atomicFields = fields.collect {
+  case f: StructField if f.dataType.isInstanceOf[AtomicType] => f
+}
+val extractor = if (atomicFields.size > 0) {
+  // Pick an arbitrary atomic field, if any

Review comment:
   shall we pick the smaller list one? e.g. prefer int over string




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] cloud-fan commented on a change in pull request #34701: [SPARK-37450][SQL] Prune unnecessary fields from Generate

2021-12-01 Thread GitBox


cloud-fan commented on a change in pull request #34701:
URL: https://github.com/apache/spark/pull/34701#discussion_r760776370



##
File path: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala
##
@@ -2162,6 +2163,48 @@ object RemoveLiteralFromGroupExpressions extends 
Rule[LogicalPlan] {
   }
 }
 
+/**
+ * Prunes unnecessary fields from a [[Generate]] if it is under a project 
which does not refer
+ * any generated attributes, .e.g., count-like aggregation on an exploded 
array.
+ */
+object GenerateOptimization extends Rule[LogicalPlan] {
+  def apply(plan: LogicalPlan): LogicalPlan = plan.transformDownWithPruning(
+  _.containsAllPatterns(PROJECT, GENERATE), ruleId) {
+  case p @ Project(_, g: Generate) if p.references.isEmpty
+  && g.generator.isInstanceOf[ExplodeBase] =>
+g.generator.children.head.dataType match {
+  case ArrayType(StructType(fields), _) =>
+val atomicFields = fields.collect {
+  case f: StructField if f.dataType.isInstanceOf[AtomicType] => f
+}
+val extractor = if (atomicFields.size > 0) {
+  // Pick an arbitrary atomic field, if any
+  ExtractValue(g.generator.children.head,

Review comment:
   nit: I feel it's safer to create `GetStructField` instead of doing name 
lookup again. It's possible that some dataframe-generated query plan has name 
conflicts in the struct, and `GetStructField` allows us to put the ordinal 
directly to avoid a name lookup.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA removed a comment on pull request #34764: [SPARK-37330][SQL] Migrate ReplaceTableStatement to v2 command

2021-12-01 Thread GitBox


SparkQA removed a comment on pull request #34764:
URL: https://github.com/apache/spark/pull/34764#issuecomment-984260141


   **[Test build #145836 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/145836/testReport)**
 for PR 34764 at commit 
[`66a2aaf`](https://github.com/apache/spark/commit/66a2aafc18c653e30c4c8d0442da810307bf9376).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #34764: [SPARK-37330][SQL] Migrate ReplaceTableStatement to v2 command

2021-12-01 Thread GitBox


AmplabJenkins commented on pull request #34764:
URL: https://github.com/apache/spark/pull/34764#issuecomment-984309930


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/145836/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #34764: [SPARK-37330][SQL] Migrate ReplaceTableStatement to v2 command

2021-12-01 Thread GitBox


AmplabJenkins removed a comment on pull request #34764:
URL: https://github.com/apache/spark/pull/34764#issuecomment-984309930


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/145836/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #34764: [SPARK-37330][SQL] Migrate ReplaceTableStatement to v2 command

2021-12-01 Thread GitBox


SparkQA commented on pull request #34764:
URL: https://github.com/apache/spark/pull/34764#issuecomment-984309768


   **[Test build #145836 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/145836/testReport)**
 for PR 34764 at commit 
[`66a2aaf`](https://github.com/apache/spark/commit/66a2aafc18c653e30c4c8d0442da810307bf9376).
* This patch **fails Spark unit tests**.
* This patch merges cleanly.
* This patch adds no public classes.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA removed a comment on pull request #34760: [SPARK-37506][CORE][SQL][DSTREAM][GRAPHX][ML][MLLIB][SS][EXAMPLES] Change the never changed 'var' to 'val'

2021-12-01 Thread GitBox


SparkQA removed a comment on pull request #34760:
URL: https://github.com/apache/spark/pull/34760#issuecomment-984238599


   **[Test build #145835 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/145835/testReport)**
 for PR 34760 at commit 
[`7f57e4a`](https://github.com/apache/spark/commit/7f57e4aa01f57c0cf9bb91c353b5c46d51e8128a).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #34760: [SPARK-37506][CORE][SQL][DSTREAM][GRAPHX][ML][MLLIB][SS][EXAMPLES] Change the never changed 'var' to 'val'

2021-12-01 Thread GitBox


SparkQA commented on pull request #34760:
URL: https://github.com/apache/spark/pull/34760#issuecomment-984309442


   **[Test build #145835 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/145835/testReport)**
 for PR 34760 at commit 
[`7f57e4a`](https://github.com/apache/spark/commit/7f57e4aa01f57c0cf9bb91c353b5c46d51e8128a).
* This patch passes all tests.
* This patch merges cleanly.
* This patch adds no public classes.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] venkata91 commented on a change in pull request #33896: [SPARK-33701][SHUFFLE] Adaptive shuffle merge finalization for push-based shuffle

2021-12-01 Thread GitBox


venkata91 commented on a change in pull request #33896:
URL: https://github.com/apache/spark/pull/33896#discussion_r760774868



##
File path: 
core/src/test/scala/org/apache/spark/scheduler/DAGSchedulerSuite.scala
##
@@ -3847,6 +3887,76 @@ class DAGSchedulerSuite extends SparkFunSuite with 
TempLocalSparkContext with Ti
 
 // Job successful ended.
 assert(results === Map(0 -> 11, 1 -> 12))
+  }
+
+  test("SPARK-33701: shuffle adaptive merge finalization") {
+initPushBasedShuffleConfs(conf)
+conf.set(config.PUSH_BASED_SHUFFLE_SIZE_MIN_SHUFFLE_SIZE_TO_WAIT, 10L)
+conf.set(config.SHUFFLE_MERGER_LOCATIONS_MIN_STATIC_THRESHOLD, 3)
+DAGSchedulerSuite.clearMergerLocs
+DAGSchedulerSuite.addMergerLocs(Seq("host1", "host2", "host3", "host4", 
"host5"))
+val parts = 2
+
+val shuffleMapRdd1 = new MyRDD(sc, parts, Nil)
+val shuffleDep1 = new ShuffleDependency(shuffleMapRdd1, new 
HashPartitioner(parts))
+val shuffleMapRdd2 = new MyRDD(sc, parts, Nil)
+val shuffleDep2 = new ShuffleDependency(shuffleMapRdd2, new 
HashPartitioner(parts))
+val reduceRdd = new MyRDD(sc, parts, List(shuffleDep1, shuffleDep2),
+  tracker = mapOutputTracker)
+
+// Submit a reduce job that depends which will create a map stage
+submit(reduceRdd, (0 until parts).toArray)
+
+val taskResults = taskSets(0).tasks.zipWithIndex.map {
+  case (_, idx) =>
+(Success, makeMapStatus("host" + ('A' + idx).toChar, parts))
+}.toSeq
+// Remove MapStatus on one of the host before the stage ends to trigger
+// a scenario where stage 0 needs to be resubmitted upon finishing all 
tasks.
+// Merge finalization should not be scheduled in this case.
+for ((result, i) <- taskResults.zipWithIndex) {
+  if (i == taskSets(0).tasks.size - 1) {
+mapOutputTracker.removeOutputsOnHost("hostA")
+  }
+  if (i < taskSets(0).tasks.size) {
+runEvent(makeCompletionEvent(taskSets(0).tasks(i), result._1, 
result._2))
+  }
+}
+val shuffleStage1 = 
scheduler.stageIdToStage(0).asInstanceOf[ShuffleMapStage]
+// Successfully completing the retry of stage 0. Merge finalization should 
be
+// disabled
+ complete(taskSets(2), taskSets(2).tasks.zipWithIndex.map {
+  case (_, idx) =>
+(Success, makeMapStatus("host" + ('A' + idx).toChar, parts))
+}.toSeq)
+assert(!shuffleStage1.shuffleDep.shuffleMergeEnabled)
+// Verify finalize task is set with 0 delay and merge results not marked
+// for registration due to shuffle size smaller than threshold
+assert(shuffleStage1.shuffleDep.getFinalizeTask.nonEmpty)
+val finalizeTask1 = shuffleStage1.shuffleDep.getFinalizeTask.get
+  .asInstanceOf[DummyScheduledFuture]
+assert(finalizeTask1.delay == 0 && !finalizeTask1.registerMergeResults)
+
+complete(taskSets(1), taskSets(1).tasks.zipWithIndex.map {
+  case (_, idx) =>
+(Success, makeMapStatus("host" + ('A' + idx).toChar, parts, 10))
+}.toSeq)
+val shuffleStage2 = 
scheduler.stageIdToStage(1).asInstanceOf[ShuffleMapStage]
+// Verify finalize task is set with default delay of 10s and merge results 
are marked
+// for registration
+assert(shuffleStage2.shuffleDep.getFinalizeTask.nonEmpty)
+val finalizeTask2 = shuffleStage2.shuffleDep.getFinalizeTask.get
+  .asInstanceOf[DummyScheduledFuture]
+assert(finalizeTask2.delay == 10 && finalizeTask2.registerMergeResults)
+
+pushComplete(shuffleStage2.shuffleDep.shuffleId, 0, 0)
+pushComplete(shuffleStage2.shuffleDep.shuffleId, 0, 1)
+
+assert(mapOutputTracker.getNumAvailableMergeResults(shuffleDep1.shuffleId) 
== parts)
+assert(mapOutputTracker.getNumAvailableMergeResults(shuffleDep2.shuffleId) 
== parts)

Review comment:
   @mridulm I just realized we cannot set 
`PUSH_BASED_SHUFFLE_MIN_PUSH_RATIO` < 1.0 as we can have some tasks still 
running but the minimum pushes would have completed causing stage completion 
(`processShuffleMapStageCompletion`), this would have cause retry of the stage 
as all of the map status is not available due to some tasks still running. Even 
though we have enough pushes completed, we still need to wait till all the 
tasks run successfully.
   We can have a check for `mapStage.isAvailable` as part of 
`handleShufflePushCompleted` to prevent it from scheduling shuffle merge 
finalization but then it ultimately comes to stage completion and schedule 
shuffle merge finalization anyway at that point. But in this case at least we 
don't need to wait for 10 secs or the default timeout as we have min pushes 
completed. Thoughts?

##
File path: 
core/src/test/scala/org/apache/spark/scheduler/DAGSchedulerSuite.scala
##
@@ -3847,6 +3887,76 @@ class DAGSchedulerSuite extends SparkFunSuite with 
TempLocalSparkContext with Ti
 
 // Job successful ended.
 assert(results === Map(0 -> 11, 1 -> 12))
+  }
+
+  test("SPARK-33701: shuffle adaptive merge 

[GitHub] [spark] AmplabJenkins removed a comment on pull request #34776: [SPARK-37512][PYTHON] Support TimedeltaIndex creation given a timedelta Series/Index

2021-12-01 Thread GitBox


AmplabJenkins removed a comment on pull request #34776:
URL: https://github.com/apache/spark/pull/34776#issuecomment-984307640


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/145841/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA removed a comment on pull request #34776: [SPARK-37512][PYTHON] Support TimedeltaIndex creation given a timedelta Series/Index

2021-12-01 Thread GitBox


SparkQA removed a comment on pull request #34776:
URL: https://github.com/apache/spark/pull/34776#issuecomment-984307312


   **[Test build #145841 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/145841/testReport)**
 for PR 34776 at commit 
[`ab9a2d6`](https://github.com/apache/spark/commit/ab9a2d636294dea1f4e19ae4057d19a156269b6d).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #34776: [SPARK-37512][PYTHON] Support TimedeltaIndex creation given a timedelta Series/Index

2021-12-01 Thread GitBox


AmplabJenkins commented on pull request #34776:
URL: https://github.com/apache/spark/pull/34776#issuecomment-984307640


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/145841/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #34776: [SPARK-37512][PYTHON] Support TimedeltaIndex creation given a timedelta Series/Index

2021-12-01 Thread GitBox


SparkQA commented on pull request #34776:
URL: https://github.com/apache/spark/pull/34776#issuecomment-984307627


   **[Test build #145841 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/145841/testReport)**
 for PR 34776 at commit 
[`ab9a2d6`](https://github.com/apache/spark/commit/ab9a2d636294dea1f4e19ae4057d19a156269b6d).
* This patch **fails Python style tests**.
* This patch merges cleanly.
* This patch adds no public classes.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #34774: [SPARK-37516][PYTHON][SQL] Uses Python's standard string formatter for SQL API in PySpark

2021-12-01 Thread GitBox


SparkQA commented on pull request #34774:
URL: https://github.com/apache/spark/pull/34774#issuecomment-984307383


   **[Test build #145843 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/145843/testReport)**
 for PR 34774 at commit 
[`b14db3d`](https://github.com/apache/spark/commit/b14db3d31491cdb85401046371613912b99b84dd).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #34701: [SPARK-37450][SQL] Prune unnecessary fields from Generate

2021-12-01 Thread GitBox


SparkQA commented on pull request #34701:
URL: https://github.com/apache/spark/pull/34701#issuecomment-984307445


   **[Test build #145845 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/145845/testReport)**
 for PR 34701 at commit 
[`099d3a8`](https://github.com/apache/spark/commit/099d3a8919189fa0b6f0d10079e327d114b657b8).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #34757: [SPARK-37504][PYTHON] Pyspark create SparkSession with existed session should not pass static conf

2021-12-01 Thread GitBox


SparkQA commented on pull request #34757:
URL: https://github.com/apache/spark/pull/34757#issuecomment-984307362


   **[Test build #145844 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/145844/testReport)**
 for PR 34757 at commit 
[`f6f06cc`](https://github.com/apache/spark/commit/f6f06cc45413e1633975c09de20f9011608b841a).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #34775: [SPARK-37511][DOCS][FOLLOW-UP] Fix documentation build warning from TimedeltaIndex

2021-12-01 Thread GitBox


SparkQA commented on pull request #34775:
URL: https://github.com/apache/spark/pull/34775#issuecomment-984307321


   **[Test build #145842 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/145842/testReport)**
 for PR 34775 at commit 
[`4682604`](https://github.com/apache/spark/commit/4682604d7628afc5ab855a1cefb8d5bb8e64004d).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #34777: [SPARK-37326][SQL][FOLLOW-UP] Update code and tests for TimestampNTZ support in CSV data source

2021-12-01 Thread GitBox


SparkQA commented on pull request #34777:
URL: https://github.com/apache/spark/pull/34777#issuecomment-984307292


   **[Test build #145840 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/145840/testReport)**
 for PR 34777 at commit 
[`65cdaa5`](https://github.com/apache/spark/commit/65cdaa56167da11389701880c949386c6902e826).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #34776: [SPARK-37512][PYTHON] Support TimedeltaIndex creation given a timedelta Series/Index

2021-12-01 Thread GitBox


SparkQA commented on pull request #34776:
URL: https://github.com/apache/spark/pull/34776#issuecomment-984307312


   **[Test build #145841 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/145841/testReport)**
 for PR 34776 at commit 
[`ab9a2d6`](https://github.com/apache/spark/commit/ab9a2d636294dea1f4e19ae4057d19a156269b6d).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #34659: [SPARK-34863][SQL] Support complex types for Parquet vectorized reader

2021-12-01 Thread GitBox


AmplabJenkins removed a comment on pull request #34659:
URL: https://github.com/apache/spark/pull/34659#issuecomment-984306042


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/145831/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #34667: [SPARK-36902][SQL] Migrate CreateTableAsSelectStatement to v2 command

2021-12-01 Thread GitBox


AmplabJenkins removed a comment on pull request #34667:
URL: https://github.com/apache/spark/pull/34667#issuecomment-984306043


   
   Refer to this link for build results (access rights to CI server needed): 
   
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/50312/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #34659: [SPARK-34863][SQL] Support complex types for Parquet vectorized reader

2021-12-01 Thread GitBox


AmplabJenkins commented on pull request #34659:
URL: https://github.com/apache/spark/pull/34659#issuecomment-984306042


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/145831/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #34667: [SPARK-36902][SQL] Migrate CreateTableAsSelectStatement to v2 command

2021-12-01 Thread GitBox


AmplabJenkins commented on pull request #34667:
URL: https://github.com/apache/spark/pull/34667#issuecomment-984306043


   
   Refer to this link for build results (access rights to CI server needed): 
   
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/50312/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] sadikovi commented on pull request #34777: [SPARK-37326][SQL][FOLLOW-UP] Update code and tests for TimestampNTZ support in CSV data source

2021-12-01 Thread GitBox


sadikovi commented on pull request #34777:
URL: https://github.com/apache/spark/pull/34777#issuecomment-984304694


   cc @MaxGekk @cloud-fan @gengliangwang for review. Thanks!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



  1   2   3   4   5   6   7   >