[GitHub] [spark] SparkQA commented on pull request #32018: [SPARK-34926][SQL] PartitioningUtils.getPathFragment() should respect partition value is null

2021-04-01 Thread GitBox


SparkQA commented on pull request #32018:
URL: https://github.com/apache/spark/pull/32018#issuecomment-812334201


   **[Test build #136841 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/136841/testReport)**
 for PR 32018 at commit 
[`992001b`](https://github.com/apache/spark/commit/992001bcf3ea7569a492659d97fbde25a5f0c406).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #32032: [SPARK-34701][SQL] Introduce TransformaAfterAnalysis rule that allows a logical plan to be transformed after all the analysis r

2021-04-01 Thread GitBox


AmplabJenkins removed a comment on pull request #32032:
URL: https://github.com/apache/spark/pull/32032#issuecomment-812333617


   
   Refer to this link for build results (access rights to CI server needed): 
   
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/41418/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #32015: [SPARK-34821][INFRA] Set up a workflow for developers to run benchmark in their fork

2021-04-01 Thread GitBox


AmplabJenkins removed a comment on pull request #32015:
URL: https://github.com/apache/spark/pull/32015#issuecomment-812333616






-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #32034: [SPARK-34940][SQL][TEST] Fix test of BasicWriteTaskStatsTrackerSuite

2021-04-01 Thread GitBox


AmplabJenkins removed a comment on pull request #32034:
URL: https://github.com/apache/spark/pull/32034#issuecomment-812333621


   
   Refer to this link for build results (access rights to CI server needed): 
   
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/41416/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #32033: [SPARK-34939][CORE] Throw fetch failure exception when unable to deserialize map statuses

2021-04-01 Thread GitBox


AmplabJenkins removed a comment on pull request #32033:
URL: https://github.com/apache/spark/pull/32033#issuecomment-812333614


   
   Refer to this link for build results (access rights to CI server needed): 
   
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/41417/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #32034: [SPARK-34940][SQL][TEST] Fix test of BasicWriteTaskStatsTrackerSuite

2021-04-01 Thread GitBox


AmplabJenkins commented on pull request #32034:
URL: https://github.com/apache/spark/pull/32034#issuecomment-812333621


   
   Refer to this link for build results (access rights to CI server needed): 
   
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/41416/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #32033: [SPARK-34939][CORE] Throw fetch failure exception when unable to deserialize map statuses

2021-04-01 Thread GitBox


AmplabJenkins commented on pull request #32033:
URL: https://github.com/apache/spark/pull/32033#issuecomment-812333614


   
   Refer to this link for build results (access rights to CI server needed): 
   
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/41417/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #32015: [SPARK-34821][INFRA] Set up a workflow for developers to run benchmark in their fork

2021-04-01 Thread GitBox


AmplabJenkins commented on pull request #32015:
URL: https://github.com/apache/spark/pull/32015#issuecomment-812333618






-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #32032: [SPARK-34701][SQL] Introduce TransformaAfterAnalysis rule that allows a logical plan to be transformed after all the analysis rules run

2021-04-01 Thread GitBox


AmplabJenkins commented on pull request #32032:
URL: https://github.com/apache/spark/pull/32032#issuecomment-812333617


   
   Refer to this link for build results (access rights to CI server needed): 
   
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/41418/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #32032: [SPARK-34701][SQL] Introduce TransformaAfterAnalysis rule that allows a logical plan to be transformed after all the analysis rules run.

2021-04-01 Thread GitBox


SparkQA commented on pull request #32032:
URL: https://github.com/apache/spark/pull/32032#issuecomment-812332967






-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA removed a comment on pull request #32015: [SPARK-34821][INFRA] Set up a workflow for developers to run benchmark in their fork

2021-04-01 Thread GitBox


SparkQA removed a comment on pull request #32015:
URL: https://github.com/apache/spark/pull/32015#issuecomment-812287685


   **[Test build #136835 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/136835/testReport)**
 for PR 32015 at commit 
[`dc7b70d`](https://github.com/apache/spark/commit/dc7b70daad9bd8f99952023110578b40a2233732).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #32015: [SPARK-34821][INFRA] Set up a workflow for developers to run benchmark in their fork

2021-04-01 Thread GitBox


SparkQA commented on pull request #32015:
URL: https://github.com/apache/spark/pull/32015#issuecomment-812331657


   **[Test build #136835 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/136835/testReport)**
 for PR 32015 at commit 
[`dc7b70d`](https://github.com/apache/spark/commit/dc7b70daad9bd8f99952023110578b40a2233732).
* This patch **fails SparkR unit tests**.
* This patch merges cleanly.
* This patch adds no public classes.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #32034: [SPARK-34940][SQL][TEST] Fix test of BasicWriteTaskStatsTrackerSuite

2021-04-01 Thread GitBox


SparkQA commented on pull request #32034:
URL: https://github.com/apache/spark/pull/32034#issuecomment-812330792






-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA removed a comment on pull request #32015: [SPARK-34821][INFRA] Set up a workflow for developers to run benchmark in their fork

2021-04-01 Thread GitBox


SparkQA removed a comment on pull request #32015:
URL: https://github.com/apache/spark/pull/32015#issuecomment-812288459


   **[Test build #136836 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/136836/testReport)**
 for PR 32015 at commit 
[`d9f9aae`](https://github.com/apache/spark/commit/d9f9aaec23e8f94bfa357264dda7376f6c615333).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #32015: [SPARK-34821][INFRA] Set up a workflow for developers to run benchmark in their fork

2021-04-01 Thread GitBox


SparkQA commented on pull request #32015:
URL: https://github.com/apache/spark/pull/32015#issuecomment-812330387


   **[Test build #136836 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/136836/testReport)**
 for PR 32015 at commit 
[`d9f9aae`](https://github.com/apache/spark/commit/d9f9aaec23e8f94bfa357264dda7376f6c615333).
* This patch passes all tests.
* This patch merges cleanly.
* This patch adds no public classes.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #32033: [SPARK-34939][CORE] Throw fetch failure exception when unable to deserialize map statuses

2021-04-01 Thread GitBox


SparkQA commented on pull request #32033:
URL: https://github.com/apache/spark/pull/32033#issuecomment-812329902


   Kubernetes integration test unable to build dist.
   
   exiting with code: 1
   URL: 
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/41417/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] WangGuangxin commented on a change in pull request #31967: [SPARK-34819][SQL] MapType supports orderable semantics

2021-04-01 Thread GitBox


WangGuangxin commented on a change in pull request #31967:
URL: https://github.com/apache/spark/pull/31967#discussion_r606078034



##
File path: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/NormalizeMapType.scala
##
@@ -0,0 +1,155 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql.catalyst.optimizer
+
+import scala.math.Ordering
+
+import org.apache.spark.sql.catalyst.expressions.{And, EqualTo, 
ExpectsInputTypes, Expression, UnaryExpression}
+import org.apache.spark.sql.catalyst.expressions.codegen.CodegenContext
+import 
org.apache.spark.sql.catalyst.expressions.codegen.CodeGenerator.{getValue, 
javaType}
+import org.apache.spark.sql.catalyst.expressions.codegen.ExprCode
+import org.apache.spark.sql.catalyst.planning.ExtractEquiJoinKeys
+import org.apache.spark.sql.catalyst.plans.logical.{LogicalPlan, Window}
+import org.apache.spark.sql.catalyst.rules.Rule
+import org.apache.spark.sql.catalyst.util.{ArrayBasedMapBuilder, MapData, 
TypeUtils}
+import org.apache.spark.sql.types.{AbstractDataType, DataType, MapType}
+
+/**
+ * When comparing two maps, we have to make sure two maps have the same key 
value pairs but
+ * with different key ordering are equal.
+ * For example, Map('a' -> 1, 'b' -> 2) equals to Map('b' -> 2, 'a' -> 1).
+ *
+ * We have to specially handle this in grouping/join/window because Spark SQL 
turns
+ * grouping/join/window partition keys into binary `UnsafeRow` and compare the
+ * binary data directly instead of using MapType's ordering. So in these 
cases, we have
+ * to insert an expression to sort map entries by key.
+ *
+ * Note that, this rule must be executed at the end of optimizer, because the 
optimizer may create
+ * new joins(the subquery rewrite) and new join conditions(the join reorder).
+ */
+object NormalizeMapType extends Rule[LogicalPlan] {
+  def apply(plan: LogicalPlan): LogicalPlan = plan transform {
+case w: Window if w.partitionSpec.exists(p => needNormalize(p)) =>
+  w.copy(partitionSpec = w.partitionSpec.map(normalize))
+
+case j @ ExtractEquiJoinKeys(_, leftKeys, rightKeys, condition, _, _, _)
+  // The analyzer guarantees left and right joins keys are of the same 
data type.
+  if leftKeys.exists(k => needNormalize(k)) =>
+  val newLeftJoinKeys = leftKeys.map(normalize)
+  val newRightJoinKeys = rightKeys.map(normalize)
+  val newConditions = newLeftJoinKeys.zip(newRightJoinKeys).map {
+case (l, r) => EqualTo(l, r)
+  } ++ condition
+  j.copy(condition = Some(newConditions.reduce(And)))
+  }
+
+  private def needNormalize(expr: Expression): Boolean = expr match {
+case SortMapKey(_) => false
+case e if e.dataType.isInstanceOf[MapType] => true
+case _ => false
+  }
+
+  private[sql] def normalize(expr: Expression): Expression = expr match {
+case _ if !needNormalize(expr) => expr
+case e if e.dataType.isInstanceOf[MapType] =>
+  SortMapKey(e)
+  }
+}
+
+case class SortMapKey(child: Expression) extends UnaryExpression with 
ExpectsInputTypes {
+  private lazy val MapType(keyType, valueType, valueContainsNull) = 
dataType.asInstanceOf[MapType]
+  private lazy val keyOrdering: Ordering[Any] = 
TypeUtils.getInterpretedOrdering(keyType)
+  private lazy val mapBuilder = new ArrayBasedMapBuilder(keyType, valueType)
+
+  override def inputTypes: Seq[AbstractDataType] = Seq(MapType)
+
+  override def dataType: DataType = child.dataType
+
+  override def nullSafeEval(input: Any): Any = {
+val childMap = input.asInstanceOf[MapData]
+val keys = childMap.keyArray()

Review comment:
   Seems that I missed this case. I'll fix it 




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] MaxGekk commented on pull request #32018: [SPARK-34926][SQL] PartitioningUtils.getPathFragment() should respect partition value is null

2021-04-01 Thread GitBox


MaxGekk commented on pull request #32018:
URL: https://github.com/apache/spark/pull/32018#issuecomment-812325437


   jenkins, retest this, please


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] MaxGekk commented on pull request #32018: [SPARK-34926][SQL] PartitioningUtils.getPathFragment() should respect partition value is null

2021-04-01 Thread GitBox


MaxGekk commented on pull request #32018:
URL: https://github.com/apache/spark/pull/32018#issuecomment-812325204


   GA are failing on Avro tests, for instance. And jenkins build failed on the 
latest commit. @AngersZh To continue with the fix, let's re-trigger tests. 
Also @cloud-fan could you look at this PR since you reviewed previous changes 
related to null part values.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] attilapiros commented on pull request #31871: [SPARK-34779][CORE] ExecutorMetricsPoller should keep stage entry in stageTCMP until a heartbeat occurs

2021-04-01 Thread GitBox


attilapiros commented on pull request #31871:
URL: https://github.com/apache/spark/pull/31871#issuecomment-812323640


   Merged to master


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] sadhen commented on a change in pull request #32026: [SPARK-34771] Support UDT for Pandas/Spark conversion with Arrow support Enabled

2021-04-01 Thread GitBox


sadhen commented on a change in pull request #32026:
URL: https://github.com/apache/spark/pull/32026#discussion_r606075743



##
File path: python/pyspark/sql/tests/test_arrow.py
##
@@ -196,6 +197,33 @@ def test_pandas_round_trip(self):
 pdf_arrow = df.toPandas()
 assert_frame_equal(pdf_arrow, pdf)
 
+def test_udt_roundtrip(self):
+pdf = pd.DataFrame({'point': pd.Series([ExamplePoint(1.0, 1.0), 
ExamplePoint(2.0, 2.0)])})
+schema = StructType([StructField('point', ExamplePointUDT(), False)])
+with 
self.sql_conf({"spark.sql.execution.arrow.pyspark.fallback.enabled": True}):
+df = self.spark.createDataFrame(pdf, schema)
+pdf_arrow = df.toPandas()
+assert_frame_equal(pdf_arrow, pdf)
+with 
self.sql_conf({"spark.sql.execution.arrow.pyspark.fallback.enabled": False}):
+df = self.spark.createDataFrame(pdf, schema)
+pdf_arrow = df.toPandas()
+assert_frame_equal(pdf_arrow, pdf)
+
+def test_array_udt_roundtrip(self):
+pdf = pd.DataFrame({'points': pd.Series([
+[ExamplePoint(1.0, 1.0), ExamplePoint(1.0, 2.0), ExamplePoint(1.0, 
3.0)],

Review comment:
   For primitive data type, it is not a good practice to wrap it in UDT. As 
a result, I do not think we should spend too much time on support UDT which is 
actually primitive data type. This part can be postponed.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] attilapiros closed pull request #31871: [SPARK-34779][CORE] ExecutorMetricsPoller should keep stage entry in stageTCMP until a heartbeat occurs

2021-04-01 Thread GitBox


attilapiros closed pull request #31871:
URL: https://github.com/apache/spark/pull/31871


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #32032: [SPARK-34701][SQL] Introduce TransformaAfterAnalysis rule that allows a logical plan to be transformed after all the analysis rules run.

2021-04-01 Thread GitBox


SparkQA commented on pull request #32032:
URL: https://github.com/apache/spark/pull/32032#issuecomment-812315631


   **[Test build #136840 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/136840/testReport)**
 for PR 32032 at commit 
[`12fdbe9`](https://github.com/apache/spark/commit/12fdbe9aea3775bd57b8fe04ecf9a944eadc7c8b).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #32033: [SPARK-34939][CORE] Throw fetch failure exception when unable to deserialize map statuses

2021-04-01 Thread GitBox


SparkQA commented on pull request #32033:
URL: https://github.com/apache/spark/pull/32033#issuecomment-812315600


   **[Test build #136839 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/136839/testReport)**
 for PR 32033 at commit 
[`12e93fa`](https://github.com/apache/spark/commit/12e93fa035d6126927ce54403c9c9983ce90968f).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #32034: [SPARK-34940][SQL][TEST] Fix test of BasicWriteTaskStatsTrackerSuite

2021-04-01 Thread GitBox


SparkQA commented on pull request #32034:
URL: https://github.com/apache/spark/pull/32034#issuecomment-812315579


   **[Test build #136838 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/136838/testReport)**
 for PR 32034 at commit 
[`f1037c7`](https://github.com/apache/spark/commit/f1037c7efd471ed438871b7c47057fcea73f8592).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #30145: [SPARK-33233][SQL]CUBE/ROLLUP/GROUPING SETS support GROUP BY ordinal

2021-04-01 Thread GitBox


AmplabJenkins removed a comment on pull request #30145:
URL: https://github.com/apache/spark/pull/30145#issuecomment-812315218


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/136829/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #31968: [SPARK-34873][SQL] Avoid wrapped in withNewExecutionId twice when run SQL with side effects

2021-04-01 Thread GitBox


AmplabJenkins removed a comment on pull request #31968:
URL: https://github.com/apache/spark/pull/31968#issuecomment-812315217


   
   Refer to this link for build results (access rights to CI server needed): 
   
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/41415/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] c21 commented on pull request #32034: [SPARK-34940][SQL][TEST] Fix test of BasicWriteTaskStatsTrackerSuite

2021-04-01 Thread GitBox


c21 commented on pull request #32034:
URL: https://github.com/apache/spark/pull/32034#issuecomment-812315290


   @cloud-fan could you help take a look when you have time, thanks.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #31968: [SPARK-34873][SQL] Avoid wrapped in withNewExecutionId twice when run SQL with side effects

2021-04-01 Thread GitBox


AmplabJenkins commented on pull request #31968:
URL: https://github.com/apache/spark/pull/31968#issuecomment-812315217


   
   Refer to this link for build results (access rights to CI server needed): 
   
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/41415/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #30145: [SPARK-33233][SQL]CUBE/ROLLUP/GROUPING SETS support GROUP BY ordinal

2021-04-01 Thread GitBox


AmplabJenkins commented on pull request #30145:
URL: https://github.com/apache/spark/pull/30145#issuecomment-812315218


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/136829/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] c21 opened a new pull request #32034: [SPARK-34940][SQL][TEST] Fix test of BasicWriteTaskStatsTrackerSuite

2021-04-01 Thread GitBox


c21 opened a new pull request #32034:
URL: https://github.com/apache/spark/pull/32034


   
   
   ### What changes were proposed in this pull request?
   
   This is to fix the minor typo in unit test of 
BasicWriteTaskStatsTrackerSuite 
(https://github.com/apache/spark/blob/master/sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/BasicWriteTaskStatsTrackerSuite.scala#L152
 ), where it should be a new file name, e.g. `f-3-3`, because the unit test 
expects 3 files in statistics 
(https://github.com/apache/spark/blob/master/sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/BasicWriteTaskStatsTrackerSuite.scala#L160
 ).
   
   ### Why are the changes needed?
   
   Fix minor bug.
   
   ### Does this PR introduce _any_ user-facing change?
   
   No.
   
   ### How was this patch tested?
   
   Changed unit test `"Three files, last one empty"` itself.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA removed a comment on pull request #30145: [SPARK-33233][SQL]CUBE/ROLLUP/GROUPING SETS support GROUP BY ordinal

2021-04-01 Thread GitBox


SparkQA removed a comment on pull request #30145:
URL: https://github.com/apache/spark/pull/30145#issuecomment-812236536


   **[Test build #136829 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/136829/testReport)**
 for PR 30145 at commit 
[`ff7971b`](https://github.com/apache/spark/commit/ff7971b2817b46c45b0584dfdfdda999bfd2b96d).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #30145: [SPARK-33233][SQL]CUBE/ROLLUP/GROUPING SETS support GROUP BY ordinal

2021-04-01 Thread GitBox


SparkQA commented on pull request #30145:
URL: https://github.com/apache/spark/pull/30145#issuecomment-812314277


   **[Test build #136829 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/136829/testReport)**
 for PR 30145 at commit 
[`ff7971b`](https://github.com/apache/spark/commit/ff7971b2817b46c45b0584dfdfdda999bfd2b96d).
* This patch passes all tests.
* This patch merges cleanly.
* This patch adds no public classes.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #31968: [SPARK-34873][SQL] Avoid wrapped in withNewExecutionId twice when run SQL with side effects

2021-04-01 Thread GitBox


SparkQA commented on pull request #31968:
URL: https://github.com/apache/spark/pull/31968#issuecomment-812313719






-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] viirya commented on a change in pull request #32033: [SPARK-34939][CORE] Throw fetch failure exception when unable to deserialize map statuses

2021-04-01 Thread GitBox


viirya commented on a change in pull request #32033:
URL: https://github.com/apache/spark/pull/32033#discussion_r606068095



##
File path: core/src/main/scala/org/apache/spark/MapOutputTracker.scala
##
@@ -100,7 +100,7 @@ private class ShuffleStatus(numPartitions: Int) extends 
Logging {
* broadcast variable in order to keep it from being garbage collected and 
to allow for it to be
* explicitly destroyed later on when the ShuffleMapStage is 
garbage-collected.
*/
-  private[this] var cachedSerializedBroadcast: Broadcast[Array[Byte]] = _
+  private[spark] var cachedSerializedBroadcast: Broadcast[Array[Byte]] = _

Review comment:
   Expose this for test.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] viirya opened a new pull request #32033: [SPARK-34939][CORE] Throw fetch failure exception when unable to deserialize map statuses

2021-04-01 Thread GitBox


viirya opened a new pull request #32033:
URL: https://github.com/apache/spark/pull/32033


   
   
   ### What changes were proposed in this pull request?
   
   
   This patch catches `IOException`, which is possibly thrown due to unable to 
deserialize map statuses (e.g., broadcasted value is destroyed), when 
deserilizing map statuses. Once `IOException` is caught, 
`MetadataFetchFailedException` is thrown to let Spark handle it.
   
   ### Why are the changes needed?
   
   
   One customer encountered application error. From the log, it is caused by 
accessing non-existing broadcasted value. The broadcasted value is map 
statuses. There is a race-condition.
   
   After map statuses are broadcasted and the executors obtain serialized 
broadcasted map statuses. If any fetch failure happens after, Spark scheduler 
invalidates cached map statuses and destroy broadcasted value of the map 
statuses. Then any executor trying to deserialize serialized broadcasted map 
statuses and access broadcasted value, `IOException` will be thrown. Currently 
we don't catch it in `MapOutputTrackerWorker` and above exception will fail the 
application.
   
   Normally we should throw a fetch failure exception for such case. Spark 
scheduler will handle this.
   
   ### Does this PR introduce _any_ user-facing change?
   
   
   No
   
   ### How was this patch tested?
   
   
   Unit test. Wait for customer verification too.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #31968: [SPARK-34873][SQL] Avoid wrapped in withNewExecutionId twice when run SQL with side effects

2021-04-01 Thread GitBox


SparkQA commented on pull request #31968:
URL: https://github.com/apache/spark/pull/31968#issuecomment-812301835


   **[Test build #136837 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/136837/testReport)**
 for PR 31968 at commit 
[`37d64d5`](https://github.com/apache/spark/commit/37d64d53a8a59de3617b1a8114cd28d25f30c900).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #32015: [SPARK-34821][INFRA] Set up a workflow for developers to run benchmark in their fork

2021-04-01 Thread GitBox


AmplabJenkins removed a comment on pull request #32015:
URL: https://github.com/apache/spark/pull/32015#issuecomment-812301609


   
   Refer to this link for build results (access rights to CI server needed): 
   
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/41414/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #32032: [SPARK-34701][SQL] Introduce TransformaAfterAnalysis rule that allows a logical plan to be transformed after all the analysis r

2021-04-01 Thread GitBox


AmplabJenkins removed a comment on pull request #32032:
URL: https://github.com/apache/spark/pull/32032#issuecomment-812301610


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/136833/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #30480: [SPARK-32921][SHUFFLE] MapOutputTracker extensions to support push-based shuffle

2021-04-01 Thread GitBox


AmplabJenkins removed a comment on pull request #30480:
URL: https://github.com/apache/spark/pull/30480#issuecomment-812301608


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/136832/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #32026: [SPARK-34771] Support UDT for Pandas/Spark conversion with Arrow support Enabled

2021-04-01 Thread GitBox


AmplabJenkins removed a comment on pull request #32026:
URL: https://github.com/apache/spark/pull/32026#issuecomment-812301612


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/136834/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #32026: [SPARK-34771] Support UDT for Pandas/Spark conversion with Arrow support Enabled

2021-04-01 Thread GitBox


AmplabJenkins commented on pull request #32026:
URL: https://github.com/apache/spark/pull/32026#issuecomment-812301612


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/136834/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #32032: [SPARK-34701][SQL] Introduce TransformaAfterAnalysis rule that allows a logical plan to be transformed after all the analysis rules run

2021-04-01 Thread GitBox


AmplabJenkins commented on pull request #32032:
URL: https://github.com/apache/spark/pull/32032#issuecomment-812301610


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/136833/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #32015: [SPARK-34821][INFRA] Set up a workflow for developers to run benchmark in their fork

2021-04-01 Thread GitBox


AmplabJenkins commented on pull request #32015:
URL: https://github.com/apache/spark/pull/32015#issuecomment-812301609


   
   Refer to this link for build results (access rights to CI server needed): 
   
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/41414/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #30480: [SPARK-32921][SHUFFLE] MapOutputTracker extensions to support push-based shuffle

2021-04-01 Thread GitBox


AmplabJenkins commented on pull request #30480:
URL: https://github.com/apache/spark/pull/30480#issuecomment-812301608


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/136832/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA removed a comment on pull request #32026: [SPARK-34771] Support UDT for Pandas/Spark conversion with Arrow support Enabled

2021-04-01 Thread GitBox


SparkQA removed a comment on pull request #32026:
URL: https://github.com/apache/spark/pull/32026#issuecomment-812287668


   **[Test build #136834 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/136834/testReport)**
 for PR 32026 at commit 
[`92f3829`](https://github.com/apache/spark/commit/92f382957c038d34b4344261e86fa1bc6956369b).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #32015: [SPARK-34821][INFRA] Set up a workflow for developers to run benchmark in their fork

2021-04-01 Thread GitBox


SparkQA commented on pull request #32015:
URL: https://github.com/apache/spark/pull/32015#issuecomment-812299706






-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #32026: [SPARK-34771] Support UDT for Pandas/Spark conversion with Arrow support Enabled

2021-04-01 Thread GitBox


SparkQA commented on pull request #32026:
URL: https://github.com/apache/spark/pull/32026#issuecomment-812299694


   **[Test build #136834 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/136834/testReport)**
 for PR 32026 at commit 
[`92f3829`](https://github.com/apache/spark/commit/92f382957c038d34b4344261e86fa1bc6956369b).
* This patch **fails Spark unit tests**.
* This patch merges cleanly.
* This patch adds no public classes.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA removed a comment on pull request #32032: [SPARK-34701][SQL] Introduce TransformaAfterAnalysis rule that allows a logical plan to be transformed after all the analysis rules r

2021-04-01 Thread GitBox


SparkQA removed a comment on pull request #32032:
URL: https://github.com/apache/spark/pull/32032#issuecomment-812274055


   **[Test build #136833 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/136833/testReport)**
 for PR 32032 at commit 
[`43f70b2`](https://github.com/apache/spark/commit/43f70b2319790e6746c53c6ab5255971468cc2b7).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #32032: [SPARK-34701][SQL] Introduce TransformaAfterAnalysis rule that allows a logical plan to be transformed after all the analysis rules run.

2021-04-01 Thread GitBox


SparkQA commented on pull request #32032:
URL: https://github.com/apache/spark/pull/32032#issuecomment-812298245


   **[Test build #136833 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/136833/testReport)**
 for PR 32032 at commit 
[`43f70b2`](https://github.com/apache/spark/commit/43f70b2319790e6746c53c6ab5255971468cc2b7).
* This patch **fails Spark unit tests**.
* This patch merges cleanly.
* This patch adds no public classes.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA removed a comment on pull request #30480: [SPARK-32921][SHUFFLE] MapOutputTracker extensions to support push-based shuffle

2021-04-01 Thread GitBox


SparkQA removed a comment on pull request #30480:
URL: https://github.com/apache/spark/pull/30480#issuecomment-812256558


   **[Test build #136832 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/136832/testReport)**
 for PR 30480 at commit 
[`a10eba1`](https://github.com/apache/spark/commit/a10eba1a558c81335fe69928904a1a2f4b4f85d9).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #30480: [SPARK-32921][SHUFFLE] MapOutputTracker extensions to support push-based shuffle

2021-04-01 Thread GitBox


SparkQA commented on pull request #30480:
URL: https://github.com/apache/spark/pull/30480#issuecomment-812291998


   **[Test build #136832 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/136832/testReport)**
 for PR 30480 at commit 
[`a10eba1`](https://github.com/apache/spark/commit/a10eba1a558c81335fe69928904a1a2f4b4f85d9).
* This patch **fails PySpark unit tests**.
* This patch merges cleanly.
* This patch adds no public classes.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #32032: [SPARK-34701][SQL] Introduce TransformaAfterAnalysis rule that allows a logical plan to be transformed after all the analysis r

2021-04-01 Thread GitBox


AmplabJenkins removed a comment on pull request #32032:
URL: https://github.com/apache/spark/pull/32032#issuecomment-812290641


   
   Refer to this link for build results (access rights to CI server needed): 
   
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/41413/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #32032: [SPARK-34701][SQL] Introduce TransformaAfterAnalysis rule that allows a logical plan to be transformed after all the analysis rules run

2021-04-01 Thread GitBox


AmplabJenkins commented on pull request #32032:
URL: https://github.com/apache/spark/pull/32032#issuecomment-812290641


   
   Refer to this link for build results (access rights to CI server needed): 
   
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/41413/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #32032: [SPARK-34701][SQL] Introduce TransformaAfterAnalysis rule that allows a logical plan to be transformed after all the analysis rules run.

2021-04-01 Thread GitBox


SparkQA commented on pull request #32032:
URL: https://github.com/apache/spark/pull/32032#issuecomment-812290627






-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #32015: [SPARK-34821][INFRA] Set up a workflow for developers to run benchmark in their fork

2021-04-01 Thread GitBox


SparkQA commented on pull request #32015:
URL: https://github.com/apache/spark/pull/32015#issuecomment-812288459


   **[Test build #136836 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/136836/testReport)**
 for PR 32015 at commit 
[`d9f9aae`](https://github.com/apache/spark/commit/d9f9aaec23e8f94bfa357264dda7376f6c615333).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] HyukjinKwon commented on a change in pull request #32015: [SPARK-34821][INFRA] Set up a workflow for developers to run benchmark in their fork

2021-04-01 Thread GitBox


HyukjinKwon commented on a change in pull request #32015:
URL: https://github.com/apache/spark/pull/32015#discussion_r606045845



##
File path: 
sql/core/src/test/scala/org/apache/spark/sql/execution/benchmark/ExtractBenchmark.scala
##
@@ -92,8 +92,9 @@ object ExtractBenchmark extends SqlBasedBenchmark {
 val intervalFields = Seq("YEAR", "MONTH", "DAY", "HOUR", "MINUTE", 
"SECOND")
 val settings = Map(
   "timestamp" -> datetimeFields,
-  "date" -> datetimeFields,
-  "interval" -> intervalFields)
+  "date" -> datetimeFields)
+  // TODO(SPARK-34938): Recover the benchmark of interval case

Review comment:
   cc @MaxGekk FYI




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #32026: [SPARK-34771] Support UDT for Pandas/Spark conversion with Arrow support Enabled

2021-04-01 Thread GitBox


AmplabJenkins removed a comment on pull request #32026:
URL: https://github.com/apache/spark/pull/32026#issuecomment-812074839


   Can one of the admins verify this patch?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #32015: [SPARK-34821][INFRA] Set up a workflow for developers to run benchmark in their fork

2021-04-01 Thread GitBox


SparkQA commented on pull request #32015:
URL: https://github.com/apache/spark/pull/32015#issuecomment-812287685


   **[Test build #136835 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/136835/testReport)**
 for PR 32015 at commit 
[`dc7b70d`](https://github.com/apache/spark/commit/dc7b70daad9bd8f99952023110578b40a2233732).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #30480: [SPARK-32921][SHUFFLE] MapOutputTracker extensions to support push-based shuffle

2021-04-01 Thread GitBox


AmplabJenkins removed a comment on pull request #30480:
URL: https://github.com/apache/spark/pull/30480#issuecomment-812287428


   
   Refer to this link for build results (access rights to CI server needed): 
   
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/41410/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #32026: [SPARK-34771] Support UDT for Pandas/Spark conversion with Arrow support Enabled

2021-04-01 Thread GitBox


SparkQA commented on pull request #32026:
URL: https://github.com/apache/spark/pull/32026#issuecomment-812287668


   **[Test build #136834 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/136834/testReport)**
 for PR 32026 at commit 
[`92f3829`](https://github.com/apache/spark/commit/92f382957c038d34b4344261e86fa1bc6956369b).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #30480: [SPARK-32921][SHUFFLE] MapOutputTracker extensions to support push-based shuffle

2021-04-01 Thread GitBox


AmplabJenkins commented on pull request #30480:
URL: https://github.com/apache/spark/pull/30480#issuecomment-812287428


   
   Refer to this link for build results (access rights to CI server needed): 
   
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/41410/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] HyukjinKwon commented on a change in pull request #32015: [SPARK-34821][INFRA] Set up a workflow for developers to run benchmark in their fork

2021-04-01 Thread GitBox


HyukjinKwon commented on a change in pull request #32015:
URL: https://github.com/apache/spark/pull/32015#discussion_r606045223



##
File path: 
sql/core/src/test/scala/org/apache/spark/sql/execution/benchmark/ExtractBenchmark.scala
##
@@ -92,8 +92,9 @@ object ExtractBenchmark extends SqlBasedBenchmark {
 val intervalFields = Seq("YEAR", "MONTH", "DAY", "HOUR", "MINUTE", 
"SECOND")
 val settings = Map(
   "timestamp" -> datetimeFields,
-  "date" -> datetimeFields,
-  "interval" -> intervalFields)
+  "date" -> datetimeFields)
+  // TODO(SPARK-34938): Recover the benchmark of internal case

Review comment:
   internal -> interval ..




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] HyukjinKwon commented on a change in pull request #32015: [SPARK-34821][INFRA] Set up a workflow for developers to run benchmark in their fork

2021-04-01 Thread GitBox


HyukjinKwon commented on a change in pull request #32015:
URL: https://github.com/apache/spark/pull/32015#discussion_r606044927



##
File path: 
sql/core/src/test/scala/org/apache/spark/sql/execution/benchmark/ExtractBenchmark.scala
##
@@ -92,8 +92,9 @@ object ExtractBenchmark extends SqlBasedBenchmark {
 val intervalFields = Seq("YEAR", "MONTH", "DAY", "HOUR", "MINUTE", 
"SECOND")
 val settings = Map(
   "timestamp" -> datetimeFields,
-  "date" -> datetimeFields,
-  "interval" -> intervalFields)
+  "date" -> datetimeFields)
+  // TODO(SPARK-34938): Recover the benchmark of internal case

Review comment:
   cc @MaxGekk FYI




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] HyukjinKwon edited a comment on pull request #32015: [SPARK-34821][INFRA] Set up a workflow for developers to run benchmark in their fork

2021-04-01 Thread GitBox


HyukjinKwon edited a comment on pull request #32015:
URL: https://github.com/apache/spark/pull/32015#issuecomment-811064987


   Note that I tested subset of benchmarks, verified that it works, and now I 
am waiting for the final results of running all benchmarks:
   - [Run benchmarks: * (JDK 
11)](https://github.com/HyukjinKwon/spark/actions/runs/710425382)
   - [Run benchmarks: * (JDK 
8)](https://github.com/HyukjinKwon/spark/actions/runs/710425286)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] sadhen commented on a change in pull request #32026: [SPARK-34771] Support UDT for Pandas/Spark conversion with Arrow support Enabled

2021-04-01 Thread GitBox


sadhen commented on a change in pull request #32026:
URL: https://github.com/apache/spark/pull/32026#discussion_r606039240



##
File path: python/pyspark/sql/tests/test_arrow.py
##
@@ -196,6 +197,33 @@ def test_pandas_round_trip(self):
 pdf_arrow = df.toPandas()
 assert_frame_equal(pdf_arrow, pdf)
 
+def test_udt_roundtrip(self):
+pdf = pd.DataFrame({'point': pd.Series([ExamplePoint(1.0, 1.0), 
ExamplePoint(2.0, 2.0)])})
+schema = StructType([StructField('point', ExamplePointUDT(), False)])
+with 
self.sql_conf({"spark.sql.execution.arrow.pyspark.fallback.enabled": True}):
+df = self.spark.createDataFrame(pdf, schema)
+pdf_arrow = df.toPandas()
+assert_frame_equal(pdf_arrow, pdf)
+with 
self.sql_conf({"spark.sql.execution.arrow.pyspark.fallback.enabled": False}):
+df = self.spark.createDataFrame(pdf, schema)
+pdf_arrow = df.toPandas()
+assert_frame_equal(pdf_arrow, pdf)
+
+def test_array_udt_roundtrip(self):
+pdf = pd.DataFrame({'points': pd.Series([
+[ExamplePoint(1.0, 1.0), ExamplePoint(1.0, 2.0), ExamplePoint(1.0, 
3.0)],

Review comment:
   I thought udt is for complex datatype. For udt which is actually 
primitive type,let me add unit tests.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #30480: [SPARK-32921][SHUFFLE] MapOutputTracker extensions to support push-based shuffle

2021-04-01 Thread GitBox


SparkQA commented on pull request #30480:
URL: https://github.com/apache/spark/pull/30480#issuecomment-812278108






-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] sadhen commented on a change in pull request #32026: [SPARK-34771] Support UDT for Pandas/Spark conversion with Arrow support Enabled

2021-04-01 Thread GitBox


sadhen commented on a change in pull request #32026:
URL: https://github.com/apache/spark/pull/32026#discussion_r606036073



##
File path: python/pyspark/sql/tests/test_arrow.py
##
@@ -196,6 +197,33 @@ def test_pandas_round_trip(self):
 pdf_arrow = df.toPandas()
 assert_frame_equal(pdf_arrow, pdf)
 
+def test_udt_roundtrip(self):
+pdf = pd.DataFrame({'point': pd.Series([ExamplePoint(1.0, 1.0), 
ExamplePoint(2.0, 2.0)])})
+schema = StructType([StructField('point', ExamplePointUDT(), False)])
+with 
self.sql_conf({"spark.sql.execution.arrow.pyspark.fallback.enabled": True}):
+df = self.spark.createDataFrame(pdf, schema)
+pdf_arrow = df.toPandas()
+assert_frame_equal(pdf_arrow, pdf)
+with 
self.sql_conf({"spark.sql.execution.arrow.pyspark.fallback.enabled": False}):
+df = self.spark.createDataFrame(pdf, schema)
+pdf_arrow = df.toPandas()
+assert_frame_equal(pdf_arrow, pdf)
+
+def test_array_udt_roundtrip(self):
+pdf = pd.DataFrame({'points': pd.Series([
+[ExamplePoint(1.0, 1.0), ExamplePoint(1.0, 2.0), ExamplePoint(1.0, 
3.0)],

Review comment:
   See `_deserialize_pandas_with_udt`, support for StructType is postponed 
in later PRs. 




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] sadhen commented on a change in pull request #32026: [SPARK-34771] Support UDT for Pandas/Spark conversion with Arrow support Enabled

2021-04-01 Thread GitBox


sadhen commented on a change in pull request #32026:
URL: https://github.com/apache/spark/pull/32026#discussion_r606035670



##
File path: python/pyspark/sql/types.py
##
@@ -764,6 +764,21 @@ def __eq__(self, other):
 return type(self) == type(other)
 
 
+def _is_datatype_with_udt(dt):

Review comment:
   fixed




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] sadhen commented on a change in pull request #32026: [SPARK-34771] Support UDT for Pandas/Spark conversion with Arrow support Enabled

2021-04-01 Thread GitBox


sadhen commented on a change in pull request #32026:
URL: https://github.com/apache/spark/pull/32026#discussion_r606035644



##
File path: 
sql/core/src/main/scala/org/apache/spark/sql/execution/python/ArrowEvalPythonExec.scala
##
@@ -89,9 +89,57 @@ case class ArrowEvalPythonExec(udfs: Seq[PythonUDF], 
resultAttrs: Seq[Attribute]
 
 columnarBatchIter.flatMap { batch =>
   val actualDataTypes = (0 until batch.numCols()).map(i => 
batch.column(i).dataType())
-  assert(outputTypes == actualDataTypes, "Invalid schema from pandas_udf: 
" +
-s"expected ${outputTypes.mkString(", ")}, got 
${actualDataTypes.mkString(", ")}")
+  assert(plainSchemaSeq(outputTypes) == actualDataTypes,
+"Incompatible schema from pandas_udf: " +
+  s"expected ${outputTypes.mkString(", ")}, got 
${actualDataTypes.mkString(", ")}")
   batch.rowIterator.asScala
 }
   }
+
+  private def plainSchemaSeq(schema: Seq[DataType]): Seq[DataType] = {
+schema.map(v => ArrowEvalPythonExec.plainSchema(v)).toList
+  }
+
+}
+
+private[sql] object ArrowEvalPythonExec {
+  /**
+   * Erase User-Defined Types and returns the plain Spark StructType instead.
+   *
+   * UserDefinedType:
+   * - will be erased as dt.sqlType

Review comment:
   Fixed.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #32031: [WIP] Initial work of Remote Shuffle Service to support dynamic allocation on Kubernetes

2021-04-01 Thread GitBox


AmplabJenkins removed a comment on pull request #32031:
URL: https://github.com/apache/spark/pull/32031#issuecomment-812275412


   
   Refer to this link for build results (access rights to CI server needed): 
   
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/41412/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #32031: [WIP] Initial work of Remote Shuffle Service to support dynamic allocation on Kubernetes

2021-04-01 Thread GitBox


AmplabJenkins commented on pull request #32031:
URL: https://github.com/apache/spark/pull/32031#issuecomment-812275412


   
   Refer to this link for build results (access rights to CI server needed): 
   
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/41412/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #32031: [WIP] Initial work of Remote Shuffle Service to support dynamic allocation on Kubernetes

2021-04-01 Thread GitBox


SparkQA commented on pull request #32031:
URL: https://github.com/apache/spark/pull/32031#issuecomment-812275259


   Kubernetes integration test status failure
   URL: 
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/41412/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #32031: [WIP] Initial work of Remote Shuffle Service to support dynamic allocation on Kubernetes

2021-04-01 Thread GitBox


SparkQA commented on pull request #32031:
URL: https://github.com/apache/spark/pull/32031#issuecomment-812274710


   Kubernetes integration test starting
   URL: 
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/41412/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] imback82 commented on a change in pull request #32032: [SPARK-34701][SQL] Introduce TransformaAfterAnalysis rule that allows a logical plan to be transformed after all the analysis ru

2021-04-01 Thread GitBox


imback82 commented on a change in pull request #32032:
URL: https://github.com/apache/spark/pull/32032#discussion_r606034352



##
File path: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/trees/TreeNode.scala
##
@@ -830,7 +830,7 @@ abstract class TreeNode[BaseType <: TreeNode[BaseType]] 
extends Product {
 }
 
 trait LeafLike[T <: TreeNode[T]] { self: TreeNode[T] =>
-  override final def children: Seq[T] = Nil
+  override def children: Seq[T] = Nil

Review comment:
   @cloud-fan I am removing this `final` temporarily. If the approach of 
this PR is OK, I will add this back and refactor.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #32032: [SPARK-34701][SQL] Introduce TransformaAfterAnalysis rule that allows a logical plan to be transformed after all the analysis r

2021-04-01 Thread GitBox


AmplabJenkins removed a comment on pull request #32032:
URL: https://github.com/apache/spark/pull/32032#issuecomment-812261585


   
   Refer to this link for build results (access rights to CI server needed): 
   
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/41411/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #32032: [SPARK-34701][SQL] Introduce TransformaAfterAnalysis rule that allows a logical plan to be transformed after all the analysis rules run.

2021-04-01 Thread GitBox


SparkQA commented on pull request #32032:
URL: https://github.com/apache/spark/pull/32032#issuecomment-812274055


   **[Test build #136833 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/136833/testReport)**
 for PR 32032 at commit 
[`43f70b2`](https://github.com/apache/spark/commit/43f70b2319790e6746c53c6ab5255971468cc2b7).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] LuciferYang commented on pull request #31776: [SPARK-34661][SQL] Clean up `OriginalType` and `DecimalMetadata ` usage in Parquet related code

2021-04-01 Thread GitBox


LuciferYang commented on pull request #31776:
URL: https://github.com/apache/spark/pull/31776#issuecomment-812272546


   Gentle ping, @wangyum @HyukjinKwon @dongjoon-hyun @maropu 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #32030: [WIP] Improve map children

2021-04-01 Thread GitBox


AmplabJenkins removed a comment on pull request #32030:
URL: https://github.com/apache/spark/pull/32030#issuecomment-812271782






-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #32031: [WIP] Initial work of Remote Shuffle Service to support dynamic allocation on Kubernetes

2021-04-01 Thread GitBox


AmplabJenkins removed a comment on pull request #32031:
URL: https://github.com/apache/spark/pull/32031#issuecomment-812271785


   
   Refer to this link for build results (access rights to CI server needed): 
   
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/41407/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #32030: [WIP] Improve map children

2021-04-01 Thread GitBox


AmplabJenkins commented on pull request #32030:
URL: https://github.com/apache/spark/pull/32030#issuecomment-812271783






-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #32031: [WIP] Initial work of Remote Shuffle Service to support dynamic allocation on Kubernetes

2021-04-01 Thread GitBox


AmplabJenkins commented on pull request #32031:
URL: https://github.com/apache/spark/pull/32031#issuecomment-812271785


   
   Refer to this link for build results (access rights to CI server needed): 
   
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/41407/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA removed a comment on pull request #32030: [WIP] Improve map children

2021-04-01 Thread GitBox


SparkQA removed a comment on pull request #32030:
URL: https://github.com/apache/spark/pull/32030#issuecomment-812236261


   **[Test build #136828 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/136828/testReport)**
 for PR 32030 at commit 
[`7045e7a`](https://github.com/apache/spark/commit/7045e7a8bb844e6a5d48fda0ab06926f67c9f4ca).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #32030: [WIP] Improve map children

2021-04-01 Thread GitBox


SparkQA commented on pull request #32030:
URL: https://github.com/apache/spark/pull/32030#issuecomment-812270113


   **[Test build #136828 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/136828/testReport)**
 for PR 32030 at commit 
[`7045e7a`](https://github.com/apache/spark/commit/7045e7a8bb844e6a5d48fda0ab06926f67c9f4ca).
* This patch **fails Spark unit tests**.
* This patch merges cleanly.
* This patch adds no public classes.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #32031: [WIP] Initial work of Remote Shuffle Service to support dynamic allocation on Kubernetes

2021-04-01 Thread GitBox


SparkQA commented on pull request #32031:
URL: https://github.com/apache/spark/pull/32031#issuecomment-812266720






-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] sadhen commented on a change in pull request #32026: [SPARK-34771] Support UDT for Pandas/Spark conversion with Arrow support Enabled

2021-04-01 Thread GitBox


sadhen commented on a change in pull request #32026:
URL: https://github.com/apache/spark/pull/32026#discussion_r606026938



##
File path: python/pyspark/sql/pandas/conversion.py
##
@@ -452,24 +457,27 @@ def _create_from_pandas_with_arrow(self, pdf, schema, 
timezone):
 struct.add(name, from_arrow_type(field.type), 
nullable=field.nullable)
 schema = struct
 
-# Determine arrow types to coerce data when creating batches
+# Determine data types to coerce data when creating batches
 if isinstance(schema, StructType):
-arrow_types = [to_arrow_type(f.dataType) for f in schema.fields]
+data_types = [f.dataType for f in schema.fields]
 elif isinstance(schema, DataType):
 raise ValueError("Single data type %s is not supported with Arrow" 
% str(schema))
 else:
 # Any timestamps must be coerced to be compatible with Spark
-arrow_types = [to_arrow_type(TimestampType())
-   if is_datetime64_dtype(t) or 
is_datetime64tz_dtype(t) else None
-   for t in pdf.dtypes]
+data_types = [to_arrow_type(TimestampType())
+  if is_datetime64_dtype(t) or 
is_datetime64tz_dtype(t) else None
+  for t in pdf.dtypes]
 
 # Slice the DataFrame to be batched
 step = -(-len(pdf) // self.sparkContext.defaultParallelism)  # round 
int up
 pdf_slices = (pdf.iloc[start:start + step] for start in range(0, 
len(pdf), step))
 
 # Create list of Arrow (columns, type) for serializer dump_stream
-arrow_data = [[(c, t) for (_, c), t in zip(pdf_slice.iteritems(), 
arrow_types)]
-  for pdf_slice in pdf_slices]
+# Type can be Spark SQL Data Type or Arrow Data Type
+arrow_data_with_t = [

Review comment:
   Well, I should use `adt` or `padt` for PyArrow Data Type and `pdt` for 
Pandas DataType.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] sadhen commented on a change in pull request #32026: [SPARK-34771] Support UDT for Pandas/Spark conversion with Arrow support Enabled

2021-04-01 Thread GitBox


sadhen commented on a change in pull request #32026:
URL: https://github.com/apache/spark/pull/32026#discussion_r606026544



##
File path: python/pyspark/sql/pandas/conversion.py
##
@@ -20,9 +20,10 @@
 
 from pyspark.rdd import _load_from_socket
 from pyspark.sql.pandas.serializers import ArrowCollectSerializer
-from pyspark.sql.types import IntegralType
 from pyspark.sql.types import ByteType, ShortType, IntegerType, LongType, 
FloatType, \
-DoubleType, BooleanType, MapType, TimestampType, StructType, DataType
+DoubleType, BooleanType, MapType, TimestampType, StructType, DataType, \
+IntegralType, _is_datatype_with_udt
+from pyspark.sql.pandas.types import _deserialize_pandas_with_udt

Review comment:
   `git grep _make_type_verifier`, there are other use cases which a 
function starts with `_` but is used outside where they are defined.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] sadhen commented on a change in pull request #32026: [SPARK-34771] Support UDT for Pandas/Spark conversion with Arrow support Enabled

2021-04-01 Thread GitBox


sadhen commented on a change in pull request #32026:
URL: https://github.com/apache/spark/pull/32026#discussion_r606025117



##
File path: python/pyspark/sql/pandas/conversion.py
##
@@ -452,24 +457,27 @@ def _create_from_pandas_with_arrow(self, pdf, schema, 
timezone):
 struct.add(name, from_arrow_type(field.type), 
nullable=field.nullable)
 schema = struct
 
-# Determine arrow types to coerce data when creating batches
+# Determine data types to coerce data when creating batches
 if isinstance(schema, StructType):
-arrow_types = [to_arrow_type(f.dataType) for f in schema.fields]
+data_types = [f.dataType for f in schema.fields]
 elif isinstance(schema, DataType):
 raise ValueError("Single data type %s is not supported with Arrow" 
% str(schema))
 else:
 # Any timestamps must be coerced to be compatible with Spark
-arrow_types = [to_arrow_type(TimestampType())
-   if is_datetime64_dtype(t) or 
is_datetime64tz_dtype(t) else None
-   for t in pdf.dtypes]
+data_types = [to_arrow_type(TimestampType())
+  if is_datetime64_dtype(t) or 
is_datetime64tz_dtype(t) else None
+  for t in pdf.dtypes]
 
 # Slice the DataFrame to be batched
 step = -(-len(pdf) // self.sparkContext.defaultParallelism)  # round 
int up
 pdf_slices = (pdf.iloc[start:start + step] for start in range(0, 
len(pdf), step))
 
 # Create list of Arrow (columns, type) for serializer dump_stream
-arrow_data = [[(c, t) for (_, c), t in zip(pdf_slice.iteritems(), 
arrow_types)]
-  for pdf_slice in pdf_slices]
+# Type can be Spark SQL Data Type or Arrow Data Type
+arrow_data_with_t = [

Review comment:
   Yes. It is renamed to indicate it is arrow data with datatype (Spark SQL 
DataType or Arrow DataType).
   
   In `serializers.py`, `dt` is for Spark SQL DataType, `pdt` is for pyarrow 
DataType.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #32030: [WIP] Improve map children

2021-04-01 Thread GitBox


SparkQA commented on pull request #32030:
URL: https://github.com/apache/spark/pull/32030#issuecomment-812262654


   Kubernetes integration test status success
   URL: 
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/41408/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #32032: [SPARK-34701][SQL] Introduce TransformaAfterAnalysis rule that allows a logical plan to be transformed after all the analysis rules run

2021-04-01 Thread GitBox


AmplabJenkins commented on pull request #32032:
URL: https://github.com/apache/spark/pull/32032#issuecomment-812261585


   
   Refer to this link for build results (access rights to CI server needed): 
   
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/41411/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #32032: [SPARK-34701][SQL] Introduce TransformaAfterAnalysis rule that allows a logical plan to be transformed after all the analysis rules run.

2021-04-01 Thread GitBox


SparkQA commented on pull request #32032:
URL: https://github.com/apache/spark/pull/32032#issuecomment-812261574


   Kubernetes integration test unable to build dist.
   
   exiting with code: 1
   URL: 
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/41411/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #32032: [SPARK-34701][SQL] Introduce TransformaAfterAnalysis rule that allows a logical plan to be transformed after all the analysis r

2021-04-01 Thread GitBox


AmplabJenkins removed a comment on pull request #32032:
URL: https://github.com/apache/spark/pull/32032#issuecomment-812259237


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/136830/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA removed a comment on pull request #32032: [SPARK-34701][SQL] Introduce TransformaAfterAnalysis rule that allows a logical plan to be transformed after all the analysis rules r

2021-04-01 Thread GitBox


SparkQA removed a comment on pull request #32032:
URL: https://github.com/apache/spark/pull/32032#issuecomment-812256299


   **[Test build #136830 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/136830/testReport)**
 for PR 32032 at commit 
[`b98c15c`](https://github.com/apache/spark/commit/b98c15c3862d5e42fc62cd5d393ad6fbf861b143).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #32032: [SPARK-34701][SQL] Introduce TransformaAfterAnalysis rule that allows a logical plan to be transformed after all the analysis rules run

2021-04-01 Thread GitBox


AmplabJenkins commented on pull request #32032:
URL: https://github.com/apache/spark/pull/32032#issuecomment-812259237


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/136830/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #32032: [SPARK-34701][SQL] Introduce TransformaAfterAnalysis rule that allows a logical plan to be transformed after all the analysis rules run.

2021-04-01 Thread GitBox


SparkQA commented on pull request #32032:
URL: https://github.com/apache/spark/pull/32032#issuecomment-812259220


   **[Test build #136830 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/136830/testReport)**
 for PR 32032 at commit 
[`b98c15c`](https://github.com/apache/spark/commit/b98c15c3862d5e42fc62cd5d393ad6fbf861b143).
* This patch **fails to build**.
* This patch merges cleanly.
* This patch adds no public classes.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #30145: [SPARK-33233][SQL]CUBE/ROLLUP/GROUPING SETS support GROUP BY ordinal

2021-04-01 Thread GitBox


AmplabJenkins removed a comment on pull request #30145:
URL: https://github.com/apache/spark/pull/30145#issuecomment-812258143


   
   Refer to this link for build results (access rights to CI server needed): 
   
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/41409/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #30145: [SPARK-33233][SQL]CUBE/ROLLUP/GROUPING SETS support GROUP BY ordinal

2021-04-01 Thread GitBox


AmplabJenkins commented on pull request #30145:
URL: https://github.com/apache/spark/pull/30145#issuecomment-812258143


   
   Refer to this link for build results (access rights to CI server needed): 
   
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/41409/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #30145: [SPARK-33233][SQL]CUBE/ROLLUP/GROUPING SETS support GROUP BY ordinal

2021-04-01 Thread GitBox


SparkQA commented on pull request #30145:
URL: https://github.com/apache/spark/pull/30145#issuecomment-812258134


   Kubernetes integration test unable to build dist.
   
   exiting with code: 1
   URL: 
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/41409/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #32031: [WIP] Initial work of Remote Shuffle Service to support dynamic allocation on Kubernetes

2021-04-01 Thread GitBox


AmplabJenkins removed a comment on pull request #32031:
URL: https://github.com/apache/spark/pull/32031#issuecomment-812256859


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/136831/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #32031: [WIP] Initial work of Remote Shuffle Service to support dynamic allocation on Kubernetes

2021-04-01 Thread GitBox


AmplabJenkins commented on pull request #32031:
URL: https://github.com/apache/spark/pull/32031#issuecomment-812256859


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/136831/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



  1   2   3   4   5   6   >