[GitHub] [spark] AmplabJenkins commented on pull request #29055: [SPARK-32251][SQL][DOCS][TESTS] Fix SQL keyword document

2020-07-09 Thread GitBox


AmplabJenkins commented on pull request #29055:
URL: https://github.com/apache/spark/pull/29055#issuecomment-656499183







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #29059: [SPARK-32256][SQL][test-hadoop2.7] Force to initialize Hadoop VersionInfo in HiveExternalCatalog

2020-07-09 Thread GitBox


SparkQA commented on pull request #29059:
URL: https://github.com/apache/spark/pull/29059#issuecomment-656498937


   **[Test build #125558 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/125558/testReport)**
 for PR 29059 at commit 
[`8f94ed6`](https://github.com/apache/spark/commit/8f94ed688ac39a761e043192b07821b2ff15a48d).



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] zsxwing commented on pull request #29059: [SPARK-32256][SQL][test-hadoop2.7] Force to initialize Hadoop VersionInfo in HiveExternalCatalog

2020-07-09 Thread GitBox


zsxwing commented on pull request #29059:
URL: https://github.com/apache/spark/pull/29059#issuecomment-656498285


   Although the temp ivy directory adds 1 minute to the test on my laptop, I 
think it should be okay since SBT tests run in parallel. It would be pretty 
complicated to coordinate two tests in different JVMs and that's not worth for 
a one-line fix.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #28898: [SPARK-32059][SQL] Allow nested schema pruning thru window/sort/filter plans

2020-07-09 Thread GitBox


AmplabJenkins removed a comment on pull request #28898:
URL: https://github.com/apache/spark/pull/28898#issuecomment-656497640


   Test FAILed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/125536/
   Test FAILed.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #28898: [SPARK-32059][SQL] Allow nested schema pruning thru window/sort/filter plans

2020-07-09 Thread GitBox


AmplabJenkins removed a comment on pull request #28898:
URL: https://github.com/apache/spark/pull/28898#issuecomment-656497633


   Merged build finished. Test FAILed.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #28953: [SPARK-32013][SQL] Support query execution before reading DataFrame and before/after writing DataFrame over JDBC

2020-07-09 Thread GitBox


AmplabJenkins removed a comment on pull request #28953:
URL: https://github.com/apache/spark/pull/28953#issuecomment-656496927


   Test FAILed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/125535/
   Test FAILed.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #28898: [SPARK-32059][SQL] Allow nested schema pruning thru window/sort/filter plans

2020-07-09 Thread GitBox


AmplabJenkins commented on pull request #28898:
URL: https://github.com/apache/spark/pull/28898#issuecomment-656497633







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #28953: [SPARK-32013][SQL] Support query execution before reading DataFrame and before/after writing DataFrame over JDBC

2020-07-09 Thread GitBox


AmplabJenkins removed a comment on pull request #28953:
URL: https://github.com/apache/spark/pull/28953#issuecomment-656496921


   Merged build finished. Test FAILed.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #28953: [SPARK-32013][SQL] Support query execution before reading DataFrame and before/after writing DataFrame over JDBC

2020-07-09 Thread GitBox


AmplabJenkins commented on pull request #28953:
URL: https://github.com/apache/spark/pull/28953#issuecomment-656496921







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] huaxingao commented on a change in pull request #29056: [SPARK-31753][SQL][DOCS][WIP]Add missing keywords

2020-07-09 Thread GitBox


huaxingao commented on a change in pull request #29056:
URL: https://github.com/apache/spark/pull/29056#discussion_r452635205



##
File path: docs/sql-ref-syntax-qry-select-case.md
##
@@ -0,0 +1,115 @@
+---
+layout: global
+title: CASE Clause
+displayTitle: CASE Clause
+license: |
+  Licensed to the Apache Software Foundation (ASF) under one or more
+  contributor license agreements.  See the NOTICE file distributed with
+  this work for additional information regarding copyright ownership.
+  The ASF licenses this file to You under the Apache License, Version 2.0
+  (the "License"); you may not use this file except in compliance with
+  the License.  You may obtain a copy of the License at
+
+ http://www.apache.org/licenses/LICENSE-2.0
+
+  Unless required by applicable law or agreed to in writing, software
+  distributed under the License is distributed on an "AS IS" BASIS,
+  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+  See the License for the specific language governing permissions and
+  limitations under the License.
+---
+
+### Description
+
+`CASE` clause uses rule to return specific result based on the specified 
condition.
+
+### Syntax
+
+```sql
+CASE { WHEN boolean_expression THEN then_expression }{ WHEN boolean_expression 
THEN then_expression } [ , ... ] [ ELSE else_expression ] END
+
+CASE expression { WHEN boolean_expression THEN then_expression }{ WHEN 
boolean_expression THEN then_expression } [ , ... ] [ ELSE else_expression ] END
+```
+
+### Parameters
+
+* **WHEN**
+
+Specific a boolean condition ,under which to return the `THEN` result, 
`WHEN` must exist in `CASE` clause.
+
+* **THEN**
+
+Specific a result base the `WHEN` condition, `THEN` must exist in `CASE` 
clause.
+
+* **ELSE**
+
+Specific a default result for the `CASE` rules, it is optional, if user 
don't use else then the `CASE` will not have default result.
+
+* **END**
+
+Key words to finish a case clause, `END` must exist in `CASE` clause.
+
+* **boolean_expression**
+
+Specific specified condition, it should be boolean type.
+
+* **then_expression**
+
+Specific the then expression based on the `boolean_expression` condition, 
`then_expression` and `else_expression` should all be same type or coercible to 
a common type.
+
+* **else_expression**
+
+Specific the default expression, `then_expression` and `else_expression` 
should all be same type or coercible to a common type.
+
+### Examples
+
+```sql
+CREATE TABLE person (id INT, name STRING, age INT);
+INSERT INTO person VALUES
+(100, 'John', 30),
+(200, 'Mary', NULL),
+(300, 'Mike', 80),
+(400, 'Dan',  50),
+(500, 'Evan_w', 16);

Review comment:
   Seems you didn't include this row in the following SELECT results?





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA removed a comment on pull request #28898: [SPARK-32059][SQL] Allow nested schema pruning thru window/sort/filter plans

2020-07-09 Thread GitBox


SparkQA removed a comment on pull request #28898:
URL: https://github.com/apache/spark/pull/28898#issuecomment-656434653


   **[Test build #125536 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/125536/testReport)**
 for PR 28898 at commit 
[`a7e885a`](https://github.com/apache/spark/commit/a7e885a3c5f09f9ca623777bdabcd05e664f3774).



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #28898: [SPARK-32059][SQL] Allow nested schema pruning thru window/sort/filter plans

2020-07-09 Thread GitBox


SparkQA commented on pull request #28898:
URL: https://github.com/apache/spark/pull/28898#issuecomment-656496831


   **[Test build #125536 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/125536/testReport)**
 for PR 28898 at commit 
[`a7e885a`](https://github.com/apache/spark/commit/a7e885a3c5f09f9ca623777bdabcd05e664f3774).
* This patch **fails Spark unit tests**.
* This patch merges cleanly.
* This patch adds no public classes.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA removed a comment on pull request #28953: [SPARK-32013][SQL] Support query execution before reading DataFrame and before/after writing DataFrame over JDBC

2020-07-09 Thread GitBox


SparkQA removed a comment on pull request #28953:
URL: https://github.com/apache/spark/pull/28953#issuecomment-656434657


   **[Test build #125535 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/125535/testReport)**
 for PR 28953 at commit 
[`9fb8383`](https://github.com/apache/spark/commit/9fb83833de0fcebe88572f3276ac31f2d3713657).



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #28953: [SPARK-32013][SQL] Support query execution before reading DataFrame and before/after writing DataFrame over JDBC

2020-07-09 Thread GitBox


SparkQA commented on pull request #28953:
URL: https://github.com/apache/spark/pull/28953#issuecomment-656495958


   **[Test build #125535 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/125535/testReport)**
 for PR 28953 at commit 
[`9fb8383`](https://github.com/apache/spark/commit/9fb83833de0fcebe88572f3276ac31f2d3713657).
* This patch **fails Spark unit tests**.
* This patch merges cleanly.
* This patch adds no public classes.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] MaxGekk commented on pull request #27366: [SPARK-30648][SQL] Support filters pushdown in JSON datasource

2020-07-09 Thread GitBox


MaxGekk commented on pull request #27366:
URL: https://github.com/apache/spark/pull/27366#issuecomment-656495713


   @HyukjinKwon Please, review this PR.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] HyukjinKwon commented on pull request #28957: [WIP][SPARK-32138] Drop Python 2.7, 3.4 and 3.5

2020-07-09 Thread GitBox


HyukjinKwon commented on pull request #28957:
URL: https://github.com/apache/spark/pull/28957#issuecomment-656494756


   cc @BryanCutler, @ueshin, @holdenk, @dongjoon-hyun, @viirya, @srowen, this 
should be ready for a review.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] wankunde commented on pull request #28850: [SPARK-32015][Core]Remote inheritable thread local variables after spark context is stopped

2020-07-09 Thread GitBox


wankunde commented on pull request #28850:
URL: https://github.com/apache/spark/pull/28850#issuecomment-656494493


   @Ngone51 @srowen Update PR,  could we only remove the thread reference 
created by hive?



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #29061: [SPARK-32258][SQL] NormalizeFloatingNumbers can directly normalize on IF and CaseWhen children expressions

2020-07-09 Thread GitBox


AmplabJenkins removed a comment on pull request #29061:
URL: https://github.com/apache/spark/pull/29061#issuecomment-656493719


   Merged build finished. Test FAILed.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #29061: [SPARK-32258][SQL] NormalizeFloatingNumbers can directly normalize on IF and CaseWhen children expressions

2020-07-09 Thread GitBox


AmplabJenkins commented on pull request #29061:
URL: https://github.com/apache/spark/pull/29061#issuecomment-656493719







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] zhengruifeng commented on pull request #29060: [SPARK-32232][ML][PySpark] Make sure ML has the same default solver values between Scala and Python

2020-07-09 Thread GitBox


zhengruifeng commented on pull request #29060:
URL: https://github.com/apache/spark/pull/29060#issuecomment-656490664


   LGTM! 
   There is still something inconsistent between Scala side and Python side, 
after py.ml was refactored in 3.0.
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] maropu commented on pull request #29054: [SPARK-32243][SQL]HiveSessionCatalog call super.makeFunctionExpression should show error message

2020-07-09 Thread GitBox


maropu commented on pull request #29054:
URL: https://github.com/apache/spark/pull/29054#issuecomment-656490867


   Please describe exception messages before/after this PR in the description.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] maropu commented on a change in pull request #29054: [SPARK-32243][SQL]HiveSessionCatalog call super.makeFunctionExpression should show error message

2020-07-09 Thread GitBox


maropu commented on a change in pull request #29054:
URL: https://github.com/apache/spark/pull/29054#discussion_r452627861



##
File path: 
sql/hive/src/test/scala/org/apache/spark/sql/hive/execution/HiveUDAFSuite.scala
##
@@ -161,6 +161,21 @@ class HiveUDAFSuite extends QueryTest
   checkAnswer(sql("select histogram_numeric(a,2) from abc where a=3"), 
Row(null))
 }
   }
+
+  test("Hive mode use spark udaf should show error") {
+val functionName = "longProductSum"
+val functionClass = "org.apache.spark.sql.hive.execution.LongProductSum"
+withUserDefinedFunction(functionName -> true) {
+  sql(s"CREATE TEMPORARY FUNCTION $functionName AS '$functionClass'")
+  val e1 = intercept[AnalysisException] {

Review comment:
   nit: `e1` -> `e`





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] zhengruifeng commented on a change in pull request #29018: [SPARK-32202][ML][WIP] tree models auto infer compact integer type

2020-07-09 Thread GitBox


zhengruifeng commented on a change in pull request #29018:
URL: https://github.com/apache/spark/pull/29018#discussion_r452627763



##
File path: mllib/src/main/scala/org/apache/spark/ml/util/MLUtils.scala
##
@@ -0,0 +1,47 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.ml.util
+
+import org.apache.spark.SparkConf
+import org.apache.spark.ml.tree.impl._
+import org.apache.spark.util.Utils
+
+private[spark] object MLUtils {
+
+  private[this] var kryoRegistered: Boolean = false
+
+  def registerKryoClasses(conf: SparkConf): Unit = {

Review comment:
   I think mark it synchronized is good, here follows same impl in 
`GraphXUtils` and `SquaredEuclideanSilhouette`. I think we can make them 
`synchronized` in the future.
   
   > Do these really have to get serialized to make it work?
   
   Since current `TreePoint` was registered in `object KryoSerializer`, so I 
tried to also register `TreePoint[B]`, otherwise, I may need to swith to a new 
treepoint class.
   
   





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] maropu commented on a change in pull request #29054: [SPARK-32243][SQL]HiveSessionCatalog call super.makeFunctionExpression should show error message

2020-07-09 Thread GitBox


maropu commented on a change in pull request #29054:
URL: https://github.com/apache/spark/pull/29054#discussion_r452627431



##
File path: 
sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveSessionCatalog.scala
##
@@ -69,49 +69,56 @@ private[sql] class HiveSessionCatalog(
 // Current thread context classloader may not be the one loaded the class. 
Need to switch
 // context classloader to initialize instance properly.
 Utils.withContextClassLoader(clazz.getClassLoader) {
-  Try(super.makeFunctionExpression(name, clazz, input)).getOrElse {
-var udfExpr: Option[Expression] = None
-try {
-  // When we instantiate hive UDF wrapper class, we may throw 
exception if the input
-  // expressions don't satisfy the hive UDF, such as type mismatch, 
input number
-  // mismatch, etc. Here we catch the exception and throw 
AnalysisException instead.
-  if (classOf[UDF].isAssignableFrom(clazz)) {
-udfExpr = Some(HiveSimpleUDF(name, new 
HiveFunctionWrapper(clazz.getName), input))
-udfExpr.get.dataType // Force it to check input data types.
-  } else if (classOf[GenericUDF].isAssignableFrom(clazz)) {
-udfExpr = Some(HiveGenericUDF(name, new 
HiveFunctionWrapper(clazz.getName), input))
-udfExpr.get.dataType // Force it to check input data types.
-  } else if 
(classOf[AbstractGenericUDAFResolver].isAssignableFrom(clazz)) {
-udfExpr = Some(HiveUDAFFunction(name, new 
HiveFunctionWrapper(clazz.getName), input))
-udfExpr.get.dataType // Force it to check input data types.
-  } else if (classOf[UDAF].isAssignableFrom(clazz)) {
-udfExpr = Some(HiveUDAFFunction(
-  name,
-  new HiveFunctionWrapper(clazz.getName),
-  input,
-  isUDAFBridgeRequired = true))
-udfExpr.get.dataType // Force it to check input data types.
-  } else if (classOf[GenericUDTF].isAssignableFrom(clazz)) {
-udfExpr = Some(HiveGenericUDTF(name, new 
HiveFunctionWrapper(clazz.getName), input))
-udfExpr.get.asInstanceOf[HiveGenericUDTF].elementSchema // Force 
it to check data types.
+  Try(super.makeFunctionExpression(name, clazz, input)) match {

Review comment:
   +1





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] LantaoJin commented on a change in pull request #29062: [SPARK-32237][SQL] Resolve hint in CTE

2020-07-09 Thread GitBox


LantaoJin commented on a change in pull request #29062:
URL: https://github.com/apache/spark/pull/29062#discussion_r452626883



##
File path: 
sql/core/src/test/scala/org/apache/spark/sql/connector/DataSourceV2SQLSuite.scala
##
@@ -756,6 +756,19 @@ class DataSourceV2SQLSuite
 }
   }
 
+  test("SPARK-32237: Hint in CTE") {

Review comment:
   Oh, it’s not only for v2. I just use a similar test for modification. I 
will move the test to another suite





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #28957: [WIP][SPARK-32138] Drop Python 2.7, 3.4 and 3.5

2020-07-09 Thread GitBox


AmplabJenkins removed a comment on pull request #28957:
URL: https://github.com/apache/spark/pull/28957#issuecomment-656487892







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #28957: [WIP][SPARK-32138] Drop Python 2.7, 3.4 and 3.5

2020-07-09 Thread GitBox


AmplabJenkins commented on pull request #28957:
URL: https://github.com/apache/spark/pull/28957#issuecomment-656487892







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] zhengruifeng commented on pull request #29018: [SPARK-32202][ML][WIP] tree models auto infer compact integer type

2020-07-09 Thread GitBox


zhengruifeng commented on pull request #29018:
URL: https://github.com/apache/spark/pull/29018#issuecomment-656487627


   I had removed the specialization in those methods, expect in `TreePoint`, 
but there seems no significant improvement. I need more time to figure out 
while it is slower.
   
   > I honestly don't know how often the training is limited by memory vs CPU 
here?
   
   It is a trade-off between RAM vs CPU.
   I think CPU is a bit more important, since the training dataset tend to fit 
in memory.
   But if it can reduce 70% RAM usage at the cost of <5% slower, I think it 
worthwhile.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] maropu commented on a change in pull request #29062: [SPARK-32237][SQL] Resolve hint in CTE

2020-07-09 Thread GitBox


maropu commented on a change in pull request #29062:
URL: https://github.com/apache/spark/pull/29062#discussion_r452625256



##
File path: 
sql/core/src/test/scala/org/apache/spark/sql/connector/DataSourceV2SQLSuite.scala
##
@@ -756,6 +756,19 @@ class DataSourceV2SQLSuite
 }
   }
 
+  test("SPARK-32237: Hint in CTE") {
+val t1 = "testcat.ns1.ns2.tbl"

Review comment:
   nit: `t1` -> `t`





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #29055: [SPARK-32251][SQL][DOCS][TESTS] Fix SQL keyword document

2020-07-09 Thread GitBox


AmplabJenkins removed a comment on pull request #29055:
URL: https://github.com/apache/spark/pull/29055#issuecomment-656485788







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA removed a comment on pull request #28957: [WIP][SPARK-32138] Drop Python 2.7, 3.4 and 3.5

2020-07-09 Thread GitBox


SparkQA removed a comment on pull request #28957:
URL: https://github.com/apache/spark/pull/28957#issuecomment-656418767


   **[Test build #125527 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/125527/testReport)**
 for PR 28957 at commit 
[`5f6ba37`](https://github.com/apache/spark/commit/5f6ba375e696ce65019cdf131febc0b9289a886a).



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #28957: [WIP][SPARK-32138] Drop Python 2.7, 3.4 and 3.5

2020-07-09 Thread GitBox


SparkQA commented on pull request #28957:
URL: https://github.com/apache/spark/pull/28957#issuecomment-656485863


   **[Test build #125527 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/125527/testReport)**
 for PR 28957 at commit 
[`5f6ba37`](https://github.com/apache/spark/commit/5f6ba375e696ce65019cdf131febc0b9289a886a).
* This patch passes all tests.
* This patch merges cleanly.
* This patch adds no public classes.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #29055: [SPARK-32251][SQL][DOCS][TESTS] Fix SQL keyword document

2020-07-09 Thread GitBox


AmplabJenkins commented on pull request #29055:
URL: https://github.com/apache/spark/pull/29055#issuecomment-656485788







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] maropu commented on a change in pull request #29062: [SPARK-32237][SQL] Resolve hint in CTE

2020-07-09 Thread GitBox


maropu commented on a change in pull request #29062:
URL: https://github.com/apache/spark/pull/29062#discussion_r452624446



##
File path: 
sql/core/src/test/scala/org/apache/spark/sql/connector/DataSourceV2SQLSuite.scala
##
@@ -756,6 +756,19 @@ class DataSourceV2SQLSuite
 }
   }
 
+  test("SPARK-32237: Hint in CTE") {

Review comment:
   Why did you put this test here? Do you think this issue is only for V2 
data sources?





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #29055: [SPARK-32251][SQL][DOCS][TESTS] Fix SQL keyword document

2020-07-09 Thread GitBox


SparkQA commented on pull request #29055:
URL: https://github.com/apache/spark/pull/29055#issuecomment-656485472


   **[Test build #125557 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/125557/testReport)**
 for PR 29055 at commit 
[`f595689`](https://github.com/apache/spark/commit/f5956895b1474e3eec63b5140e7635adc8a06d92).



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] maropu commented on a change in pull request #29062: [SPARK-32237][SQL] Resolve hint in CTE

2020-07-09 Thread GitBox


maropu commented on a change in pull request #29062:
URL: https://github.com/apache/spark/pull/29062#discussion_r452624446



##
File path: 
sql/core/src/test/scala/org/apache/spark/sql/connector/DataSourceV2SQLSuite.scala
##
@@ -756,6 +756,19 @@ class DataSourceV2SQLSuite
 }
   }
 
+  test("SPARK-32237: Hint in CTE") {

Review comment:
   Why did you put this test here? Do you think this issue in only for V2 
data sources?





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #29034: [SPARK-32219][SQL] Add SHOW CACHED TABLES Command

2020-07-09 Thread GitBox


SparkQA commented on pull request #29034:
URL: https://github.com/apache/spark/pull/29034#issuecomment-656484439


   **[Test build #125556 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/125556/testReport)**
 for PR 29034 at commit 
[`5138cdd`](https://github.com/apache/spark/commit/5138cdd47881a686d624e1bb094f660e52a004e3).



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] maropu commented on pull request #29055: [SPARK-32251][SQL][DOCS][TESTS] Fix SQL keyword document

2020-07-09 Thread GitBox


maropu commented on pull request #29055:
URL: https://github.com/apache/spark/pull/29055#issuecomment-656484216


   retest this please



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] maropu commented on a change in pull request #29045: [SPARK-32234][SQL] Spark sql commands are failing on selecting the orc tables

2020-07-09 Thread GitBox


maropu commented on a change in pull request #29045:
URL: https://github.com/apache/spark/pull/29045#discussion_r452623395



##
File path: 
sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/orc/OrcColumnarBatchReaderSuite.scala
##
@@ -77,4 +77,43 @@ class OrcColumnarBatchReaderSuite extends QueryTest with 
SharedSparkSession {
   assert(p1.getUTF8String(0) === partitionValues.getUTF8String(0))
 }
   }
+
+  test("orc data created by the hive tables having _col fields name") {
+var error: Throwable = null
+val table = """CREATE TABLE `test_date_hive_orc`
+  | (`_col1` INT,`_col2` STRING,`_col3` INT)
+  |  USING orc""".stripMargin
+spark.sql(table).collect
+spark.sql("insert into test_date_hive_orc values(9, '12', 2020)").collect
+val df = spark.sql("select _col2 from test_date_hive_orc")
+try {
+  val data = df.collect()
+  assert(data.length == 1)
+} catch {
+  case e: Throwable =>
+error = e
+}
+assert(error == null)
+spark.sql(s"DROP TABLE IF EXISTS test_date_hive_orc")

Review comment:
   Please check the other tests carefully then follow how to write tests 
there. How about refactoring it like this?
   ```
   withTable("test_date_hive_orc") {
 spark.sql(
   s"""
  |CREATE TABLE test_date_hive_orc
  |  (col1 INT, col2 STRING, col3 INT)
  |  USING orc
""".stripMargin)
 spark.sql(
   s"""
  |INSERT INTO test_date_hive_orc VALUES
  |  (9, '12', 2020)
""".stripMargin)
   
 val df = spark.sql("SELECT col2 FROM test_date_hive_orc")
 checkAnswer(df, Row(...))
   }
   ```





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #29034: [SPARK-32219][SQL] Add SHOW CACHED TABLES Command

2020-07-09 Thread GitBox


AmplabJenkins removed a comment on pull request #29034:
URL: https://github.com/apache/spark/pull/29034#issuecomment-656483701







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #29034: [SPARK-32219][SQL] Add SHOW CACHED TABLES Command

2020-07-09 Thread GitBox


AmplabJenkins commented on pull request #29034:
URL: https://github.com/apache/spark/pull/29034#issuecomment-656483701







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] dilipbiswal edited a comment on pull request #28683: [SPARK-31875][SQL] Provide a option to disable user supplied Hints

2020-07-09 Thread GitBox


dilipbiswal edited a comment on pull request #28683:
URL: https://github.com/apache/spark/pull/28683#issuecomment-656483187


   @dongjoon-hyun @maropu @cloud-fan @ulysses-you Thanks a lot.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] dilipbiswal commented on pull request #28683: [SPARK-31875][SQL] Provide a option to disable user supplied Hints

2020-07-09 Thread GitBox


dilipbiswal commented on pull request #28683:
URL: https://github.com/apache/spark/pull/28683#issuecomment-656483187


   @dongjoon-hyun Thank you very much :-)
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #29055: [SPARK-32251][SQL][DOCS][TESTS] Fix SQL keyword document

2020-07-09 Thread GitBox


AmplabJenkins removed a comment on pull request #29055:
URL: https://github.com/apache/spark/pull/29055#issuecomment-656482084


   Test FAILed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/125514/
   Test FAILed.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #29055: [SPARK-32251][SQL][DOCS][TESTS] Fix SQL keyword document

2020-07-09 Thread GitBox


AmplabJenkins removed a comment on pull request #29055:
URL: https://github.com/apache/spark/pull/29055#issuecomment-656482077


   Merged build finished. Test FAILed.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #29055: [SPARK-32251][SQL][DOCS][TESTS] Fix SQL keyword document

2020-07-09 Thread GitBox


AmplabJenkins commented on pull request #29055:
URL: https://github.com/apache/spark/pull/29055#issuecomment-656482077







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #29045: [SPARK-32234][SQL] Spark sql commands are failing on selecting the orc tables

2020-07-09 Thread GitBox


AmplabJenkins removed a comment on pull request #29045:
URL: https://github.com/apache/spark/pull/29045#issuecomment-656481931







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #29045: [SPARK-32234][SQL] Spark sql commands are failing on selecting the orc tables

2020-07-09 Thread GitBox


AmplabJenkins commented on pull request #29045:
URL: https://github.com/apache/spark/pull/29045#issuecomment-656481931







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #29045: [SPARK-32234][SQL] Spark sql commands are failing on selecting the orc tables

2020-07-09 Thread GitBox


SparkQA commented on pull request #29045:
URL: https://github.com/apache/spark/pull/29045#issuecomment-656481611


   **[Test build #12 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/12/testReport)**
 for PR 29045 at commit 
[`75e8833`](https://github.com/apache/spark/commit/75e8833a4c5c1b9cf48c4bc3322cd8143a042b38).



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] maropu commented on a change in pull request #29045: [SPARK-32234][SQL] Spark sql commands are failing on selecting the orc tables

2020-07-09 Thread GitBox


maropu commented on a change in pull request #29045:
URL: https://github.com/apache/spark/pull/29045#discussion_r452621076



##
File path: 
sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/orc/OrcColumnarBatchReaderSuite.scala
##
@@ -77,4 +77,43 @@ class OrcColumnarBatchReaderSuite extends QueryTest with 
SharedSparkSession {
   assert(p1.getUTF8String(0) === partitionValues.getUTF8String(0))
 }
   }
+
+  test("orc data created by the hive tables having _col fields name") {

Review comment:
   Plz add the prefix `test("SPARK-32234: orc data...`.

##
File path: 
sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/orc/OrcColumnarBatchReaderSuite.scala
##
@@ -77,4 +77,43 @@ class OrcColumnarBatchReaderSuite extends QueryTest with 
SharedSparkSession {
   assert(p1.getUTF8String(0) === partitionValues.getUTF8String(0))
 }
   }
+
+  test("orc data created by the hive tables having _col fields name") {
+var error: Throwable = null
+val table = """CREATE TABLE `test_date_hive_orc`
+  | (`_col1` INT,`_col2` STRING,`_col3` INT)
+  |  USING orc""".stripMargin
+spark.sql(table).collect
+spark.sql("insert into test_date_hive_orc values(9, '12', 2020)").collect
+val df = spark.sql("select _col2 from test_date_hive_orc")
+try {
+  val data = df.collect()
+  assert(data.length == 1)
+} catch {
+  case e: Throwable =>
+error = e
+}
+assert(error == null)
+spark.sql(s"DROP TABLE IF EXISTS test_date_hive_orc")
+  }
+
+  test("orc data created by the spark having proper fields name") {

Review comment:
   ditto





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #29034: [SPARK-32219][SQL] Add SHOW CACHED TABLES Command

2020-07-09 Thread GitBox


AmplabJenkins removed a comment on pull request #29034:
URL: https://github.com/apache/spark/pull/29034#issuecomment-656481034


   Test FAILed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/125534/
   Test FAILed.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] maropu commented on a change in pull request #29045: [SPARK-32234][SQL] Spark sql commands are failing on selecting the orc tables

2020-07-09 Thread GitBox


maropu commented on a change in pull request #29045:
URL: https://github.com/apache/spark/pull/29045#discussion_r452620989



##
File path: 
sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/orc/OrcColumnarBatchReaderSuite.scala
##
@@ -77,4 +77,43 @@ class OrcColumnarBatchReaderSuite extends QueryTest with 
SharedSparkSession {
   assert(p1.getUTF8String(0) === partitionValues.getUTF8String(0))
 }
   }
+
+  test("orc data created by the hive tables having _col fields name") {
+var error: Throwable = null
+val table = """CREATE TABLE `test_date_hive_orc`
+  | (`_col1` INT,`_col2` STRING,`_col3` INT)
+  |  USING orc""".stripMargin
+spark.sql(table).collect
+spark.sql("insert into test_date_hive_orc values(9, '12', 2020)").collect
+val df = spark.sql("select _col2 from test_date_hive_orc")
+try {
+  val data = df.collect()
+  assert(data.length == 1)
+} catch {
+  case e: Throwable =>
+error = e
+}
+assert(error == null)
+spark.sql(s"DROP TABLE IF EXISTS test_date_hive_orc")
+  }
+
+  test("orc data created by the spark having proper fields name") {
+var error: Throwable = null
+val table = """CREATE TABLE `test_date_spark_orc`
+  | (`d_date_sk` INT,`d_date_id` STRING,`d_year` INT)
+  |  USING orc""".stripMargin
+spark.sql(table).collect
+spark.sql("insert into test_date_spark_orc values(9, '12', 2020)").collect
+val df = spark.sql("select d_date_id from test_date_spark_orc")
+try {
+  val data = df.collect()
+  assert(data.length == 1)
+} catch {
+  case e: Throwable =>
+error = e
+}
+assert(error == null)
+spark.sql(s"DROP TABLE IF EXISTS test_date_spark_orc")
+  }
+

Review comment:
   nit: remove this blank.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #29034: [SPARK-32219][SQL] Add SHOW CACHED TABLES Command

2020-07-09 Thread GitBox


AmplabJenkins removed a comment on pull request #29034:
URL: https://github.com/apache/spark/pull/29034#issuecomment-656481025


   Merged build finished. Test FAILed.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA removed a comment on pull request #29055: [SPARK-32251][SQL][DOCS][TESTS] Fix SQL keyword document

2020-07-09 Thread GitBox


SparkQA removed a comment on pull request #29055:
URL: https://github.com/apache/spark/pull/29055#issuecomment-656344977


   **[Test build #125514 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/125514/testReport)**
 for PR 29055 at commit 
[`f595689`](https://github.com/apache/spark/commit/f5956895b1474e3eec63b5140e7635adc8a06d92).



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #29034: [SPARK-32219][SQL] Add SHOW CACHED TABLES Command

2020-07-09 Thread GitBox


AmplabJenkins commented on pull request #29034:
URL: https://github.com/apache/spark/pull/29034#issuecomment-656481025







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] maropu commented on pull request #29045: [SPARK-32234][SQL] Spark sql commands are failing on selecting the orc tables

2020-07-09 Thread GitBox


maropu commented on pull request #29045:
URL: https://github.com/apache/spark/pull/29045#issuecomment-656481035


   retest this please



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #29045: [SPARK-32234][SQL] Spark sql commands are failing on selecting the orc tables

2020-07-09 Thread GitBox


AmplabJenkins removed a comment on pull request #29045:
URL: https://github.com/apache/spark/pull/29045#issuecomment-656480364


   Test FAILed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/125520/
   Test FAILed.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #29055: [SPARK-32251][SQL][DOCS][TESTS] Fix SQL keyword document

2020-07-09 Thread GitBox


SparkQA commented on pull request #29055:
URL: https://github.com/apache/spark/pull/29055#issuecomment-656480655


   **[Test build #125514 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/125514/testReport)**
 for PR 29055 at commit 
[`f595689`](https://github.com/apache/spark/commit/f5956895b1474e3eec63b5140e7635adc8a06d92).
* This patch **fails SparkR unit tests**.
* This patch merges cleanly.
* This patch adds the following public classes _(experimental)_:
 * `trait SQLKeywordUtils extends SQLHelper `
 * `class SQLKeywordSuite extends SparkFunSuite with SQLKeywordUtils `
 * `class TableIdentifierParserSuite extends SparkFunSuite with 
SQLKeywordUtils `



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #28979: [SPARK-32154][SQL] Use ExpressionEncoder for the return type of ScalaUDF to convert to catalyst type

2020-07-09 Thread GitBox


AmplabJenkins removed a comment on pull request #28979:
URL: https://github.com/apache/spark/pull/28979#issuecomment-656480264







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #29045: [SPARK-32234][SQL] Spark sql commands are failing on selecting the orc tables

2020-07-09 Thread GitBox


AmplabJenkins removed a comment on pull request #29045:
URL: https://github.com/apache/spark/pull/29045#issuecomment-656480361


   Merged build finished. Test FAILed.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA removed a comment on pull request #29034: [SPARK-32219][SQL] Add SHOW CACHED TABLES Command

2020-07-09 Thread GitBox


SparkQA removed a comment on pull request #29034:
URL: https://github.com/apache/spark/pull/29034#issuecomment-656434656


   **[Test build #125534 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/125534/testReport)**
 for PR 29034 at commit 
[`564725a`](https://github.com/apache/spark/commit/564725a54450f6a3fc004b86cfa3f0077dbb25fd).



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #28979: [SPARK-32154][SQL] Use ExpressionEncoder for the return type of ScalaUDF to convert to catalyst type

2020-07-09 Thread GitBox


SparkQA commented on pull request #28979:
URL: https://github.com/apache/spark/pull/28979#issuecomment-656480373


   **[Test build #125554 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/125554/testReport)**
 for PR 28979 at commit 
[`871c35f`](https://github.com/apache/spark/commit/871c35ffd3d71cef6040849561ae07a8d4e6b370).



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #29045: [SPARK-32234][SQL] Spark sql commands are failing on selecting the orc tables

2020-07-09 Thread GitBox


AmplabJenkins commented on pull request #29045:
URL: https://github.com/apache/spark/pull/29045#issuecomment-656480361







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #28979: [SPARK-32154][SQL] Use ExpressionEncoder for the return type of ScalaUDF to convert to catalyst type

2020-07-09 Thread GitBox


AmplabJenkins commented on pull request #28979:
URL: https://github.com/apache/spark/pull/28979#issuecomment-656480264







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #29034: [SPARK-32219][SQL] Add SHOW CACHED TABLES Command

2020-07-09 Thread GitBox


SparkQA commented on pull request #29034:
URL: https://github.com/apache/spark/pull/29034#issuecomment-656480243


   **[Test build #125534 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/125534/testReport)**
 for PR 29034 at commit 
[`564725a`](https://github.com/apache/spark/commit/564725a54450f6a3fc004b86cfa3f0077dbb25fd).
* This patch **fails Spark unit tests**.
* This patch merges cleanly.
* This patch adds no public classes.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA removed a comment on pull request #29045: [SPARK-32234][SQL] Spark sql commands are failing on selecting the orc tables

2020-07-09 Thread GitBox


SparkQA removed a comment on pull request #29045:
URL: https://github.com/apache/spark/pull/29045#issuecomment-656384460


   **[Test build #125520 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/125520/testReport)**
 for PR 29045 at commit 
[`75e8833`](https://github.com/apache/spark/commit/75e8833a4c5c1b9cf48c4bc3322cd8143a042b38).



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] maropu commented on pull request #28979: [SPARK-32154][SQL] Use ExpressionEncoder for the return type of ScalaUDF to convert to catalyst type

2020-07-09 Thread GitBox


maropu commented on pull request #28979:
URL: https://github.com/apache/spark/pull/28979#issuecomment-656479376


   retest this please



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #29045: [SPARK-32234][SQL] Spark sql commands are failing on selecting the orc tables

2020-07-09 Thread GitBox


SparkQA commented on pull request #29045:
URL: https://github.com/apache/spark/pull/29045#issuecomment-656479282


   **[Test build #125520 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/125520/testReport)**
 for PR 29045 at commit 
[`75e8833`](https://github.com/apache/spark/commit/75e8833a4c5c1b9cf48c4bc3322cd8143a042b38).
* This patch **fails Spark unit tests**.
* This patch merges cleanly.
* This patch adds no public classes.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #29062: [SPARK-32237][SQL] Resolve hint in CTE

2020-07-09 Thread GitBox


AmplabJenkins removed a comment on pull request #29062:
URL: https://github.com/apache/spark/pull/29062#issuecomment-656478525







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #29062: [SPARK-32237][SQL] Resolve hint in CTE

2020-07-09 Thread GitBox


AmplabJenkins commented on pull request #29062:
URL: https://github.com/apache/spark/pull/29062#issuecomment-656478525







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #29062: [SPARK-32237][SQL] Resolve hint in CTE

2020-07-09 Thread GitBox


SparkQA commented on pull request #29062:
URL: https://github.com/apache/spark/pull/29062#issuecomment-656478192


   **[Test build #125553 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/125553/testReport)**
 for PR 29062 at commit 
[`b183cba`](https://github.com/apache/spark/commit/b183cbae8baf4086010dee0c7bd00e46402cc0fa).



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] LantaoJin opened a new pull request #29062: [SPARK-32237][SQL] Resolve hint in CTE

2020-07-09 Thread GitBox


LantaoJin opened a new pull request #29062:
URL: https://github.com/apache/spark/pull/29062


   ### Why are the changes needed?
   Below SQL in Spark3.0 will throw AnalysisException, but it works in Spark2.x
   ```
   WITH cte AS (SELECT /*+ REPARTITION(3) */ T.id, T.data FROM $t1 T)
   SELECT cte.id, cte.data FROM cte
   ```
   Failed to analyze query: org.apache.spark.sql.AnalysisException: cannot 
resolve '`cte.id`' given input columns: [cte.data, cte.id]; line 3 pos 7;
   'Project ['cte.id, 'cte.data]
   +- SubqueryAlias cte
  +- Project [id#21L, data#22]
 +- SubqueryAlias T
+- SubqueryAlias testcat.ns1.ns2.tbl
   +- RelationV2[id#21L, data#22] testcat.ns1.ns2.tbl
   
   'Project ['cte.id, 'cte.data]
   +- SubqueryAlias cte
  +- Project [id#21L, data#22]
 +- SubqueryAlias T
+- SubqueryAlias testcat.ns1.ns2.tbl
   +- RelationV2[id#21L, data#22] testcat.ns1.ns2.tbl
   
   ### Does this PR introduce _any_ user-facing change?
   No
   
   ### How was this patch tested?
   Add a unit test
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #28967: [SPARK-32149][SHUFFLE] Improve file path name normalisation at block resolution within the external shuffle service

2020-07-09 Thread GitBox


AmplabJenkins commented on pull request #28967:
URL: https://github.com/apache/spark/pull/28967#issuecomment-656476856







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #28967: [SPARK-32149][SHUFFLE] Improve file path name normalisation at block resolution within the external shuffle service

2020-07-09 Thread GitBox


AmplabJenkins removed a comment on pull request #28967:
URL: https://github.com/apache/spark/pull/28967#issuecomment-656476856







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #28967: [SPARK-32149][SHUFFLE] Improve file path name normalisation at block resolution within the external shuffle service

2020-07-09 Thread GitBox


AmplabJenkins removed a comment on pull request #28967:
URL: https://github.com/apache/spark/pull/28967#issuecomment-656475603


   Test FAILed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/125528/
   Test FAILed.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #28967: [SPARK-32149][SHUFFLE] Improve file path name normalisation at block resolution within the external shuffle service

2020-07-09 Thread GitBox


SparkQA commented on pull request #28967:
URL: https://github.com/apache/spark/pull/28967#issuecomment-656476651


   **[Test build #125552 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/125552/testReport)**
 for PR 28967 at commit 
[`e7657cd`](https://github.com/apache/spark/commit/e7657cd33d3f3544e0b12e81f2742be0e6b5073e).



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] gengliangwang commented on a change in pull request #29057: [SPARK-32245][INFRA] Run Spark tests in Github Actions

2020-07-09 Thread GitBox


gengliangwang commented on a change in pull request #29057:
URL: https://github.com/apache/spark/pull/29057#discussion_r452616560



##
File path: dev/run-tests.py
##
@@ -595,13 +601,28 @@ def main():
 
 changed_modules = None
 changed_files = None
-if test_env == "amplab_jenkins" and os.environ.get("AMP_JENKINS_PRB"):
+should_only_test_modules = "TEST_ONLY_MODULES" in os.environ
+included_tags = []
+if should_only_test_modules:
+str_test_modules = [m.strip() for m in 
os.environ.get("TEST_ONLY_MODULES").split(",")]
+test_modules = [m for m in modules.all_modules if m.name in 
str_test_modules]
+# Directly uses test_modules as changed modules to apply tags and 
environments
+# as if all specified test modules are changed.
+changed_modules = test_modules
+str_excluded_tags = os.environ.get("TEST_ONLY_EXCLUDED_TAGS", None)
+str_included_tags = os.environ.get("TEST_ONLY_INCLUDED_TAGS", None)
+excluded_tags = []
+if str_excluded_tags:
+excluded_tags = [t.strip() for t in str_excluded_tags.split(",")]
+included_tags = []
+if str_included_tags:
+included_tags = [t.strip() for t in str_included_tags.split(",")]
+elif test_env == "amplab_jenkins" and os.environ.get("AMP_JENKINS_PRB"):
 target_branch = os.environ["ghprbTargetBranch"]
 changed_files = identify_changed_files_from_git_commits("HEAD", 
target_branch=target_branch)
 changed_modules = determine_modules_for_files(changed_files)
 excluded_tags = determine_tags_to_exclude(changed_modules)
-
-if not changed_modules:
+else:

Review comment:
   The original logical is to run all the test if `changed_modules` is 
empty. And current logic is to run all the tests if it is not from github 
action or AMP lab jenkins.
   But this is trivial as everything is working now.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #28967: [SPARK-32149][SHUFFLE] Improve file path name normalisation at block resolution within the external shuffle service

2020-07-09 Thread GitBox


AmplabJenkins removed a comment on pull request #28967:
URL: https://github.com/apache/spark/pull/28967#issuecomment-656475597


   Merged build finished. Test FAILed.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #28967: [SPARK-32149][SHUFFLE] Improve file path name normalisation at block resolution within the external shuffle service

2020-07-09 Thread GitBox


AmplabJenkins commented on pull request #28967:
URL: https://github.com/apache/spark/pull/28967#issuecomment-656475597







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] HeartSaVioR commented on pull request #28967: [SPARK-32149][SHUFFLE] Improve file path name normalisation at block resolution within the external shuffle service

2020-07-09 Thread GitBox


HeartSaVioR commented on pull request #28967:
URL: https://github.com/apache/spark/pull/28967#issuecomment-656475406


   Retest this please



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA removed a comment on pull request #28967: [SPARK-32149][SHUFFLE] Improve file path name normalisation at block resolution within the external shuffle service

2020-07-09 Thread GitBox


SparkQA removed a comment on pull request #28967:
URL: https://github.com/apache/spark/pull/28967#issuecomment-656420486


   **[Test build #125528 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/125528/testReport)**
 for PR 28967 at commit 
[`e7657cd`](https://github.com/apache/spark/commit/e7657cd33d3f3544e0b12e81f2742be0e6b5073e).



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #28979: [SPARK-32154][SQL] Use ExpressionEncoder for the return type of ScalaUDF to convert to catalyst type

2020-07-09 Thread GitBox


AmplabJenkins removed a comment on pull request #28979:
URL: https://github.com/apache/spark/pull/28979#issuecomment-656474785


   Test FAILed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/125523/
   Test FAILed.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #28979: [SPARK-32154][SQL] Use ExpressionEncoder for the return type of ScalaUDF to convert to catalyst type

2020-07-09 Thread GitBox


AmplabJenkins removed a comment on pull request #28979:
URL: https://github.com/apache/spark/pull/28979#issuecomment-656474782


   Merged build finished. Test FAILed.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] agrawaldevesh commented on pull request #29014: [SPARK-32199][SPARK-32198] Reduce job failures during decommissioning

2020-07-09 Thread GitBox


agrawaldevesh commented on pull request #29014:
URL: https://github.com/apache/spark/pull/29014#issuecomment-656474975


   @holdenk, @jiangxb1987 @cloud-fan @Ngone51 -- This PR is ready for your 
review please. Thanks !



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #28967: [SPARK-32149][SHUFFLE] Improve file path name normalisation at block resolution within the external shuffle service

2020-07-09 Thread GitBox


SparkQA commented on pull request #28967:
URL: https://github.com/apache/spark/pull/28967#issuecomment-656474900


   **[Test build #125528 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/125528/testReport)**
 for PR 28967 at commit 
[`e7657cd`](https://github.com/apache/spark/commit/e7657cd33d3f3544e0b12e81f2742be0e6b5073e).
* This patch **fails Spark unit tests**.
* This patch merges cleanly.
* This patch adds no public classes.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] agrawaldevesh commented on pull request #29015: [SPARK-32215] Expose a (protected) /workers/kill endpoint on the MasterWebUI

2020-07-09 Thread GitBox


agrawaldevesh commented on pull request #29015:
URL: https://github.com/apache/spark/pull/29015#issuecomment-656474929


   @holdenk, @jiangxb1987 @cloud-fan @Ngone51 -- This PR is ready for your 
review please. Thanks !



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] agrawaldevesh commented on pull request #29032: [SPARK-32217] Plumb whether a worker would also be decommissioned along with executor

2020-07-09 Thread GitBox


agrawaldevesh commented on pull request #29032:
URL: https://github.com/apache/spark/pull/29032#issuecomment-656474836


   @holdenk, @jiangxb1987 @cloud-fan @Ngone51 -- This PR is ready for your 
review please. Thanks ! 



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #28979: [SPARK-32154][SQL] Use ExpressionEncoder for the return type of ScalaUDF to convert to catalyst type

2020-07-09 Thread GitBox


AmplabJenkins commented on pull request #28979:
URL: https://github.com/apache/spark/pull/28979#issuecomment-656474782







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] bharatviswa504 commented on pull request #29030: [WIP][SPARK-32093][BUILD] Add hadoop-ozone-filesystem jar.

2020-07-09 Thread GitBox


bharatviswa504 commented on pull request #29030:
URL: https://github.com/apache/spark/pull/29030#issuecomment-656474666


   > t not every app needs to include it, I can see trying to build it into a 
Spark distro. It wouldn't be enabled by default. It could go in hadoop-cloud 
though this isn't really to help use cloud object stores, though then again 
maybe this is meant to be a cloud-based object store in its own right?
   > In any event not sure it makes sense to add to the pom here if it's still 
pretty beta / dev? Anyone can add this to a downstream Spark build already.
   
   I will modify this PR, once ozone GA version is released.
   
   This change is to make ozone support out of box with spark tarball distro. 
So, that spark distro's can contain required ozone jar. This is the main 
intention of this.
   
   Ozone is an on-prem object store. So, I have added this main pom.xml, as 
hadoop-cloud is specifically for cloud stores.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA removed a comment on pull request #28979: [SPARK-32154][SQL] Use ExpressionEncoder for the return type of ScalaUDF to convert to catalyst type

2020-07-09 Thread GitBox


SparkQA removed a comment on pull request #28979:
URL: https://github.com/apache/spark/pull/28979#issuecomment-656396975


   **[Test build #125523 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/125523/testReport)**
 for PR 28979 at commit 
[`871c35f`](https://github.com/apache/spark/commit/871c35ffd3d71cef6040849561ae07a8d4e6b370).



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] bharatviswa504 commented on a change in pull request #29030: [WIP][SPARK-32093][BUILD] Add hadoop-ozone-filesystem jar.

2020-07-09 Thread GitBox


bharatviswa504 commented on a change in pull request #29030:
URL: https://github.com/apache/spark/pull/29030#discussion_r452614277



##
File path: pom.xml
##
@@ -202,6 +202,9 @@
 0.15.1
 
 org.fusesource.leveldbjni
+
+
+0.5.0-beta

Review comment:
   Ozone's next release is 0.6.0. 
   Mostly it will be released before August 2020.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #28979: [SPARK-32154][SQL] Use ExpressionEncoder for the return type of ScalaUDF to convert to catalyst type

2020-07-09 Thread GitBox


SparkQA commented on pull request #28979:
URL: https://github.com/apache/spark/pull/28979#issuecomment-656474181


   **[Test build #125523 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/125523/testReport)**
 for PR 28979 at commit 
[`871c35f`](https://github.com/apache/spark/commit/871c35ffd3d71cef6040849561ae07a8d4e6b370).
* This patch **fails Spark unit tests**.
* This patch merges cleanly.
* This patch adds no public classes.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #29054: [SPARK-32243][SQL]HiveSessionCatalog call super.makeFunctionExpression should show error message

2020-07-09 Thread GitBox


AmplabJenkins removed a comment on pull request #29054:
URL: https://github.com/apache/spark/pull/29054#issuecomment-656472339







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #28996: [SPARK-29358][SQL] Make unionByName optionally fill missing columns with nulls

2020-07-09 Thread GitBox


AmplabJenkins commented on pull request #28996:
URL: https://github.com/apache/spark/pull/28996#issuecomment-656472252







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #28996: [SPARK-29358][SQL] Make unionByName optionally fill missing columns with nulls

2020-07-09 Thread GitBox


AmplabJenkins removed a comment on pull request #28996:
URL: https://github.com/apache/spark/pull/28996#issuecomment-656472252







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #29054: [SPARK-32243][SQL]HiveSessionCatalog call super.makeFunctionExpression should show error message

2020-07-09 Thread GitBox


AmplabJenkins commented on pull request #29054:
URL: https://github.com/apache/spark/pull/29054#issuecomment-656472339







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] HeartSaVioR closed pull request #29036: [SPARK-32242][SQL] CliSuite flakiness fix via differentiating cli driver bootup timeout and query execution timeout

2020-07-09 Thread GitBox


HeartSaVioR closed pull request #29036:
URL: https://github.com/apache/spark/pull/29036


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #29054: [SPARK-32243][SQL]HiveSessionCatalog call super.makeFunctionExpression should show error message

2020-07-09 Thread GitBox


SparkQA commented on pull request #29054:
URL: https://github.com/apache/spark/pull/29054#issuecomment-656472024


   **[Test build #125550 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/125550/testReport)**
 for PR 29054 at commit 
[`5dd3169`](https://github.com/apache/spark/commit/5dd31697feb3a01f65b900efa416390486abd4d5).



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #28996: [SPARK-29358][SQL] Make unionByName optionally fill missing columns with nulls

2020-07-09 Thread GitBox


SparkQA commented on pull request #28996:
URL: https://github.com/apache/spark/pull/28996#issuecomment-656472034


   **[Test build #125551 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/125551/testReport)**
 for PR 28996 at commit 
[`df4e8dc`](https://github.com/apache/spark/commit/df4e8dc6a4bed3959b4317e3ff39da9f8aef5548).



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] HeartSaVioR commented on pull request #29036: [SPARK-32242][SQL] CliSuite flakiness fix via differentiating cli driver bootup timeout and query execution timeout

2020-07-09 Thread GitBox


HeartSaVioR commented on pull request #29036:
URL: https://github.com/apache/spark/pull/29036#issuecomment-656471953


   Thanks for the feedbacks! Merging to master. As same as #29039, we can port 
back anytime when we find the flakiness in other branches, so it should be OK 
to start with only master branch.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



  1   2   3   4   5   6   7   8   9   10   >