[GitHub] [spark] AmplabJenkins removed a comment on issue #25512: [WIP][SPARK-28782][SQL] Support explode function on aggregate expressions
AmplabJenkins removed a comment on issue #25512: [WIP][SPARK-28782][SQL] Support explode function on aggregate expressions URL: https://github.com/apache/spark/pull/25512#issuecomment-523301482 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/109460/ Test FAILed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #25512: [WIP][SPARK-28782][SQL] Support explode function on aggregate expressions
AmplabJenkins removed a comment on issue #25512: [WIP][SPARK-28782][SQL] Support explode function on aggregate expressions URL: https://github.com/apache/spark/pull/25512#issuecomment-523301479 Merged build finished. Test FAILed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA removed a comment on issue #25512: [WIP][SPARK-28782][SQL] Support explode function on aggregate expressions
SparkQA removed a comment on issue #25512: [WIP][SPARK-28782][SQL] Support explode function on aggregate expressions URL: https://github.com/apache/spark/pull/25512#issuecomment-523287243 **[Test build #109460 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/109460/testReport)** for PR 25512 at commit [`6ca1568`](https://github.com/apache/spark/commit/6ca15681d1041046e6b21436514e9c7c11682db7). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #25512: [WIP][SPARK-28782][SQL] Support explode function on aggregate expressions
AmplabJenkins commented on issue #25512: [WIP][SPARK-28782][SQL] Support explode function on aggregate expressions URL: https://github.com/apache/spark/pull/25512#issuecomment-523301479 Merged build finished. Test FAILed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #25512: [WIP][SPARK-28782][SQL] Support explode function on aggregate expressions
AmplabJenkins commented on issue #25512: [WIP][SPARK-28782][SQL] Support explode function on aggregate expressions URL: https://github.com/apache/spark/pull/25512#issuecomment-523301482 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/109460/ Test FAILed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on issue #25512: [WIP][SPARK-28782][SQL] Support explode function on aggregate expressions
SparkQA commented on issue #25512: [WIP][SPARK-28782][SQL] Support explode function on aggregate expressions URL: https://github.com/apache/spark/pull/25512#issuecomment-523301365 **[Test build #109460 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/109460/testReport)** for PR 25512 at commit [`6ca1568`](https://github.com/apache/spark/commit/6ca15681d1041046e6b21436514e9c7c11682db7). * This patch **fails Spark unit tests**. * This patch merges cleanly. * This patch adds no public classes. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] gengliangwang commented on a change in pull request #25453: [SPARK-28730][SQL] Configurable type coercion policy for table insertion
gengliangwang commented on a change in pull request #25453: [SPARK-28730][SQL] Configurable type coercion policy for table insertion URL: https://github.com/apache/spark/pull/25453#discussion_r316002612 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/TableOutputResolver.scala ## @@ -0,0 +1,126 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.spark.sql.catalyst.analysis + +import scala.collection.mutable + +import org.apache.spark.sql.AnalysisException +import org.apache.spark.sql.catalyst.expressions.{Alias, Attribute, Cast, NamedExpression} +import org.apache.spark.sql.catalyst.plans.logical.{LogicalPlan, Project} +import org.apache.spark.sql.internal.SQLConf +import org.apache.spark.sql.internal.SQLConf.StoreAssignmentPolicy +import org.apache.spark.sql.types.DataType + +object TableOutputResolver { + def resolveOutputColumns( + tableName: String, + expected: Seq[Attribute], + query: LogicalPlan, + byName: Boolean, + conf: SQLConf): LogicalPlan = { + +if (expected.size < query.output.size) { + throw new AnalysisException( +s"""Cannot write to '$tableName', too many data columns: + |Table columns: ${expected.map(c => s"'${c.name}'").mkString(", ")} + |Data columns: ${query.output.map(c => s"'${c.name}'").mkString(", ")}""".stripMargin) +} + +val errors = new mutable.ArrayBuffer[String]() +val resolved: Seq[NamedExpression] = if (byName) { + expected.flatMap { tableAttr => +query.resolve(Seq(tableAttr.name), conf.resolver) match { + case Some(queryExpr) => +checkField(tableAttr, queryExpr, byName, conf, err => errors += err) + case None => +errors += s"Cannot find data for output column '${tableAttr.name}'" +None +} + } + +} else { + if (expected.size > query.output.size) { +throw new AnalysisException( + s"""Cannot write to '$tableName', not enough data columns: + |Table columns: ${expected.map(c => s"'${c.name}'").mkString(", ")} + |Data columns: ${query.output.map(c => s"'${c.name}'").mkString(", ")}""" +.stripMargin) + } + + query.output.zip(expected).flatMap { +case (queryExpr, tableAttr) => + checkField(tableAttr, queryExpr, byName, conf, err => errors += err) + } +} + +if (errors.nonEmpty) { + throw new AnalysisException( +s"Cannot write incompatible data to table '$tableName':\n- ${errors.mkString("\n- ")}") +} + +if (resolved == query.output) { + query +} else { + Project(resolved, query) +} + } + + private def checkField( + tableAttr: Attribute, + queryExpr: NamedExpression, + byName: Boolean, + conf: SQLConf, + addError: String => Unit): Option[NamedExpression] = { + +lazy val outputField = if (tableAttr.dataType.sameType(queryExpr.dataType) && + tableAttr.name == queryExpr.name && + tableAttr.metadata == queryExpr.metadata) { + Some(queryExpr) +} else { + // Renaming is needed for handling the following cases like + // 1) Column names/types do not match, e.g., INSERT INTO TABLE tab1 SELECT 1, 2 + // 2) Target tables have column metadata + Some(Alias( +Cast(queryExpr, tableAttr.dataType, Option(conf.sessionLocalTimeZone)), +tableAttr.name)(explicitMetadata = Option(tableAttr.metadata))) +} + +conf.storeAssignmentPolicy match { + case StoreAssignmentPolicy.LEGACY => +outputField + + case StoreAssignmentPolicy.STRICT => +// run the type check first to ensure type errors are present Review comment: > btw, we don't need the check queryExpr.nullable && !tableAttr.nullable in the other modes? IIRC there is no such check in Spark 2.x This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at:
[GitHub] [spark] gengliangwang commented on a change in pull request #25453: [SPARK-28730][SQL] Configurable type coercion policy for table insertion
gengliangwang commented on a change in pull request #25453: [SPARK-28730][SQL] Configurable type coercion policy for table insertion URL: https://github.com/apache/spark/pull/25453#discussion_r316002461 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/TableOutputResolver.scala ## @@ -0,0 +1,126 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.spark.sql.catalyst.analysis + +import scala.collection.mutable + +import org.apache.spark.sql.AnalysisException +import org.apache.spark.sql.catalyst.expressions.{Alias, Attribute, Cast, NamedExpression} +import org.apache.spark.sql.catalyst.plans.logical.{LogicalPlan, Project} +import org.apache.spark.sql.internal.SQLConf +import org.apache.spark.sql.internal.SQLConf.StoreAssignmentPolicy +import org.apache.spark.sql.types.DataType + +object TableOutputResolver { + def resolveOutputColumns( + tableName: String, + expected: Seq[Attribute], + query: LogicalPlan, + byName: Boolean, + conf: SQLConf): LogicalPlan = { + +if (expected.size < query.output.size) { + throw new AnalysisException( +s"""Cannot write to '$tableName', too many data columns: + |Table columns: ${expected.map(c => s"'${c.name}'").mkString(", ")} + |Data columns: ${query.output.map(c => s"'${c.name}'").mkString(", ")}""".stripMargin) +} + +val errors = new mutable.ArrayBuffer[String]() +val resolved: Seq[NamedExpression] = if (byName) { + expected.flatMap { tableAttr => +query.resolve(Seq(tableAttr.name), conf.resolver) match { + case Some(queryExpr) => +checkField(tableAttr, queryExpr, byName, conf, err => errors += err) + case None => +errors += s"Cannot find data for output column '${tableAttr.name}'" +None +} + } + +} else { + if (expected.size > query.output.size) { +throw new AnalysisException( + s"""Cannot write to '$tableName', not enough data columns: + |Table columns: ${expected.map(c => s"'${c.name}'").mkString(", ")} + |Data columns: ${query.output.map(c => s"'${c.name}'").mkString(", ")}""" +.stripMargin) + } + + query.output.zip(expected).flatMap { +case (queryExpr, tableAttr) => + checkField(tableAttr, queryExpr, byName, conf, err => errors += err) + } +} + +if (errors.nonEmpty) { + throw new AnalysisException( +s"Cannot write incompatible data to table '$tableName':\n- ${errors.mkString("\n- ")}") +} + +if (resolved == query.output) { + query +} else { + Project(resolved, query) +} + } + + private def checkField( + tableAttr: Attribute, + queryExpr: NamedExpression, + byName: Boolean, + conf: SQLConf, + addError: String => Unit): Option[NamedExpression] = { + +lazy val outputField = if (tableAttr.dataType.sameType(queryExpr.dataType) && + tableAttr.name == queryExpr.name && + tableAttr.metadata == queryExpr.metadata) { + Some(queryExpr) +} else { + // Renaming is needed for handling the following cases like + // 1) Column names/types do not match, e.g., INSERT INTO TABLE tab1 SELECT 1, 2 + // 2) Target tables have column metadata + Some(Alias( +Cast(queryExpr, tableAttr.dataType, Option(conf.sessionLocalTimeZone)), +tableAttr.name)(explicitMetadata = Option(tableAttr.metadata))) +} + +conf.storeAssignmentPolicy match { + case StoreAssignmentPolicy.LEGACY => +outputField + + case StoreAssignmentPolicy.STRICT => +// run the type check first to ensure type errors are present Review comment: I think this is on purpose in the original code. Running `DataType.canWrite` can expose more errors. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git
[GitHub] [spark] AmplabJenkins removed a comment on issue #25522: [SPARK-28787][DOC][SQL]Document LOAD DATA statement in SQL Reference
AmplabJenkins removed a comment on issue #25522: [SPARK-28787][DOC][SQL]Document LOAD DATA statement in SQL Reference URL: https://github.com/apache/spark/pull/25522#issuecomment-523299825 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #25522: [SPARK-28787][DOC][SQL]Document LOAD DATA statement in SQL Reference
AmplabJenkins removed a comment on issue #25522: [SPARK-28787][DOC][SQL]Document LOAD DATA statement in SQL Reference URL: https://github.com/apache/spark/pull/25522#issuecomment-523299829 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/14522/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on issue #25522: [SPARK-28787][DOC][SQL]Document LOAD DATA statement in SQL Reference
SparkQA commented on issue #25522: [SPARK-28787][DOC][SQL]Document LOAD DATA statement in SQL Reference URL: https://github.com/apache/spark/pull/25522#issuecomment-523300173 **[Test build #109464 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/109464/testReport)** for PR 25522 at commit [`efea0b8`](https://github.com/apache/spark/commit/efea0b8fcf8da236efeb4a24d5eb5a6266cb7454). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] wangyum commented on issue #25347: [SPARK-28610][SQL] Allow having a decimal buffer for long sum
wangyum commented on issue #25347: [SPARK-28610][SQL] Allow having a decimal buffer for long sum URL: https://github.com/apache/spark/pull/25347#issuecomment-523300126 **PostgreSQL**: ```sql postgres=# create table t1(c1 bigint); CREATE TABLE postgres=# insert into t1 values(9223372036854775807), (9223372036854775807); INSERT 0 2 postgres=# select sum(c1) from t1; sum -- 18446744073709551614 (1 row) ``` **db2**: ```sql [db2inst1@2f3c821d36b7 ~]$ db2 "create table t1(c1 bigint)" DB2I The SQL command completed successfully. [db2inst1@2f3c821d36b7 ~]$ db2 "insert into t1 values(9223372036854775807), (9223372036854775807)" DB2I The SQL command completed successfully. [db2inst1@2f3c821d36b7 ~]$ db2 "select sum(c1) from t1" 1 SQL0802N Arithmetic overflow or other arithmetic exception occurred. SQLSTATE=22003 ``` **SQL Server:2019-CTP3.0** ```sql 1> create table t1(c1 bigint) 2> go 1> insert into t1 values(9223372036854775807), (9223372036854775807) 2> go (2 rows affected) 1> 1> select sum(c1) from t1 2> go Msg 8115, Level 16, State 2, Server 8c82b3c03354, Line 1 Arithmetic overflow error converting expression to data type bigint. ``` **Vertica**: ```sql dbadmin=> create table t1(c1 bigint); CREATE TABLE dbadmin=> insert into t1 values(9223372036854775807); OUTPUT 1 (1 row) dbadmin=> insert into t1 values(9223372036854775807); OUTPUT 1 (1 row) dbadmin=> select sum(c1) from t1; ERROR 4845: Sum() overflowed HINT: Try sum_float() instead dbadmin=> select sum_float(c1) from t1; sum_float -- 1.84467440737096e+19 (1 row) ``` **Oracle**: ```sql SQL> -- BIGINT -> NUMBER(19) : https://docs.oracle.com/cd/B19306_01/gateways.102/b14270/apa.htm SQL> create table t1(c1 NUMBER(19)); Table created. SQL> insert into t1 values(9223372036854775807); 1 row created. SQL> insert into t1 values(9223372036854775807); 1 row created. SQL> select sum(c1) from t1; SUM(C1) -- 1.8447E+19 SQL> create table t2 as select sum(c1) as s from t1; Table created. SQL> desc t2; Name Null?Type - S NUMBER ``` This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #25522: [SPARK-28787][DOC][SQL]Document LOAD DATA statement in SQL Reference
AmplabJenkins commented on issue #25522: [SPARK-28787][DOC][SQL]Document LOAD DATA statement in SQL Reference URL: https://github.com/apache/spark/pull/25522#issuecomment-523299829 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/14522/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #25522: [SPARK-28787][DOC][SQL]Document LOAD DATA statement in SQL Reference
AmplabJenkins commented on issue #25522: [SPARK-28787][DOC][SQL]Document LOAD DATA statement in SQL Reference URL: https://github.com/apache/spark/pull/25522#issuecomment-523299825 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] huaxingao opened a new pull request #25522: [SPARK-28787][DOC][SQL]Document LOAD DATA statement in SQL Reference
huaxingao opened a new pull request #25522: [SPARK-28787][DOC][SQL]Document LOAD DATA statement in SQL Reference URL: https://github.com/apache/spark/pull/25522 ### What changes were proposed in this pull request? Document LOAD DATA statement in SQL Reference ### Why are the changes needed? To complete the SQL Reference ### Does this PR introduce any user-facing change? No ### How was this patch tested? Manually checked the new doc. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #25465: [SPARK-28747][SQL] merge the two data source v2 fallback configs
AmplabJenkins commented on issue #25465: [SPARK-28747][SQL] merge the two data source v2 fallback configs URL: https://github.com/apache/spark/pull/25465#issuecomment-523298275 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #25465: [SPARK-28747][SQL] merge the two data source v2 fallback configs
AmplabJenkins removed a comment on issue #25465: [SPARK-28747][SQL] merge the two data source v2 fallback configs URL: https://github.com/apache/spark/pull/25465#issuecomment-523298275 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #25465: [SPARK-28747][SQL] merge the two data source v2 fallback configs
AmplabJenkins commented on issue #25465: [SPARK-28747][SQL] merge the two data source v2 fallback configs URL: https://github.com/apache/spark/pull/25465#issuecomment-523298279 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/109450/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #25465: [SPARK-28747][SQL] merge the two data source v2 fallback configs
AmplabJenkins removed a comment on issue #25465: [SPARK-28747][SQL] merge the two data source v2 fallback configs URL: https://github.com/apache/spark/pull/25465#issuecomment-523298279 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/109450/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA removed a comment on issue #25465: [SPARK-28747][SQL] merge the two data source v2 fallback configs
SparkQA removed a comment on issue #25465: [SPARK-28747][SQL] merge the two data source v2 fallback configs URL: https://github.com/apache/spark/pull/25465#issuecomment-523253999 **[Test build #109450 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/109450/testReport)** for PR 25465 at commit [`84e9966`](https://github.com/apache/spark/commit/84e9966a67d1b6bc6ecd9024d57f6256879de9b7). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on issue #25465: [SPARK-28747][SQL] merge the two data source v2 fallback configs
SparkQA commented on issue #25465: [SPARK-28747][SQL] merge the two data source v2 fallback configs URL: https://github.com/apache/spark/pull/25465#issuecomment-523297905 **[Test build #109450 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/109450/testReport)** for PR 25465 at commit [`84e9966`](https://github.com/apache/spark/commit/84e9966a67d1b6bc6ecd9024d57f6256879de9b7). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] gengliangwang commented on issue #25453: [SPARK-28730][SQL] Configurable type coercion policy for table insertion
gengliangwang commented on issue #25453: [SPARK-28730][SQL] Configurable type coercion policy for table insertion URL: https://github.com/apache/spark/pull/25453#issuecomment-523297016 > Why is that? We know that v2 already introduces breaking behavior changes and we can't avoid them. We were previously okay with different behavior between v1 and v2, so I see no reason to support the legacy type coercion in the v2 path. As per the discussion in dev list, I think most of us agree that we should make the table insertion behavior configurable. So I assume that you are asking to have two table insertion flags for V1 and V2 data sources. Currently, there are data sources with both V1 and V2 implementation, and some are with V1 implement only. In the future, there can be data source with V2 implementation only. For Spark users, I think it makes more sense to choose the table insertion policy with one flag. I have asked for your opinions one week ago in comment https://github.com/apache/spark/pull/25453#issuecomment-521232172 before I start the actual code changes. I hope we can move forward on this PR. The ANSI mode will be added right after this. Making the default policy as the legacy one is safest for now. We can discuss the default policy after ANSI mode is added. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #25348: [SPARK-28554][SQL] Adds a v1 fallback writer implementation for v2 data source codepaths
AmplabJenkins removed a comment on issue #25348: [SPARK-28554][SQL] Adds a v1 fallback writer implementation for v2 data source codepaths URL: https://github.com/apache/spark/pull/25348#issuecomment-523293535 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/109449/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #25348: [SPARK-28554][SQL] Adds a v1 fallback writer implementation for v2 data source codepaths
AmplabJenkins removed a comment on issue #25348: [SPARK-28554][SQL] Adds a v1 fallback writer implementation for v2 data source codepaths URL: https://github.com/apache/spark/pull/25348#issuecomment-523293532 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #25348: [SPARK-28554][SQL] Adds a v1 fallback writer implementation for v2 data source codepaths
AmplabJenkins commented on issue #25348: [SPARK-28554][SQL] Adds a v1 fallback writer implementation for v2 data source codepaths URL: https://github.com/apache/spark/pull/25348#issuecomment-523293532 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #25348: [SPARK-28554][SQL] Adds a v1 fallback writer implementation for v2 data source codepaths
AmplabJenkins commented on issue #25348: [SPARK-28554][SQL] Adds a v1 fallback writer implementation for v2 data source codepaths URL: https://github.com/apache/spark/pull/25348#issuecomment-523293535 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/109449/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA removed a comment on issue #25348: [SPARK-28554][SQL] Adds a v1 fallback writer implementation for v2 data source codepaths
SparkQA removed a comment on issue #25348: [SPARK-28554][SQL] Adds a v1 fallback writer implementation for v2 data source codepaths URL: https://github.com/apache/spark/pull/25348#issuecomment-523252601 **[Test build #109449 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/109449/testReport)** for PR 25348 at commit [`27598ce`](https://github.com/apache/spark/commit/27598ce9b5ef7bc8224e37df6f14907e766ddd54). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on issue #25348: [SPARK-28554][SQL] Adds a v1 fallback writer implementation for v2 data source codepaths
SparkQA commented on issue #25348: [SPARK-28554][SQL] Adds a v1 fallback writer implementation for v2 data source codepaths URL: https://github.com/apache/spark/pull/25348#issuecomment-523293178 **[Test build #109449 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/109449/testReport)** for PR 25348 at commit [`27598ce`](https://github.com/apache/spark/commit/27598ce9b5ef7bc8224e37df6f14907e766ddd54). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #25521: [SPARK-28635][SQL][followup] CatalogManager should reflect the changes of default catalog
AmplabJenkins removed a comment on issue #25521: [SPARK-28635][SQL][followup] CatalogManager should reflect the changes of default catalog URL: https://github.com/apache/spark/pull/25521#issuecomment-523292277 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/14520/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #25520: [SPARK-28621][SQL] Make spark.sql.crossJoin.enabled default value true
AmplabJenkins removed a comment on issue #25520: [SPARK-28621][SQL] Make spark.sql.crossJoin.enabled default value true URL: https://github.com/apache/spark/pull/25520#issuecomment-523292239 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #25520: [SPARK-28621][SQL] Make spark.sql.crossJoin.enabled default value true
AmplabJenkins removed a comment on issue #25520: [SPARK-28621][SQL] Make spark.sql.crossJoin.enabled default value true URL: https://github.com/apache/spark/pull/25520#issuecomment-523292243 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/14521/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #25521: [SPARK-28635][SQL][followup] CatalogManager should reflect the changes of default catalog
AmplabJenkins removed a comment on issue #25521: [SPARK-28635][SQL][followup] CatalogManager should reflect the changes of default catalog URL: https://github.com/apache/spark/pull/25521#issuecomment-523292273 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on issue #25520: [SPARK-28621][SQL] Make spark.sql.crossJoin.enabled default value true
SparkQA commented on issue #25520: [SPARK-28621][SQL] Make spark.sql.crossJoin.enabled default value true URL: https://github.com/apache/spark/pull/25520#issuecomment-523292677 **[Test build #109463 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/109463/testReport)** for PR 25520 at commit [`1cc5f4d`](https://github.com/apache/spark/commit/1cc5f4d6eedeed84c5a51507f4ad307c3dcde2e6). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on issue #25521: [SPARK-28635][SQL][followup] CatalogManager should reflect the changes of default catalog
SparkQA commented on issue #25521: [SPARK-28635][SQL][followup] CatalogManager should reflect the changes of default catalog URL: https://github.com/apache/spark/pull/25521#issuecomment-523292685 **[Test build #109462 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/109462/testReport)** for PR 25521 at commit [`37fe544`](https://github.com/apache/spark/commit/37fe54401c8ad5b1d62dddb6ac7515a38071b6b6). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #25520: [SPARK-28621][SQL] Make spark.sql.crossJoin.enabled default value true
AmplabJenkins commented on issue #25520: [SPARK-28621][SQL] Make spark.sql.crossJoin.enabled default value true URL: https://github.com/apache/spark/pull/25520#issuecomment-523292239 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #25520: [SPARK-28621][SQL] Make spark.sql.crossJoin.enabled default value true
AmplabJenkins commented on issue #25520: [SPARK-28621][SQL] Make spark.sql.crossJoin.enabled default value true URL: https://github.com/apache/spark/pull/25520#issuecomment-523292243 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/14521/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #25521: [SPARK-28635][SQL][followup] CatalogManager should reflect the changes of default catalog
AmplabJenkins commented on issue #25521: [SPARK-28635][SQL][followup] CatalogManager should reflect the changes of default catalog URL: https://github.com/apache/spark/pull/25521#issuecomment-523292273 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #25521: [SPARK-28635][SQL][followup] CatalogManager should reflect the changes of default catalog
AmplabJenkins commented on issue #25521: [SPARK-28635][SQL][followup] CatalogManager should reflect the changes of default catalog URL: https://github.com/apache/spark/pull/25521#issuecomment-523292277 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/14520/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] cloud-fan commented on issue #25521: [SPARK-28635][SQL][followup] CatalogManager should reflect the changes of default catalog
cloud-fan commented on issue #25521: [SPARK-28635][SQL][followup] CatalogManager should reflect the changes of default catalog URL: https://github.com/apache/spark/pull/25521#issuecomment-523292130 cc @imback82 @brkyvz @rdblue This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] cloud-fan opened a new pull request #25521: [SPARK-28635][SQL][followup] CatalogManager should reflect the changes of default catalog
cloud-fan opened a new pull request #25521: [SPARK-28635][SQL][followup] CatalogManager should reflect the changes of default catalog URL: https://github.com/apache/spark/pull/25521 ### What changes were proposed in this pull request? Fix a bug in CatalogManager, to reflect the change of default catalog config when reporting current catalog. ### Why are the changes needed? The current namespace/catalog should be set to None at the beginning, so that we can read the new configs when reporting currennt namespace/catalog later. ### Does this PR introduce any user-facing change? No. The current namespace/catalog stuff is still internal right now. ### How was this patch tested? a new test suite This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] WeichenXu123 commented on issue #25520: [SPARK-28621][SQL] Make spark.sql.crossJoin.enabled default value true
WeichenXu123 commented on issue #25520: [SPARK-28621][SQL] Make spark.sql.crossJoin.enabled default value true URL: https://github.com/apache/spark/pull/25520#issuecomment-523291925 @cloud-fan @gatorsmile This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] WeichenXu123 commented on issue #25359: [WIP][SPARK-28621][SQL] Fix CheckCartesianProducts mismatch actual physical plan
WeichenXu123 commented on issue #25359: [WIP][SPARK-28621][SQL] Fix CheckCartesianProducts mismatch actual physical plan URL: https://github.com/apache/spark/pull/25359#issuecomment-523291228 See new PR https://github.com/apache/spark/pull/25520 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] WeichenXu123 closed pull request #25359: [WIP][SPARK-28621][SQL] Fix CheckCartesianProducts mismatch actual physical plan
WeichenXu123 closed pull request #25359: [WIP][SPARK-28621][SQL] Fix CheckCartesianProducts mismatch actual physical plan URL: https://github.com/apache/spark/pull/25359 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] WeichenXu123 opened a new pull request #25520: [SPARK-28621][SQL] Make spark.sql.crossJoin.enabled default value true
WeichenXu123 opened a new pull request #25520: [SPARK-28621][SQL] Make spark.sql.crossJoin.enabled default value true URL: https://github.com/apache/spark/pull/25520 ### What changes were proposed in this pull request? Make `spark.sql.crossJoin.enabled` default value true ### Why are the changes needed? For implicit cross join, we can set up a watchdog to cancel it if running for a long time. When "spark.sql.crossJoin.enabled" is false, because `CheckCartesianProducts` is implemented in logical plan stage, it may generate some mismatching error which may confuse end user. So we'd better make `spark.sql.crossJoin.enabled` default value true. ### Does this PR introduce any user-facing change? ### How was this patch tested? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #25519: [SPARK-28483][Core][FOLLOW-UP] Dealing with interrupted exception in BarrierTaskContext.barrier()
AmplabJenkins removed a comment on issue #25519: [SPARK-28483][Core][FOLLOW-UP] Dealing with interrupted exception in BarrierTaskContext.barrier() URL: https://github.com/apache/spark/pull/25519#issuecomment-523289486 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #25519: [SPARK-28483][Core][FOLLOW-UP] Dealing with interrupted exception in BarrierTaskContext.barrier()
AmplabJenkins removed a comment on issue #25519: [SPARK-28483][Core][FOLLOW-UP] Dealing with interrupted exception in BarrierTaskContext.barrier() URL: https://github.com/apache/spark/pull/25519#issuecomment-523289491 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/14519/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] HyukjinKwon commented on a change in pull request #24232: [SPARK-27297] [SQL] Add higher order functions to scala API
HyukjinKwon commented on a change in pull request #24232: [SPARK-27297] [SQL] Add higher order functions to scala API URL: https://github.com/apache/spark/pull/24232#discussion_r315993792 ## File path: sql/core/src/test/scala/org/apache/spark/sql/DataFrameFunctionsSuite.scala ## @@ -1917,19 +1921,33 @@ class DataFrameFunctionsSuite extends QueryTest with SharedSparkSession { null ).toDF("i") +// transform(i, x -> x + 1) +val resA = Seq( + Row(Seq(2, 10, 9, 8)), + Row(Seq(6, 9, 10, 8, 3)), + Row(Seq.empty), + Row(null)) + +// transform(i, (x, i) -> x + i) +val resB = Seq( + Row(Seq(1, 10, 10, 10)), + Row(Seq(5, 9, 11, 10, 6)), + Row(Seq.empty), + Row(null)) + def testArrayOfPrimitiveTypeNotContainsNull(): Unit = { - checkAnswer(df.selectExpr("transform(i, x -> x + 1)"), -Seq( - Row(Seq(2, 10, 9, 8)), - Row(Seq(6, 9, 10, 8, 3)), - Row(Seq.empty), - Row(null))) - checkAnswer(df.selectExpr("transform(i, (x, i) -> x + i)"), -Seq( - Row(Seq(1, 10, 10, 10)), - Row(Seq(5, 9, 11, 10, 6)), - Row(Seq.empty), - Row(null))) + checkAnswer(df.selectExpr("transform(i, x -> x + 1)"), resA) + checkAnswer(df.selectExpr("transform(i, (x, i) -> x + i)"), resB) + + checkAnswer(df.select(transform(col("i"), x => x + 1)), resA) + checkAnswer(df.select(transform(col("i"), (x, i) => x + i)), resB) + + checkAnswer(df.select(transform(col("i"), new JFunc { +def call(x: Column) = x + 1 + })), resA) + checkAnswer(df.select(transform(col("i"), new JFunc2 { +def call(x: Column, i: Column) = x + i + })), resB) Review comment: > As for these, the arguments are scala.Function1 and org.apache.spark.api.java.function.Function, Java compiler recognizes both as SAM (Single Abstract Method) type when resolving x -> x, and they are ambiguous. Sounds like a valid concern to me. Does such `x -> x` syntax work in Java too? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on issue #25519: [SPARK-28483][Core][FOLLOW-UP] Dealing with interrupted exception in BarrierTaskContext.barrier()
SparkQA commented on issue #25519: [SPARK-28483][Core][FOLLOW-UP] Dealing with interrupted exception in BarrierTaskContext.barrier() URL: https://github.com/apache/spark/pull/25519#issuecomment-523289865 **[Test build #109461 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/109461/testReport)** for PR 25519 at commit [`2650742`](https://github.com/apache/spark/commit/26507422214046289a6c0ef92426ac62d1ae1b96). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] wangyum commented on a change in pull request #24715: [SPARK-25474][SQL] Data source tables support fallback to HDFS for size estimation
wangyum commented on a change in pull request #24715: [SPARK-25474][SQL] Data source tables support fallback to HDFS for size estimation URL: https://github.com/apache/spark/pull/24715#discussion_r315993569 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/DataSourceStrategy.scala ## @@ -220,10 +220,20 @@ case class DataSourceAnalysis(conf: SQLConf) extends Rule[LogicalPlan] with Cast * data source. */ class FindDataSourceTable(sparkSession: SparkSession) extends Rule[LogicalPlan] { - private def readDataSourceTable(table: CatalogTable): LogicalPlan = { -val qualifiedTableName = QualifiedTableName(table.database, table.identifier.table) + private def maybeWithTableStats(tableMeta: CatalogTable): CatalogTable = { +if (tableMeta.stats.isEmpty && sparkSession.sessionState.conf.fallBackToHdfsForStatsEnabled) { + val sizeInBytes = CommandUtils.getSizeInBytesFallBackToHdfs(sparkSession, tableMeta) + tableMeta.copy(stats = Some(CatalogStatistics(sizeInBytes = BigInt(sizeInBytes +} else { + tableMeta +} + } + + private def readDataSourceTable(tableMeta: CatalogTable): LogicalPlan = { +val qualifiedTableName = QualifiedTableName(tableMeta.database, tableMeta.identifier.table) val catalog = sparkSession.sessionState.catalog catalog.getCachedPlan(qualifiedTableName, () => { + val table = maybeWithTableStats(tableMeta) Review comment: Verify it always uses cached relation if it cached. Checkout this [commit](https://github.com/apache/spark/pull/24715/commits/1fce50859b959cd4190f0da5dabf4addd187fb79): ```shell git checkout 1fce50859b959cd4190f0da5dabf4addd187fb79 build/sbt clean package export SPARK_PREPEND_CLASSES=true ./bin/spark-shell ``` ```scala import org.apache.spark.sql.catalyst.QualifiedTableName import org.apache.spark.sql.execution.datasources.LogicalRelation spark.sql("CREATE TABLE t3(id int, c2 int) USING parquet PARTITIONED BY(id)") spark.sql("INSERT INTO TABLE t3 PARTITION(id=1) SELECT 2") // default, fallBackToHdfs=false spark.sql("EXPLAIN COST SELECT * FROM t3").show(false) spark.sessionState.catalog.getCachedTable(QualifiedTableName(spark.sessionState.catalog.getCurrentDatabase, "t3")).asInstanceOf[LogicalRelation].catalogTable.get.stats // enable fallBackToHdfs spark.sql("set spark.sql.statistics.fallBackToHdfs=true") spark.sql("EXPLAIN COST SELECT * FROM t3").show(false) spark.sessionState.catalog.getCachedTable(QualifiedTableName(spark.sessionState.catalog.getCurrentDatabase, "t3")).asInstanceOf[LogicalRelation].catalogTable.get.stats // Invalidate cached relations spark.sessionState.catalog.invalidateAllCachedTables spark.sql("EXPLAIN COST SELECT * FROM t3").show(false) ``` ```scala scala> import org.apache.spark.sql.catalyst.QualifiedTableName import org.apache.spark.sql.catalyst.QualifiedTableName scala> import org.apache.spark.sql.execution.datasources.LogicalRelation import org.apache.spark.sql.execution.datasources.LogicalRelation scala> scala> spark.sql("CREATE TABLE t3(id int, c2 int) USING parquet PARTITIONED BY(id)") res0: org.apache.spark.sql.DataFrame = [] scala> spark.sql("INSERT INTO TABLE t3 PARTITION(id=1) SELECT 2") res1: org.apache.spark.sql.DataFrame = [] scala> scala> // default, fallBackToHdfs=false scala> spark.sql("EXPLAIN COST SELECT * FROM t3").show(false) ++ |plan | ++ |== Optimized Logical Plan == Relation[c2#1,id#2] parquet, Statistics(sizeInBytes=8.0 EiB) == Physical Plan == *(1) ColumnarToRow +- FileScan parquet default.t3[c2#1,id#2] Batched: true, DataFilters: [], Format: Parquet, Location:
[GitHub] [spark] AmplabJenkins commented on issue #25519: [SPARK-28483][Core][FOLLOW-UP] Dealing with interrupted exception in BarrierTaskContext.barrier()
AmplabJenkins commented on issue #25519: [SPARK-28483][Core][FOLLOW-UP] Dealing with interrupted exception in BarrierTaskContext.barrier() URL: https://github.com/apache/spark/pull/25519#issuecomment-523289486 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #25519: [SPARK-28483][Core][FOLLOW-UP] Dealing with interrupted exception in BarrierTaskContext.barrier()
AmplabJenkins commented on issue #25519: [SPARK-28483][Core][FOLLOW-UP] Dealing with interrupted exception in BarrierTaskContext.barrier() URL: https://github.com/apache/spark/pull/25519#issuecomment-523289491 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/14519/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] HyukjinKwon commented on a change in pull request #24232: [SPARK-27297] [SQL] Add higher order functions to scala API
HyukjinKwon commented on a change in pull request #24232: [SPARK-27297] [SQL] Add higher order functions to scala API URL: https://github.com/apache/spark/pull/24232#discussion_r315993279 ## File path: sql/core/src/test/scala/org/apache/spark/sql/DataFrameFunctionsSuite.scala ## @@ -1917,19 +1921,33 @@ class DataFrameFunctionsSuite extends QueryTest with SharedSparkSession { null ).toDF("i") +// transform(i, x -> x + 1) +val resA = Seq( + Row(Seq(2, 10, 9, 8)), + Row(Seq(6, 9, 10, 8, 3)), + Row(Seq.empty), + Row(null)) + +// transform(i, (x, i) -> x + i) +val resB = Seq( + Row(Seq(1, 10, 10, 10)), + Row(Seq(5, 9, 11, 10, 6)), + Row(Seq.empty), + Row(null)) + def testArrayOfPrimitiveTypeNotContainsNull(): Unit = { - checkAnswer(df.selectExpr("transform(i, x -> x + 1)"), -Seq( - Row(Seq(2, 10, 9, 8)), - Row(Seq(6, 9, 10, 8, 3)), - Row(Seq.empty), - Row(null))) - checkAnswer(df.selectExpr("transform(i, (x, i) -> x + i)"), -Seq( - Row(Seq(1, 10, 10, 10)), - Row(Seq(5, 9, 11, 10, 6)), - Row(Seq.empty), - Row(null))) + checkAnswer(df.selectExpr("transform(i, x -> x + 1)"), resA) + checkAnswer(df.selectExpr("transform(i, (x, i) -> x + i)"), resB) + + checkAnswer(df.select(transform(col("i"), x => x + 1)), resA) + checkAnswer(df.select(transform(col("i"), (x, i) => x + i)), resB) + + checkAnswer(df.select(transform(col("i"), new JFunc { +def call(x: Column) = x + 1 + })), resA) + checkAnswer(df.select(transform(col("i"), new JFunc2 { +def call(x: Column, i: Column) = x + i + })), resB) Review comment: @nvander1, let's move Java specific tests to Java's. Ideally it should be moved to there and it's better to clarify such doubts. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] WeichenXu123 opened a new pull request #25519: [SPARK-28483][Core][FOLLOW-UP] Dealing with interrupted exception in BarrierTaskContext.barrier()
WeichenXu123 opened a new pull request #25519: [SPARK-28483][Core][FOLLOW-UP] Dealing with interrupted exception in BarrierTaskContext.barrier() URL: https://github.com/apache/spark/pull/25519 ### What changes were proposed in this pull request? ### Why are the changes needed? Dealing with interrupted exception in BarrierTaskContext.barrier() ### Does this PR introduce any user-facing change? ### How was this patch tested? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] ueshin commented on a change in pull request #24232: [SPARK-27297] [SQL] Add higher order functions to scala API
ueshin commented on a change in pull request #24232: [SPARK-27297] [SQL] Add higher order functions to scala API URL: https://github.com/apache/spark/pull/24232#discussion_r315992363 ## File path: sql/core/src/test/scala/org/apache/spark/sql/DataFrameFunctionsSuite.scala ## @@ -1917,19 +1921,33 @@ class DataFrameFunctionsSuite extends QueryTest with SharedSparkSession { null ).toDF("i") +// transform(i, x -> x + 1) +val resA = Seq( + Row(Seq(2, 10, 9, 8)), + Row(Seq(6, 9, 10, 8, 3)), + Row(Seq.empty), + Row(null)) + +// transform(i, (x, i) -> x + i) +val resB = Seq( + Row(Seq(1, 10, 10, 10)), + Row(Seq(5, 9, 11, 10, 6)), + Row(Seq.empty), + Row(null)) + def testArrayOfPrimitiveTypeNotContainsNull(): Unit = { - checkAnswer(df.selectExpr("transform(i, x -> x + 1)"), -Seq( - Row(Seq(2, 10, 9, 8)), - Row(Seq(6, 9, 10, 8, 3)), - Row(Seq.empty), - Row(null))) - checkAnswer(df.selectExpr("transform(i, (x, i) -> x + i)"), -Seq( - Row(Seq(1, 10, 10, 10)), - Row(Seq(5, 9, 11, 10, 6)), - Row(Seq.empty), - Row(null))) + checkAnswer(df.selectExpr("transform(i, x -> x + 1)"), resA) + checkAnswer(df.selectExpr("transform(i, (x, i) -> x + i)"), resB) + + checkAnswer(df.select(transform(col("i"), x => x + 1)), resA) + checkAnswer(df.select(transform(col("i"), (x, i) => x + i)), resB) + + checkAnswer(df.select(transform(col("i"), new JFunc { +def call(x: Column) = x + 1 + })), resA) + checkAnswer(df.select(transform(col("i"), new JFunc2 { +def call(x: Column, i: Column) = x + i + })), resB) Review comment: The Java API tests you added to `DataFrameFunctionsSuite` are not useful. Please move them to `JavaDataFrameSuite` or create `JavaDataFrameFunctionsSuite` with lambda expressions as users usually do like above and see what happens. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA removed a comment on issue #25430: [SPARK-28540][WEBUI] Document Environment page
SparkQA removed a comment on issue #25430: [SPARK-28540][WEBUI] Document Environment page URL: https://github.com/apache/spark/pull/25430#issuecomment-523286026 **[Test build #109459 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/109459/testReport)** for PR 25430 at commit [`172f4e6`](https://github.com/apache/spark/commit/172f4e6cbe97e436346206312f761be01eae9f36). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #25430: [SPARK-28540][WEBUI] Document Environment page
AmplabJenkins commented on issue #25430: [SPARK-28540][WEBUI] Document Environment page URL: https://github.com/apache/spark/pull/25430#issuecomment-523287956 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/109459/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #25430: [SPARK-28540][WEBUI] Document Environment page
AmplabJenkins removed a comment on issue #25430: [SPARK-28540][WEBUI] Document Environment page URL: https://github.com/apache/spark/pull/25430#issuecomment-523287956 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/109459/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #25430: [SPARK-28540][WEBUI] Document Environment page
AmplabJenkins removed a comment on issue #25430: [SPARK-28540][WEBUI] Document Environment page URL: https://github.com/apache/spark/pull/25430#issuecomment-523287949 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #25430: [SPARK-28540][WEBUI] Document Environment page
AmplabJenkins commented on issue #25430: [SPARK-28540][WEBUI] Document Environment page URL: https://github.com/apache/spark/pull/25430#issuecomment-523287949 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on issue #25430: [SPARK-28540][WEBUI] Document Environment page
SparkQA commented on issue #25430: [SPARK-28540][WEBUI] Document Environment page URL: https://github.com/apache/spark/pull/25430#issuecomment-523287891 **[Test build #109459 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/109459/testReport)** for PR 25430 at commit [`172f4e6`](https://github.com/apache/spark/commit/172f4e6cbe97e436346206312f761be01eae9f36). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #25383: [SPARK-13677][ML] Implement Tree-Based Feature Transformation for ML
AmplabJenkins removed a comment on issue #25383: [SPARK-13677][ML] Implement Tree-Based Feature Transformation for ML URL: https://github.com/apache/spark/pull/25383#issuecomment-523287225 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #25383: [SPARK-13677][ML] Implement Tree-Based Feature Transformation for ML
AmplabJenkins removed a comment on issue #25383: [SPARK-13677][ML] Implement Tree-Based Feature Transformation for ML URL: https://github.com/apache/spark/pull/25383#issuecomment-523287226 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/109454/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #25383: [SPARK-13677][ML] Implement Tree-Based Feature Transformation for ML
AmplabJenkins commented on issue #25383: [SPARK-13677][ML] Implement Tree-Based Feature Transformation for ML URL: https://github.com/apache/spark/pull/25383#issuecomment-523287225 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on issue #25512: [WIP][SPARK-28782][SQL] Support explode function on aggregate expressions
SparkQA commented on issue #25512: [WIP][SPARK-28782][SQL] Support explode function on aggregate expressions URL: https://github.com/apache/spark/pull/25512#issuecomment-523287243 **[Test build #109460 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/109460/testReport)** for PR 25512 at commit [`6ca1568`](https://github.com/apache/spark/commit/6ca15681d1041046e6b21436514e9c7c11682db7). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #25383: [SPARK-13677][ML] Implement Tree-Based Feature Transformation for ML
AmplabJenkins commented on issue #25383: [SPARK-13677][ML] Implement Tree-Based Feature Transformation for ML URL: https://github.com/apache/spark/pull/25383#issuecomment-523287226 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/109454/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA removed a comment on issue #25383: [SPARK-13677][ML] Implement Tree-Based Feature Transformation for ML
SparkQA removed a comment on issue #25383: [SPARK-13677][ML] Implement Tree-Based Feature Transformation for ML URL: https://github.com/apache/spark/pull/25383#issuecomment-523275035 **[Test build #109454 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/109454/testReport)** for PR 25383 at commit [`4eb97be`](https://github.com/apache/spark/commit/4eb97be97cc4e509e74181dbab428548e6da28e3). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #25512: [WIP][SPARK-28782][SQL] Support explode function on aggregate expressions
AmplabJenkins removed a comment on issue #25512: [WIP][SPARK-28782][SQL] Support explode function on aggregate expressions URL: https://github.com/apache/spark/pull/25512#issuecomment-523286932 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/14518/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on issue #25383: [SPARK-13677][ML] Implement Tree-Based Feature Transformation for ML
SparkQA commented on issue #25383: [SPARK-13677][ML] Implement Tree-Based Feature Transformation for ML URL: https://github.com/apache/spark/pull/25383#issuecomment-523287073 **[Test build #109454 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/109454/testReport)** for PR 25383 at commit [`4eb97be`](https://github.com/apache/spark/commit/4eb97be97cc4e509e74181dbab428548e6da28e3). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #25512: [WIP][SPARK-28782][SQL] Support explode function on aggregate expressions
AmplabJenkins removed a comment on issue #25512: [WIP][SPARK-28782][SQL] Support explode function on aggregate expressions URL: https://github.com/apache/spark/pull/25512#issuecomment-523286929 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #25512: [WIP][SPARK-28782][SQL] Support explode function on aggregate expressions
AmplabJenkins commented on issue #25512: [WIP][SPARK-28782][SQL] Support explode function on aggregate expressions URL: https://github.com/apache/spark/pull/25512#issuecomment-523286929 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #25512: [WIP][SPARK-28782][SQL] Support explode function on aggregate expressions
AmplabJenkins commented on issue #25512: [WIP][SPARK-28782][SQL] Support explode function on aggregate expressions URL: https://github.com/apache/spark/pull/25512#issuecomment-523286932 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/14518/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] cloud-fan commented on a change in pull request #25368: [SPARK-28635][SQL] create CatalogManager to track registered v2 catalogs
cloud-fan commented on a change in pull request #25368: [SPARK-28635][SQL] create CatalogManager to track registered v2 catalogs URL: https://github.com/apache/spark/pull/25368#discussion_r315991219 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalog/v2/CatalogManager.scala ## @@ -0,0 +1,100 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.spark.sql.catalog.v2 + +import scala.collection.mutable +import scala.util.control.NonFatal + +import org.apache.spark.internal.Logging +import org.apache.spark.sql.internal.SQLConf + +/** + * A thread-safe manager for [[CatalogPlugin]]s. It tracks all the registered catalogs, and allow + * the caller to look up a catalog by name. + */ +class CatalogManager(conf: SQLConf) extends Logging { + + private val catalogs = mutable.HashMap.empty[String, CatalogPlugin] + + def catalog(name: String): CatalogPlugin = synchronized { +catalogs.getOrElseUpdate(name, Catalogs.load(name, conf)) + } + + def defaultCatalog: Option[CatalogPlugin] = { +conf.defaultV2Catalog.flatMap { catalogName => + try { +Some(catalog(catalogName)) + } catch { +case NonFatal(e) => + logError(s"Cannot load default v2 catalog: $catalogName", e) + None + } +} + } + + def v2SessionCatalog: Option[CatalogPlugin] = { +try { + Some(catalog(CatalogManager.SESSION_CATALOG_NAME)) +} catch { + case NonFatal(e) => +logError("Cannot load v2 session catalog", e) +None +} + } + + private def getDefaultNamespace(c: CatalogPlugin) = c match { +case c: SupportsNamespaces => c.defaultNamespace() +case _ => Array.empty[String] + } + + private var _currentNamespace = { +// The builtin catalog use "default" as the default database. +defaultCatalog.map(getDefaultNamespace).getOrElse(Array("default")) Review comment: Good catch! The `currentNamespace` and `currentCatalog` are not used anywhere and I was planning to add tests when I implement switching current namespace/catalog. Let me add some simple tests to reflect runtime config changes first. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #25515: [SPARK-27659][PYTHON] Allow PySpark to prefetch during toLocalIterator
AmplabJenkins removed a comment on issue #25515: [SPARK-27659][PYTHON] Allow PySpark to prefetch during toLocalIterator URL: https://github.com/apache/spark/pull/25515#issuecomment-523286206 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/109448/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #25515: [SPARK-27659][PYTHON] Allow PySpark to prefetch during toLocalIterator
AmplabJenkins removed a comment on issue #25515: [SPARK-27659][PYTHON] Allow PySpark to prefetch during toLocalIterator URL: https://github.com/apache/spark/pull/25515#issuecomment-523286205 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #25515: [SPARK-27659][PYTHON] Allow PySpark to prefetch during toLocalIterator
AmplabJenkins commented on issue #25515: [SPARK-27659][PYTHON] Allow PySpark to prefetch during toLocalIterator URL: https://github.com/apache/spark/pull/25515#issuecomment-523286205 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #25515: [SPARK-27659][PYTHON] Allow PySpark to prefetch during toLocalIterator
AmplabJenkins commented on issue #25515: [SPARK-27659][PYTHON] Allow PySpark to prefetch during toLocalIterator URL: https://github.com/apache/spark/pull/25515#issuecomment-523286206 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/109448/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] WeichenXu123 commented on issue #25512: [WIP][SPARK-28782][SQL] Support explode function on aggregate expressions
WeichenXu123 commented on issue #25512: [WIP][SPARK-28782][SQL] Support explode function on aggregate expressions URL: https://github.com/apache/spark/pull/25512#issuecomment-523286194 Jenkins retest this please. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on issue #25430: [SPARK-28540][WEBUI] Document Environment page
SparkQA commented on issue #25430: [SPARK-28540][WEBUI] Document Environment page URL: https://github.com/apache/spark/pull/25430#issuecomment-523286026 **[Test build #109459 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/109459/testReport)** for PR 25430 at commit [`172f4e6`](https://github.com/apache/spark/commit/172f4e6cbe97e436346206312f761be01eae9f36). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA removed a comment on issue #25515: [SPARK-27659][PYTHON] Allow PySpark to prefetch during toLocalIterator
SparkQA removed a comment on issue #25515: [SPARK-27659][PYTHON] Allow PySpark to prefetch during toLocalIterator URL: https://github.com/apache/spark/pull/25515#issuecomment-523252585 **[Test build #109448 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/109448/testReport)** for PR 25515 at commit [`e0327a2`](https://github.com/apache/spark/commit/e0327a24e48c4ba7a483ef2590c9dd72c6bedfc5). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on issue #25515: [SPARK-27659][PYTHON] Allow PySpark to prefetch during toLocalIterator
SparkQA commented on issue #25515: [SPARK-27659][PYTHON] Allow PySpark to prefetch during toLocalIterator URL: https://github.com/apache/spark/pull/25515#issuecomment-523285855 **[Test build #109448 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/109448/testReport)** for PR 25515 at commit [`e0327a2`](https://github.com/apache/spark/commit/e0327a24e48c4ba7a483ef2590c9dd72c6bedfc5). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #25430: [SPARK-28540][WEBUI] Document Environment page
AmplabJenkins removed a comment on issue #25430: [SPARK-28540][WEBUI] Document Environment page URL: https://github.com/apache/spark/pull/25430#issuecomment-523285672 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/14517/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #25430: [SPARK-28540][WEBUI] Document Environment page
AmplabJenkins removed a comment on issue #25430: [SPARK-28540][WEBUI] Document Environment page URL: https://github.com/apache/spark/pull/25430#issuecomment-523285669 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #25516: [SPARK-26895][CORE][branch-2.4] prepareSubmitEnvironment should be called within doAs for proxy users
AmplabJenkins commented on issue #25516: [SPARK-26895][CORE][branch-2.4] prepareSubmitEnvironment should be called within doAs for proxy users URL: https://github.com/apache/spark/pull/25516#issuecomment-523285595 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/109441/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #25430: [SPARK-28540][WEBUI] Document Environment page
AmplabJenkins commented on issue #25430: [SPARK-28540][WEBUI] Document Environment page URL: https://github.com/apache/spark/pull/25430#issuecomment-523285672 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/14517/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #25430: [SPARK-28540][WEBUI] Document Environment page
AmplabJenkins commented on issue #25430: [SPARK-28540][WEBUI] Document Environment page URL: https://github.com/apache/spark/pull/25430#issuecomment-523285669 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #25516: [SPARK-26895][CORE][branch-2.4] prepareSubmitEnvironment should be called within doAs for proxy users
AmplabJenkins removed a comment on issue #25516: [SPARK-26895][CORE][branch-2.4] prepareSubmitEnvironment should be called within doAs for proxy users URL: https://github.com/apache/spark/pull/25516#issuecomment-523285588 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #25516: [SPARK-26895][CORE][branch-2.4] prepareSubmitEnvironment should be called within doAs for proxy users
AmplabJenkins removed a comment on issue #25516: [SPARK-26895][CORE][branch-2.4] prepareSubmitEnvironment should be called within doAs for proxy users URL: https://github.com/apache/spark/pull/25516#issuecomment-523285595 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/109441/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #25516: [SPARK-26895][CORE][branch-2.4] prepareSubmitEnvironment should be called within doAs for proxy users
AmplabJenkins commented on issue #25516: [SPARK-26895][CORE][branch-2.4] prepareSubmitEnvironment should be called within doAs for proxy users URL: https://github.com/apache/spark/pull/25516#issuecomment-523285588 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA removed a comment on issue #25516: [SPARK-26895][CORE][branch-2.4] prepareSubmitEnvironment should be called within doAs for proxy users
SparkQA removed a comment on issue #25516: [SPARK-26895][CORE][branch-2.4] prepareSubmitEnvironment should be called within doAs for proxy users URL: https://github.com/apache/spark/pull/25516#issuecomment-523224772 **[Test build #109441 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/109441/testReport)** for PR 25516 at commit [`df9eace`](https://github.com/apache/spark/commit/df9eace2c6a6d96783a083050ad58e055ad94886). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on issue #25516: [SPARK-26895][CORE][branch-2.4] prepareSubmitEnvironment should be called within doAs for proxy users
SparkQA commented on issue #25516: [SPARK-26895][CORE][branch-2.4] prepareSubmitEnvironment should be called within doAs for proxy users URL: https://github.com/apache/spark/pull/25516#issuecomment-523285289 **[Test build #109441 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/109441/testReport)** for PR 25516 at commit [`df9eace`](https://github.com/apache/spark/commit/df9eace2c6a6d96783a083050ad58e055ad94886). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on issue #24232: [SPARK-27297] [SQL] Add higher order functions to scala API
SparkQA commented on issue #24232: [SPARK-27297] [SQL] Add higher order functions to scala API URL: https://github.com/apache/spark/pull/24232#issuecomment-523284745 **[Test build #109458 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/109458/testReport)** for PR 24232 at commit [`182a08b`](https://github.com/apache/spark/commit/182a08ba872c99bc062d8bcd87f0b930f9bae96c). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] keypointt commented on issue #25465: [SPARK-28747][SQL] merge the two data source v2 fallback configs
keypointt commented on issue #25465: [SPARK-28747][SQL] merge the two data source v2 fallback configs URL: https://github.com/apache/spark/pull/25465#issuecomment-523284861 > > I'm seeing a lot of renaming from USE_V1_SOURCE_READER_LIST to USE_V1_SOURCE_LIST in test classes > > This is the proposed change: merge the confs and have a new name. I see. sorry missed it... thank you Wenchen This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] nvander1 commented on a change in pull request #24232: [SPARK-27297] [SQL] Add higher order functions to scala API
nvander1 commented on a change in pull request #24232: [SPARK-27297] [SQL] Add higher order functions to scala API URL: https://github.com/apache/spark/pull/24232#discussion_r315989115 ## File path: sql/core/src/test/scala/org/apache/spark/sql/DataFrameFunctionsSuite.scala ## @@ -1917,19 +1921,33 @@ class DataFrameFunctionsSuite extends QueryTest with SharedSparkSession { null ).toDF("i") +// transform(i, x -> x + 1) +val resA = Seq( + Row(Seq(2, 10, 9, 8)), + Row(Seq(6, 9, 10, 8, 3)), + Row(Seq.empty), + Row(null)) + +// transform(i, (x, i) -> x + i) +val resB = Seq( + Row(Seq(1, 10, 10, 10)), + Row(Seq(5, 9, 11, 10, 6)), + Row(Seq.empty), + Row(null)) + def testArrayOfPrimitiveTypeNotContainsNull(): Unit = { - checkAnswer(df.selectExpr("transform(i, x -> x + 1)"), -Seq( - Row(Seq(2, 10, 9, 8)), - Row(Seq(6, 9, 10, 8, 3)), - Row(Seq.empty), - Row(null))) - checkAnswer(df.selectExpr("transform(i, (x, i) -> x + i)"), -Seq( - Row(Seq(1, 10, 10, 10)), - Row(Seq(5, 9, 11, 10, 6)), - Row(Seq.empty), - Row(null))) + checkAnswer(df.selectExpr("transform(i, x -> x + 1)"), resA) + checkAnswer(df.selectExpr("transform(i, (x, i) -> x + i)"), resB) + + checkAnswer(df.select(transform(col("i"), x => x + 1)), resA) + checkAnswer(df.select(transform(col("i"), (x, i) => x + i)), resB) + + checkAnswer(df.select(transform(col("i"), new JFunc { +def call(x: Column) = x + 1 + })), resA) + checkAnswer(df.select(transform(col("i"), new JFunc2 { +def call(x: Column, i: Column) = x + i + })), resB) Review comment: Finished adding tests for java overload to the DataFrameFunctionsSuite. If it turns out we need to move it after investigating your error @ueshin, then I can do that too. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #24232: [SPARK-27297] [SQL] Add higher order functions to scala API
AmplabJenkins commented on issue #24232: [SPARK-27297] [SQL] Add higher order functions to scala API URL: https://github.com/apache/spark/pull/24232#issuecomment-523284365 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #24232: [SPARK-27297] [SQL] Add higher order functions to scala API
AmplabJenkins removed a comment on issue #24232: [SPARK-27297] [SQL] Add higher order functions to scala API URL: https://github.com/apache/spark/pull/24232#issuecomment-523284365 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #24232: [SPARK-27297] [SQL] Add higher order functions to scala API
AmplabJenkins removed a comment on issue #24232: [SPARK-27297] [SQL] Add higher order functions to scala API URL: https://github.com/apache/spark/pull/24232#issuecomment-523284371 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/14516/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #24232: [SPARK-27297] [SQL] Add higher order functions to scala API
AmplabJenkins commented on issue #24232: [SPARK-27297] [SQL] Add higher order functions to scala API URL: https://github.com/apache/spark/pull/24232#issuecomment-523284371 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/14516/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] nvander1 edited a comment on issue #24232: [SPARK-27297] [SQL] Add higher order functions to scala API
nvander1 edited a comment on issue #24232: [SPARK-27297] [SQL] Add higher order functions to scala API URL: https://github.com/apache/spark/pull/24232#issuecomment-520112443 I'm also adding tests for the overloads that accept Java Functional Interfaces. - ~transform~ - ~map_filter~ - ~filter~ - ~exists~ - ~forall~ - ~aggregate~ - ~map_zip_with~ - ~transform_keys~ - ~transform_values~ - ~zip_with~ This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] ajithme commented on issue #25448: [SPARK-28697][SQL] Invalidate Database/Table names starting with underscore
ajithme commented on issue #25448: [SPARK-28697][SQL] Invalidate Database/Table names starting with underscore URL: https://github.com/apache/spark/pull/25448#issuecomment-523283547 > According to the discussions in this PR, seems there are many other databases allowing the identifiers to start with `_`. Spark can support it as well without hive support. Shall we add the check only in `HiveExternalCatalog.createTable`? @cloud-fan @HyukjinKwon This for inputs. Would it not differ spark behavior when hive support is enabled or disabled.? (Hive cannot support _ starting identifiers due to limitations in FileFormat which affect its extended classes too) I think the identifier support inside spark should be uniform. Please correct me if i am wrong. Else i am ok to add check only at ``HiveExternalCatalog.createTable`` so that we can get more proper error message and mark this as a limitation when hive support is enabled This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #25518: [SPARK-26046][SS] Add StreamingQueryManager.removeAllListeners()
AmplabJenkins removed a comment on issue #25518: [SPARK-26046][SS] Add StreamingQueryManager.removeAllListeners() URL: https://github.com/apache/spark/pull/25518#issuecomment-523282738 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/109447/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org