[GitHub] spark issue #14034: [SPARK-16355] [SPARK-16354] [SQL] Fix Bugs When LIMIT/TA...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/14034 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #14034: [SPARK-16355] [SPARK-16354] [SQL] Fix Bugs When LIMIT/TA...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/14034 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/62073/ Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #14131: [SPARK-16318][SQL] Implement all remaining xpath functio...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/14131 **[Test build #62076 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/62076/consoleFull)** for PR 14131 at commit [`4d6f654`](https://github.com/apache/spark/commit/4d6f6544be4373a32150fd6d59ba539d3fcb6aab). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #14034: [SPARK-16355] [SPARK-16354] [SQL] Fix Bugs When LIMIT/TA...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/14034 **[Test build #62073 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/62073/consoleFull)** for PR 14034 at commit [`2e6f8d8`](https://github.com/apache/spark/commit/2e6f8d8c8b5007302415b7fd984a38fc51be44bf). * This patch **fails Spark unit tests**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #14114: [SPARK-16458][SQL] SessionCatalog should support ...
Github user dongjoon-hyun commented on a diff in the pull request: https://github.com/apache/spark/pull/14114#discussion_r70204413 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/catalog/SessionCatalog.scala --- @@ -246,9 +246,27 @@ class SessionCatalog( def getTableMetadata(name: TableIdentifier): CatalogTable = { --- End diff -- Yep. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #14131: [SPARK-16318][SQL] Implement all remaining xpath functio...
Github user petermaxlee commented on the issue: https://github.com/apache/spark/pull/14131 cc @cloud-fan --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #14114: [SPARK-16458][SQL] SessionCatalog should support ...
Github user dongjoon-hyun commented on a diff in the pull request: https://github.com/apache/spark/pull/14114#discussion_r70204393 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/catalog/SessionCatalog.scala --- @@ -425,10 +443,11 @@ class SessionCatalog( def tableExists(name: TableIdentifier): Boolean = synchronized { --- End diff -- Sure. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #13991: [SPARK-16318][SQL] Implement all remaining xpath functio...
Github user petermaxlee commented on the issue: https://github.com/apache/spark/pull/13991 Here it is for branch-2.0. https://github.com/apache/spark/pull/14131 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #14131: [SPARK-16318][SQL] Implement all remaining xpath ...
GitHub user petermaxlee opened a pull request: https://github.com/apache/spark/pull/14131 [SPARK-16318][SQL] Implement all remaining xpath functions (branch-2.0) ## What changes were proposed in this pull request? This patch implements all remaining xpath functions that Hive supports and not natively supported in Spark: xpath_int, xpath_short, xpath_long, xpath_float, xpath_double, xpath_string, and xpath. This is based on https://github.com/apache/spark/pull/13991 but for branch-2.0. ## How was this patch tested? Added unit tests and end-to-end tests. You can merge this pull request into a Git repository by running: $ git pull https://github.com/petermaxlee/spark xpath-branch-2.0 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/14131.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #14131 commit 4d6f6544be4373a32150fd6d59ba539d3fcb6aab Author: petermaxleeDate: 2016-07-11T05:28:34Z [SPARK-16318][SQL] Implement all remaining xpath functions This patch implements all remaining xpath functions that Hive supports and not natively supported in Spark: xpath_int, xpath_short, xpath_long, xpath_float, xpath_double, xpath_string, and xpath. Added unit tests and end-to-end tests. Author: petermaxlee Closes #13991 from petermaxlee/SPARK-16318. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #14112: [SPARK-16240][ML] Model loading backward compatibility f...
Github user yanboliang commented on the issue: https://github.com/apache/spark/pull/14112 @hhbyyh I think offline test should be OK for now, since we don't have unified save/load compatibility test framework until now. It's better we can get this feature in the next RC. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #13969: [SPARK-16284][SQL] Implement reflect SQL function
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/13969#discussion_r70204050 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/Reflect.scala --- @@ -0,0 +1,174 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.spark.sql.catalyst.expressions + +import java.lang.reflect.Method + +import scala.util.Try + +import org.apache.spark.sql.catalyst.InternalRow +import org.apache.spark.sql.catalyst.analysis.TypeCheckResult +import org.apache.spark.sql.catalyst.analysis.TypeCheckResult.{TypeCheckFailure, TypeCheckSuccess} +import org.apache.spark.sql.catalyst.expressions.codegen.CodegenFallback +import org.apache.spark.sql.types._ +import org.apache.spark.unsafe.types.UTF8String +import org.apache.spark.util.Utils + +/** + * An expression that invokes a method on a class via reflection. + * + * For now, only types defined in `Reflect.typeMapping` are supported (basically primitives + * and string) as input types, and the output is turned automatically to a string. + * + * @param children the first element should be a literal string for the class name, + * and the second element should be a literal string for the method name, + * and the remaining are input arguments to the Java method. + */ +// scalastyle:off line.size.limit +@ExpressionDescription( + usage = "_FUNC_(class,method[,arg1[,arg2..]]) calls method with reflection", + extended = "> SELECT _FUNC_('java.util.UUID', 'randomUUID');\n c33fb387-8500-4bfa-81d2-6e0e3e930df2") +// scalastyle:on line.size.limit +case class Reflect(children: Seq[Expression]) --- End diff -- `CallMethodUsingReflect`? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #13991: [SPARK-16318][SQL] Implement all remaining xpath functio...
Github user petermaxlee commented on the issue: https://github.com/apache/spark/pull/13991 @cloud-fan thanks for merging! @yhuai I think the degree to which we want to add more tests also depend on how much we trust the library we are using. XPath (with Query) is almost as complicated as SQL itself. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #13991: [SPARK-16318][SQL] Implement all remaining xpath ...
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/13991 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #13991: [SPARK-16318][SQL] Implement all remaining xpath functio...
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/13991 Thanks, merging to master! This doesn't merge clearly to 2.0, @petermaxlee can you submit a new PR against 2.0? thanks! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #13969: [SPARK-16284][SQL] Implement reflect SQL function
Github user rxin commented on the issue: https://github.com/apache/spark/pull/13969 You can also remove half of the test cases. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #13969: [SPARK-16284][SQL] Implement reflect SQL function
Github user rxin commented on the issue: https://github.com/apache/spark/pull/13969 One thing ... it might be better to remove the ability to call non-static methods. At least to me it'd make the things slightly simpler and more clear. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #13991: [SPARK-16318][SQL] Implement all remaining xpath functio...
Github user yhuai commented on the issue: https://github.com/apache/spark/pull/13991 OK. Thanks. Then, it will be good to add more tests for cases that are not covered by those hive tests. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #11317: [SPARK-12639] [SQL] Mark Filters Fully Handled By Source...
Github user yhuai commented on the issue: https://github.com/apache/spark/pull/11317 tes this please --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #13991: [SPARK-16318][SQL] Implement all remaining xpath functio...
Github user petermaxlee commented on the issue: https://github.com/apache/spark/pull/13991 BTW I have to say Hive's test coverage in this area is very spotty, so I don't actually think it's great to follow, but I used those. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #13991: [SPARK-16318][SQL] Implement all remaining xpath functio...
Github user petermaxlee commented on the issue: https://github.com/apache/spark/pull/13991 Actually I created the unit tests based on those. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #14114: [SPARK-16458][SQL] SessionCatalog should support ...
Github user rxin commented on a diff in the pull request: https://github.com/apache/spark/pull/14114#discussion_r70203557 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/catalog/SessionCatalog.scala --- @@ -246,9 +246,27 @@ class SessionCatalog( def getTableMetadata(name: TableIdentifier): CatalogTable = { --- End diff -- same thing here - update SessionCatalogSuite. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #13991: [SPARK-16318][SQL] Implement all remaining xpath functio...
Github user yhuai commented on the issue: https://github.com/apache/spark/pull/13991 As a follow-up task. Can you take a look at the following query files and add useful tests in your test? Thanks. ``` .//sql/hive/src/test/resources/ql/src/test/queries/clientpositive/describe_xpath.q .//sql/hive/src/test/resources/ql/src/test/queries/clientpositive/input_testxpath.q .//sql/hive/src/test/resources/ql/src/test/queries/clientpositive/input_testxpath2.q .//sql/hive/src/test/resources/ql/src/test/queries/clientpositive/input_testxpath3.q .//sql/hive/src/test/resources/ql/src/test/queries/clientpositive/input_testxpath4.q .//sql/hive/src/test/resources/ql/src/test/queries/clientpositive/udf_xpath.q .//sql/hive/src/test/resources/ql/src/test/queries/clientpositive/udf_xpath_boolean.q .//sql/hive/src/test/resources/ql/src/test/queries/clientpositive/udf_xpath_double.q .//sql/hive/src/test/resources/ql/src/test/queries/clientpositive/udf_xpath_float.q .//sql/hive/src/test/resources/ql/src/test/queries/clientpositive/udf_xpath_int.q .//sql/hive/src/test/resources/ql/src/test/queries/clientpositive/udf_xpath_long.q .//sql/hive/src/test/resources/ql/src/test/queries/clientpositive/udf_xpath_short.q .//sql/hive/src/test/resources/ql/src/test/queries/clientpositive/udf_xpath_string.q ``` --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #14114: [SPARK-16458][SQL] SessionCatalog should support ...
Github user rxin commented on a diff in the pull request: https://github.com/apache/spark/pull/14114#discussion_r70203553 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/catalog/SessionCatalog.scala --- @@ -425,10 +443,11 @@ class SessionCatalog( def tableExists(name: TableIdentifier): Boolean = synchronized { --- End diff -- can you update SessionCatalogSuite to reflect this behavior? I think we weren't checking temp tables in the past. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #13969: [SPARK-16284][SQL] Implement reflect SQL function
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/13969#discussion_r70203510 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/Reflect.scala --- @@ -0,0 +1,174 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.spark.sql.catalyst.expressions + +import java.lang.reflect.Method + +import scala.util.Try + +import org.apache.spark.sql.catalyst.InternalRow +import org.apache.spark.sql.catalyst.analysis.TypeCheckResult +import org.apache.spark.sql.catalyst.analysis.TypeCheckResult.{TypeCheckFailure, TypeCheckSuccess} +import org.apache.spark.sql.catalyst.expressions.codegen.CodegenFallback +import org.apache.spark.sql.types._ +import org.apache.spark.unsafe.types.UTF8String +import org.apache.spark.util.Utils + +/** + * An expression that invokes a method on a class via reflection. + * + * For now, only types defined in `Reflect.typeMapping` are supported (basically primitives + * and string) as input types, and the output is turned automatically to a string. + * + * @param children the first element should be a literal string for the class name, + * and the second element should be a literal string for the method name, + * and the remaining are input arguments to the Java method. + */ +// scalastyle:off line.size.limit +@ExpressionDescription( + usage = "_FUNC_(class,method[,arg1[,arg2..]]) calls method with reflection", + extended = "> SELECT _FUNC_('java.util.UUID', 'randomUUID');\n c33fb387-8500-4bfa-81d2-6e0e3e930df2") +// scalastyle:on line.size.limit +case class Reflect(children: Seq[Expression]) + extends Expression with CodegenFallback { + + override def prettyName: String = "reflect" + + override def checkInputDataTypes(): TypeCheckResult = { +if (children.size < 2) { + TypeCheckFailure("requires at least two arguments") +} else if (!children.take(2).forall(e => e.dataType == StringType && e.foldable)) { + // The first two arguments must be string type. + TypeCheckFailure("first two arguments should be string literals") +} else if (!classExists) { + TypeCheckFailure(s"class $className not found") +} else if (method == null) { + TypeCheckFailure(s"cannot find a method that matches the argument types in $className") +} else { + TypeCheckSuccess +} + } + + override def deterministic: Boolean = false + override def nullable: Boolean = true + override val dataType: DataType = StringType + + override def eval(input: InternalRow): Any = { +var i = 0 +while (i < argExprs.length) { + buffer(i) = argExprs(i).eval(input).asInstanceOf[Object] + // Convert if necessary. Based on the types defined in typeMapping, string is the only + // type that needs conversion. If we support timestamps, dates, decimals, arrays, or maps + // in the future, proper conversion needs to happen here too. + if (buffer(i).isInstanceOf[UTF8String]) { +buffer(i) = buffer(i).toString + } + i += 1 +} +UTF8String.fromString(String.valueOf(method.invoke(obj, buffer : _*))) + } + + @transient private lazy val argExprs: Array[Expression] = children.drop(2).toArray + + /** Name of the class -- this has to be called after we verify children has at least two exprs. */ + @transient private lazy val className = children(0).eval().asInstanceOf[UTF8String].toString + + /** True if the class exists and can be loaded. */ + @transient private lazy val classExists = Reflect.classExists(className) + + /** The reflection method. */ + @transient lazy val method: Method = { +val methodName = children(1).eval(null).asInstanceOf[UTF8String].toString +Reflect.findMethod(className, methodName, argExprs.map(_.dataType)).orNull
[GitHub] spark issue #13969: [SPARK-16284][SQL] Implement reflect SQL function
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/13969 what's hive's behaviour if calling a non-static method but the class doesn't have no-arg constructor? null or exception? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #12414: [SPARK-14657] [ML] RFormula w/o intercept should output ...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/12414 **[Test build #62074 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/62074/consoleFull)** for PR 12414 at commit [`167beae`](https://github.com/apache/spark/commit/167beae592d084ead74d2361dcc3a19d0d53d60b). * This patch **fails to build**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #14120: [SPARK-16199][SQL] Add a method to list the referenced c...
Github user rxin commented on the issue: https://github.com/apache/spark/pull/14120 This can be updated once #14130 is merged. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #13969: [SPARK-16284][SQL] Implement reflect SQL function
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/13969#discussion_r70203430 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/MiscFunctionsSuite.scala --- @@ -0,0 +1,35 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.spark.sql + +import org.apache.spark.sql.test.SharedSQLContext + +class MiscFunctionsSuite extends QueryTest with SharedSQLContext { + import testImplicits._ + + test("reflect and java_method") { +val df = Seq((1, "one")).toDF("a", "b") +checkAnswer( + df.selectExpr("reflect('org.apache.spark.sql.ReflectClass', 'method1', a, b)"), --- End diff -- oh i see, `reflect('org.apache.spark.sql.ReflectClass', 'method1', a, b)` is not equal to `ReflectClass.method1`, but calling the static method defined in the `ReflectClass` class. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #12414: [SPARK-14657] [ML] RFormula w/o intercept should output ...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/12414 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/62074/ Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #12414: [SPARK-14657] [ML] RFormula w/o intercept should output ...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/12414 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #14130: [SPARK-16477] Bump master version to 2.1.0-SNAPSHOT
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/14130 **[Test build #62075 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/62075/consoleFull)** for PR 14130 at commit [`2915bf1`](https://github.com/apache/spark/commit/2915bf1e79a28a1df0fbc895068fcf8bee2095b0). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #14130: [SPARK-16477] Bump master version to 2.1.0-SNAPSH...
GitHub user rxin opened a pull request: https://github.com/apache/spark/pull/14130 [SPARK-16477] Bump master version to 2.1.0-SNAPSHOT ## What changes were proposed in this pull request? After SPARK-16476, we can finally bump the version number. ## How was this patch tested? N/A You can merge this pull request into a Git repository by running: $ git pull https://github.com/rxin/spark SPARK-16477 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/14130.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #14130 commit 2915bf1e79a28a1df0fbc895068fcf8bee2095b0 Author: Reynold XinDate: 2016-07-11T05:10:43Z [SPARK-16477] Bump master version to 2.1.0-SNAPSHOT --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #14130: [SPARK-16477] Bump master version to 2.1.0-SNAPSHOT
Github user rxin commented on the issue: https://github.com/apache/spark/pull/14130 cc @srowen --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #13969: [SPARK-16284][SQL] Implement reflect SQL function
Github user petermaxlee commented on a diff in the pull request: https://github.com/apache/spark/pull/13969#discussion_r70202991 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/MiscFunctionsSuite.scala --- @@ -0,0 +1,35 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.spark.sql + +import org.apache.spark.sql.test.SharedSQLContext + +class MiscFunctionsSuite extends QueryTest with SharedSQLContext { + import testImplicits._ + + test("reflect and java_method") { +val df = Seq((1, "one")).toDF("a", "b") +checkAnswer( + df.selectExpr("reflect('org.apache.spark.sql.ReflectClass', 'method1', a, b)"), --- End diff -- You should decompile the right class (don't decompile the one with a dollar sign). Static methods are generated too. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #12414: [SPARK-14657] [ML] RFormula w/o intercept should output ...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/12414 **[Test build #62074 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/62074/consoleFull)** for PR 12414 at commit [`167beae`](https://github.com/apache/spark/commit/167beae592d084ead74d2361dcc3a19d0d53d60b). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #13969: [SPARK-16284][SQL] Implement reflect SQL function
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/13969#discussion_r70202906 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/MiscFunctionsSuite.scala --- @@ -0,0 +1,35 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.spark.sql + +import org.apache.spark.sql.test.SharedSQLContext + +class MiscFunctionsSuite extends QueryTest with SharedSQLContext { + import testImplicits._ + + test("reflect and java_method") { +val df = Seq((1, "one")).toDF("a", "b") +checkAnswer( + df.selectExpr("reflect('org.apache.spark.sql.ReflectClass', 'method1', a, b)"), --- End diff -- no, method defined in companion object is not static method, but a normal method defined in a singleton class. You can decompile the class file to check it. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #13704: [SPARK-15985][SQL] Reduce runtime overhead of a program ...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/13704 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/62071/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #13704: [SPARK-15985][SQL] Reduce runtime overhead of a program ...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/13704 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #13704: [SPARK-15985][SQL] Reduce runtime overhead of a program ...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/13704 **[Test build #62071 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/62071/consoleFull)** for PR 13704 at commit [`66800fa`](https://github.com/apache/spark/commit/66800faaebf72e492ee7693d81f8dba980f1dab2). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #14090: [SPARK-16112][SparkR] Programming guide for gappl...
Github user NarineK commented on a diff in the pull request: https://github.com/apache/spark/pull/14090#discussion_r70202736 --- Diff: docs/sparkr.md --- @@ -306,6 +306,64 @@ head(ldf, 3) {% endhighlight %} + Run a given function on a large dataset grouping by input column(s) and using `gapply` or `gapplyCollect` + +# gapply +Apply a function to each group of a `SparkDataFrame`. The function is to be applied to each group of the `SparkDataFrame` and should have only two parameters: grouping key and R `data.frame` corresponding to +that key. The groups are chosen from `SparkDataFrame`s column(s). +The output of function should be a `data.frame`. Schema specifies the row format of the resulting +`SparkDataFrame`. It must match the R function's output. --- End diff -- Thanks, I was looking at types.R file and have noticed that we have NA's for array, map and struct. https://github.com/apache/spark/blob/master/R/pkg/R/types.R#L42 But I guess in our case we can have: array, map and struct mapped to array, map and struct correspondingly ?! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #14128: [SPARK-16476] Restructure MimaExcludes for easier...
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/14128 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #14128: [SPARK-16476] Restructure MimaExcludes for easier union ...
Github user rxin commented on the issue: https://github.com/apache/spark/pull/14128 I'm going to merge this since this is a simple formatting change. I will submit a patch that updates the pom files and show what this can do. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #14128: [SPARK-16476] Restructure MimaExcludes for easier union ...
Github user rxin commented on the issue: https://github.com/apache/spark/pull/14128 Merging in master/2.0. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #13969: [SPARK-16284][SQL] Implement reflect SQL function
Github user petermaxlee commented on a diff in the pull request: https://github.com/apache/spark/pull/13969#discussion_r70202613 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/Reflect.scala --- @@ -0,0 +1,174 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.spark.sql.catalyst.expressions + +import java.lang.reflect.Method + +import scala.util.Try + +import org.apache.spark.sql.catalyst.InternalRow +import org.apache.spark.sql.catalyst.analysis.TypeCheckResult +import org.apache.spark.sql.catalyst.analysis.TypeCheckResult.{TypeCheckFailure, TypeCheckSuccess} +import org.apache.spark.sql.catalyst.expressions.codegen.CodegenFallback +import org.apache.spark.sql.types._ +import org.apache.spark.unsafe.types.UTF8String +import org.apache.spark.util.Utils + +/** + * An expression that invokes a method on a class via reflection. + * + * For now, only types defined in `Reflect.typeMapping` are supported (basically primitives + * and string) as input types, and the output is turned automatically to a string. + * + * @param children the first element should be a literal string for the class name, + * and the second element should be a literal string for the method name, + * and the remaining are input arguments to the Java method. + */ +// scalastyle:off line.size.limit +@ExpressionDescription( + usage = "_FUNC_(class,method[,arg1[,arg2..]]) calls method with reflection", + extended = "> SELECT _FUNC_('java.util.UUID', 'randomUUID');\n c33fb387-8500-4bfa-81d2-6e0e3e930df2") +// scalastyle:on line.size.limit +case class Reflect(children: Seq[Expression]) + extends Expression with CodegenFallback { + + override def prettyName: String = "reflect" + + override def checkInputDataTypes(): TypeCheckResult = { +if (children.size < 2) { + TypeCheckFailure("requires at least two arguments") +} else if (!children.take(2).forall(e => e.dataType == StringType && e.foldable)) { + // The first two arguments must be string type. + TypeCheckFailure("first two arguments should be string literals") +} else if (!classExists) { + TypeCheckFailure(s"class $className not found") +} else if (method == null) { + TypeCheckFailure(s"cannot find a method that matches the argument types in $className") +} else { + TypeCheckSuccess +} + } + + override def deterministic: Boolean = false + override def nullable: Boolean = true + override val dataType: DataType = StringType + + override def eval(input: InternalRow): Any = { +var i = 0 +while (i < argExprs.length) { + buffer(i) = argExprs(i).eval(input).asInstanceOf[Object] + // Convert if necessary. Based on the types defined in typeMapping, string is the only + // type that needs conversion. If we support timestamps, dates, decimals, arrays, or maps + // in the future, proper conversion needs to happen here too. + if (buffer(i).isInstanceOf[UTF8String]) { +buffer(i) = buffer(i).toString + } + i += 1 +} +UTF8String.fromString(String.valueOf(method.invoke(obj, buffer : _*))) + } + + @transient private lazy val argExprs: Array[Expression] = children.drop(2).toArray + + /** Name of the class -- this has to be called after we verify children has at least two exprs. */ + @transient private lazy val className = children(0).eval().asInstanceOf[UTF8String].toString + + /** True if the class exists and can be loaded. */ + @transient private lazy val classExists = Reflect.classExists(className) + + /** The reflection method. */ + @transient lazy val method: Method = { +val methodName = children(1).eval(null).asInstanceOf[UTF8String].toString +Reflect.findMethod(className, methodName,
[GitHub] spark issue #13991: [SPARK-16318][SQL] Implement all remaining xpath functio...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/13991 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/62070/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #13991: [SPARK-16318][SQL] Implement all remaining xpath functio...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/13991 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #14090: [SPARK-16112][SparkR] Programming guide for gappl...
Github user shivaram commented on a diff in the pull request: https://github.com/apache/spark/pull/14090#discussion_r70202560 --- Diff: docs/sparkr.md --- @@ -306,6 +306,64 @@ head(ldf, 3) {% endhighlight %} + Run a given function on a large dataset grouping by input column(s) and using `gapply` or `gapplyCollect` + +# gapply +Apply a function to each group of a `SparkDataFrame`. The function is to be applied to each group of the `SparkDataFrame` and should have only two parameters: grouping key and R `data.frame` corresponding to +that key. The groups are chosen from `SparkDataFrame`s column(s). +The output of function should be a `data.frame`. Schema specifies the row format of the resulting +`SparkDataFrame`. It must match the R function's output. --- End diff -- This looks good to me ! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #13991: [SPARK-16318][SQL] Implement all remaining xpath functio...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/13991 **[Test build #62070 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/62070/consoleFull)** for PR 13991 at commit [`0c60d87`](https://github.com/apache/spark/commit/0c60d87c0dd1b7e78fd77c2f01b67a2ae8a0151e). * This patch passes all tests. * This patch merges cleanly. * This patch adds the following public classes _(experimental)_: * `case class ParseUrl(children: Seq[Expression])` --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #14090: [SPARK-16112][SparkR] Programming guide for gappl...
Github user NarineK commented on a diff in the pull request: https://github.com/apache/spark/pull/14090#discussion_r70202321 --- Diff: docs/sparkr.md --- @@ -306,6 +306,64 @@ head(ldf, 3) {% endhighlight %} + Run a given function on a large dataset grouping by input column(s) and using `gapply` or `gapplyCollect` + +# gapply +Apply a function to each group of a `SparkDataFrame`. The function is to be applied to each group of the `SparkDataFrame` and should have only two parameters: grouping key and R `data.frame` corresponding to +that key. The groups are chosen from `SparkDataFrame`s column(s). +The output of function should be a `data.frame`. Schema specifies the row format of the resulting +`SparkDataFrame`. It must match the R function's output. --- End diff -- Thanks @shivaram. Does the following mapping looks fine to have in the table ? ``` **R Spark** byte byte integer integer float float double double numericdouble character string stringstring binary binary raw binary logical boolean timestamptimestamp date date array array map map structstruct ``` --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #14034: [SPARK-16355] [SPARK-16354] [SQL] Fix Bugs When LIMIT/TA...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/14034 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/62069/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #14034: [SPARK-16355] [SPARK-16354] [SQL] Fix Bugs When LIMIT/TA...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/14034 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #14034: [SPARK-16355] [SPARK-16354] [SQL] Fix Bugs When LIMIT/TA...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/14034 **[Test build #62069 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/62069/consoleFull)** for PR 14034 at commit [`dec5ad9`](https://github.com/apache/spark/commit/dec5ad95bdd003fe58e92d1245388fa4758d8f49). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #14090: [SPARK-16112][SparkR] Programming guide for gappl...
Github user shivaram commented on a diff in the pull request: https://github.com/apache/spark/pull/14090#discussion_r70202064 --- Diff: docs/sparkr.md --- @@ -306,6 +306,64 @@ head(ldf, 3) {% endhighlight %} + Run a given function on a large dataset grouping by input column(s) and using `gapply` or `gapplyCollect` + +# gapply +Apply a function to each group of a `SparkDataFrame`. The function is to be applied to each group of the `SparkDataFrame` and should have only two parameters: grouping key and R `data.frame` corresponding to +that key. The groups are chosen from `SparkDataFrame`s column(s). +The output of function should be a `data.frame`. Schema specifies the row format of the resulting +`SparkDataFrame`. It must match the R function's output. --- End diff -- Yeah but instead of a pointer to the code it would be great if we could have a table in the documentation. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #13704: [SPARK-15985][SQL] Reduce runtime overhead of a p...
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/13704#discussion_r70202042 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala --- @@ -2018,6 +2018,8 @@ class Analyzer( fail(child, DateType, walkedTypePath) case (StringType, to: NumericType) => fail(child, to, walkedTypePath) + case (from: ArrayType, to: ArrayType) if !from.containsNull => --- End diff -- I mean MapType. It's similar to ArrayType, the value of it can be nullable or non-nullable. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #14034: [SPARK-16355] [SPARK-16354] [SQL] Fix Bugs When LIMIT/TA...
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/14034 Thank you! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #13991: [SPARK-16318][SQL] Implement all remaining xpath functio...
Github user petermaxlee commented on the issue: https://github.com/apache/spark/pull/13991 I guess whatever generates that message is buggy? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #14034: [SPARK-16355] [SPARK-16354] [SQL] Fix Bugs When LIMIT/TA...
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/14034 LGTM, pending jenkins --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #13969: [SPARK-16284][SQL] Implement reflect SQL function
Github user petermaxlee commented on a diff in the pull request: https://github.com/apache/spark/pull/13969#discussion_r70201874 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/MiscFunctionsSuite.scala --- @@ -0,0 +1,35 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.spark.sql + +import org.apache.spark.sql.test.SharedSQLContext + +class MiscFunctionsSuite extends QueryTest with SharedSQLContext { + import testImplicits._ + + test("reflect and java_method") { +val df = Seq((1, "one")).toDF("a", "b") +checkAnswer( + df.selectExpr("reflect('org.apache.spark.sql.ReflectClass', 'method1', a, b)"), --- End diff -- I don't get what you mean. Scala does have static methods -- methods that are defined in a companion object is static. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #13969: [SPARK-16284][SQL] Implement reflect SQL function
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/13969#discussion_r70201875 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/Reflect.scala --- @@ -0,0 +1,174 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.spark.sql.catalyst.expressions + +import java.lang.reflect.Method + +import scala.util.Try + +import org.apache.spark.sql.catalyst.InternalRow +import org.apache.spark.sql.catalyst.analysis.TypeCheckResult +import org.apache.spark.sql.catalyst.analysis.TypeCheckResult.{TypeCheckFailure, TypeCheckSuccess} +import org.apache.spark.sql.catalyst.expressions.codegen.CodegenFallback +import org.apache.spark.sql.types._ +import org.apache.spark.unsafe.types.UTF8String +import org.apache.spark.util.Utils + +/** + * An expression that invokes a method on a class via reflection. + * + * For now, only types defined in `Reflect.typeMapping` are supported (basically primitives + * and string) as input types, and the output is turned automatically to a string. + * + * @param children the first element should be a literal string for the class name, + * and the second element should be a literal string for the method name, + * and the remaining are input arguments to the Java method. + */ +// scalastyle:off line.size.limit +@ExpressionDescription( + usage = "_FUNC_(class,method[,arg1[,arg2..]]) calls method with reflection", + extended = "> SELECT _FUNC_('java.util.UUID', 'randomUUID');\n c33fb387-8500-4bfa-81d2-6e0e3e930df2") +// scalastyle:on line.size.limit +case class Reflect(children: Seq[Expression]) + extends Expression with CodegenFallback { + + override def prettyName: String = "reflect" + + override def checkInputDataTypes(): TypeCheckResult = { +if (children.size < 2) { + TypeCheckFailure("requires at least two arguments") +} else if (!children.take(2).forall(e => e.dataType == StringType && e.foldable)) { + // The first two arguments must be string type. + TypeCheckFailure("first two arguments should be string literals") +} else if (!classExists) { + TypeCheckFailure(s"class $className not found") +} else if (method == null) { + TypeCheckFailure(s"cannot find a method that matches the argument types in $className") +} else { + TypeCheckSuccess +} + } + + override def deterministic: Boolean = false + override def nullable: Boolean = true + override val dataType: DataType = StringType + + override def eval(input: InternalRow): Any = { +var i = 0 +while (i < argExprs.length) { + buffer(i) = argExprs(i).eval(input).asInstanceOf[Object] + // Convert if necessary. Based on the types defined in typeMapping, string is the only + // type that needs conversion. If we support timestamps, dates, decimals, arrays, or maps + // in the future, proper conversion needs to happen here too. + if (buffer(i).isInstanceOf[UTF8String]) { +buffer(i) = buffer(i).toString + } + i += 1 +} +UTF8String.fromString(String.valueOf(method.invoke(obj, buffer : _*))) + } + + @transient private lazy val argExprs: Array[Expression] = children.drop(2).toArray + + /** Name of the class -- this has to be called after we verify children has at least two exprs. */ + @transient private lazy val className = children(0).eval().asInstanceOf[UTF8String].toString + + /** True if the class exists and can be loaded. */ + @transient private lazy val classExists = Reflect.classExists(className) + + /** The reflection method. */ + @transient lazy val method: Method = { +val methodName = children(1).eval(null).asInstanceOf[UTF8String].toString +Reflect.findMethod(className, methodName, argExprs.map(_.dataType)).orNull
[GitHub] spark issue #14123: [SPARK-16471] [SQL] Remove Hive-specific CreateHiveTable...
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/14123 @cloud-fan Yeah! Will be in [WIP] until https://github.com/apache/spark/pull/14071 is merged. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #13969: [SPARK-16284][SQL] Implement reflect SQL function
Github user petermaxlee commented on a diff in the pull request: https://github.com/apache/spark/pull/13969#discussion_r70201841 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/Reflect.scala --- @@ -0,0 +1,174 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.spark.sql.catalyst.expressions + +import java.lang.reflect.Method + +import scala.util.Try + +import org.apache.spark.sql.catalyst.InternalRow +import org.apache.spark.sql.catalyst.analysis.TypeCheckResult +import org.apache.spark.sql.catalyst.analysis.TypeCheckResult.{TypeCheckFailure, TypeCheckSuccess} +import org.apache.spark.sql.catalyst.expressions.codegen.CodegenFallback +import org.apache.spark.sql.types._ +import org.apache.spark.unsafe.types.UTF8String +import org.apache.spark.util.Utils + +/** + * An expression that invokes a method on a class via reflection. + * + * For now, only types defined in `Reflect.typeMapping` are supported (basically primitives + * and string) as input types, and the output is turned automatically to a string. + * + * @param children the first element should be a literal string for the class name, + * and the second element should be a literal string for the method name, + * and the remaining are input arguments to the Java method. + */ +// scalastyle:off line.size.limit +@ExpressionDescription( + usage = "_FUNC_(class,method[,arg1[,arg2..]]) calls method with reflection", + extended = "> SELECT _FUNC_('java.util.UUID', 'randomUUID');\n c33fb387-8500-4bfa-81d2-6e0e3e930df2") +// scalastyle:on line.size.limit +case class Reflect(children: Seq[Expression]) --- End diff -- So what's a good name? I am not attached to Reflect, but I think Reflect should be in the name, if the function is called reflect. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #14123: [SPARK-16471] [SQL] Remove Hive-specific CreateHiveTable...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/14123 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #13969: [SPARK-16284][SQL] Implement reflect SQL function
Github user petermaxlee commented on a diff in the pull request: https://github.com/apache/spark/pull/13969#discussion_r70201786 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/Reflect.scala --- @@ -0,0 +1,174 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.spark.sql.catalyst.expressions + +import java.lang.reflect.Method + +import scala.util.Try + +import org.apache.spark.sql.catalyst.InternalRow +import org.apache.spark.sql.catalyst.analysis.TypeCheckResult +import org.apache.spark.sql.catalyst.analysis.TypeCheckResult.{TypeCheckFailure, TypeCheckSuccess} +import org.apache.spark.sql.catalyst.expressions.codegen.CodegenFallback +import org.apache.spark.sql.types._ +import org.apache.spark.unsafe.types.UTF8String +import org.apache.spark.util.Utils + +/** + * An expression that invokes a method on a class via reflection. + * + * For now, only types defined in `Reflect.typeMapping` are supported (basically primitives + * and string) as input types, and the output is turned automatically to a string. + * + * @param children the first element should be a literal string for the class name, + * and the second element should be a literal string for the method name, + * and the remaining are input arguments to the Java method. + */ +// scalastyle:off line.size.limit +@ExpressionDescription( + usage = "_FUNC_(class,method[,arg1[,arg2..]]) calls method with reflection", + extended = "> SELECT _FUNC_('java.util.UUID', 'randomUUID');\n c33fb387-8500-4bfa-81d2-6e0e3e930df2") +// scalastyle:on line.size.limit +case class Reflect(children: Seq[Expression]) + extends Expression with CodegenFallback { + + override def prettyName: String = "reflect" + + override def checkInputDataTypes(): TypeCheckResult = { +if (children.size < 2) { + TypeCheckFailure("requires at least two arguments") +} else if (!children.take(2).forall(e => e.dataType == StringType && e.foldable)) { + // The first two arguments must be string type. + TypeCheckFailure("first two arguments should be string literals") +} else if (!classExists) { + TypeCheckFailure(s"class $className not found") +} else if (method == null) { + TypeCheckFailure(s"cannot find a method that matches the argument types in $className") +} else { + TypeCheckSuccess +} + } + + override def deterministic: Boolean = false + override def nullable: Boolean = true + override val dataType: DataType = StringType + + override def eval(input: InternalRow): Any = { +var i = 0 +while (i < argExprs.length) { + buffer(i) = argExprs(i).eval(input).asInstanceOf[Object] + // Convert if necessary. Based on the types defined in typeMapping, string is the only + // type that needs conversion. If we support timestamps, dates, decimals, arrays, or maps + // in the future, proper conversion needs to happen here too. + if (buffer(i).isInstanceOf[UTF8String]) { +buffer(i) = buffer(i).toString + } + i += 1 +} +UTF8String.fromString(String.valueOf(method.invoke(obj, buffer : _*))) + } + + @transient private lazy val argExprs: Array[Expression] = children.drop(2).toArray + + /** Name of the class -- this has to be called after we verify children has at least two exprs. */ + @transient private lazy val className = children(0).eval().asInstanceOf[UTF8String].toString + + /** True if the class exists and can be loaded. */ + @transient private lazy val classExists = Reflect.classExists(className) + + /** The reflection method. */ + @transient lazy val method: Method = { +val methodName = children(1).eval(null).asInstanceOf[UTF8String].toString +Reflect.findMethod(className, methodName,
[GitHub] spark issue #13991: [SPARK-16318][SQL] Implement all remaining xpath functio...
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/13991 The latest result is `This patch does not merge cleanly.`, I just wanna double check it. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #14123: [SPARK-16471] [SQL] Remove Hive-specific CreateHiveTable...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/14123 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/62068/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #13969: [SPARK-16284][SQL] Implement reflect SQL function
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/13969#discussion_r70201691 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/MiscFunctionsSuite.scala --- @@ -0,0 +1,35 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.spark.sql + +import org.apache.spark.sql.test.SharedSQLContext + +class MiscFunctionsSuite extends QueryTest with SharedSQLContext { + import testImplicits._ + + test("reflect and java_method") { +val df = Seq((1, "one")).toDF("a", "b") +checkAnswer( + df.selectExpr("reflect('org.apache.spark.sql.ReflectClass', 'method1', a, b)"), --- End diff -- We should also test it in `JavaDataFrameSuite`, there is no real static method in scala. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #14123: [SPARK-16471] [SQL] Remove Hive-specific CreateHiveTable...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/14123 **[Test build #62068 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/62068/consoleFull)** for PR 14123 at commit [`082040f`](https://github.com/apache/spark/commit/082040f64130795593d551647f4d451a0b6a9a7e). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #13969: [SPARK-16284][SQL] Implement reflect SQL function
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/13969#discussion_r70201642 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/Reflect.scala --- @@ -0,0 +1,174 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.spark.sql.catalyst.expressions + +import java.lang.reflect.Method + +import scala.util.Try + +import org.apache.spark.sql.catalyst.InternalRow +import org.apache.spark.sql.catalyst.analysis.TypeCheckResult +import org.apache.spark.sql.catalyst.analysis.TypeCheckResult.{TypeCheckFailure, TypeCheckSuccess} +import org.apache.spark.sql.catalyst.expressions.codegen.CodegenFallback +import org.apache.spark.sql.types._ +import org.apache.spark.unsafe.types.UTF8String +import org.apache.spark.util.Utils + +/** + * An expression that invokes a method on a class via reflection. + * + * For now, only types defined in `Reflect.typeMapping` are supported (basically primitives + * and string) as input types, and the output is turned automatically to a string. + * + * @param children the first element should be a literal string for the class name, + * and the second element should be a literal string for the method name, + * and the remaining are input arguments to the Java method. + */ +// scalastyle:off line.size.limit +@ExpressionDescription( + usage = "_FUNC_(class,method[,arg1[,arg2..]]) calls method with reflection", + extended = "> SELECT _FUNC_('java.util.UUID', 'randomUUID');\n c33fb387-8500-4bfa-81d2-6e0e3e930df2") +// scalastyle:on line.size.limit +case class Reflect(children: Seq[Expression]) + extends Expression with CodegenFallback { + + override def prettyName: String = "reflect" + + override def checkInputDataTypes(): TypeCheckResult = { +if (children.size < 2) { + TypeCheckFailure("requires at least two arguments") +} else if (!children.take(2).forall(e => e.dataType == StringType && e.foldable)) { + // The first two arguments must be string type. + TypeCheckFailure("first two arguments should be string literals") +} else if (!classExists) { + TypeCheckFailure(s"class $className not found") +} else if (method == null) { + TypeCheckFailure(s"cannot find a method that matches the argument types in $className") +} else { + TypeCheckSuccess +} + } + + override def deterministic: Boolean = false + override def nullable: Boolean = true + override val dataType: DataType = StringType + + override def eval(input: InternalRow): Any = { +var i = 0 +while (i < argExprs.length) { + buffer(i) = argExprs(i).eval(input).asInstanceOf[Object] + // Convert if necessary. Based on the types defined in typeMapping, string is the only + // type that needs conversion. If we support timestamps, dates, decimals, arrays, or maps + // in the future, proper conversion needs to happen here too. + if (buffer(i).isInstanceOf[UTF8String]) { +buffer(i) = buffer(i).toString + } + i += 1 +} +UTF8String.fromString(String.valueOf(method.invoke(obj, buffer : _*))) + } + + @transient private lazy val argExprs: Array[Expression] = children.drop(2).toArray + + /** Name of the class -- this has to be called after we verify children has at least two exprs. */ + @transient private lazy val className = children(0).eval().asInstanceOf[UTF8String].toString + + /** True if the class exists and can be loaded. */ + @transient private lazy val classExists = Reflect.classExists(className) + + /** The reflection method. */ + @transient lazy val method: Method = { +val methodName = children(1).eval(null).asInstanceOf[UTF8String].toString +Reflect.findMethod(className, methodName, argExprs.map(_.dataType)).orNull
[GitHub] spark pull request #13969: [SPARK-16284][SQL] Implement reflect SQL function
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/13969#discussion_r70201559 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/Reflect.scala --- @@ -0,0 +1,174 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.spark.sql.catalyst.expressions + +import java.lang.reflect.Method + +import scala.util.Try + +import org.apache.spark.sql.catalyst.InternalRow +import org.apache.spark.sql.catalyst.analysis.TypeCheckResult +import org.apache.spark.sql.catalyst.analysis.TypeCheckResult.{TypeCheckFailure, TypeCheckSuccess} +import org.apache.spark.sql.catalyst.expressions.codegen.CodegenFallback +import org.apache.spark.sql.types._ +import org.apache.spark.unsafe.types.UTF8String +import org.apache.spark.util.Utils + +/** + * An expression that invokes a method on a class via reflection. + * + * For now, only types defined in `Reflect.typeMapping` are supported (basically primitives + * and string) as input types, and the output is turned automatically to a string. + * + * @param children the first element should be a literal string for the class name, + * and the second element should be a literal string for the method name, + * and the remaining are input arguments to the Java method. + */ +// scalastyle:off line.size.limit +@ExpressionDescription( + usage = "_FUNC_(class,method[,arg1[,arg2..]]) calls method with reflection", + extended = "> SELECT _FUNC_('java.util.UUID', 'randomUUID');\n c33fb387-8500-4bfa-81d2-6e0e3e930df2") +// scalastyle:on line.size.limit +case class Reflect(children: Seq[Expression]) + extends Expression with CodegenFallback { + + override def prettyName: String = "reflect" + + override def checkInputDataTypes(): TypeCheckResult = { +if (children.size < 2) { + TypeCheckFailure("requires at least two arguments") +} else if (!children.take(2).forall(e => e.dataType == StringType && e.foldable)) { + // The first two arguments must be string type. + TypeCheckFailure("first two arguments should be string literals") +} else if (!classExists) { + TypeCheckFailure(s"class $className not found") +} else if (method == null) { + TypeCheckFailure(s"cannot find a method that matches the argument types in $className") +} else { + TypeCheckSuccess +} + } + + override def deterministic: Boolean = false + override def nullable: Boolean = true + override val dataType: DataType = StringType + + override def eval(input: InternalRow): Any = { +var i = 0 +while (i < argExprs.length) { + buffer(i) = argExprs(i).eval(input).asInstanceOf[Object] + // Convert if necessary. Based on the types defined in typeMapping, string is the only + // type that needs conversion. If we support timestamps, dates, decimals, arrays, or maps + // in the future, proper conversion needs to happen here too. + if (buffer(i).isInstanceOf[UTF8String]) { +buffer(i) = buffer(i).toString + } + i += 1 +} +UTF8String.fromString(String.valueOf(method.invoke(obj, buffer : _*))) + } + + @transient private lazy val argExprs: Array[Expression] = children.drop(2).toArray + + /** Name of the class -- this has to be called after we verify children has at least two exprs. */ + @transient private lazy val className = children(0).eval().asInstanceOf[UTF8String].toString + + /** True if the class exists and can be loaded. */ + @transient private lazy val classExists = Reflect.classExists(className) + + /** The reflection method. */ + @transient lazy val method: Method = { +val methodName = children(1).eval(null).asInstanceOf[UTF8String].toString +Reflect.findMethod(className, methodName, argExprs.map(_.dataType)).orNull
[GitHub] spark pull request #13969: [SPARK-16284][SQL] Implement reflect SQL function
Github user dongjoon-hyun commented on a diff in the pull request: https://github.com/apache/spark/pull/13969#discussion_r70201417 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/Reflect.scala --- @@ -0,0 +1,174 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.spark.sql.catalyst.expressions + +import java.lang.reflect.Method + +import scala.util.Try + +import org.apache.spark.sql.catalyst.InternalRow +import org.apache.spark.sql.catalyst.analysis.TypeCheckResult +import org.apache.spark.sql.catalyst.analysis.TypeCheckResult.{TypeCheckFailure, TypeCheckSuccess} +import org.apache.spark.sql.catalyst.expressions.codegen.CodegenFallback +import org.apache.spark.sql.types._ +import org.apache.spark.unsafe.types.UTF8String +import org.apache.spark.util.Utils + +/** + * An expression that invokes a method on a class via reflection. + * + * For now, only types defined in `Reflect.typeMapping` are supported (basically primitives + * and string) as input types, and the output is turned automatically to a string. + * + * @param children the first element should be a literal string for the class name, + * and the second element should be a literal string for the method name, + * and the remaining are input arguments to the Java method. + */ +// scalastyle:off line.size.limit +@ExpressionDescription( + usage = "_FUNC_(class,method[,arg1[,arg2..]]) calls method with reflection", + extended = "> SELECT _FUNC_('java.util.UUID', 'randomUUID');\n c33fb387-8500-4bfa-81d2-6e0e3e930df2") +// scalastyle:on line.size.limit +case class Reflect(children: Seq[Expression]) --- End diff -- Ya. It's my fault. Sorry for that. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #13991: [SPARK-16318][SQL] Implement all remaining xpath functio...
Github user petermaxlee commented on the issue: https://github.com/apache/spark/pull/13991 @cloud-fan Jenkins already ran twice successfully before. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #13969: [SPARK-16284][SQL] Implement reflect SQL function
Github user petermaxlee commented on a diff in the pull request: https://github.com/apache/spark/pull/13969#discussion_r70200185 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/Reflect.scala --- @@ -0,0 +1,174 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.spark.sql.catalyst.expressions + +import java.lang.reflect.Method + +import scala.util.Try + +import org.apache.spark.sql.catalyst.InternalRow +import org.apache.spark.sql.catalyst.analysis.TypeCheckResult +import org.apache.spark.sql.catalyst.analysis.TypeCheckResult.{TypeCheckFailure, TypeCheckSuccess} +import org.apache.spark.sql.catalyst.expressions.codegen.CodegenFallback +import org.apache.spark.sql.types._ +import org.apache.spark.unsafe.types.UTF8String +import org.apache.spark.util.Utils + +/** + * An expression that invokes a method on a class via reflection. + * + * For now, only types defined in `Reflect.typeMapping` are supported (basically primitives + * and string) as input types, and the output is turned automatically to a string. + * + * @param children the first element should be a literal string for the class name, + * and the second element should be a literal string for the method name, + * and the remaining are input arguments to the Java method. + */ +// scalastyle:off line.size.limit +@ExpressionDescription( + usage = "_FUNC_(class,method[,arg1[,arg2..]]) calls method with reflection", + extended = "> SELECT _FUNC_('java.util.UUID', 'randomUUID');\n c33fb387-8500-4bfa-81d2-6e0e3e930df2") +// scalastyle:on line.size.limit +case class Reflect(children: Seq[Expression]) --- End diff -- It is also annoying if we search for reflect (based on the name) and then doesn't find an expression with reflect in the name. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #13969: [SPARK-16284][SQL] Implement reflect SQL function
Github user petermaxlee commented on a diff in the pull request: https://github.com/apache/spark/pull/13969#discussion_r70200163 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/Reflect.scala --- @@ -0,0 +1,174 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.spark.sql.catalyst.expressions + +import java.lang.reflect.Method + +import scala.util.Try + +import org.apache.spark.sql.catalyst.InternalRow +import org.apache.spark.sql.catalyst.analysis.TypeCheckResult +import org.apache.spark.sql.catalyst.analysis.TypeCheckResult.{TypeCheckFailure, TypeCheckSuccess} +import org.apache.spark.sql.catalyst.expressions.codegen.CodegenFallback +import org.apache.spark.sql.types._ +import org.apache.spark.unsafe.types.UTF8String +import org.apache.spark.util.Utils + +/** + * An expression that invokes a method on a class via reflection. + * + * For now, only types defined in `Reflect.typeMapping` are supported (basically primitives + * and string) as input types, and the output is turned automatically to a string. + * + * @param children the first element should be a literal string for the class name, + * and the second element should be a literal string for the method name, + * and the remaining are input arguments to the Java method. + */ +// scalastyle:off line.size.limit +@ExpressionDescription( + usage = "_FUNC_(class,method[,arg1[,arg2..]]) calls method with reflection", + extended = "> SELECT _FUNC_('java.util.UUID', 'randomUUID');\n c33fb387-8500-4bfa-81d2-6e0e3e930df2") +// scalastyle:on line.size.limit +case class Reflect(children: Seq[Expression]) --- End diff -- I actually named it JavaMethodReflect before but @dongjoon-hyun asked to use Reflect. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #14081: [SPARK-16403][Examples] Cleanup to remove unused imports...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/14081 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/62072/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #14081: [SPARK-16403][Examples] Cleanup to remove unused imports...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/14081 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #14034: [SPARK-16355] [SPARK-16354] [SQL] Fix Bugs When LIMIT/TA...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/14034 **[Test build #62073 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/62073/consoleFull)** for PR 14034 at commit [`2e6f8d8`](https://github.com/apache/spark/commit/2e6f8d8c8b5007302415b7fd984a38fc51be44bf). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #14081: [SPARK-16403][Examples] Cleanup to remove unused imports...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/14081 **[Test build #62072 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/62072/consoleFull)** for PR 14081 at commit [`81611a8`](https://github.com/apache/spark/commit/81611a860031064d482f2d3b2b67f5f4ed0648dd). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #13704: [SPARK-15985][SQL] Reduce runtime overhead of a p...
Github user kiszk commented on a diff in the pull request: https://github.com/apache/spark/pull/13704#discussion_r70199583 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala --- @@ -2018,6 +2018,8 @@ class Analyzer( fail(child, DateType, walkedTypePath) case (StringType, to: NumericType) => fail(child, to, walkedTypePath) + case (from: ArrayType, to: ArrayType) if !from.containsNull => --- End diff -- I will try improving the `SimplifyCasts` rule to force to eliminate the cast from non-element-nullable array to nullable ones. I do not understand the following. Will it be automatically done by improving the `SimplifyCasts` or do we need to improve another rule? > "we can handle map too" I will add unit tests for it. I think that it is good to add a benchmark to show degree of improvements. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #14081: [SPARK-16403][Examples] Cleanup to remove unused imports...
Github user BryanCutler commented on the issue: https://github.com/apache/spark/pull/14081 I realized that the "PipelineExample"s are included in the docs, while the "SimpleTextClassificationExample"s are not, so it might be better to keep those instead. I just changed the data and regularization value to that of "SimpleTextClassificationExample" which gives correct predictions (it looks like these examples were updated at one time by DB to fix this, but the change was not put into the doc example). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #14114: [SPARK-16458][SQL] SessionCatalog should support `listCo...
Github user dongjoon-hyun commented on the issue: https://github.com/apache/spark/pull/14114 Now, it's back for review again. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #13969: [SPARK-16284][SQL] Implement reflect SQL function
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/13969#discussion_r70198925 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/Reflect.scala --- @@ -0,0 +1,170 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.spark.sql.catalyst.expressions + +import java.lang.reflect.Method + +import scala.util.Try + +import org.apache.spark.sql.catalyst.InternalRow +import org.apache.spark.sql.catalyst.analysis.TypeCheckResult +import org.apache.spark.sql.catalyst.analysis.TypeCheckResult.{TypeCheckFailure, TypeCheckSuccess} +import org.apache.spark.sql.catalyst.expressions.codegen.CodegenFallback +import org.apache.spark.sql.types._ +import org.apache.spark.unsafe.types.UTF8String +import org.apache.spark.util.Utils + +/** + * An expression that invokes a method on a class via reflection. + * + * @param children the first element should be a literal string for the class name, + * and the second element should be a literal string for the method name, + * and the remaining are input arguments to the Java method. + */ +// scalastyle:off line.size.limit +@ExpressionDescription( + usage = "_FUNC_(class,method[,arg1[,arg2..]]) calls method with reflection", + extended = "> SELECT _FUNC_('java.util.UUID', 'randomUUID');\nc33fb387-8500-4bfa-81d2-6e0e3e930df2") +// scalastyle:on line.size.limit +case class Reflect(children: Seq[Expression]) + extends Expression with CodegenFallback { + + override def prettyName: String = "reflect" + + override def checkInputDataTypes(): TypeCheckResult = { +if (children.size < 2) { + TypeCheckFailure("requires at least two arguments") +} else if (!children.take(2).forall(e => e.dataType == StringType && e.foldable)) { + // The first two arguments must be string type. + TypeCheckFailure("first two arguments should be string literals") +} else if (!classExists) { + TypeCheckFailure(s"class $className not found") +} else if (method == null) { + TypeCheckFailure(s"cannot find a method that matches the argument types in $className") +} else { + TypeCheckSuccess +} + } + + override def deterministic: Boolean = false + override def nullable: Boolean = true + override val dataType: DataType = StringType + + override def eval(input: InternalRow): Any = { +var i = 0 +while (i < argExprs.length) { --- End diff -- `while` is preferred here. The `eval` method is critical path and `for` loop in scala in slow. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #14081: [SPARK-16403][Examples] Cleanup to remove unused imports...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/14081 **[Test build #62072 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/62072/consoleFull)** for PR 14081 at commit [`81611a8`](https://github.com/apache/spark/commit/81611a860031064d482f2d3b2b67f5f4ed0648dd). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #13969: [SPARK-16284][SQL] Implement reflect SQL function
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/13969#discussion_r70198848 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/Reflect.scala --- @@ -0,0 +1,174 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.spark.sql.catalyst.expressions + +import java.lang.reflect.Method + +import scala.util.Try + +import org.apache.spark.sql.catalyst.InternalRow +import org.apache.spark.sql.catalyst.analysis.TypeCheckResult +import org.apache.spark.sql.catalyst.analysis.TypeCheckResult.{TypeCheckFailure, TypeCheckSuccess} +import org.apache.spark.sql.catalyst.expressions.codegen.CodegenFallback +import org.apache.spark.sql.types._ +import org.apache.spark.unsafe.types.UTF8String +import org.apache.spark.util.Utils + +/** + * An expression that invokes a method on a class via reflection. + * + * For now, only types defined in `Reflect.typeMapping` are supported (basically primitives + * and string) as input types, and the output is turned automatically to a string. + * + * @param children the first element should be a literal string for the class name, + * and the second element should be a literal string for the method name, + * and the remaining are input arguments to the Java method. + */ +// scalastyle:off line.size.limit +@ExpressionDescription( + usage = "_FUNC_(class,method[,arg1[,arg2..]]) calls method with reflection", + extended = "> SELECT _FUNC_('java.util.UUID', 'randomUUID');\n c33fb387-8500-4bfa-81d2-6e0e3e930df2") +// scalastyle:on line.size.limit +case class Reflect(children: Seq[Expression]) --- End diff -- `Reflect` is really ambiguous, how about `CallMethod`? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #14114: [SPARK-16458][SQL] SessionCatalog should support `listCo...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/14114 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/62067/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #14114: [SPARK-16458][SQL] SessionCatalog should support `listCo...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/14114 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #14114: [SPARK-16458][SQL] SessionCatalog should support `listCo...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/14114 **[Test build #62067 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/62067/consoleFull)** for PR 14114 at commit [`af6692f`](https://github.com/apache/spark/commit/af6692fafda6429d87eff7decb8ec0fdabd036fd). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #14034: [SPARK-16355] [SPARK-16354] [SQL] Fix Bugs When LIMIT/TA...
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/14034 uh... Thanks! Let me do it now. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #14090: [SPARK-16112][SparkR] Programming guide for gappl...
Github user NarineK commented on a diff in the pull request: https://github.com/apache/spark/pull/14090#discussion_r70198331 --- Diff: docs/sparkr.md --- @@ -306,6 +306,64 @@ head(ldf, 3) {% endhighlight %} + Run a given function on a large dataset grouping by input column(s) and using `gapply` or `gapplyCollect` + +# gapply +Apply a function to each group of a `SparkDataFrame`. The function is to be applied to each group of the `SparkDataFrame` and should have only two parameters: grouping key and R `data.frame` corresponding to +that key. The groups are chosen from `SparkDataFrame`s column(s). +The output of function should be a `data.frame`. Schema specifies the row format of the resulting +`SparkDataFrame`. It must match the R function's output. --- End diff -- or we could probably refer also to this ? https://github.com/apache/spark/blob/master/R/pkg/R/types.R#L21 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #14124: [SPARK-16472][SQL] Inconsistent nullability in schema af...
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/14124 @viirya Thanks for your comment! Actually, that's I want to have some feedback for from @marmbrus . It seems forcing to a nullable schema all is already happening when you read/write data via `read`/`write` API (but not for structured streaming and another API for json). So, actually, the reason of this PR is, to make all consistent. The reason to make them consistent in a way that the schema is forced as nullable is what he said in the mailing list. >Sure, but a traditional RDBMS has the opportunity to do validation before >loading data in. Thats not really an option when you are reading random >files from S3. This is why Hive and many other systems in this space treat >all columns as nullable. Actually, Parquet also reads and writes the schema with nullability correctly if we get rid of `asNullable` (I tested this before) but it seems that's prevented due to (I assume) the reason above. @marmbrus Do you mind if I ask to clarify here please? I think we may have to deal with this as datasource-specific problem. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #13778: [SPARK-16062][SPARK-15989][SQL] Fix two bugs of P...
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/13778#discussion_r70198204 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/objects/objects.scala --- @@ -346,14 +346,47 @@ case class LambdaVariable(value: String, isNull: String, dataType: DataType) ext object MapObjects { private val curId = new java.util.concurrent.atomic.AtomicInteger() + /** + * Construct an instance of MapObjects case class. + * + * @param function The function applied on the collection elements. + * @param inputData An expression that when evaluated returns a collection object. + * @param elementType The data type of elements in the collection. + */ def apply( function: Expression => Expression, inputData: Expression, elementType: DataType): MapObjects = { val loopValue = "MapObjects_loopValue" + curId.getAndIncrement() val loopIsNull = "MapObjects_loopIsNull" + curId.getAndIncrement() val loopVar = LambdaVariable(loopValue, loopIsNull, elementType) -MapObjects(loopValue, loopIsNull, elementType, function(loopVar), inputData) +MapObjects(loopValue, loopIsNull, elementType, function(loopVar), inputData, None) + } + + /** + * Construct an instance of MapObjects case class. + * + * @param function The function applied on the collection elements. + * @param inputData An expression that when evaluated returns a collection object. + * @param elementType The data type of elements in the collection. + * @param inputDataType The explicitly given data type of inputData to override the + * data type inferred from inputData (i.e., inputData.dataType). + * When Python UDT whose sqlType is an array, the deserializer + * expression will apply MapObjects on it. However, as the data type + * of inputData is Python UDT, which is not an expected array type + * in MapObjects. In this case, we need to explicitly use + * Python UDT's sqlType as data type. --- End diff -- Making sense. I will update it later. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #13248: [SPARK-15194] [ML] Add Python ML API for Multivar...
Github user lins05 commented on a diff in the pull request: https://github.com/apache/spark/pull/13248#discussion_r70198232 --- Diff: python/pyspark/ml/stat/distribution.py --- @@ -0,0 +1,267 @@ +# +# Licensed to the Apache Software Foundation (ASF) under one or more +# contributor license agreements. See the NOTICE file distributed with +# this work for additional information regarding copyright ownership. +# The ASF licenses this file to You under the Apache License, Version 2.0 +# (the "License"); you may not use this file except in compliance with +# the License. You may obtain a copy of the License at +# +#http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. +# + +from pyspark.ml.linalg import DenseVector, DenseMatrix, Vector +import numpy as np + +__all__ = ['MultivariateGaussian'] + + + +class MultivariateGaussian(): +""" +This class provides basic functionality for a Multivariate Gaussian (Normal) Distribution. In + the event that the covariance matrix is singular, the density will be computed in a +reduced dimensional subspace under which the distribution is supported. +(see [[http://en.wikipedia.org/wiki/Multivariate_normal_distribution#Degenerate_case]]) + +mu The mean vector of the distribution +sigma The covariance matrix of the distribution + + +>>> mu = Vectors.dense([0.0, 0.0]) --- End diff -- I see, but the missing import of `Vectors` would fail the doctest. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #14048: [SPARK-16370][SQL] Union queries should not be ex...
Github user dongjoon-hyun closed the pull request at: https://github.com/apache/spark/pull/14048 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #14048: [SPARK-16370][SQL] Union queries should not be executed ...
Github user dongjoon-hyun commented on the issue: https://github.com/apache/spark/pull/14048 Hmm. Okay, I didn't prevent all. I see. I'll close. Thank you for decision, @cloud-fan . --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #14048: [SPARK-16370][SQL] Union queries should not be executed ...
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/14048 ``` case Union(children) if children.forall(x => x.isInstanceOf[InsertIntoTable] || x.isInstanceOf[InsertIntoHadoopFsRelationCommand]) => ``` This doesn't indicate a muliti-insert right? A normal `Uion` can also looks like this --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #14048: [SPARK-16370][SQL] Union queries should not be executed ...
Github user dongjoon-hyun commented on the issue: https://github.com/apache/spark/pull/14048 Ah, I see. You mean Union of `INSERT INTO`s, right? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #14048: [SPARK-16370][SQL] Union queries should not be executed ...
Github user dongjoon-hyun commented on the issue: https://github.com/apache/spark/pull/14048 This PR fixes that with minimal efforts. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #14034: [SPARK-16355] [SPARK-16354] [SQL] Fix Bugs When LIMIT/TA...
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/14034 you missed one comment : https://github.com/apache/spark/pull/14034/files#r70183958 :) --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #13704: [SPARK-15985][SQL] Reduce runtime overhead of a program ...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/13704 **[Test build #62071 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/62071/consoleFull)** for PR 13704 at commit [`66800fa`](https://github.com/apache/spark/commit/66800faaebf72e492ee7693d81f8dba980f1dab2). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #14048: [SPARK-16370][SQL] Union queries should not be executed ...
Github user dongjoon-hyun commented on the issue: https://github.com/apache/spark/pull/14048 Ur, what do you mean? > With your patch, we can still create union queries with side effect which will be executed eagerly. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #14048: [SPARK-16370][SQL] Union queries should not be executed ...
Github user dongjoon-hyun commented on the issue: https://github.com/apache/spark/pull/14048 The current one looks like this. ``` case Union(children) if children.forall(x => x.isInstanceOf[InsertIntoTable] || x.isInstanceOf[InsertIntoHadoopFsRelationCommand]) => ``` --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org