[GitHub] spark issue #15441: [SPARK-4411] [Web UI] Add "kill" link for jobs in the UI
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15441 **[Test build #67406 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/67406/consoleFull)** for PR 15441 at commit [`b1e77ba`](https://github.com/apache/spark/commit/b1e77baaff2bae12e745d623ea27e7cb2ad5e2be). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15441: [SPARK-4411] [Web UI] Add "kill" link for jobs in the UI
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15441 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15441: [SPARK-4411] [Web UI] Add "kill" link for jobs in the UI
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15441 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/67406/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15513: [SPARK-17963][SQL][Documentation] Add examples (extend) ...
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/15513 Will review this PR tomorrow. Thanks for your work! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15582: [SPARK-18045][SQL][TESTS] Move `HiveDataFrameAnalyticsSu...
Github user rxin commented on the issue: https://github.com/apache/spark/pull/15582 I'm going to merge this first. Please submit a follow-up pr to move other ones. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #14136: [SPARK-16282][SQL] Implement percentile SQL function.
Github user rxin commented on the issue: https://github.com/apache/spark/pull/14136 I think we can merge this one first, since it is no worse than the Hive one. Then we can think about how to make it more robust. cc @hvanhovell --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15582: [SPARK-18045][SQL][TESTS] Move `HiveDataFrameAnalyticsSu...
Github user rxin commented on the issue: https://github.com/apache/spark/pull/15582 (Actually I'm on an airplane and the internet is not good enough to merge patches). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15582: [SPARK-18045][SQL][TESTS] Move `HiveDataFrameAnalyticsSu...
Github user rxin commented on the issue: https://github.com/apache/spark/pull/15582 cc @cloud-fan can you merge this? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15575: [SPARK-18038] [SQL] Move output partitioning definition ...
Github user rxin commented on the issue: https://github.com/apache/spark/pull/15575 LGTM too. Unforunately my internet sucks (on a plane) and I can't merge this right now. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #14136: [SPARK-16282][SQL] Implement percentile SQL funct...
Github user rxin commented on a diff in the pull request: https://github.com/apache/spark/pull/14136#discussion_r84591451 --- Diff: sql/hive/src/test/scala/org/apache/spark/sql/hive/execution/AggregationQuerySuite.scala --- @@ -851,6 +851,42 @@ abstract class AggregationQuerySuite extends QueryTest with SQLTestUtils with Te checkAnswer(df3.groupBy().agg(covar_pop("a", "b")), Row(0.0)) } + test("percentile") { --- End diff -- i don't think you need the test case here. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #14136: [SPARK-16282][SQL] Implement percentile SQL funct...
Github user rxin commented on a diff in the pull request: https://github.com/apache/spark/pull/14136#discussion_r84591458 --- Diff: sql/hive/src/test/scala/org/apache/spark/sql/hive/execution/AggregationQuerySuite.scala --- @@ -851,6 +851,42 @@ abstract class AggregationQuerySuite extends QueryTest with SQLTestUtils with Te checkAnswer(df3.groupBy().agg(covar_pop("a", "b")), Row(0.0)) } + test("percentile") { --- End diff -- basically i think this entire suite is mostly historic and pretty much obsolete now. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15575: [SPARK-18038] [SQL] Move output partitioning definition ...
Github user viirya commented on the issue: https://github.com/apache/spark/pull/15575 yeah, LGTM, it doesn't change current outputPartitioning of operators. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #14136: [SPARK-16282][SQL] Implement percentile SQL funct...
Github user rxin commented on a diff in the pull request: https://github.com/apache/spark/pull/14136#discussion_r84591459 --- Diff: sql/hive/src/test/scala/org/apache/spark/sql/hive/execution/HiveUDFSuite.scala --- @@ -136,7 +136,7 @@ class HiveUDFSuite extends QueryTest with TestHiveSingleton with SQLTestUtils { test("SPARK-2693 udaf aggregates test") { checkAnswer(sql("SELECT percentile(key, 1) FROM src LIMIT 1"), - sql("SELECT max(key) FROM src").collect().toSeq) + sql("SELECT array(max(key)) FROM src").collect().toSeq) --- End diff -- what is this change about? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #15513: [SPARK-17963][SQL][Documentation] Add examples (e...
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/15513#discussion_r84591475 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/execution/command/DDLSuite.scala --- @@ -1455,50 +1455,59 @@ class DDLSuite extends QueryTest with SharedSQLContext with BeforeAndAfterEach { sql("DESCRIBE FUNCTION log"), Row("Class: org.apache.spark.sql.catalyst.expressions.Logarithm") :: Row("Function: log") :: -Row("Usage: log(b, x) - Returns the logarithm of x with base b.") :: Nil +Row("Usage: log(base, expr) - Returns the logarithm of expr with base.") :: Nil ) // predicate operator checkAnswer( sql("DESCRIBE FUNCTION or"), Row("Class: org.apache.spark.sql.catalyst.expressions.Or") :: Row("Function: or") :: -Row("Usage: a or b - Logical OR.") :: Nil +Row("Usage: expr1 or expr2 - Logical OR.") :: Nil ) checkAnswer( sql("DESCRIBE FUNCTION !"), Row("Class: org.apache.spark.sql.catalyst.expressions.Not") :: Row("Function: !") :: -Row("Usage: ! a - Logical not") :: Nil +Row("Usage: ! expr - Logical not.") :: Nil ) // arithmetic operators checkAnswer( sql("DESCRIBE FUNCTION +"), Row("Class: org.apache.spark.sql.catalyst.expressions.Add") :: Row("Function: +") :: -Row("Usage: a + b - Returns a+b.") :: Nil +Row("Usage: expr1 + expr2 - Returns expr1+expr2.") :: Nil ) // comparison operators checkAnswer( sql("DESCRIBE FUNCTION <"), Row("Class: org.apache.spark.sql.catalyst.expressions.LessThan") :: Row("Function: <") :: -Row("Usage: a < b - Returns TRUE if a is less than b.") :: Nil +Row("Usage: expr1 < expr2 - Returns TRUE if expr1 is less than expr2.") :: Nil ) // STRING checkAnswer( sql("DESCRIBE FUNCTION 'concat'"), Row("Class: org.apache.spark.sql.catalyst.expressions.Concat") :: Row("Function: concat") :: Row("Usage: concat(str1, str2, ..., strN) " + - "- Returns the concatenation of str1, str2, ..., strN") :: Nil + "- Returns the concatenation of str1, str2, ..., strN.") :: Nil ) // extended mode checkAnswer( sql("DESCRIBE FUNCTION EXTENDED ^"), Row("Class: org.apache.spark.sql.catalyst.expressions.BitwiseXor") :: -Row("Extended Usage:\n> SELECT 3 ^ 5; 2") :: +Row( + """Extended Usage: +|Arguments: +| expr1 - a integral numeric expression. --- End diff -- I will sweep it the same instances! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #15513: [SPARK-17963][SQL][Documentation] Add examples (e...
Github user jodersky commented on a diff in the pull request: https://github.com/apache/spark/pull/15513#discussion_r84591482 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/command/functions.scala --- @@ -125,7 +129,7 @@ case class DescribeFunctionCommand( if (isExtended) { result :+ - Row(s"Extended Usage:\n${replaceFunctionName(info.getExtended, info.getName)}") + Row(s"Extended Usage:${replaceFunctionName(info.getExtended, info.getName)}") --- End diff -- Indeed, annotations require constant parameters (probably due to JVM requirements). Since `stripMargin` is a method on a string wrapper, it unfortunately cannot be used as an annotation argument --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #14136: [SPARK-16282][SQL] Implement percentile SQL funct...
Github user rxin commented on a diff in the pull request: https://github.com/apache/spark/pull/14136#discussion_r84591480 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/functions.scala --- @@ -613,6 +613,46 @@ object functions { def min(columnName: String): Column = min(Column(columnName)) /** + * Aggregate function: returns the exact percentile(s) of the expression in a group at pc with --- End diff -- let's not add the scala api for this for now. i'm not sure if we want to encourage users to use it -- it is super expensive. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #14136: [SPARK-16282][SQL] Implement percentile SQL function.
Github user rxin commented on the issue: https://github.com/apache/spark/pull/14136 Is there a way we can add some expression level unit tests for this? I realized we don't really have existing infrastructure for unit testing aggregate expressions (only for non-aggregate expressions). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15597: [SPARK-18063][SQL] Failed to infer constraints over mult...
Github user rxin commented on the issue: https://github.com/apache/spark/pull/15597 cc @sameeragarwal --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15354: [SPARK-17764][SQL] Add `to_json` supporting to convert n...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15354 **[Test build #67407 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/67407/consoleFull)** for PR 15354 at commit [`d74c96d`](https://github.com/apache/spark/commit/d74c96d1b2c6b72e373131d4e1fecd90cc46cb0b). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15354: [SPARK-17764][SQL] Add `to_json` supporting to convert n...
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/15354 @marmbrus Sure (I didn't mean I am not going to do this..), I just handled the case in this PR for `to_json`. BTW, I would like to note that there are the same problems in other JSON related functionalities. For example, I might have to add ```scala override def checkInputDataTypes(): TypeCheckResult = { ... JacksonUtils.verifySchema(child.dataType.asInstanceOf[StructType]) ... } ``` for `from_json` as well. Let me please open a follow up for adding this logic and test for JSON related functionalities. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15354: [SPARK-17764][SQL] Add `to_json` supporting to convert n...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15354 **[Test build #67408 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/67408/consoleFull)** for PR 15354 at commit [`8603462`](https://github.com/apache/spark/commit/8603462b3e09ee2187d57be504e99b12dad50486). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15354: [SPARK-17764][SQL] Add `to_json` supporting to convert n...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15354 **[Test build #67407 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/67407/consoleFull)** for PR 15354 at commit [`d74c96d`](https://github.com/apache/spark/commit/d74c96d1b2c6b72e373131d4e1fecd90cc46cb0b). * This patch **fails Spark unit tests**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15354: [SPARK-17764][SQL] Add `to_json` supporting to convert n...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15354 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15354: [SPARK-17764][SQL] Add `to_json` supporting to convert n...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15354 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/67407/ Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15354: [SPARK-17764][SQL] Add `to_json` supporting to convert n...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15354 **[Test build #67409 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/67409/consoleFull)** for PR 15354 at commit [`518f48d`](https://github.com/apache/spark/commit/518f48d19e37235e228570817eb5847f1f05c28a). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #14957: [SPARK-4502][SQL]Support parquet nested struct pr...
Github user xuanyuanking commented on a diff in the pull request: https://github.com/apache/spark/pull/14957#discussion_r84592806 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/FileSourceStrategy.scala --- @@ -97,7 +99,15 @@ object FileSourceStrategy extends Strategy with Logging { dataColumns .filter(requiredAttributes.contains) .filterNot(partitionColumns.contains) - val outputSchema = readDataColumns.toStructType + val outputSchema = if (fsRelation.sqlContext.conf.isParquetNestColumnPruning +&& fsRelation.fileFormat.isInstanceOf[ParquetFileFormat]) { +val totalSchema = readDataColumns.toStructType +val prunedSchema = StructType( + generateStructFieldsContainsNesting(projects, totalSchema)) +// Merge schema in same StructType and merge with filterAttributes +prunedSchema.fields.map(f => StructType(Array(f))).reduceLeft(_ merge _) + .merge(filterAttributes.toSeq.toStructType) + } else readDataColumns.toStructType --- End diff -- fix done --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #14957: [SPARK-4502][SQL]Support parquet nested struct pr...
Github user xuanyuanking commented on a diff in the pull request: https://github.com/apache/spark/pull/14957#discussion_r84592805 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/FileSourceStrategy.scala --- @@ -97,7 +99,15 @@ object FileSourceStrategy extends Strategy with Logging { dataColumns .filter(requiredAttributes.contains) .filterNot(partitionColumns.contains) - val outputSchema = readDataColumns.toStructType + val outputSchema = if (fsRelation.sqlContext.conf.isParquetNestColumnPruning +&& fsRelation.fileFormat.isInstanceOf[ParquetFileFormat]) { +val totalSchema = readDataColumns.toStructType --- End diff -- fix done --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #14957: [SPARK-4502][SQL]Support parquet nested struct pruning a...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/14957 **[Test build #67410 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/67410/consoleFull)** for PR 14957 at commit [`ab8f5ec`](https://github.com/apache/spark/commit/ab8f5ec15b2682ee40ea0483e0b6642b2a14c7ad). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #14957: [SPARK-4502][SQL]Support parquet nested struct pr...
Github user xuanyuanking commented on a diff in the pull request: https://github.com/apache/spark/pull/14957#discussion_r84592818 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/FileSourceStrategy.scala --- @@ -126,4 +136,52 @@ object FileSourceStrategy extends Strategy with Logging { case _ => Nil } + + private def generateStructFieldsContainsNesting(projects: Seq[Expression], + totalSchema: StructType) : Seq[StructField] = { +def generateStructField(curField: List[String], + node: Expression) : Seq[StructField] = { --- End diff -- fix done --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #14957: [SPARK-4502][SQL]Support parquet nested struct pr...
Github user xuanyuanking commented on a diff in the pull request: https://github.com/apache/spark/pull/14957#discussion_r84592816 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/FileSourceStrategy.scala --- @@ -126,4 +136,52 @@ object FileSourceStrategy extends Strategy with Logging { case _ => Nil } + + private def generateStructFieldsContainsNesting(projects: Seq[Expression], + totalSchema: StructType) : Seq[StructField] = { --- End diff -- fix code style done. No problem, I'll add tests for the private func generateStructFieldsContainsNesting next patch, this patch fix all code style and naming problem. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #14957: [SPARK-4502][SQL]Support parquet nested struct pr...
Github user xuanyuanking commented on a diff in the pull request: https://github.com/apache/spark/pull/14957#discussion_r84592821 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/FileSourceStrategy.scala --- @@ -126,4 +136,52 @@ object FileSourceStrategy extends Strategy with Logging { case _ => Nil } + + private def generateStructFieldsContainsNesting(projects: Seq[Expression], + totalSchema: StructType) : Seq[StructField] = { +def generateStructField(curField: List[String], + node: Expression) : Seq[StructField] = { + node match { +case ai: GetArrayItem => + // Here we drop the previous for simplify array and map support. + // Same strategy in GetArrayStructFields and GetMapValue + generateStructField(List.empty[String], ai.child) +case asf: GetArrayStructFields => + generateStructField(List.empty[String], asf.child) +case mv: GetMapValue => + generateStructField(List.empty[String], mv.child) +case attr: AttributeReference => + Seq(getFieldRecursively(totalSchema, attr.name :: curField)) +case sf: GetStructField => + generateStructField(sf.name.get :: curField, sf.child) +case _ => + if (node.children.nonEmpty) { +node.children.flatMap(child => generateStructField(curField, child)) + } else { +Seq.empty[StructField] + } + } +} + +def getFieldRecursively(totalSchema: StructType, +name: List[String]): StructField = { --- End diff -- fix done --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15513: [SPARK-17963][SQL][Documentation] Add examples (extend) ...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15513 **[Test build #67411 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/67411/consoleFull)** for PR 15513 at commit [`feafdc2`](https://github.com/apache/spark/commit/feafdc21317b4d05921b73f6231d8b0d498bad43). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #14957: [SPARK-4502][SQL]Support parquet nested struct pr...
Github user xuanyuanking commented on a diff in the pull request: https://github.com/apache/spark/pull/14957#discussion_r84592865 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/FileSourceStrategy.scala --- @@ -126,4 +136,52 @@ object FileSourceStrategy extends Strategy with Logging { case _ => Nil } + + private def generateStructFieldsContainsNesting(projects: Seq[Expression], + totalSchema: StructType) : Seq[StructField] = { +def generateStructField(curField: List[String], + node: Expression) : Seq[StructField] = { + node match { +case ai: GetArrayItem => + // Here we drop the previous for simplify array and map support. + // Same strategy in GetArrayStructFields and GetMapValue + generateStructField(List.empty[String], ai.child) +case asf: GetArrayStructFields => + generateStructField(List.empty[String], asf.child) +case mv: GetMapValue => + generateStructField(List.empty[String], mv.child) +case attr: AttributeReference => + Seq(getFieldRecursively(totalSchema, attr.name :: curField)) +case sf: GetStructField => + generateStructField(sf.name.get :: curField, sf.child) +case _ => + if (node.children.nonEmpty) { +node.children.flatMap(child => generateStructField(curField, child)) + } else { +Seq.empty[StructField] + } + } +} + +def getFieldRecursively(totalSchema: StructType, +name: List[String]): StructField = { + if (name.length > 1) { +val curField = name.head +val curFieldType = totalSchema(curField) +curFieldType.dataType match { + case st: StructType => +val newField = getFieldRecursively(StructType(st.fields), name.drop(1)) +StructField(curFieldType.name, StructType(Seq(newField)), + curFieldType.nullable, curFieldType.metadata) + case _ => +throw new IllegalArgumentException(s"""Field "$curField" is not struct field.""") +} + } else { +totalSchema(name.head) + } +} --- End diff -- The func getFieldRecursively here need the return value which is a StructField contains all nested relation in path. For example: The fullSchema is: ``` root |-- col: struct (nullable = true) ||-- s1: struct (nullable = true) |||-- s1_1: long (nullable = true) |||-- s1_2: long (nullable = true) ||-- str: string (nullable = true) |-- num: long (nullable = true) |-- str: string (nullable = true) ``` the func should return: ``` StructField(col,StructType(StructField(s1,StructType(StructField(s1_1,LongType,true)),true)),true) ``` So maybe I can't use the simplified func getnestedField because it returns only the last StructField: ``` StructField(s1_1,LongType,true) ``` --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #14957: [SPARK-4502][SQL]Support parquet nested struct pr...
Github user xuanyuanking commented on a diff in the pull request: https://github.com/apache/spark/pull/14957#discussion_r84592876 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala --- @@ -212,6 +212,11 @@ object SQLConf { .booleanConf .createWithDefault(true) + val PARQUET_NEST_COLUMN_PRUNING = SQLConfigBuilder("spark.sql.parquet.nestColumnPruning") --- End diff -- rename done --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #14957: [SPARK-4502][SQL]Support parquet nested struct pr...
Github user xuanyuanking commented on a diff in the pull request: https://github.com/apache/spark/pull/14957#discussion_r84592883 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala --- @@ -661,6 +666,8 @@ private[sql] class SQLConf extends Serializable with CatalystConf with Logging { def isParquetINT96AsTimestamp: Boolean = getConf(PARQUET_INT96_AS_TIMESTAMP) + def isParquetNestColumnPruning: Boolean = getConf(PARQUET_NEST_COLUMN_PRUNING) --- End diff -- rename done --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #14957: [SPARK-4502][SQL]Support parquet nested struct pr...
Github user xuanyuanking commented on a diff in the pull request: https://github.com/apache/spark/pull/14957#discussion_r84592888 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/parquet/ParquetQuerySuite.scala --- @@ -571,6 +571,37 @@ class ParquetQuerySuite extends QueryTest with ParquetTest with SharedSQLContext } } + test("SPARK-4502 parquet nested fields pruning") { +// Schema of "test-data/nested-array-struct.parquet": +//root +//|-- col: struct (nullable = true) +//||-- s1: struct (nullable = true) +//|||-- s1_1: long (nullable = true) +//|||-- s1_2: long (nullable = true) +//||-- str: string (nullable = true) +//|-- num: long (nullable = true) +//|-- str: string (nullable = true) +val df = readResourceParquetFile("test-data/nested-struct.snappy.parquet") +df.createOrReplaceTempView("tmp_table") --- End diff -- fix done --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15354: [SPARK-17764][SQL] Add `to_json` supporting to convert n...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15354 **[Test build #67408 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/67408/consoleFull)** for PR 15354 at commit [`8603462`](https://github.com/apache/spark/commit/8603462b3e09ee2187d57be504e99b12dad50486). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15354: [SPARK-17764][SQL] Add `to_json` supporting to convert n...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15354 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/67408/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15354: [SPARK-17764][SQL] Add `to_json` supporting to convert n...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15354 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #15601: [SPARK-17123][SQL][BRANCH-2.0] Use type-widened e...
GitHub user HyukjinKwon opened a pull request: https://github.com/apache/spark/pull/15601 [SPARK-17123][SQL][BRANCH-2.0] Use type-widened encoder for DataFrame for set operations ## What changes were proposed in this pull request? This PR backports https://github.com/apache/spark/pull/15072 Please note that the test code is a bit different with the master as https://github.com/apache/spark/pull/14786 was only merged into master and therefore, it does not support type-widening between `DateType` and `TimestampType`. So, both types were taken out from the test. ## How was this patch tested? Unit test in `DataFrameSuite`. You can merge this pull request into a Git repository by running: $ git pull https://github.com/HyukjinKwon/spark backport-17123 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/15601.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #15601 commit b233d09977d3ca1afead55f8c8d6057c5643a500 Author: hyukjinkwon Date: 2016-10-22T18:09:04Z Backport "Use type-widened encoder for DataFrame rather than existing encoder to allow type-widening from set operations" --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15601: [SPARK-17123][SQL][BRANCH-2.0] Use type-widened encoder ...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15601 **[Test build #67412 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/67412/consoleFull)** for PR 15601 at commit [`b233d09`](https://github.com/apache/spark/commit/b233d09977d3ca1afead55f8c8d6057c5643a500). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15601: [SPARK-17123][SQL][BRANCH-2.0] Use type-widened encoder ...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15601 **[Test build #67413 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/67413/consoleFull)** for PR 15601 at commit [`9154463`](https://github.com/apache/spark/commit/9154463561c20b26c92e145e4b3039239fb46f5e). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15599: [SPARK-18022][SQL] java.lang.NullPointerException instea...
Github user srowen commented on the issue: https://github.com/apache/spark/pull/15599 I think we don't know exactly what the real exception is. The NPE occurs while handling the exception. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15413: [SPARK-17847][ML][WIP] Copy GaussianMixture implementati...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15413 **[Test build #67414 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/67414/consoleFull)** for PR 15413 at commit [`8b94909`](https://github.com/apache/spark/commit/8b9490981bf21ba9d53880efbfdee1447b01a4a9). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15575: [SPARK-18038] [SQL] Move output partitioning definition ...
Github user hvanhovell commented on the issue: https://github.com/apache/spark/pull/15575 LGTM - merging to master. Thanks! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #15575: [SPARK-18038] [SQL] Move output partitioning defi...
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/15575 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15354: [SPARK-17764][SQL] Add `to_json` supporting to convert n...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15354 **[Test build #67409 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/67409/consoleFull)** for PR 15354 at commit [`518f48d`](https://github.com/apache/spark/commit/518f48d19e37235e228570817eb5847f1f05c28a). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15553: [SPARK-18008] [build] Add support for -Dmaven.test.skip=...
Github user srowen commented on the issue: https://github.com/apache/spark/pull/15553 For development time, use an IDE or maybe SBT to do incremental compilation, which is even faster. I don't otherwise see a strong use case for not compiling tests. Yes it's slower to compile tests, but not slow compared to running them, and the downside is missing compile errors and the extra complexity of an already creaking build. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15582: [SPARK-18045][SQL][TESTS] Move `HiveDataFrameAnalyticsSu...
Github user hvanhovell commented on the issue: https://github.com/apache/spark/pull/15582 Merging to master. Thanks! Ping me for the follow-up. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #14957: [SPARK-4502][SQL]Support parquet nested struct pruning a...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/14957 **[Test build #67410 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/67410/consoleFull)** for PR 14957 at commit [`ab8f5ec`](https://github.com/apache/spark/commit/ab8f5ec15b2682ee40ea0483e0b6642b2a14c7ad). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15354: [SPARK-17764][SQL] Add `to_json` supporting to convert n...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15354 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/67409/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15354: [SPARK-17764][SQL] Add `to_json` supporting to convert n...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15354 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #14957: [SPARK-4502][SQL]Support parquet nested struct pruning a...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/14957 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/67410/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #14957: [SPARK-4502][SQL]Support parquet nested struct pruning a...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/14957 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #15582: [SPARK-18045][SQL][TESTS] Move `HiveDataFrameAnal...
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/15582 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15513: [SPARK-17963][SQL][Documentation] Add examples (extend) ...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15513 **[Test build #67411 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/67411/consoleFull)** for PR 15513 at commit [`feafdc2`](https://github.com/apache/spark/commit/feafdc21317b4d05921b73f6231d8b0d498bad43). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15513: [SPARK-17963][SQL][Documentation] Add examples (extend) ...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15513 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/67411/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15513: [SPARK-17963][SQL][Documentation] Add examples (extend) ...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15513 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15413: [SPARK-17847][ML][WIP] Copy GaussianMixture implementati...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15413 **[Test build #67414 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/67414/consoleFull)** for PR 15413 at commit [`8b94909`](https://github.com/apache/spark/commit/8b9490981bf21ba9d53880efbfdee1447b01a4a9). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15413: [SPARK-17847][ML][WIP] Copy GaussianMixture implementati...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15413 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/67414/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15413: [SPARK-17847][ML][WIP] Copy GaussianMixture implementati...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15413 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15601: [SPARK-17123][SQL][BRANCH-2.0] Use type-widened encoder ...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15601 **[Test build #67412 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/67412/consoleFull)** for PR 15601 at commit [`b233d09`](https://github.com/apache/spark/commit/b233d09977d3ca1afead55f8c8d6057c5643a500). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15601: [SPARK-17123][SQL][BRANCH-2.0] Use type-widened encoder ...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15601 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/67412/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15601: [SPARK-17123][SQL][BRANCH-2.0] Use type-widened encoder ...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15601 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15601: [SPARK-17123][SQL][BRANCH-2.0] Use type-widened encoder ...
Github user hvanhovell commented on the issue: https://github.com/apache/spark/pull/15601 LGTM. Merging to branch-2.0. Could you close this? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15601: [SPARK-17123][SQL][BRANCH-2.0] Use type-widened encoder ...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15601 **[Test build #67413 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/67413/consoleFull)** for PR 15601 at commit [`9154463`](https://github.com/apache/spark/commit/9154463561c20b26c92e145e4b3039239fb46f5e). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15601: [SPARK-17123][SQL][BRANCH-2.0] Use type-widened encoder ...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15601 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15601: [SPARK-17123][SQL][BRANCH-2.0] Use type-widened encoder ...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15601 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/67413/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #15601: [SPARK-17123][SQL][BRANCH-2.0] Use type-widened e...
Github user HyukjinKwon closed the pull request at: https://github.com/apache/spark/pull/15601 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15433: [SPARK-17822][SPARKR] Use weak reference in JVMObjectTra...
Github user techaddict commented on the issue: https://github.com/apache/spark/pull/15433 closing this since, its maybe not the right way to do this --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #15433: [SPARK-17822][SPARKR] Use weak reference in JVMOb...
Github user techaddict closed the pull request at: https://github.com/apache/spark/pull/15433 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15513: [SPARK-17963][SQL][Documentation] Add examples (extend) ...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15513 **[Test build #67415 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/67415/consoleFull)** for PR 15513 at commit [`2656c62`](https://github.com/apache/spark/commit/2656c6265ef5dc02a0fd06b4f8154fd12939bdad). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15595: [SPARK-18058][SQL] Comparing column types ignoring Nulla...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15595 **[Test build #67416 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/67416/consoleFull)** for PR 15595 at commit [`e691714`](https://github.com/apache/spark/commit/e691714120fd3dd57c1f9d2add06554dff9a1eae). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #15513: [SPARK-17963][SQL][Documentation] Add examples (e...
Github user gatorsmile commented on a diff in the pull request: https://github.com/apache/spark/pull/15513#discussion_r84597530 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/mathExpressions.scala --- @@ -970,9 +1270,19 @@ case class Round(child: Expression, scale: Expression) * also known as Gaussian rounding or bankers' rounding. * round(2.5) = 2.0, round(3.5) = 4.0. */ +// scalastyle:off line.size.limit @ExpressionDescription( - usage = "_FUNC_(x, d) - Round x to d decimal places using HALF_EVEN rounding mode.", - extended = "> SELECT _FUNC_(2.5, 0);\n 2.0") + usage = "_FUNC_(expr, d) - Round expr to d decimal places using HALF_EVEN rounding mode.", --- End diff -- Add `Returns` at the beginning. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #15513: [SPARK-17963][SQL][Documentation] Add examples (e...
Github user gatorsmile commented on a diff in the pull request: https://github.com/apache/spark/pull/15513#discussion_r84597610 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/xml/xpath.scala --- @@ -150,8 +206,16 @@ case class XPathString(xml: Expression, path: Expression) extends XPathExtract { // scalastyle:off line.size.limit @ExpressionDescription( - usage = "_FUNC_(xml, xpath) - Returns a string array of values within xml nodes that match the xpath expression", - extended = "> SELECT _FUNC_('b1b2b3c1c2','a/b/text()');\n['b1','b2','b3']") + usage = "_FUNC_(xml, xpath) - Returns a string array of values within the nodes of xml that match the XPath expression.", + extended = """ +Arguments: + xml - a string expression that represents XML document. + path - a string literal that represents XPath expression. --- End diff -- What is the differences between `a string expression` and `a string literal`? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #15513: [SPARK-17963][SQL][Documentation] Add examples (e...
Github user gatorsmile commented on a diff in the pull request: https://github.com/apache/spark/pull/15513#discussion_r84597661 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/windowExpressions.scala --- @@ -692,9 +722,11 @@ case class DenseRank(children: Seq[Expression]) extends RankLike { * change in rank. This is an internal parameter and will be assigned by the * Analyser. */ -@ExpressionDescription(usage = - """_FUNC_() - PERCENT_RANK() The PercentRank function computes the percentage - ranking of a value in a group of values.""") +@ExpressionDescription( + usage = """ +_FUNC_() - PERCENT_RANK() The PercentRank function computes the percentage --- End diff -- `The PercentRank function computes the percentage` -> `Computes the percentage` --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #15513: [SPARK-17963][SQL][Documentation] Add examples (e...
Github user gatorsmile commented on a diff in the pull request: https://github.com/apache/spark/pull/15513#discussion_r84597837 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/aggregate/ApproximatePercentile.scala --- @@ -49,21 +49,29 @@ import org.apache.spark.sql.types._ * DEFAULT_PERCENTILE_ACCURACY. */ @ExpressionDescription( - usage = -""" - _FUNC_(col, percentage [, accuracy]) - Returns the approximate percentile value of numeric + usage = """ +_FUNC_(col, percentage [, accuracy]) - Returns the approximate percentile value of numeric column `col` at the given percentage. The value of percentage must be between 0.0 and 1.0. The `accuracy` parameter (default: 1) is a positive integer literal which controls approximation accuracy at the cost of memory. Higher value of `accuracy` yields better accuracy, `1.0/accuracy` is the relative error of the approximation. - - _FUNC_(col, array(percentage1 [, percentage2]...) [, accuracy]) - Returns the approximate - percentile array of column `col` at the given percentage array. Each value of the - percentage array must be between 0.0 and 1.0. The `accuracy` parameter (default: 1) is - a positive integer literal which controls approximation accuracy at the cost of memory. - Higher value of `accuracy` yields better accuracy, `1.0/accuracy` is the relative error of - the approximation. -""") + When percentage is an array, each value of the percentage array must be between 0.0 and 1.0. + In this case, returns the approximate percentile array of column `col` at the given + percentage array. + """, + extended = """ +Arguments: + col - a numeric expression. + percentage - a numeric literal or an array literal of numeric type that defines the +percentile between 0.0 and 1.0. For example, 0.5 means 50-percentile. + accuracy - a numeric literal that defines approximation accuracy. --- End diff -- What is the difference between `a numeric expression` and `a numeric literal`? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15513: [SPARK-17963][SQL][Documentation] Add examples (extend) ...
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/15513 I prefer to minimizing the words without showing any information. Below is an example. ``` Arguments: class - a string literal that represents a fully-qualified class name. method - a string literal that represents a method name. arg - a boolean, numeric or string expression that represents arguments for the method. ``` -> ``` Arguments: class - a fully-qualified class name. Data type: string. method - a method name. Data type: string. arg - arguments for the method. Data type: boolean, numeric or string ``` --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15513: [SPARK-17963][SQL][Documentation] Add examples (extend) ...
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/15513 Another example: ``` Arguments: expr1 - a numeric expression. expr2 - a numeric expression. ``` -> ``` Arguments: expr1 - Data type: numeric. expr2 - Data type: numeric. ``` --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15513: [SPARK-17963][SQL][Documentation] Add examples (extend) ...
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/15513 We should also add the default value in the argument description. Below is an example. ``` Arguments: col - a numeric expression. percentage - a numeric literal or an array literal of numeric type that defines the percentile between 0.0 and 1.0. For example, 0.5 means 50-percentile. accuracy - a numeric literal that defines approximation accuracy. ``` -> ``` Arguments: col - Data type: numeric. percentage - the percentile between 0.0 and 1.0. For example, 0.5 means 50-percentile. Data type: numeric or an array expression of numeric type. accuracy - approximation accuracy. Data type: numeric. Default: 1. ``` --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15417: [SPARK-17851][SQL][TESTS] Make sure all test sqls in cat...
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/15417 Sorry, will review it soon. Thanks! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15595: [SPARK-18058][SQL] Comparing column types ignoring Nulla...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15595 **[Test build #67416 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/67416/consoleFull)** for PR 15595 at commit [`e691714`](https://github.com/apache/spark/commit/e691714120fd3dd57c1f9d2add06554dff9a1eae). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15595: [SPARK-18058][SQL] Comparing column types ignoring Nulla...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15595 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/67416/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15595: [SPARK-18058][SQL] Comparing column types ignoring Nulla...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15595 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15513: [SPARK-17963][SQL][Documentation] Add examples (extend) ...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15513 **[Test build #67415 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/67415/consoleFull)** for PR 15513 at commit [`2656c62`](https://github.com/apache/spark/commit/2656c6265ef5dc02a0fd06b4f8154fd12939bdad). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15513: [SPARK-17963][SQL][Documentation] Add examples (extend) ...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15513 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15513: [SPARK-17963][SQL][Documentation] Add examples (extend) ...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15513 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/67415/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #15441: [SPARK-4411] [Web UI] Add "kill" link for jobs in...
Github user ajbozarth commented on a diff in the pull request: https://github.com/apache/spark/pull/15441#discussion_r84599733 --- Diff: core/src/test/scala/org/apache/spark/ui/UISeleniumSuite.scala --- @@ -651,6 +671,15 @@ class UISeleniumSuite extends SparkFunSuite with WebBrowser with Matchers with B } } + def getResponseCode(url: URL, method: String): Int = { +val connection = url.openConnection().asInstanceOf[HttpURLConnection] +connection.setRequestMethod(method) +connection.connect() +val code = connection.getResponseCode() +connection.disconnect() --- End diff -- Thanks, I didn't know scala could do that (how it returns), still learning new things every day --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #14136: [SPARK-16282][SQL] Implement percentile SQL function.
Github user hvanhovell commented on the issue: https://github.com/apache/spark/pull/14136 It is relatively straight forward to test `ImperativeAggregate`s. See: - org.apache.spark.sql.catalyst.expressions.aggregate.HyperLogLogPlusPlusSuite - org.apache.spark.sql.catalyst.expressions.aggregate.ApproximatePercentileSuite For some inspiration. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15595: [SPARK-18058][SQL] Comparing column types ignoring Nulla...
Github user hvanhovell commented on the issue: https://github.com/apache/spark/pull/15595 LGTM - merging to master. Thanks! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #15595: [SPARK-18058][SQL] Comparing column types ignorin...
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/15595 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15595: [SPARK-18058][SQL] Comparing column types ignoring Nulla...
Github user CodingCat commented on the issue: https://github.com/apache/spark/pull/15595 sure, doing that now --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15595: [SPARK-18058][SQL] Comparing column types ignoring Nulla...
Github user hvanhovell commented on the issue: https://github.com/apache/spark/pull/15595 @CodingCat could you open a backport for branch-2.0? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15595: [SPARK-18058][SQL] Comparing column types ignoring Nulla...
Github user hvanhovell commented on the issue: https://github.com/apache/spark/pull/15595 Thanks! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15441: [SPARK-4411] [Web UI] Add "kill" link for jobs in the UI
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15441 **[Test build #67417 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/67417/consoleFull)** for PR 15441 at commit [`0162c81`](https://github.com/apache/spark/commit/0162c81d8b7a090015ededf1da2e8ffad3f5f3e7). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #15595: [SPARK-18058][SQL] Comparing column types ignorin...
Github user tejasapatil commented on a diff in the pull request: https://github.com/apache/spark/pull/15595#discussion_r84600424 --- Diff: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/analysis/AnalysisSuite.scala --- @@ -377,4 +377,23 @@ class AnalysisSuite extends AnalysisTest { assertExpressionType(sum(Divide(Decimal(1), 2.0)), DoubleType) assertExpressionType(sum(Divide(1.0, Decimal(2.0))), DoubleType) } + + test("SPARK-18058: union and set operations shall not care about the nullability" + +" when comparing column types") { +val firstTable = LocalRelation( + AttributeReference("a", +StructType(Seq(StructField("a", IntegerType, nullable = true))), nullable = false)()) +val secondTable = LocalRelation( + AttributeReference("a", +StructType(Seq(StructField("a", IntegerType, nullable = false))), nullable = false)()) + +val unionPlan = Union(firstTable, secondTable) +assertAnalysisSuccess(unionPlan) + +val r1 = Except(firstTable, secondTable) --- End diff -- nit: this var could have been named better. Actually, you could have gotten rid of the vars by doing : ``` assertAnalysisSuccess(Union(firstTable, secondTable)) assertAnalysisSuccess(Except(firstTable, secondTable)) assertAnalysisSuccess(Intersect(firstTable, secondTable)) ``` --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15595: [SPARK-18058][SQL] Comparing column types ignoring Nulla...
Github user tejasapatil commented on the issue: https://github.com/apache/spark/pull/15595 Just saw that this got merged. I had a tiny nit but its ok without it. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15541: [SPARK-17637][Scheduler]Packed scheduling for Spark task...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15541 **[Test build #67418 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/67418/consoleFull)** for PR 15541 at commit [`6b29002`](https://github.com/apache/spark/commit/6b29002c29fecdbe32159dd0d31f53716630de46). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15589: [SPARKR][branch-2.0] R merge API doc and example fix
Github user felixcheung commented on the issue: https://github.com/apache/spark/pull/15589 merged to branch-2.0 and then cherry-picked to master for the error message fix not in master. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #15589: [SPARKR][branch-2.0] R merge API doc and example ...
Github user felixcheung closed the pull request at: https://github.com/apache/spark/pull/15589 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org