[spark] branch master updated (7d6e3fb -> 5effa8e)
This is an automated email from the ASF dual-hosted git repository. yamamuro pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from 7d6e3fb [SPARK-33074][SQL] Classify dialect exceptions in JDBC v2 Table Catalog add 5effa8e [SPARK-33091][SQL] Avoid using map instead of foreach to avoid potential side effect at callers of OrcUtils.readCatalystSchema No new revisions were added by this update. Summary of changes: .../org/apache/spark/sql/execution/datasources/orc/OrcFileFormat.scala | 2 +- .../sql/execution/datasources/v2/orc/OrcPartitionReaderFactory.scala| 2 +- 2 files changed, 2 insertions(+), 2 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch branch-3.0 updated: [SPARK-33091][SQL] Avoid using map instead of foreach to avoid potential side effect at callers of OrcUtils.readCatalystSchema
This is an automated email from the ASF dual-hosted git repository. yamamuro pushed a commit to branch branch-3.0 in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/branch-3.0 by this push: new 782ab8e [SPARK-33091][SQL] Avoid using map instead of foreach to avoid potential side effect at callers of OrcUtils.readCatalystSchema 782ab8e is described below commit 782ab8e244252696c50b4b432d07a56c374b8680 Author: HyukjinKwon AuthorDate: Thu Oct 8 16:29:15 2020 +0900 [SPARK-33091][SQL] Avoid using map instead of foreach to avoid potential side effect at callers of OrcUtils.readCatalystSchema ### What changes were proposed in this pull request? This is a kind of a followup of SPARK-32646. New JIRA was filed to control the fixed versions properly. When you use `map`, it might be lazily evaluated and not executed. To avoid this, we should better use `foreach`. See also SPARK-16694. Current codes look not causing any bug for now but it should be best to fix to avoid potential issues. ### Why are the changes needed? To avoid potential issues from `map` being lazy and not executed. ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? Ran related tests. CI in this PR should verify. Closes #29974 from HyukjinKwon/SPARK-32646. Authored-by: HyukjinKwon Signed-off-by: Takeshi Yamamuro (cherry picked from commit 5effa8ea261ba59214afedc2853d1b248b330ca6) Signed-off-by: Takeshi Yamamuro --- .../org/apache/spark/sql/execution/datasources/orc/OrcFileFormat.scala | 2 +- .../sql/execution/datasources/v2/orc/OrcPartitionReaderFactory.scala| 2 +- 2 files changed, 2 insertions(+), 2 deletions(-) diff --git a/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/orc/OrcFileFormat.scala b/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/orc/OrcFileFormat.scala index 69badb4..c540007 100644 --- a/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/orc/OrcFileFormat.scala +++ b/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/orc/OrcFileFormat.scala @@ -185,7 +185,7 @@ class OrcFileFormat } else { // ORC predicate pushdown if (orcFilterPushDown) { - OrcUtils.readCatalystSchema(filePath, conf, ignoreCorruptFiles).map { fileSchema => + OrcUtils.readCatalystSchema(filePath, conf, ignoreCorruptFiles).foreach { fileSchema => OrcFilters.createFilter(fileSchema, filters).foreach { f => OrcInputFormat.setSearchArgument(conf, f, fileSchema.fieldNames) } diff --git a/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2/orc/OrcPartitionReaderFactory.scala b/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2/orc/OrcPartitionReaderFactory.scala index 1f38128..b0ddee0 100644 --- a/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2/orc/OrcPartitionReaderFactory.scala +++ b/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2/orc/OrcPartitionReaderFactory.scala @@ -69,7 +69,7 @@ case class OrcPartitionReaderFactory( private def pushDownPredicates(filePath: Path, conf: Configuration): Unit = { if (orcFilterPushDown) { - OrcUtils.readCatalystSchema(filePath, conf, ignoreCorruptFiles).map { fileSchema => + OrcUtils.readCatalystSchema(filePath, conf, ignoreCorruptFiles).foreach { fileSchema => OrcFilters.createFilter(fileSchema, filters).foreach { f => OrcInputFormat.setSearchArgument(conf, f, fileSchema.fieldNames) } - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (7d6e3fb -> 5effa8e)
This is an automated email from the ASF dual-hosted git repository. yamamuro pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from 7d6e3fb [SPARK-33074][SQL] Classify dialect exceptions in JDBC v2 Table Catalog add 5effa8e [SPARK-33091][SQL] Avoid using map instead of foreach to avoid potential side effect at callers of OrcUtils.readCatalystSchema No new revisions were added by this update. Summary of changes: .../org/apache/spark/sql/execution/datasources/orc/OrcFileFormat.scala | 2 +- .../sql/execution/datasources/v2/orc/OrcPartitionReaderFactory.scala| 2 +- 2 files changed, 2 insertions(+), 2 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch branch-3.0 updated: [SPARK-33091][SQL] Avoid using map instead of foreach to avoid potential side effect at callers of OrcUtils.readCatalystSchema
This is an automated email from the ASF dual-hosted git repository. yamamuro pushed a commit to branch branch-3.0 in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/branch-3.0 by this push: new 782ab8e [SPARK-33091][SQL] Avoid using map instead of foreach to avoid potential side effect at callers of OrcUtils.readCatalystSchema 782ab8e is described below commit 782ab8e244252696c50b4b432d07a56c374b8680 Author: HyukjinKwon AuthorDate: Thu Oct 8 16:29:15 2020 +0900 [SPARK-33091][SQL] Avoid using map instead of foreach to avoid potential side effect at callers of OrcUtils.readCatalystSchema ### What changes were proposed in this pull request? This is a kind of a followup of SPARK-32646. New JIRA was filed to control the fixed versions properly. When you use `map`, it might be lazily evaluated and not executed. To avoid this, we should better use `foreach`. See also SPARK-16694. Current codes look not causing any bug for now but it should be best to fix to avoid potential issues. ### Why are the changes needed? To avoid potential issues from `map` being lazy and not executed. ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? Ran related tests. CI in this PR should verify. Closes #29974 from HyukjinKwon/SPARK-32646. Authored-by: HyukjinKwon Signed-off-by: Takeshi Yamamuro (cherry picked from commit 5effa8ea261ba59214afedc2853d1b248b330ca6) Signed-off-by: Takeshi Yamamuro --- .../org/apache/spark/sql/execution/datasources/orc/OrcFileFormat.scala | 2 +- .../sql/execution/datasources/v2/orc/OrcPartitionReaderFactory.scala| 2 +- 2 files changed, 2 insertions(+), 2 deletions(-) diff --git a/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/orc/OrcFileFormat.scala b/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/orc/OrcFileFormat.scala index 69badb4..c540007 100644 --- a/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/orc/OrcFileFormat.scala +++ b/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/orc/OrcFileFormat.scala @@ -185,7 +185,7 @@ class OrcFileFormat } else { // ORC predicate pushdown if (orcFilterPushDown) { - OrcUtils.readCatalystSchema(filePath, conf, ignoreCorruptFiles).map { fileSchema => + OrcUtils.readCatalystSchema(filePath, conf, ignoreCorruptFiles).foreach { fileSchema => OrcFilters.createFilter(fileSchema, filters).foreach { f => OrcInputFormat.setSearchArgument(conf, f, fileSchema.fieldNames) } diff --git a/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2/orc/OrcPartitionReaderFactory.scala b/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2/orc/OrcPartitionReaderFactory.scala index 1f38128..b0ddee0 100644 --- a/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2/orc/OrcPartitionReaderFactory.scala +++ b/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2/orc/OrcPartitionReaderFactory.scala @@ -69,7 +69,7 @@ case class OrcPartitionReaderFactory( private def pushDownPredicates(filePath: Path, conf: Configuration): Unit = { if (orcFilterPushDown) { - OrcUtils.readCatalystSchema(filePath, conf, ignoreCorruptFiles).map { fileSchema => + OrcUtils.readCatalystSchema(filePath, conf, ignoreCorruptFiles).foreach { fileSchema => OrcFilters.createFilter(fileSchema, filters).foreach { f => OrcInputFormat.setSearchArgument(conf, f, fileSchema.fieldNames) } - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (7d6e3fb -> 5effa8e)
This is an automated email from the ASF dual-hosted git repository. yamamuro pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from 7d6e3fb [SPARK-33074][SQL] Classify dialect exceptions in JDBC v2 Table Catalog add 5effa8e [SPARK-33091][SQL] Avoid using map instead of foreach to avoid potential side effect at callers of OrcUtils.readCatalystSchema No new revisions were added by this update. Summary of changes: .../org/apache/spark/sql/execution/datasources/orc/OrcFileFormat.scala | 2 +- .../sql/execution/datasources/v2/orc/OrcPartitionReaderFactory.scala| 2 +- 2 files changed, 2 insertions(+), 2 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (7d6e3fb -> 5effa8e)
This is an automated email from the ASF dual-hosted git repository. yamamuro pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from 7d6e3fb [SPARK-33074][SQL] Classify dialect exceptions in JDBC v2 Table Catalog add 5effa8e [SPARK-33091][SQL] Avoid using map instead of foreach to avoid potential side effect at callers of OrcUtils.readCatalystSchema No new revisions were added by this update. Summary of changes: .../org/apache/spark/sql/execution/datasources/orc/OrcFileFormat.scala | 2 +- .../sql/execution/datasources/v2/orc/OrcPartitionReaderFactory.scala| 2 +- 2 files changed, 2 insertions(+), 2 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (72da6f8 -> 94d648d)
This is an automated email from the ASF dual-hosted git repository. yamamuro pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from 72da6f8 [SPARK-33002][PYTHON] Remove non-API annotations add 94d648d [SPARK-33036][SQL] Refactor RewriteCorrelatedScalarSubquery code to replace exprIds in a bottom-up manner No new revisions were added by this update. Summary of changes: .../spark/sql/catalyst/optimizer/subquery.scala| 80 ++ 1 file changed, 51 insertions(+), 29 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (72da6f8 -> 94d648d)
This is an automated email from the ASF dual-hosted git repository. yamamuro pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from 72da6f8 [SPARK-33002][PYTHON] Remove non-API annotations add 94d648d [SPARK-33036][SQL] Refactor RewriteCorrelatedScalarSubquery code to replace exprIds in a bottom-up manner No new revisions were added by this update. Summary of changes: .../spark/sql/catalyst/optimizer/subquery.scala| 80 ++ 1 file changed, 51 insertions(+), 29 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (72da6f8 -> 94d648d)
This is an automated email from the ASF dual-hosted git repository. yamamuro pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from 72da6f8 [SPARK-33002][PYTHON] Remove non-API annotations add 94d648d [SPARK-33036][SQL] Refactor RewriteCorrelatedScalarSubquery code to replace exprIds in a bottom-up manner No new revisions were added by this update. Summary of changes: .../spark/sql/catalyst/optimizer/subquery.scala| 80 ++ 1 file changed, 51 insertions(+), 29 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (72da6f8 -> 94d648d)
This is an automated email from the ASF dual-hosted git repository. yamamuro pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from 72da6f8 [SPARK-33002][PYTHON] Remove non-API annotations add 94d648d [SPARK-33036][SQL] Refactor RewriteCorrelatedScalarSubquery code to replace exprIds in a bottom-up manner No new revisions were added by this update. Summary of changes: .../spark/sql/catalyst/optimizer/subquery.scala| 80 ++ 1 file changed, 51 insertions(+), 29 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (72da6f8 -> 94d648d)
This is an automated email from the ASF dual-hosted git repository. yamamuro pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from 72da6f8 [SPARK-33002][PYTHON] Remove non-API annotations add 94d648d [SPARK-33036][SQL] Refactor RewriteCorrelatedScalarSubquery code to replace exprIds in a bottom-up manner No new revisions were added by this update. Summary of changes: .../spark/sql/catalyst/optimizer/subquery.scala| 80 ++ 1 file changed, 51 insertions(+), 29 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (1299c8a -> 5af62a2)
This is an automated email from the ASF dual-hosted git repository. yamamuro pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from 1299c8a [SPARK-33037][SHUFFLE] Remove knownManagers to support user's custom shuffle manager plugin add 5af62a2 [SPARK-33052][SQL][TEST] Make all the database versions up-to-date for integration tests No new revisions were added by this update. Summary of changes: .../src/test/resources/mariadb_docker_entrypoint.sh | 2 +- .../scala/org/apache/spark/sql/jdbc/DB2IntegrationSuite.scala | 9 - .../org/apache/spark/sql/jdbc/DB2KrbIntegrationSuite.scala| 9 - .../apache/spark/sql/jdbc/MariaDBKrbIntegrationSuite.scala| 4 +++- .../apache/spark/sql/jdbc/MsSqlServerIntegrationSuite.scala | 10 +- .../org/apache/spark/sql/jdbc/MySQLIntegrationSuite.scala | 11 +-- .../org/apache/spark/sql/jdbc/PostgresIntegrationSuite.scala | 9 - .../apache/spark/sql/jdbc/PostgresKrbIntegrationSuite.scala | 9 - 8 files changed, 54 insertions(+), 9 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (1299c8a -> 5af62a2)
This is an automated email from the ASF dual-hosted git repository. yamamuro pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from 1299c8a [SPARK-33037][SHUFFLE] Remove knownManagers to support user's custom shuffle manager plugin add 5af62a2 [SPARK-33052][SQL][TEST] Make all the database versions up-to-date for integration tests No new revisions were added by this update. Summary of changes: .../src/test/resources/mariadb_docker_entrypoint.sh | 2 +- .../scala/org/apache/spark/sql/jdbc/DB2IntegrationSuite.scala | 9 - .../org/apache/spark/sql/jdbc/DB2KrbIntegrationSuite.scala| 9 - .../apache/spark/sql/jdbc/MariaDBKrbIntegrationSuite.scala| 4 +++- .../apache/spark/sql/jdbc/MsSqlServerIntegrationSuite.scala | 10 +- .../org/apache/spark/sql/jdbc/MySQLIntegrationSuite.scala | 11 +-- .../org/apache/spark/sql/jdbc/PostgresIntegrationSuite.scala | 9 - .../apache/spark/sql/jdbc/PostgresKrbIntegrationSuite.scala | 9 - 8 files changed, 54 insertions(+), 9 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (1299c8a -> 5af62a2)
This is an automated email from the ASF dual-hosted git repository. yamamuro pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from 1299c8a [SPARK-33037][SHUFFLE] Remove knownManagers to support user's custom shuffle manager plugin add 5af62a2 [SPARK-33052][SQL][TEST] Make all the database versions up-to-date for integration tests No new revisions were added by this update. Summary of changes: .../src/test/resources/mariadb_docker_entrypoint.sh | 2 +- .../scala/org/apache/spark/sql/jdbc/DB2IntegrationSuite.scala | 9 - .../org/apache/spark/sql/jdbc/DB2KrbIntegrationSuite.scala| 9 - .../apache/spark/sql/jdbc/MariaDBKrbIntegrationSuite.scala| 4 +++- .../apache/spark/sql/jdbc/MsSqlServerIntegrationSuite.scala | 10 +- .../org/apache/spark/sql/jdbc/MySQLIntegrationSuite.scala | 11 +-- .../org/apache/spark/sql/jdbc/PostgresIntegrationSuite.scala | 9 - .../apache/spark/sql/jdbc/PostgresKrbIntegrationSuite.scala | 9 - 8 files changed, 54 insertions(+), 9 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (1299c8a -> 5af62a2)
This is an automated email from the ASF dual-hosted git repository. yamamuro pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from 1299c8a [SPARK-33037][SHUFFLE] Remove knownManagers to support user's custom shuffle manager plugin add 5af62a2 [SPARK-33052][SQL][TEST] Make all the database versions up-to-date for integration tests No new revisions were added by this update. Summary of changes: .../src/test/resources/mariadb_docker_entrypoint.sh | 2 +- .../scala/org/apache/spark/sql/jdbc/DB2IntegrationSuite.scala | 9 - .../org/apache/spark/sql/jdbc/DB2KrbIntegrationSuite.scala| 9 - .../apache/spark/sql/jdbc/MariaDBKrbIntegrationSuite.scala| 4 +++- .../apache/spark/sql/jdbc/MsSqlServerIntegrationSuite.scala | 10 +- .../org/apache/spark/sql/jdbc/MySQLIntegrationSuite.scala | 11 +-- .../org/apache/spark/sql/jdbc/PostgresIntegrationSuite.scala | 9 - .../apache/spark/sql/jdbc/PostgresKrbIntegrationSuite.scala | 9 - 8 files changed, 54 insertions(+), 9 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (1299c8a -> 5af62a2)
This is an automated email from the ASF dual-hosted git repository. yamamuro pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from 1299c8a [SPARK-33037][SHUFFLE] Remove knownManagers to support user's custom shuffle manager plugin add 5af62a2 [SPARK-33052][SQL][TEST] Make all the database versions up-to-date for integration tests No new revisions were added by this update. Summary of changes: .../src/test/resources/mariadb_docker_entrypoint.sh | 2 +- .../scala/org/apache/spark/sql/jdbc/DB2IntegrationSuite.scala | 9 - .../org/apache/spark/sql/jdbc/DB2KrbIntegrationSuite.scala| 9 - .../apache/spark/sql/jdbc/MariaDBKrbIntegrationSuite.scala| 4 +++- .../apache/spark/sql/jdbc/MsSqlServerIntegrationSuite.scala | 10 +- .../org/apache/spark/sql/jdbc/MySQLIntegrationSuite.scala | 11 +-- .../org/apache/spark/sql/jdbc/PostgresIntegrationSuite.scala | 9 - .../apache/spark/sql/jdbc/PostgresKrbIntegrationSuite.scala | 9 - 8 files changed, 54 insertions(+), 9 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (9b88aca -> 82721ce)
This is an automated email from the ASF dual-hosted git repository. yamamuro pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from 9b88aca [SPARK-33030][R] Add nth_value to SparkR add 82721ce [SPARK-32741][SQL][FOLLOWUP] Run plan integrity check only for effective plan changes No new revisions were added by this update. Summary of changes: .../main/scala/org/apache/spark/sql/catalyst/rules/RuleExecutor.scala | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (9b88aca -> 82721ce)
This is an automated email from the ASF dual-hosted git repository. yamamuro pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from 9b88aca [SPARK-33030][R] Add nth_value to SparkR add 82721ce [SPARK-32741][SQL][FOLLOWUP] Run plan integrity check only for effective plan changes No new revisions were added by this update. Summary of changes: .../main/scala/org/apache/spark/sql/catalyst/rules/RuleExecutor.scala | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (9b88aca -> 82721ce)
This is an automated email from the ASF dual-hosted git repository. yamamuro pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from 9b88aca [SPARK-33030][R] Add nth_value to SparkR add 82721ce [SPARK-32741][SQL][FOLLOWUP] Run plan integrity check only for effective plan changes No new revisions were added by this update. Summary of changes: .../main/scala/org/apache/spark/sql/catalyst/rules/RuleExecutor.scala | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (9b88aca -> 82721ce)
This is an automated email from the ASF dual-hosted git repository. yamamuro pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from 9b88aca [SPARK-33030][R] Add nth_value to SparkR add 82721ce [SPARK-32741][SQL][FOLLOWUP] Run plan integrity check only for effective plan changes No new revisions were added by this update. Summary of changes: .../main/scala/org/apache/spark/sql/catalyst/rules/RuleExecutor.scala | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (9b88aca -> 82721ce)
This is an automated email from the ASF dual-hosted git repository. yamamuro pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from 9b88aca [SPARK-33030][R] Add nth_value to SparkR add 82721ce [SPARK-32741][SQL][FOLLOWUP] Run plan integrity check only for effective plan changes No new revisions were added by this update. Summary of changes: .../main/scala/org/apache/spark/sql/catalyst/rules/RuleExecutor.scala | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (8657742 -> d6f3138)
This is an automated email from the ASF dual-hosted git repository. yamamuro pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from 8657742 [SPARK-32996][WEB-UI][FOLLOWUP] Move ExecutorSummarySuite to proper path add d6f3138 [SPARK-32859][SQL] Introduce physical rule to decide bucketing dynamically No new revisions were added by this update. Summary of changes: .../catalyst/optimizer/CostBasedJoinReorder.scala | 2 +- .../org/apache/spark/sql/internal/SQLConf.scala| 13 ++ .../spark/sql/execution/DataSourceScanExec.scala | 37 ++-- .../spark/sql/execution/QueryExecution.scala | 3 +- .../bucketing/DisableUnnecessaryBucketedScan.scala | 161 +++ .../org/apache/spark/sql/DataFrameJoinSuite.scala | 2 +- .../scala/org/apache/spark/sql/SubquerySuite.scala | 2 +- .../DisableUnnecessaryBucketedScanSuite.scala | 221 + ...ecessaryBucketedScanWithHiveSupportSuite.scala} | 5 +- 9 files changed, 427 insertions(+), 19 deletions(-) create mode 100644 sql/core/src/main/scala/org/apache/spark/sql/execution/bucketing/DisableUnnecessaryBucketedScan.scala create mode 100644 sql/core/src/test/scala/org/apache/spark/sql/sources/DisableUnnecessaryBucketedScanSuite.scala copy sql/hive/src/test/scala/org/apache/spark/sql/sources/{BucketedReadWithHiveSupportSuite.scala => DisableUnnecessaryBucketedScanWithHiveSupportSuite.scala} (89%) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (8657742 -> d6f3138)
This is an automated email from the ASF dual-hosted git repository. yamamuro pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from 8657742 [SPARK-32996][WEB-UI][FOLLOWUP] Move ExecutorSummarySuite to proper path add d6f3138 [SPARK-32859][SQL] Introduce physical rule to decide bucketing dynamically No new revisions were added by this update. Summary of changes: .../catalyst/optimizer/CostBasedJoinReorder.scala | 2 +- .../org/apache/spark/sql/internal/SQLConf.scala| 13 ++ .../spark/sql/execution/DataSourceScanExec.scala | 37 ++-- .../spark/sql/execution/QueryExecution.scala | 3 +- .../bucketing/DisableUnnecessaryBucketedScan.scala | 161 +++ .../org/apache/spark/sql/DataFrameJoinSuite.scala | 2 +- .../scala/org/apache/spark/sql/SubquerySuite.scala | 2 +- .../DisableUnnecessaryBucketedScanSuite.scala | 221 + ...ecessaryBucketedScanWithHiveSupportSuite.scala} | 5 +- 9 files changed, 427 insertions(+), 19 deletions(-) create mode 100644 sql/core/src/main/scala/org/apache/spark/sql/execution/bucketing/DisableUnnecessaryBucketedScan.scala create mode 100644 sql/core/src/test/scala/org/apache/spark/sql/sources/DisableUnnecessaryBucketedScanSuite.scala copy sql/hive/src/test/scala/org/apache/spark/sql/sources/{BucketedReadWithHiveSupportSuite.scala => DisableUnnecessaryBucketedScanWithHiveSupportSuite.scala} (89%) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (8657742 -> d6f3138)
This is an automated email from the ASF dual-hosted git repository. yamamuro pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from 8657742 [SPARK-32996][WEB-UI][FOLLOWUP] Move ExecutorSummarySuite to proper path add d6f3138 [SPARK-32859][SQL] Introduce physical rule to decide bucketing dynamically No new revisions were added by this update. Summary of changes: .../catalyst/optimizer/CostBasedJoinReorder.scala | 2 +- .../org/apache/spark/sql/internal/SQLConf.scala| 13 ++ .../spark/sql/execution/DataSourceScanExec.scala | 37 ++-- .../spark/sql/execution/QueryExecution.scala | 3 +- .../bucketing/DisableUnnecessaryBucketedScan.scala | 161 +++ .../org/apache/spark/sql/DataFrameJoinSuite.scala | 2 +- .../scala/org/apache/spark/sql/SubquerySuite.scala | 2 +- .../DisableUnnecessaryBucketedScanSuite.scala | 221 + ...ecessaryBucketedScanWithHiveSupportSuite.scala} | 5 +- 9 files changed, 427 insertions(+), 19 deletions(-) create mode 100644 sql/core/src/main/scala/org/apache/spark/sql/execution/bucketing/DisableUnnecessaryBucketedScan.scala create mode 100644 sql/core/src/test/scala/org/apache/spark/sql/sources/DisableUnnecessaryBucketedScanSuite.scala copy sql/hive/src/test/scala/org/apache/spark/sql/sources/{BucketedReadWithHiveSupportSuite.scala => DisableUnnecessaryBucketedScanWithHiveSupportSuite.scala} (89%) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (8657742 -> d6f3138)
This is an automated email from the ASF dual-hosted git repository. yamamuro pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from 8657742 [SPARK-32996][WEB-UI][FOLLOWUP] Move ExecutorSummarySuite to proper path add d6f3138 [SPARK-32859][SQL] Introduce physical rule to decide bucketing dynamically No new revisions were added by this update. Summary of changes: .../catalyst/optimizer/CostBasedJoinReorder.scala | 2 +- .../org/apache/spark/sql/internal/SQLConf.scala| 13 ++ .../spark/sql/execution/DataSourceScanExec.scala | 37 ++-- .../spark/sql/execution/QueryExecution.scala | 3 +- .../bucketing/DisableUnnecessaryBucketedScan.scala | 161 +++ .../org/apache/spark/sql/DataFrameJoinSuite.scala | 2 +- .../scala/org/apache/spark/sql/SubquerySuite.scala | 2 +- .../DisableUnnecessaryBucketedScanSuite.scala | 221 + ...ecessaryBucketedScanWithHiveSupportSuite.scala} | 5 +- 9 files changed, 427 insertions(+), 19 deletions(-) create mode 100644 sql/core/src/main/scala/org/apache/spark/sql/execution/bucketing/DisableUnnecessaryBucketedScan.scala create mode 100644 sql/core/src/test/scala/org/apache/spark/sql/sources/DisableUnnecessaryBucketedScanSuite.scala copy sql/hive/src/test/scala/org/apache/spark/sql/sources/{BucketedReadWithHiveSupportSuite.scala => DisableUnnecessaryBucketedScanWithHiveSupportSuite.scala} (89%) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (8657742 -> d6f3138)
This is an automated email from the ASF dual-hosted git repository. yamamuro pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from 8657742 [SPARK-32996][WEB-UI][FOLLOWUP] Move ExecutorSummarySuite to proper path add d6f3138 [SPARK-32859][SQL] Introduce physical rule to decide bucketing dynamically No new revisions were added by this update. Summary of changes: .../catalyst/optimizer/CostBasedJoinReorder.scala | 2 +- .../org/apache/spark/sql/internal/SQLConf.scala| 13 ++ .../spark/sql/execution/DataSourceScanExec.scala | 37 ++-- .../spark/sql/execution/QueryExecution.scala | 3 +- .../bucketing/DisableUnnecessaryBucketedScan.scala | 161 +++ .../org/apache/spark/sql/DataFrameJoinSuite.scala | 2 +- .../scala/org/apache/spark/sql/SubquerySuite.scala | 2 +- .../DisableUnnecessaryBucketedScanSuite.scala | 221 + ...ecessaryBucketedScanWithHiveSupportSuite.scala} | 5 +- 9 files changed, 427 insertions(+), 19 deletions(-) create mode 100644 sql/core/src/main/scala/org/apache/spark/sql/execution/bucketing/DisableUnnecessaryBucketedScan.scala create mode 100644 sql/core/src/test/scala/org/apache/spark/sql/sources/DisableUnnecessaryBucketedScanSuite.scala copy sql/hive/src/test/scala/org/apache/spark/sql/sources/{BucketedReadWithHiveSupportSuite.scala => DisableUnnecessaryBucketedScanWithHiveSupportSuite.scala} (89%) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch branch-3.0 updated: [SQL][DOC][MINOR] Corrects input table names in the examples of CREATE FUNCTION doc
This is an automated email from the ASF dual-hosted git repository. yamamuro pushed a commit to branch branch-3.0 in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/branch-3.0 by this push: new bc29602 [SQL][DOC][MINOR] Corrects input table names in the examples of CREATE FUNCTION doc bc29602 is described below commit bc29602e740393aa00c3154986000eaf1be2f965 Author: iRakson AuthorDate: Thu Oct 1 20:50:16 2020 +0900 [SQL][DOC][MINOR] Corrects input table names in the examples of CREATE FUNCTION doc ### What changes were proposed in this pull request? Fix Typo ### Why are the changes needed? To maintain consistency. Correct table name should be used for SELECT command. ### Does this PR introduce _any_ user-facing change? Yes. Now CREATE FUNCTION doc will show the correct name of table. ### How was this patch tested? Manually. Doc changes. Closes #29920 from iRakson/fixTypo. Authored-by: iRakson Signed-off-by: Takeshi Yamamuro (cherry picked from commit d3dbe1a9076c8a76be0590ca071bfbec6114813b) Signed-off-by: Takeshi Yamamuro --- docs/sql-ref-syntax-ddl-create-function.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/docs/sql-ref-syntax-ddl-create-function.md b/docs/sql-ref-syntax-ddl-create-function.md index aa6c1fa..dfa4f4f 100644 --- a/docs/sql-ref-syntax-ddl-create-function.md +++ b/docs/sql-ref-syntax-ddl-create-function.md @@ -112,7 +112,7 @@ SHOW USER FUNCTIONS; +--+ -- Invoke the function. Every selected value should be incremented by 10. -SELECT simple_udf(c1) AS function_return_value FROM t1; +SELECT simple_udf(c1) AS function_return_value FROM test; +-+ |function_return_value| +-+ @@ -150,7 +150,7 @@ CREATE OR REPLACE FUNCTION simple_udf AS 'SimpleUdfR' USING JAR '/tmp/SimpleUdfR.jar'; -- Invoke the function. Every selected value should be incremented by 20. -SELECT simple_udf(c1) AS function_return_value FROM t1; +SELECT simple_udf(c1) AS function_return_value FROM test; +-+ |function_return_value| +-+ - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch branch-3.0 updated: [SQL][DOC][MINOR] Corrects input table names in the examples of CREATE FUNCTION doc
This is an automated email from the ASF dual-hosted git repository. yamamuro pushed a commit to branch branch-3.0 in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/branch-3.0 by this push: new bc29602 [SQL][DOC][MINOR] Corrects input table names in the examples of CREATE FUNCTION doc bc29602 is described below commit bc29602e740393aa00c3154986000eaf1be2f965 Author: iRakson AuthorDate: Thu Oct 1 20:50:16 2020 +0900 [SQL][DOC][MINOR] Corrects input table names in the examples of CREATE FUNCTION doc ### What changes were proposed in this pull request? Fix Typo ### Why are the changes needed? To maintain consistency. Correct table name should be used for SELECT command. ### Does this PR introduce _any_ user-facing change? Yes. Now CREATE FUNCTION doc will show the correct name of table. ### How was this patch tested? Manually. Doc changes. Closes #29920 from iRakson/fixTypo. Authored-by: iRakson Signed-off-by: Takeshi Yamamuro (cherry picked from commit d3dbe1a9076c8a76be0590ca071bfbec6114813b) Signed-off-by: Takeshi Yamamuro --- docs/sql-ref-syntax-ddl-create-function.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/docs/sql-ref-syntax-ddl-create-function.md b/docs/sql-ref-syntax-ddl-create-function.md index aa6c1fa..dfa4f4f 100644 --- a/docs/sql-ref-syntax-ddl-create-function.md +++ b/docs/sql-ref-syntax-ddl-create-function.md @@ -112,7 +112,7 @@ SHOW USER FUNCTIONS; +--+ -- Invoke the function. Every selected value should be incremented by 10. -SELECT simple_udf(c1) AS function_return_value FROM t1; +SELECT simple_udf(c1) AS function_return_value FROM test; +-+ |function_return_value| +-+ @@ -150,7 +150,7 @@ CREATE OR REPLACE FUNCTION simple_udf AS 'SimpleUdfR' USING JAR '/tmp/SimpleUdfR.jar'; -- Invoke the function. Every selected value should be incremented by 20. -SELECT simple_udf(c1) AS function_return_value FROM t1; +SELECT simple_udf(c1) AS function_return_value FROM test; +-+ |function_return_value| +-+ - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (5651284 -> d3dbe1a)
This is an automated email from the ASF dual-hosted git repository. yamamuro pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from 5651284 [SPARK-32992][SQL] Map Oracle's ROWID type to StringType in read via JDBC add d3dbe1a [SQL][DOC][MINOR] Corrects input table names in the examples of CREATE FUNCTION doc No new revisions were added by this update. Summary of changes: docs/sql-ref-syntax-ddl-create-function.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch branch-3.0 updated: [SQL][DOC][MINOR] Corrects input table names in the examples of CREATE FUNCTION doc
This is an automated email from the ASF dual-hosted git repository. yamamuro pushed a commit to branch branch-3.0 in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/branch-3.0 by this push: new bc29602 [SQL][DOC][MINOR] Corrects input table names in the examples of CREATE FUNCTION doc bc29602 is described below commit bc29602e740393aa00c3154986000eaf1be2f965 Author: iRakson AuthorDate: Thu Oct 1 20:50:16 2020 +0900 [SQL][DOC][MINOR] Corrects input table names in the examples of CREATE FUNCTION doc ### What changes were proposed in this pull request? Fix Typo ### Why are the changes needed? To maintain consistency. Correct table name should be used for SELECT command. ### Does this PR introduce _any_ user-facing change? Yes. Now CREATE FUNCTION doc will show the correct name of table. ### How was this patch tested? Manually. Doc changes. Closes #29920 from iRakson/fixTypo. Authored-by: iRakson Signed-off-by: Takeshi Yamamuro (cherry picked from commit d3dbe1a9076c8a76be0590ca071bfbec6114813b) Signed-off-by: Takeshi Yamamuro --- docs/sql-ref-syntax-ddl-create-function.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/docs/sql-ref-syntax-ddl-create-function.md b/docs/sql-ref-syntax-ddl-create-function.md index aa6c1fa..dfa4f4f 100644 --- a/docs/sql-ref-syntax-ddl-create-function.md +++ b/docs/sql-ref-syntax-ddl-create-function.md @@ -112,7 +112,7 @@ SHOW USER FUNCTIONS; +--+ -- Invoke the function. Every selected value should be incremented by 10. -SELECT simple_udf(c1) AS function_return_value FROM t1; +SELECT simple_udf(c1) AS function_return_value FROM test; +-+ |function_return_value| +-+ @@ -150,7 +150,7 @@ CREATE OR REPLACE FUNCTION simple_udf AS 'SimpleUdfR' USING JAR '/tmp/SimpleUdfR.jar'; -- Invoke the function. Every selected value should be incremented by 20. -SELECT simple_udf(c1) AS function_return_value FROM t1; +SELECT simple_udf(c1) AS function_return_value FROM test; +-+ |function_return_value| +-+ - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (5651284 -> d3dbe1a)
This is an automated email from the ASF dual-hosted git repository. yamamuro pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from 5651284 [SPARK-32992][SQL] Map Oracle's ROWID type to StringType in read via JDBC add d3dbe1a [SQL][DOC][MINOR] Corrects input table names in the examples of CREATE FUNCTION doc No new revisions were added by this update. Summary of changes: docs/sql-ref-syntax-ddl-create-function.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch branch-3.0 updated: [SQL][DOC][MINOR] Corrects input table names in the examples of CREATE FUNCTION doc
This is an automated email from the ASF dual-hosted git repository. yamamuro pushed a commit to branch branch-3.0 in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/branch-3.0 by this push: new bc29602 [SQL][DOC][MINOR] Corrects input table names in the examples of CREATE FUNCTION doc bc29602 is described below commit bc29602e740393aa00c3154986000eaf1be2f965 Author: iRakson AuthorDate: Thu Oct 1 20:50:16 2020 +0900 [SQL][DOC][MINOR] Corrects input table names in the examples of CREATE FUNCTION doc ### What changes were proposed in this pull request? Fix Typo ### Why are the changes needed? To maintain consistency. Correct table name should be used for SELECT command. ### Does this PR introduce _any_ user-facing change? Yes. Now CREATE FUNCTION doc will show the correct name of table. ### How was this patch tested? Manually. Doc changes. Closes #29920 from iRakson/fixTypo. Authored-by: iRakson Signed-off-by: Takeshi Yamamuro (cherry picked from commit d3dbe1a9076c8a76be0590ca071bfbec6114813b) Signed-off-by: Takeshi Yamamuro --- docs/sql-ref-syntax-ddl-create-function.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/docs/sql-ref-syntax-ddl-create-function.md b/docs/sql-ref-syntax-ddl-create-function.md index aa6c1fa..dfa4f4f 100644 --- a/docs/sql-ref-syntax-ddl-create-function.md +++ b/docs/sql-ref-syntax-ddl-create-function.md @@ -112,7 +112,7 @@ SHOW USER FUNCTIONS; +--+ -- Invoke the function. Every selected value should be incremented by 10. -SELECT simple_udf(c1) AS function_return_value FROM t1; +SELECT simple_udf(c1) AS function_return_value FROM test; +-+ |function_return_value| +-+ @@ -150,7 +150,7 @@ CREATE OR REPLACE FUNCTION simple_udf AS 'SimpleUdfR' USING JAR '/tmp/SimpleUdfR.jar'; -- Invoke the function. Every selected value should be incremented by 20. -SELECT simple_udf(c1) AS function_return_value FROM t1; +SELECT simple_udf(c1) AS function_return_value FROM test; +-+ |function_return_value| +-+ - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (5651284 -> d3dbe1a)
This is an automated email from the ASF dual-hosted git repository. yamamuro pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from 5651284 [SPARK-32992][SQL] Map Oracle's ROWID type to StringType in read via JDBC add d3dbe1a [SQL][DOC][MINOR] Corrects input table names in the examples of CREATE FUNCTION doc No new revisions were added by this update. Summary of changes: docs/sql-ref-syntax-ddl-create-function.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch branch-3.0 updated: [SQL][DOC][MINOR] Corrects input table names in the examples of CREATE FUNCTION doc
This is an automated email from the ASF dual-hosted git repository. yamamuro pushed a commit to branch branch-3.0 in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/branch-3.0 by this push: new bc29602 [SQL][DOC][MINOR] Corrects input table names in the examples of CREATE FUNCTION doc bc29602 is described below commit bc29602e740393aa00c3154986000eaf1be2f965 Author: iRakson AuthorDate: Thu Oct 1 20:50:16 2020 +0900 [SQL][DOC][MINOR] Corrects input table names in the examples of CREATE FUNCTION doc ### What changes were proposed in this pull request? Fix Typo ### Why are the changes needed? To maintain consistency. Correct table name should be used for SELECT command. ### Does this PR introduce _any_ user-facing change? Yes. Now CREATE FUNCTION doc will show the correct name of table. ### How was this patch tested? Manually. Doc changes. Closes #29920 from iRakson/fixTypo. Authored-by: iRakson Signed-off-by: Takeshi Yamamuro (cherry picked from commit d3dbe1a9076c8a76be0590ca071bfbec6114813b) Signed-off-by: Takeshi Yamamuro --- docs/sql-ref-syntax-ddl-create-function.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/docs/sql-ref-syntax-ddl-create-function.md b/docs/sql-ref-syntax-ddl-create-function.md index aa6c1fa..dfa4f4f 100644 --- a/docs/sql-ref-syntax-ddl-create-function.md +++ b/docs/sql-ref-syntax-ddl-create-function.md @@ -112,7 +112,7 @@ SHOW USER FUNCTIONS; +--+ -- Invoke the function. Every selected value should be incremented by 10. -SELECT simple_udf(c1) AS function_return_value FROM t1; +SELECT simple_udf(c1) AS function_return_value FROM test; +-+ |function_return_value| +-+ @@ -150,7 +150,7 @@ CREATE OR REPLACE FUNCTION simple_udf AS 'SimpleUdfR' USING JAR '/tmp/SimpleUdfR.jar'; -- Invoke the function. Every selected value should be incremented by 20. -SELECT simple_udf(c1) AS function_return_value FROM t1; +SELECT simple_udf(c1) AS function_return_value FROM test; +-+ |function_return_value| +-+ - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (5651284 -> d3dbe1a)
This is an automated email from the ASF dual-hosted git repository. yamamuro pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from 5651284 [SPARK-32992][SQL] Map Oracle's ROWID type to StringType in read via JDBC add d3dbe1a [SQL][DOC][MINOR] Corrects input table names in the examples of CREATE FUNCTION doc No new revisions were added by this update. Summary of changes: docs/sql-ref-syntax-ddl-create-function.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (5651284 -> d3dbe1a)
This is an automated email from the ASF dual-hosted git repository. yamamuro pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from 5651284 [SPARK-32992][SQL] Map Oracle's ROWID type to StringType in read via JDBC add d3dbe1a [SQL][DOC][MINOR] Corrects input table names in the examples of CREATE FUNCTION doc No new revisions were added by this update. Summary of changes: docs/sql-ref-syntax-ddl-create-function.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (28ed3a5 -> 5651284)
This is an automated email from the ASF dual-hosted git repository. yamamuro pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from 28ed3a5 [SPARK-32723][WEBUI] Upgrade to jQuery 3.5.1 add 5651284 [SPARK-32992][SQL] Map Oracle's ROWID type to StringType in read via JDBC No new revisions were added by this update. Summary of changes: .../org/apache/spark/sql/jdbc/OracleIntegrationSuite.scala| 11 +++ .../main/scala/org/apache/spark/sql/jdbc/OracleDialect.scala | 6 ++ 2 files changed, 17 insertions(+) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (28ed3a5 -> 5651284)
This is an automated email from the ASF dual-hosted git repository. yamamuro pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from 28ed3a5 [SPARK-32723][WEBUI] Upgrade to jQuery 3.5.1 add 5651284 [SPARK-32992][SQL] Map Oracle's ROWID type to StringType in read via JDBC No new revisions were added by this update. Summary of changes: .../org/apache/spark/sql/jdbc/OracleIntegrationSuite.scala| 11 +++ .../main/scala/org/apache/spark/sql/jdbc/OracleDialect.scala | 6 ++ 2 files changed, 17 insertions(+) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (28ed3a5 -> 5651284)
This is an automated email from the ASF dual-hosted git repository. yamamuro pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from 28ed3a5 [SPARK-32723][WEBUI] Upgrade to jQuery 3.5.1 add 5651284 [SPARK-32992][SQL] Map Oracle's ROWID type to StringType in read via JDBC No new revisions were added by this update. Summary of changes: .../org/apache/spark/sql/jdbc/OracleIntegrationSuite.scala| 11 +++ .../main/scala/org/apache/spark/sql/jdbc/OracleDialect.scala | 6 ++ 2 files changed, 17 insertions(+) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (28ed3a5 -> 5651284)
This is an automated email from the ASF dual-hosted git repository. yamamuro pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from 28ed3a5 [SPARK-32723][WEBUI] Upgrade to jQuery 3.5.1 add 5651284 [SPARK-32992][SQL] Map Oracle's ROWID type to StringType in read via JDBC No new revisions were added by this update. Summary of changes: .../org/apache/spark/sql/jdbc/OracleIntegrationSuite.scala| 11 +++ .../main/scala/org/apache/spark/sql/jdbc/OracleDialect.scala | 6 ++ 2 files changed, 17 insertions(+) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (28ed3a5 -> 5651284)
This is an automated email from the ASF dual-hosted git repository. yamamuro pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from 28ed3a5 [SPARK-32723][WEBUI] Upgrade to jQuery 3.5.1 add 5651284 [SPARK-32992][SQL] Map Oracle's ROWID type to StringType in read via JDBC No new revisions were added by this update. Summary of changes: .../org/apache/spark/sql/jdbc/OracleIntegrationSuite.scala| 11 +++ .../main/scala/org/apache/spark/sql/jdbc/OracleDialect.scala | 6 ++ 2 files changed, 17 insertions(+) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch branch-3.0 updated: [SPARK-31753][SQL][DOCS][FOLLOW-UP] Add missing keywords in the SQL docs
This is an automated email from the ASF dual-hosted git repository. yamamuro pushed a commit to branch branch-3.0 in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/branch-3.0 by this push: new db6ba04 [SPARK-31753][SQL][DOCS][FOLLOW-UP] Add missing keywords in the SQL docs db6ba04 is described below commit db6ba049c43e2aa1521ed39c9f2b802ad04d111f Author: GuoPhilipse <46367746+guophili...@users.noreply.github.com> AuthorDate: Thu Oct 1 08:15:53 2020 +0900 [SPARK-31753][SQL][DOCS][FOLLOW-UP] Add missing keywords in the SQL docs ### What changes were proposed in this pull request? update sql-ref docs, the following key words will be added in this PR. CLUSTERED BY SORTED BY INTO num_buckets BUCKETS ### Why are the changes needed? let more users know the sql key words usage ### Does this PR introduce _any_ user-facing change? No ![image](https://user-images.githubusercontent.com/46367746/94428281-0a6b8080-01c3-11eb-9ff3-899f8da602ca.png) ![image](https://user-images.githubusercontent.com/46367746/94428285-0d667100-01c3-11eb-8a54-90e7641d917b.png) ![image](https://user-images.githubusercontent.com/46367746/94428288-0f303480-01c3-11eb-9e1d-023538aa6e2d.png) ### How was this patch tested? generate html test Closes #29883 from GuoPhilipse/add-sql-missing-keywords. Lead-authored-by: GuoPhilipse <46367746+guophili...@users.noreply.github.com> Co-authored-by: GuoPhilipse Signed-off-by: Takeshi Yamamuro (cherry picked from commit 3bdbb5546d2517dda6f71613927cc1783c87f319) Signed-off-by: Takeshi Yamamuro --- docs/sql-ref-syntax-ddl-create-table-datasource.md | 7 - docs/sql-ref-syntax-ddl-create-table-hiveformat.md | 32 ++ 2 files changed, 38 insertions(+), 1 deletion(-) diff --git a/docs/sql-ref-syntax-ddl-create-table-datasource.md b/docs/sql-ref-syntax-ddl-create-table-datasource.md index d334447..ba0516a 100644 --- a/docs/sql-ref-syntax-ddl-create-table-datasource.md +++ b/docs/sql-ref-syntax-ddl-create-table-datasource.md @@ -67,7 +67,12 @@ as any order. For example, you can write COMMENT table_comment after TBLPROPERTI * **SORTED BY** -Determines the order in which the data is stored in buckets. Default is Ascending order. +Specifies an ordering of bucket columns. Optionally, one can use ASC for an ascending order or DESC for a descending order after any column names in the SORTED BY clause. +If not specified, ASC is assumed by default. + +* **INTO num_buckets BUCKETS** + +Specifies buckets numbers, which is used in `CLUSTERED BY` clause. * **LOCATION** diff --git a/docs/sql-ref-syntax-ddl-create-table-hiveformat.md b/docs/sql-ref-syntax-ddl-create-table-hiveformat.md index 7bf847d..3a8c8d5 100644 --- a/docs/sql-ref-syntax-ddl-create-table-hiveformat.md +++ b/docs/sql-ref-syntax-ddl-create-table-hiveformat.md @@ -31,6 +31,9 @@ CREATE [ EXTERNAL ] TABLE [ IF NOT EXISTS ] table_identifier [ COMMENT table_comment ] [ PARTITIONED BY ( col_name2[:] col_type2 [ COMMENT col_comment2 ], ... ) | ( col_name1, col_name2, ... ) ] +[ CLUSTERED BY ( col_name1, col_name2, ...) +[ SORTED BY ( col_name1 [ ASC | DESC ], col_name2 [ ASC | DESC ], ... ) ] +INTO num_buckets BUCKETS ] [ ROW FORMAT row_format ] [ STORED AS file_format ] [ LOCATION path ] @@ -65,6 +68,21 @@ as any order. For example, you can write COMMENT table_comment after TBLPROPERTI Partitions are created on the table, based on the columns specified. +* **CLUSTERED BY** + +Partitions created on the table will be bucketed into fixed buckets based on the column specified for bucketing. + +**NOTE:** Bucketing is an optimization technique that uses buckets (and bucketing columns) to determine data partitioning and avoid data shuffle. + +* **SORTED BY** + +Specifies an ordering of bucket columns. Optionally, one can use ASC for an ascending order or DESC for a descending order after any column names in the SORTED BY clause. +If not specified, ASC is assumed by default. + +* **INTO num_buckets BUCKETS** + +Specifies buckets numbers, which is used in `CLUSTERED BY` clause. + * **row_format** Use the `SERDE` clause to specify a custom SerDe for one table. Otherwise, use the `DELIMITED` clause to use the native SerDe and specify the delimiter, escape character, null character and so on. @@ -203,6 +221,20 @@ CREATE EXTERNAL TABLE family (id INT, name STRING) STORED AS INPUTFORMAT 'com.ly.spark.example.serde.io.SerDeExampleInputFormat' OUTPUTFORMAT 'com.ly.spark.example.serde.io.SerDeExampleOutputFormat' LOCATION '/tmp/family/'; + +--Use `CLUSTERED BY` clause to create bucket table without `SORTED BY` +CREATE TABLE clustered_by_test1 (ID INT, AGE STRING) +CLUSTERED BY (ID)
[spark] branch branch-3.0 updated: [SPARK-31753][SQL][DOCS][FOLLOW-UP] Add missing keywords in the SQL docs
This is an automated email from the ASF dual-hosted git repository. yamamuro pushed a commit to branch branch-3.0 in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/branch-3.0 by this push: new db6ba04 [SPARK-31753][SQL][DOCS][FOLLOW-UP] Add missing keywords in the SQL docs db6ba04 is described below commit db6ba049c43e2aa1521ed39c9f2b802ad04d111f Author: GuoPhilipse <46367746+guophili...@users.noreply.github.com> AuthorDate: Thu Oct 1 08:15:53 2020 +0900 [SPARK-31753][SQL][DOCS][FOLLOW-UP] Add missing keywords in the SQL docs ### What changes were proposed in this pull request? update sql-ref docs, the following key words will be added in this PR. CLUSTERED BY SORTED BY INTO num_buckets BUCKETS ### Why are the changes needed? let more users know the sql key words usage ### Does this PR introduce _any_ user-facing change? No ![image](https://user-images.githubusercontent.com/46367746/94428281-0a6b8080-01c3-11eb-9ff3-899f8da602ca.png) ![image](https://user-images.githubusercontent.com/46367746/94428285-0d667100-01c3-11eb-8a54-90e7641d917b.png) ![image](https://user-images.githubusercontent.com/46367746/94428288-0f303480-01c3-11eb-9e1d-023538aa6e2d.png) ### How was this patch tested? generate html test Closes #29883 from GuoPhilipse/add-sql-missing-keywords. Lead-authored-by: GuoPhilipse <46367746+guophili...@users.noreply.github.com> Co-authored-by: GuoPhilipse Signed-off-by: Takeshi Yamamuro (cherry picked from commit 3bdbb5546d2517dda6f71613927cc1783c87f319) Signed-off-by: Takeshi Yamamuro --- docs/sql-ref-syntax-ddl-create-table-datasource.md | 7 - docs/sql-ref-syntax-ddl-create-table-hiveformat.md | 32 ++ 2 files changed, 38 insertions(+), 1 deletion(-) diff --git a/docs/sql-ref-syntax-ddl-create-table-datasource.md b/docs/sql-ref-syntax-ddl-create-table-datasource.md index d334447..ba0516a 100644 --- a/docs/sql-ref-syntax-ddl-create-table-datasource.md +++ b/docs/sql-ref-syntax-ddl-create-table-datasource.md @@ -67,7 +67,12 @@ as any order. For example, you can write COMMENT table_comment after TBLPROPERTI * **SORTED BY** -Determines the order in which the data is stored in buckets. Default is Ascending order. +Specifies an ordering of bucket columns. Optionally, one can use ASC for an ascending order or DESC for a descending order after any column names in the SORTED BY clause. +If not specified, ASC is assumed by default. + +* **INTO num_buckets BUCKETS** + +Specifies buckets numbers, which is used in `CLUSTERED BY` clause. * **LOCATION** diff --git a/docs/sql-ref-syntax-ddl-create-table-hiveformat.md b/docs/sql-ref-syntax-ddl-create-table-hiveformat.md index 7bf847d..3a8c8d5 100644 --- a/docs/sql-ref-syntax-ddl-create-table-hiveformat.md +++ b/docs/sql-ref-syntax-ddl-create-table-hiveformat.md @@ -31,6 +31,9 @@ CREATE [ EXTERNAL ] TABLE [ IF NOT EXISTS ] table_identifier [ COMMENT table_comment ] [ PARTITIONED BY ( col_name2[:] col_type2 [ COMMENT col_comment2 ], ... ) | ( col_name1, col_name2, ... ) ] +[ CLUSTERED BY ( col_name1, col_name2, ...) +[ SORTED BY ( col_name1 [ ASC | DESC ], col_name2 [ ASC | DESC ], ... ) ] +INTO num_buckets BUCKETS ] [ ROW FORMAT row_format ] [ STORED AS file_format ] [ LOCATION path ] @@ -65,6 +68,21 @@ as any order. For example, you can write COMMENT table_comment after TBLPROPERTI Partitions are created on the table, based on the columns specified. +* **CLUSTERED BY** + +Partitions created on the table will be bucketed into fixed buckets based on the column specified for bucketing. + +**NOTE:** Bucketing is an optimization technique that uses buckets (and bucketing columns) to determine data partitioning and avoid data shuffle. + +* **SORTED BY** + +Specifies an ordering of bucket columns. Optionally, one can use ASC for an ascending order or DESC for a descending order after any column names in the SORTED BY clause. +If not specified, ASC is assumed by default. + +* **INTO num_buckets BUCKETS** + +Specifies buckets numbers, which is used in `CLUSTERED BY` clause. + * **row_format** Use the `SERDE` clause to specify a custom SerDe for one table. Otherwise, use the `DELIMITED` clause to use the native SerDe and specify the delimiter, escape character, null character and so on. @@ -203,6 +221,20 @@ CREATE EXTERNAL TABLE family (id INT, name STRING) STORED AS INPUTFORMAT 'com.ly.spark.example.serde.io.SerDeExampleInputFormat' OUTPUTFORMAT 'com.ly.spark.example.serde.io.SerDeExampleOutputFormat' LOCATION '/tmp/family/'; + +--Use `CLUSTERED BY` clause to create bucket table without `SORTED BY` +CREATE TABLE clustered_by_test1 (ID INT, AGE STRING) +CLUSTERED BY (ID)
[spark] branch master updated (ece8d8e -> 3bdbb55)
This is an automated email from the ASF dual-hosted git repository. yamamuro pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from ece8d8e [SPARK-33006][K8S][DOCS] Add dynamic PVC usage example into K8s doc add 3bdbb55 [SPARK-31753][SQL][DOCS][FOLLOW-UP] Add missing keywords in the SQL docs No new revisions were added by this update. Summary of changes: docs/sql-ref-syntax-ddl-create-table-datasource.md | 7 - docs/sql-ref-syntax-ddl-create-table-hiveformat.md | 32 ++ 2 files changed, 38 insertions(+), 1 deletion(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch branch-3.0 updated: [SPARK-31753][SQL][DOCS][FOLLOW-UP] Add missing keywords in the SQL docs
This is an automated email from the ASF dual-hosted git repository. yamamuro pushed a commit to branch branch-3.0 in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/branch-3.0 by this push: new db6ba04 [SPARK-31753][SQL][DOCS][FOLLOW-UP] Add missing keywords in the SQL docs db6ba04 is described below commit db6ba049c43e2aa1521ed39c9f2b802ad04d111f Author: GuoPhilipse <46367746+guophili...@users.noreply.github.com> AuthorDate: Thu Oct 1 08:15:53 2020 +0900 [SPARK-31753][SQL][DOCS][FOLLOW-UP] Add missing keywords in the SQL docs ### What changes were proposed in this pull request? update sql-ref docs, the following key words will be added in this PR. CLUSTERED BY SORTED BY INTO num_buckets BUCKETS ### Why are the changes needed? let more users know the sql key words usage ### Does this PR introduce _any_ user-facing change? No ![image](https://user-images.githubusercontent.com/46367746/94428281-0a6b8080-01c3-11eb-9ff3-899f8da602ca.png) ![image](https://user-images.githubusercontent.com/46367746/94428285-0d667100-01c3-11eb-8a54-90e7641d917b.png) ![image](https://user-images.githubusercontent.com/46367746/94428288-0f303480-01c3-11eb-9e1d-023538aa6e2d.png) ### How was this patch tested? generate html test Closes #29883 from GuoPhilipse/add-sql-missing-keywords. Lead-authored-by: GuoPhilipse <46367746+guophili...@users.noreply.github.com> Co-authored-by: GuoPhilipse Signed-off-by: Takeshi Yamamuro (cherry picked from commit 3bdbb5546d2517dda6f71613927cc1783c87f319) Signed-off-by: Takeshi Yamamuro --- docs/sql-ref-syntax-ddl-create-table-datasource.md | 7 - docs/sql-ref-syntax-ddl-create-table-hiveformat.md | 32 ++ 2 files changed, 38 insertions(+), 1 deletion(-) diff --git a/docs/sql-ref-syntax-ddl-create-table-datasource.md b/docs/sql-ref-syntax-ddl-create-table-datasource.md index d334447..ba0516a 100644 --- a/docs/sql-ref-syntax-ddl-create-table-datasource.md +++ b/docs/sql-ref-syntax-ddl-create-table-datasource.md @@ -67,7 +67,12 @@ as any order. For example, you can write COMMENT table_comment after TBLPROPERTI * **SORTED BY** -Determines the order in which the data is stored in buckets. Default is Ascending order. +Specifies an ordering of bucket columns. Optionally, one can use ASC for an ascending order or DESC for a descending order after any column names in the SORTED BY clause. +If not specified, ASC is assumed by default. + +* **INTO num_buckets BUCKETS** + +Specifies buckets numbers, which is used in `CLUSTERED BY` clause. * **LOCATION** diff --git a/docs/sql-ref-syntax-ddl-create-table-hiveformat.md b/docs/sql-ref-syntax-ddl-create-table-hiveformat.md index 7bf847d..3a8c8d5 100644 --- a/docs/sql-ref-syntax-ddl-create-table-hiveformat.md +++ b/docs/sql-ref-syntax-ddl-create-table-hiveformat.md @@ -31,6 +31,9 @@ CREATE [ EXTERNAL ] TABLE [ IF NOT EXISTS ] table_identifier [ COMMENT table_comment ] [ PARTITIONED BY ( col_name2[:] col_type2 [ COMMENT col_comment2 ], ... ) | ( col_name1, col_name2, ... ) ] +[ CLUSTERED BY ( col_name1, col_name2, ...) +[ SORTED BY ( col_name1 [ ASC | DESC ], col_name2 [ ASC | DESC ], ... ) ] +INTO num_buckets BUCKETS ] [ ROW FORMAT row_format ] [ STORED AS file_format ] [ LOCATION path ] @@ -65,6 +68,21 @@ as any order. For example, you can write COMMENT table_comment after TBLPROPERTI Partitions are created on the table, based on the columns specified. +* **CLUSTERED BY** + +Partitions created on the table will be bucketed into fixed buckets based on the column specified for bucketing. + +**NOTE:** Bucketing is an optimization technique that uses buckets (and bucketing columns) to determine data partitioning and avoid data shuffle. + +* **SORTED BY** + +Specifies an ordering of bucket columns. Optionally, one can use ASC for an ascending order or DESC for a descending order after any column names in the SORTED BY clause. +If not specified, ASC is assumed by default. + +* **INTO num_buckets BUCKETS** + +Specifies buckets numbers, which is used in `CLUSTERED BY` clause. + * **row_format** Use the `SERDE` clause to specify a custom SerDe for one table. Otherwise, use the `DELIMITED` clause to use the native SerDe and specify the delimiter, escape character, null character and so on. @@ -203,6 +221,20 @@ CREATE EXTERNAL TABLE family (id INT, name STRING) STORED AS INPUTFORMAT 'com.ly.spark.example.serde.io.SerDeExampleInputFormat' OUTPUTFORMAT 'com.ly.spark.example.serde.io.SerDeExampleOutputFormat' LOCATION '/tmp/family/'; + +--Use `CLUSTERED BY` clause to create bucket table without `SORTED BY` +CREATE TABLE clustered_by_test1 (ID INT, AGE STRING) +CLUSTERED BY (ID)
[spark] branch master updated (ece8d8e -> 3bdbb55)
This is an automated email from the ASF dual-hosted git repository. yamamuro pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from ece8d8e [SPARK-33006][K8S][DOCS] Add dynamic PVC usage example into K8s doc add 3bdbb55 [SPARK-31753][SQL][DOCS][FOLLOW-UP] Add missing keywords in the SQL docs No new revisions were added by this update. Summary of changes: docs/sql-ref-syntax-ddl-create-table-datasource.md | 7 - docs/sql-ref-syntax-ddl-create-table-hiveformat.md | 32 ++ 2 files changed, 38 insertions(+), 1 deletion(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch branch-3.0 updated: [SPARK-31753][SQL][DOCS][FOLLOW-UP] Add missing keywords in the SQL docs
This is an automated email from the ASF dual-hosted git repository. yamamuro pushed a commit to branch branch-3.0 in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/branch-3.0 by this push: new db6ba04 [SPARK-31753][SQL][DOCS][FOLLOW-UP] Add missing keywords in the SQL docs db6ba04 is described below commit db6ba049c43e2aa1521ed39c9f2b802ad04d111f Author: GuoPhilipse <46367746+guophili...@users.noreply.github.com> AuthorDate: Thu Oct 1 08:15:53 2020 +0900 [SPARK-31753][SQL][DOCS][FOLLOW-UP] Add missing keywords in the SQL docs ### What changes were proposed in this pull request? update sql-ref docs, the following key words will be added in this PR. CLUSTERED BY SORTED BY INTO num_buckets BUCKETS ### Why are the changes needed? let more users know the sql key words usage ### Does this PR introduce _any_ user-facing change? No ![image](https://user-images.githubusercontent.com/46367746/94428281-0a6b8080-01c3-11eb-9ff3-899f8da602ca.png) ![image](https://user-images.githubusercontent.com/46367746/94428285-0d667100-01c3-11eb-8a54-90e7641d917b.png) ![image](https://user-images.githubusercontent.com/46367746/94428288-0f303480-01c3-11eb-9e1d-023538aa6e2d.png) ### How was this patch tested? generate html test Closes #29883 from GuoPhilipse/add-sql-missing-keywords. Lead-authored-by: GuoPhilipse <46367746+guophili...@users.noreply.github.com> Co-authored-by: GuoPhilipse Signed-off-by: Takeshi Yamamuro (cherry picked from commit 3bdbb5546d2517dda6f71613927cc1783c87f319) Signed-off-by: Takeshi Yamamuro --- docs/sql-ref-syntax-ddl-create-table-datasource.md | 7 - docs/sql-ref-syntax-ddl-create-table-hiveformat.md | 32 ++ 2 files changed, 38 insertions(+), 1 deletion(-) diff --git a/docs/sql-ref-syntax-ddl-create-table-datasource.md b/docs/sql-ref-syntax-ddl-create-table-datasource.md index d334447..ba0516a 100644 --- a/docs/sql-ref-syntax-ddl-create-table-datasource.md +++ b/docs/sql-ref-syntax-ddl-create-table-datasource.md @@ -67,7 +67,12 @@ as any order. For example, you can write COMMENT table_comment after TBLPROPERTI * **SORTED BY** -Determines the order in which the data is stored in buckets. Default is Ascending order. +Specifies an ordering of bucket columns. Optionally, one can use ASC for an ascending order or DESC for a descending order after any column names in the SORTED BY clause. +If not specified, ASC is assumed by default. + +* **INTO num_buckets BUCKETS** + +Specifies buckets numbers, which is used in `CLUSTERED BY` clause. * **LOCATION** diff --git a/docs/sql-ref-syntax-ddl-create-table-hiveformat.md b/docs/sql-ref-syntax-ddl-create-table-hiveformat.md index 7bf847d..3a8c8d5 100644 --- a/docs/sql-ref-syntax-ddl-create-table-hiveformat.md +++ b/docs/sql-ref-syntax-ddl-create-table-hiveformat.md @@ -31,6 +31,9 @@ CREATE [ EXTERNAL ] TABLE [ IF NOT EXISTS ] table_identifier [ COMMENT table_comment ] [ PARTITIONED BY ( col_name2[:] col_type2 [ COMMENT col_comment2 ], ... ) | ( col_name1, col_name2, ... ) ] +[ CLUSTERED BY ( col_name1, col_name2, ...) +[ SORTED BY ( col_name1 [ ASC | DESC ], col_name2 [ ASC | DESC ], ... ) ] +INTO num_buckets BUCKETS ] [ ROW FORMAT row_format ] [ STORED AS file_format ] [ LOCATION path ] @@ -65,6 +68,21 @@ as any order. For example, you can write COMMENT table_comment after TBLPROPERTI Partitions are created on the table, based on the columns specified. +* **CLUSTERED BY** + +Partitions created on the table will be bucketed into fixed buckets based on the column specified for bucketing. + +**NOTE:** Bucketing is an optimization technique that uses buckets (and bucketing columns) to determine data partitioning and avoid data shuffle. + +* **SORTED BY** + +Specifies an ordering of bucket columns. Optionally, one can use ASC for an ascending order or DESC for a descending order after any column names in the SORTED BY clause. +If not specified, ASC is assumed by default. + +* **INTO num_buckets BUCKETS** + +Specifies buckets numbers, which is used in `CLUSTERED BY` clause. + * **row_format** Use the `SERDE` clause to specify a custom SerDe for one table. Otherwise, use the `DELIMITED` clause to use the native SerDe and specify the delimiter, escape character, null character and so on. @@ -203,6 +221,20 @@ CREATE EXTERNAL TABLE family (id INT, name STRING) STORED AS INPUTFORMAT 'com.ly.spark.example.serde.io.SerDeExampleInputFormat' OUTPUTFORMAT 'com.ly.spark.example.serde.io.SerDeExampleOutputFormat' LOCATION '/tmp/family/'; + +--Use `CLUSTERED BY` clause to create bucket table without `SORTED BY` +CREATE TABLE clustered_by_test1 (ID INT, AGE STRING) +CLUSTERED BY (ID)
[spark] branch master updated (ece8d8e -> 3bdbb55)
This is an automated email from the ASF dual-hosted git repository. yamamuro pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from ece8d8e [SPARK-33006][K8S][DOCS] Add dynamic PVC usage example into K8s doc add 3bdbb55 [SPARK-31753][SQL][DOCS][FOLLOW-UP] Add missing keywords in the SQL docs No new revisions were added by this update. Summary of changes: docs/sql-ref-syntax-ddl-create-table-datasource.md | 7 - docs/sql-ref-syntax-ddl-create-table-hiveformat.md | 32 ++ 2 files changed, 38 insertions(+), 1 deletion(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch branch-3.0 updated: [SPARK-31753][SQL][DOCS][FOLLOW-UP] Add missing keywords in the SQL docs
This is an automated email from the ASF dual-hosted git repository. yamamuro pushed a commit to branch branch-3.0 in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/branch-3.0 by this push: new db6ba04 [SPARK-31753][SQL][DOCS][FOLLOW-UP] Add missing keywords in the SQL docs db6ba04 is described below commit db6ba049c43e2aa1521ed39c9f2b802ad04d111f Author: GuoPhilipse <46367746+guophili...@users.noreply.github.com> AuthorDate: Thu Oct 1 08:15:53 2020 +0900 [SPARK-31753][SQL][DOCS][FOLLOW-UP] Add missing keywords in the SQL docs ### What changes were proposed in this pull request? update sql-ref docs, the following key words will be added in this PR. CLUSTERED BY SORTED BY INTO num_buckets BUCKETS ### Why are the changes needed? let more users know the sql key words usage ### Does this PR introduce _any_ user-facing change? No ![image](https://user-images.githubusercontent.com/46367746/94428281-0a6b8080-01c3-11eb-9ff3-899f8da602ca.png) ![image](https://user-images.githubusercontent.com/46367746/94428285-0d667100-01c3-11eb-8a54-90e7641d917b.png) ![image](https://user-images.githubusercontent.com/46367746/94428288-0f303480-01c3-11eb-9e1d-023538aa6e2d.png) ### How was this patch tested? generate html test Closes #29883 from GuoPhilipse/add-sql-missing-keywords. Lead-authored-by: GuoPhilipse <46367746+guophili...@users.noreply.github.com> Co-authored-by: GuoPhilipse Signed-off-by: Takeshi Yamamuro (cherry picked from commit 3bdbb5546d2517dda6f71613927cc1783c87f319) Signed-off-by: Takeshi Yamamuro --- docs/sql-ref-syntax-ddl-create-table-datasource.md | 7 - docs/sql-ref-syntax-ddl-create-table-hiveformat.md | 32 ++ 2 files changed, 38 insertions(+), 1 deletion(-) diff --git a/docs/sql-ref-syntax-ddl-create-table-datasource.md b/docs/sql-ref-syntax-ddl-create-table-datasource.md index d334447..ba0516a 100644 --- a/docs/sql-ref-syntax-ddl-create-table-datasource.md +++ b/docs/sql-ref-syntax-ddl-create-table-datasource.md @@ -67,7 +67,12 @@ as any order. For example, you can write COMMENT table_comment after TBLPROPERTI * **SORTED BY** -Determines the order in which the data is stored in buckets. Default is Ascending order. +Specifies an ordering of bucket columns. Optionally, one can use ASC for an ascending order or DESC for a descending order after any column names in the SORTED BY clause. +If not specified, ASC is assumed by default. + +* **INTO num_buckets BUCKETS** + +Specifies buckets numbers, which is used in `CLUSTERED BY` clause. * **LOCATION** diff --git a/docs/sql-ref-syntax-ddl-create-table-hiveformat.md b/docs/sql-ref-syntax-ddl-create-table-hiveformat.md index 7bf847d..3a8c8d5 100644 --- a/docs/sql-ref-syntax-ddl-create-table-hiveformat.md +++ b/docs/sql-ref-syntax-ddl-create-table-hiveformat.md @@ -31,6 +31,9 @@ CREATE [ EXTERNAL ] TABLE [ IF NOT EXISTS ] table_identifier [ COMMENT table_comment ] [ PARTITIONED BY ( col_name2[:] col_type2 [ COMMENT col_comment2 ], ... ) | ( col_name1, col_name2, ... ) ] +[ CLUSTERED BY ( col_name1, col_name2, ...) +[ SORTED BY ( col_name1 [ ASC | DESC ], col_name2 [ ASC | DESC ], ... ) ] +INTO num_buckets BUCKETS ] [ ROW FORMAT row_format ] [ STORED AS file_format ] [ LOCATION path ] @@ -65,6 +68,21 @@ as any order. For example, you can write COMMENT table_comment after TBLPROPERTI Partitions are created on the table, based on the columns specified. +* **CLUSTERED BY** + +Partitions created on the table will be bucketed into fixed buckets based on the column specified for bucketing. + +**NOTE:** Bucketing is an optimization technique that uses buckets (and bucketing columns) to determine data partitioning and avoid data shuffle. + +* **SORTED BY** + +Specifies an ordering of bucket columns. Optionally, one can use ASC for an ascending order or DESC for a descending order after any column names in the SORTED BY clause. +If not specified, ASC is assumed by default. + +* **INTO num_buckets BUCKETS** + +Specifies buckets numbers, which is used in `CLUSTERED BY` clause. + * **row_format** Use the `SERDE` clause to specify a custom SerDe for one table. Otherwise, use the `DELIMITED` clause to use the native SerDe and specify the delimiter, escape character, null character and so on. @@ -203,6 +221,20 @@ CREATE EXTERNAL TABLE family (id INT, name STRING) STORED AS INPUTFORMAT 'com.ly.spark.example.serde.io.SerDeExampleInputFormat' OUTPUTFORMAT 'com.ly.spark.example.serde.io.SerDeExampleOutputFormat' LOCATION '/tmp/family/'; + +--Use `CLUSTERED BY` clause to create bucket table without `SORTED BY` +CREATE TABLE clustered_by_test1 (ID INT, AGE STRING) +CLUSTERED BY (ID)
[spark] branch master updated (ece8d8e -> 3bdbb55)
This is an automated email from the ASF dual-hosted git repository. yamamuro pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from ece8d8e [SPARK-33006][K8S][DOCS] Add dynamic PVC usage example into K8s doc add 3bdbb55 [SPARK-31753][SQL][DOCS][FOLLOW-UP] Add missing keywords in the SQL docs No new revisions were added by this update. Summary of changes: docs/sql-ref-syntax-ddl-create-table-datasource.md | 7 - docs/sql-ref-syntax-ddl-create-table-hiveformat.md | 32 ++ 2 files changed, 38 insertions(+), 1 deletion(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (ece8d8e -> 3bdbb55)
This is an automated email from the ASF dual-hosted git repository. yamamuro pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from ece8d8e [SPARK-33006][K8S][DOCS] Add dynamic PVC usage example into K8s doc add 3bdbb55 [SPARK-31753][SQL][DOCS][FOLLOW-UP] Add missing keywords in the SQL docs No new revisions were added by this update. Summary of changes: docs/sql-ref-syntax-ddl-create-table-datasource.md | 7 - docs/sql-ref-syntax-ddl-create-table-hiveformat.md | 32 ++ 2 files changed, 38 insertions(+), 1 deletion(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (cc06266 -> 3a299aa)
This is an automated email from the ASF dual-hosted git repository. yamamuro pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from cc06266 [SPARK-33019][CORE] Use spark.hadoop.mapreduce.fileoutputcommitter.algorithm.version=1 by default add 3a299aa [SPARK-32741][SQL] Check if the same ExprId refers to the unique attribute in logical plans No new revisions were added by this update. Summary of changes: .../spark/sql/catalyst/analysis/Analyzer.scala | 11 +++- .../spark/sql/catalyst/optimizer/Optimizer.scala | 15 +++-- .../spark/sql/catalyst/optimizer/subquery.scala| 51 +--- .../sql/catalyst/plans/logical/LogicalPlan.scala | 70 ++ .../optimizer/FoldablePropagationSuite.scala | 4 +- .../plans/logical/LogicalPlanIntegritySuite.scala | 51 .../sql/execution/adaptive/AQEOptimizer.scala | 8 ++- .../apache/spark/sql/streaming/StreamSuite.scala | 7 +-- 8 files changed, 181 insertions(+), 36 deletions(-) create mode 100644 sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/plans/logical/LogicalPlanIntegritySuite.scala - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (cc06266 -> 3a299aa)
This is an automated email from the ASF dual-hosted git repository. yamamuro pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from cc06266 [SPARK-33019][CORE] Use spark.hadoop.mapreduce.fileoutputcommitter.algorithm.version=1 by default add 3a299aa [SPARK-32741][SQL] Check if the same ExprId refers to the unique attribute in logical plans No new revisions were added by this update. Summary of changes: .../spark/sql/catalyst/analysis/Analyzer.scala | 11 +++- .../spark/sql/catalyst/optimizer/Optimizer.scala | 15 +++-- .../spark/sql/catalyst/optimizer/subquery.scala| 51 +--- .../sql/catalyst/plans/logical/LogicalPlan.scala | 70 ++ .../optimizer/FoldablePropagationSuite.scala | 4 +- .../plans/logical/LogicalPlanIntegritySuite.scala | 51 .../sql/execution/adaptive/AQEOptimizer.scala | 8 ++- .../apache/spark/sql/streaming/StreamSuite.scala | 7 +-- 8 files changed, 181 insertions(+), 36 deletions(-) create mode 100644 sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/plans/logical/LogicalPlanIntegritySuite.scala - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (cc06266 -> 3a299aa)
This is an automated email from the ASF dual-hosted git repository. yamamuro pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from cc06266 [SPARK-33019][CORE] Use spark.hadoop.mapreduce.fileoutputcommitter.algorithm.version=1 by default add 3a299aa [SPARK-32741][SQL] Check if the same ExprId refers to the unique attribute in logical plans No new revisions were added by this update. Summary of changes: .../spark/sql/catalyst/analysis/Analyzer.scala | 11 +++- .../spark/sql/catalyst/optimizer/Optimizer.scala | 15 +++-- .../spark/sql/catalyst/optimizer/subquery.scala| 51 +--- .../sql/catalyst/plans/logical/LogicalPlan.scala | 70 ++ .../optimizer/FoldablePropagationSuite.scala | 4 +- .../plans/logical/LogicalPlanIntegritySuite.scala | 51 .../sql/execution/adaptive/AQEOptimizer.scala | 8 ++- .../apache/spark/sql/streaming/StreamSuite.scala | 7 +-- 8 files changed, 181 insertions(+), 36 deletions(-) create mode 100644 sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/plans/logical/LogicalPlanIntegritySuite.scala - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (cc06266 -> 3a299aa)
This is an automated email from the ASF dual-hosted git repository. yamamuro pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from cc06266 [SPARK-33019][CORE] Use spark.hadoop.mapreduce.fileoutputcommitter.algorithm.version=1 by default add 3a299aa [SPARK-32741][SQL] Check if the same ExprId refers to the unique attribute in logical plans No new revisions were added by this update. Summary of changes: .../spark/sql/catalyst/analysis/Analyzer.scala | 11 +++- .../spark/sql/catalyst/optimizer/Optimizer.scala | 15 +++-- .../spark/sql/catalyst/optimizer/subquery.scala| 51 +--- .../sql/catalyst/plans/logical/LogicalPlan.scala | 70 ++ .../optimizer/FoldablePropagationSuite.scala | 4 +- .../plans/logical/LogicalPlanIntegritySuite.scala | 51 .../sql/execution/adaptive/AQEOptimizer.scala | 8 ++- .../apache/spark/sql/streaming/StreamSuite.scala | 7 +-- 8 files changed, 181 insertions(+), 36 deletions(-) create mode 100644 sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/plans/logical/LogicalPlanIntegritySuite.scala - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (cc06266 -> 3a299aa)
This is an automated email from the ASF dual-hosted git repository. yamamuro pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from cc06266 [SPARK-33019][CORE] Use spark.hadoop.mapreduce.fileoutputcommitter.algorithm.version=1 by default add 3a299aa [SPARK-32741][SQL] Check if the same ExprId refers to the unique attribute in logical plans No new revisions were added by this update. Summary of changes: .../spark/sql/catalyst/analysis/Analyzer.scala | 11 +++- .../spark/sql/catalyst/optimizer/Optimizer.scala | 15 +++-- .../spark/sql/catalyst/optimizer/subquery.scala| 51 +--- .../sql/catalyst/plans/logical/LogicalPlan.scala | 70 ++ .../optimizer/FoldablePropagationSuite.scala | 4 +- .../plans/logical/LogicalPlanIntegritySuite.scala | 51 .../sql/execution/adaptive/AQEOptimizer.scala | 8 ++- .../apache/spark/sql/streaming/StreamSuite.scala | 7 +-- 8 files changed, 181 insertions(+), 36 deletions(-) create mode 100644 sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/plans/logical/LogicalPlanIntegritySuite.scala - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (b53da23 -> acfee3c)
This is an automated email from the ASF dual-hosted git repository. yamamuro pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from b53da23 [MINOR][SQL] Improve examples for `percentile_approx()` add acfee3c [SPARK-32870][DOCS][SQL] Make sure that all expressions have their ExpressionDescription filled No new revisions were added by this update. Summary of changes: .../sql/catalyst/analysis/FunctionRegistry.scala | 2 +- .../expressions/CallMethodViaReflection.scala | 3 +- .../spark/sql/catalyst/expressions/Cast.scala | 3 +- .../expressions/MonotonicallyIncreasingID.scala| 8 +- .../catalyst/expressions/SparkPartitionID.scala| 8 +- .../expressions/aggregate/CountMinSketchAgg.scala | 7 ++ .../expressions/aggregate/bitwiseAggregates.scala | 1 + .../sql/catalyst/expressions/arithmetic.scala | 38 ++--- .../catalyst/expressions/bitwiseExpressions.scala | 12 ++- .../expressions/collectionOperations.scala | 18 +++-- .../catalyst/expressions/complexTypeCreator.scala | 20 +++-- .../expressions/conditionalExpressions.scala | 6 +- .../catalyst/expressions/datetimeExpressions.scala | 40 +- .../sql/catalyst/expressions/generators.scala | 12 ++- .../spark/sql/catalyst/expressions/hash.scala | 18 +++-- .../sql/catalyst/expressions/inputFileBlock.scala | 27 ++- .../sql/catalyst/expressions/jsonExpressions.scala | 6 +- .../spark/sql/catalyst/expressions/misc.scala | 16 +++- .../sql/catalyst/expressions/predicates.scala | 61 --- .../catalyst/expressions/windowExpressions.scala | 91 ++ .../spark/sql/catalyst/expressions/xml/xpath.scala | 24 -- .../sql-functions/sql-expression-schema.md | 48 ++-- .../test/resources/sql-tests/results/cast.sql.out | 2 + .../apache/spark/sql/ExpressionsSchemaSuite.scala | 10 ++- .../spark/sql/execution/command/DDLSuite.scala | 6 +- .../sql/expressions/ExpressionInfoSuite.scala | 37 - 26 files changed, 404 insertions(+), 120 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (b53da23 -> acfee3c)
This is an automated email from the ASF dual-hosted git repository. yamamuro pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from b53da23 [MINOR][SQL] Improve examples for `percentile_approx()` add acfee3c [SPARK-32870][DOCS][SQL] Make sure that all expressions have their ExpressionDescription filled No new revisions were added by this update. Summary of changes: .../sql/catalyst/analysis/FunctionRegistry.scala | 2 +- .../expressions/CallMethodViaReflection.scala | 3 +- .../spark/sql/catalyst/expressions/Cast.scala | 3 +- .../expressions/MonotonicallyIncreasingID.scala| 8 +- .../catalyst/expressions/SparkPartitionID.scala| 8 +- .../expressions/aggregate/CountMinSketchAgg.scala | 7 ++ .../expressions/aggregate/bitwiseAggregates.scala | 1 + .../sql/catalyst/expressions/arithmetic.scala | 38 ++--- .../catalyst/expressions/bitwiseExpressions.scala | 12 ++- .../expressions/collectionOperations.scala | 18 +++-- .../catalyst/expressions/complexTypeCreator.scala | 20 +++-- .../expressions/conditionalExpressions.scala | 6 +- .../catalyst/expressions/datetimeExpressions.scala | 40 +- .../sql/catalyst/expressions/generators.scala | 12 ++- .../spark/sql/catalyst/expressions/hash.scala | 18 +++-- .../sql/catalyst/expressions/inputFileBlock.scala | 27 ++- .../sql/catalyst/expressions/jsonExpressions.scala | 6 +- .../spark/sql/catalyst/expressions/misc.scala | 16 +++- .../sql/catalyst/expressions/predicates.scala | 61 --- .../catalyst/expressions/windowExpressions.scala | 91 ++ .../spark/sql/catalyst/expressions/xml/xpath.scala | 24 -- .../sql-functions/sql-expression-schema.md | 48 ++-- .../test/resources/sql-tests/results/cast.sql.out | 2 + .../apache/spark/sql/ExpressionsSchemaSuite.scala | 10 ++- .../spark/sql/execution/command/DDLSuite.scala | 6 +- .../sql/expressions/ExpressionInfoSuite.scala | 37 - 26 files changed, 404 insertions(+), 120 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (b53da23 -> acfee3c)
This is an automated email from the ASF dual-hosted git repository. yamamuro pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from b53da23 [MINOR][SQL] Improve examples for `percentile_approx()` add acfee3c [SPARK-32870][DOCS][SQL] Make sure that all expressions have their ExpressionDescription filled No new revisions were added by this update. Summary of changes: .../sql/catalyst/analysis/FunctionRegistry.scala | 2 +- .../expressions/CallMethodViaReflection.scala | 3 +- .../spark/sql/catalyst/expressions/Cast.scala | 3 +- .../expressions/MonotonicallyIncreasingID.scala| 8 +- .../catalyst/expressions/SparkPartitionID.scala| 8 +- .../expressions/aggregate/CountMinSketchAgg.scala | 7 ++ .../expressions/aggregate/bitwiseAggregates.scala | 1 + .../sql/catalyst/expressions/arithmetic.scala | 38 ++--- .../catalyst/expressions/bitwiseExpressions.scala | 12 ++- .../expressions/collectionOperations.scala | 18 +++-- .../catalyst/expressions/complexTypeCreator.scala | 20 +++-- .../expressions/conditionalExpressions.scala | 6 +- .../catalyst/expressions/datetimeExpressions.scala | 40 +- .../sql/catalyst/expressions/generators.scala | 12 ++- .../spark/sql/catalyst/expressions/hash.scala | 18 +++-- .../sql/catalyst/expressions/inputFileBlock.scala | 27 ++- .../sql/catalyst/expressions/jsonExpressions.scala | 6 +- .../spark/sql/catalyst/expressions/misc.scala | 16 +++- .../sql/catalyst/expressions/predicates.scala | 61 --- .../catalyst/expressions/windowExpressions.scala | 91 ++ .../spark/sql/catalyst/expressions/xml/xpath.scala | 24 -- .../sql-functions/sql-expression-schema.md | 48 ++-- .../test/resources/sql-tests/results/cast.sql.out | 2 + .../apache/spark/sql/ExpressionsSchemaSuite.scala | 10 ++- .../spark/sql/execution/command/DDLSuite.scala | 6 +- .../sql/expressions/ExpressionInfoSuite.scala | 37 - 26 files changed, 404 insertions(+), 120 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (b53da23 -> acfee3c)
This is an automated email from the ASF dual-hosted git repository. yamamuro pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from b53da23 [MINOR][SQL] Improve examples for `percentile_approx()` add acfee3c [SPARK-32870][DOCS][SQL] Make sure that all expressions have their ExpressionDescription filled No new revisions were added by this update. Summary of changes: .../sql/catalyst/analysis/FunctionRegistry.scala | 2 +- .../expressions/CallMethodViaReflection.scala | 3 +- .../spark/sql/catalyst/expressions/Cast.scala | 3 +- .../expressions/MonotonicallyIncreasingID.scala| 8 +- .../catalyst/expressions/SparkPartitionID.scala| 8 +- .../expressions/aggregate/CountMinSketchAgg.scala | 7 ++ .../expressions/aggregate/bitwiseAggregates.scala | 1 + .../sql/catalyst/expressions/arithmetic.scala | 38 ++--- .../catalyst/expressions/bitwiseExpressions.scala | 12 ++- .../expressions/collectionOperations.scala | 18 +++-- .../catalyst/expressions/complexTypeCreator.scala | 20 +++-- .../expressions/conditionalExpressions.scala | 6 +- .../catalyst/expressions/datetimeExpressions.scala | 40 +- .../sql/catalyst/expressions/generators.scala | 12 ++- .../spark/sql/catalyst/expressions/hash.scala | 18 +++-- .../sql/catalyst/expressions/inputFileBlock.scala | 27 ++- .../sql/catalyst/expressions/jsonExpressions.scala | 6 +- .../spark/sql/catalyst/expressions/misc.scala | 16 +++- .../sql/catalyst/expressions/predicates.scala | 61 --- .../catalyst/expressions/windowExpressions.scala | 91 ++ .../spark/sql/catalyst/expressions/xml/xpath.scala | 24 -- .../sql-functions/sql-expression-schema.md | 48 ++-- .../test/resources/sql-tests/results/cast.sql.out | 2 + .../apache/spark/sql/ExpressionsSchemaSuite.scala | 10 ++- .../spark/sql/execution/command/DDLSuite.scala | 6 +- .../sql/expressions/ExpressionInfoSuite.scala | 37 - 26 files changed, 404 insertions(+), 120 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (b53da23 -> acfee3c)
This is an automated email from the ASF dual-hosted git repository. yamamuro pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from b53da23 [MINOR][SQL] Improve examples for `percentile_approx()` add acfee3c [SPARK-32870][DOCS][SQL] Make sure that all expressions have their ExpressionDescription filled No new revisions were added by this update. Summary of changes: .../sql/catalyst/analysis/FunctionRegistry.scala | 2 +- .../expressions/CallMethodViaReflection.scala | 3 +- .../spark/sql/catalyst/expressions/Cast.scala | 3 +- .../expressions/MonotonicallyIncreasingID.scala| 8 +- .../catalyst/expressions/SparkPartitionID.scala| 8 +- .../expressions/aggregate/CountMinSketchAgg.scala | 7 ++ .../expressions/aggregate/bitwiseAggregates.scala | 1 + .../sql/catalyst/expressions/arithmetic.scala | 38 ++--- .../catalyst/expressions/bitwiseExpressions.scala | 12 ++- .../expressions/collectionOperations.scala | 18 +++-- .../catalyst/expressions/complexTypeCreator.scala | 20 +++-- .../expressions/conditionalExpressions.scala | 6 +- .../catalyst/expressions/datetimeExpressions.scala | 40 +- .../sql/catalyst/expressions/generators.scala | 12 ++- .../spark/sql/catalyst/expressions/hash.scala | 18 +++-- .../sql/catalyst/expressions/inputFileBlock.scala | 27 ++- .../sql/catalyst/expressions/jsonExpressions.scala | 6 +- .../spark/sql/catalyst/expressions/misc.scala | 16 +++- .../sql/catalyst/expressions/predicates.scala | 61 --- .../catalyst/expressions/windowExpressions.scala | 91 ++ .../spark/sql/catalyst/expressions/xml/xpath.scala | 24 -- .../sql-functions/sql-expression-schema.md | 48 ++-- .../test/resources/sql-tests/results/cast.sql.out | 2 + .../apache/spark/sql/ExpressionsSchemaSuite.scala | 10 ++- .../spark/sql/execution/command/DDLSuite.scala | 6 +- .../sql/expressions/ExpressionInfoSuite.scala | 37 - 26 files changed, 404 insertions(+), 120 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (4ced588 -> 68e0d5f)
This is an automated email from the ASF dual-hosted git repository. yamamuro pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from 4ced588 [SPARK-32635][SQL] Fix foldable propagation add 68e0d5f [SPARK-32902][SQL] Logging plan changes for AQE No new revisions were added by this update. Summary of changes: .../execution/adaptive/AdaptiveSparkPlanExec.scala | 45 +- .../adaptive/AdaptiveQueryExecSuite.scala | 20 ++ 2 files changed, 56 insertions(+), 9 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (4ced588 -> 68e0d5f)
This is an automated email from the ASF dual-hosted git repository. yamamuro pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from 4ced588 [SPARK-32635][SQL] Fix foldable propagation add 68e0d5f [SPARK-32902][SQL] Logging plan changes for AQE No new revisions were added by this update. Summary of changes: .../execution/adaptive/AdaptiveSparkPlanExec.scala | 45 +- .../adaptive/AdaptiveQueryExecSuite.scala | 20 ++ 2 files changed, 56 insertions(+), 9 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (4ced588 -> 68e0d5f)
This is an automated email from the ASF dual-hosted git repository. yamamuro pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from 4ced588 [SPARK-32635][SQL] Fix foldable propagation add 68e0d5f [SPARK-32902][SQL] Logging plan changes for AQE No new revisions were added by this update. Summary of changes: .../execution/adaptive/AdaptiveSparkPlanExec.scala | 45 +- .../adaptive/AdaptiveQueryExecSuite.scala | 20 ++ 2 files changed, 56 insertions(+), 9 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (4ced588 -> 68e0d5f)
This is an automated email from the ASF dual-hosted git repository. yamamuro pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from 4ced588 [SPARK-32635][SQL] Fix foldable propagation add 68e0d5f [SPARK-32902][SQL] Logging plan changes for AQE No new revisions were added by this update. Summary of changes: .../execution/adaptive/AdaptiveSparkPlanExec.scala | 45 +- .../adaptive/AdaptiveQueryExecSuite.scala | 20 ++ 2 files changed, 56 insertions(+), 9 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (4ced588 -> 68e0d5f)
This is an automated email from the ASF dual-hosted git repository. yamamuro pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from 4ced588 [SPARK-32635][SQL] Fix foldable propagation add 68e0d5f [SPARK-32902][SQL] Logging plan changes for AQE No new revisions were added by this update. Summary of changes: .../execution/adaptive/AdaptiveSparkPlanExec.scala | 45 +- .../adaptive/AdaptiveQueryExecSuite.scala | 20 ++ 2 files changed, 56 insertions(+), 9 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch branch-3.0 updated: [SPARK-32635][SQL] Fix foldable propagation
This is an automated email from the ASF dual-hosted git repository. yamamuro pushed a commit to branch branch-3.0 in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/branch-3.0 by this push: new ecc2f5d [SPARK-32635][SQL] Fix foldable propagation ecc2f5d is described below commit ecc2f5d9e227b62f418d65708f516ffe8e690f96 Author: Peter Toth AuthorDate: Fri Sep 18 08:17:23 2020 +0900 [SPARK-32635][SQL] Fix foldable propagation ### What changes were proposed in this pull request? This PR rewrites `FoldablePropagation` rule to replace attribute references in a node with foldables coming only from the node's children. Before this PR in the case of this example (with setting`spark.sql.optimizer.excludedRules=org.apache.spark.sql.catalyst.optimizer.ConvertToLocalRelation`): ```scala val a = Seq("1").toDF("col1").withColumn("col2", lit("1")) val b = Seq("2").toDF("col1").withColumn("col2", lit("2")) val aub = a.union(b) val c = aub.filter($"col1" === "2").cache() val d = Seq("2").toDF( "col4") val r = d.join(aub, $"col2" === $"col4").select("col4") val l = c.select("col2") val df = l.join(r, $"col2" === $"col4", "LeftOuter") df.show() ``` foldable propagation happens incorrectly: ``` Join LeftOuter, (col2#6 = col4#34) Join LeftOuter, (col2#6 = col4#34) !:- Project [col2#6] :- Project [1 AS col2#6] : +- InMemoryRelation [col1#4, col2#6], StorageLevel(disk, memory, deserialized, 1 replicas) : +- InMemoryRelation [col1#4, col2#6], StorageLevel(disk, memory, deserialized, 1 replicas) :+- Union :+- Union : :- *(1) Project [value#1 AS col1#4, 1 AS col2#6] : :- *(1) Project [value#1 AS col1#4, 1 AS col2#6] : : +- *(1) Filter (isnotnull(value#1) AND (value#1 = 2)) : : +- *(1) Filter (isnotnull(value#1) AND (value#1 = 2)) : : +- *(1) LocalTableScan [value#1] : : +- *(1) LocalTableScan [value#1] : +- *(2) Project [value#10 AS col1#13, 2 AS col2#15] : +- *(2) Project [value#10 AS col1#13, 2 AS col2#15] : +- *(2) Filter (isnotnull(value#10) AND (value#10 = 2)) : +- *(2) Filter (isnotnull(value#10) AND (value#10 = 2)) : +- *(2) LocalTableScan [value#10] : +- *(2) LocalTableScan [value#10] +- Project [col4#34] +- Project [col4#34] +- Join Inner, (col2#6 = col4#34) +- Join Inner, (col2#6 = col4#34) :- Project [value#31 AS col4#34] :- Project [value#31 AS col4#34] : +- LocalRelation [value#31] : +- LocalRelation [value#31] +- Project [col2#6] +- Project [col2#6] +- Union false, false +- Union false, false :- Project [1 AS col2#6] :- Project [1 AS col2#6] : +- LocalRelation [value#1] : +- LocalRelation [value#1] +- Project [2 AS col2#15] +- Project [2 AS col2#15] +- LocalRelation [value#10] +- LocalRelation [value#10] ``` and so the result is wrong: ``` +++ |col2|col4| +++ | 1|null| +++ ``` After this PR foldable propagation will not happen incorrectly and the result is correct: ``` +++ |col2|col4| +++ | 2| 2| +++ ``` ### Why are the changes needed? To fix a correctness issue. ### Does this PR introduce _any_ user-facing change? Yes, fixes a correctness issue. ###
[spark] branch master updated (ea3b979 -> 4ced588)
This is an automated email from the ASF dual-hosted git repository. yamamuro pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from ea3b979 [SPARK-32889][SQL] orc table column name supports special characters add 4ced588 [SPARK-32635][SQL] Fix foldable propagation No new revisions were added by this update. Summary of changes: .../sql/catalyst/expressions/AttributeMap.scala| 2 + .../sql/catalyst/expressions/AttributeMap.scala| 2 + .../spark/sql/catalyst/optimizer/expressions.scala | 121 - .../org/apache/spark/sql/DataFrameSuite.scala | 12 ++ 4 files changed, 88 insertions(+), 49 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch branch-3.0 updated: [SPARK-32635][SQL] Fix foldable propagation
This is an automated email from the ASF dual-hosted git repository. yamamuro pushed a commit to branch branch-3.0 in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/branch-3.0 by this push: new ecc2f5d [SPARK-32635][SQL] Fix foldable propagation ecc2f5d is described below commit ecc2f5d9e227b62f418d65708f516ffe8e690f96 Author: Peter Toth AuthorDate: Fri Sep 18 08:17:23 2020 +0900 [SPARK-32635][SQL] Fix foldable propagation ### What changes were proposed in this pull request? This PR rewrites `FoldablePropagation` rule to replace attribute references in a node with foldables coming only from the node's children. Before this PR in the case of this example (with setting`spark.sql.optimizer.excludedRules=org.apache.spark.sql.catalyst.optimizer.ConvertToLocalRelation`): ```scala val a = Seq("1").toDF("col1").withColumn("col2", lit("1")) val b = Seq("2").toDF("col1").withColumn("col2", lit("2")) val aub = a.union(b) val c = aub.filter($"col1" === "2").cache() val d = Seq("2").toDF( "col4") val r = d.join(aub, $"col2" === $"col4").select("col4") val l = c.select("col2") val df = l.join(r, $"col2" === $"col4", "LeftOuter") df.show() ``` foldable propagation happens incorrectly: ``` Join LeftOuter, (col2#6 = col4#34) Join LeftOuter, (col2#6 = col4#34) !:- Project [col2#6] :- Project [1 AS col2#6] : +- InMemoryRelation [col1#4, col2#6], StorageLevel(disk, memory, deserialized, 1 replicas) : +- InMemoryRelation [col1#4, col2#6], StorageLevel(disk, memory, deserialized, 1 replicas) :+- Union :+- Union : :- *(1) Project [value#1 AS col1#4, 1 AS col2#6] : :- *(1) Project [value#1 AS col1#4, 1 AS col2#6] : : +- *(1) Filter (isnotnull(value#1) AND (value#1 = 2)) : : +- *(1) Filter (isnotnull(value#1) AND (value#1 = 2)) : : +- *(1) LocalTableScan [value#1] : : +- *(1) LocalTableScan [value#1] : +- *(2) Project [value#10 AS col1#13, 2 AS col2#15] : +- *(2) Project [value#10 AS col1#13, 2 AS col2#15] : +- *(2) Filter (isnotnull(value#10) AND (value#10 = 2)) : +- *(2) Filter (isnotnull(value#10) AND (value#10 = 2)) : +- *(2) LocalTableScan [value#10] : +- *(2) LocalTableScan [value#10] +- Project [col4#34] +- Project [col4#34] +- Join Inner, (col2#6 = col4#34) +- Join Inner, (col2#6 = col4#34) :- Project [value#31 AS col4#34] :- Project [value#31 AS col4#34] : +- LocalRelation [value#31] : +- LocalRelation [value#31] +- Project [col2#6] +- Project [col2#6] +- Union false, false +- Union false, false :- Project [1 AS col2#6] :- Project [1 AS col2#6] : +- LocalRelation [value#1] : +- LocalRelation [value#1] +- Project [2 AS col2#15] +- Project [2 AS col2#15] +- LocalRelation [value#10] +- LocalRelation [value#10] ``` and so the result is wrong: ``` +++ |col2|col4| +++ | 1|null| +++ ``` After this PR foldable propagation will not happen incorrectly and the result is correct: ``` +++ |col2|col4| +++ | 2| 2| +++ ``` ### Why are the changes needed? To fix a correctness issue. ### Does this PR introduce _any_ user-facing change? Yes, fixes a correctness issue. ###
[spark] branch master updated (ea3b979 -> 4ced588)
This is an automated email from the ASF dual-hosted git repository. yamamuro pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from ea3b979 [SPARK-32889][SQL] orc table column name supports special characters add 4ced588 [SPARK-32635][SQL] Fix foldable propagation No new revisions were added by this update. Summary of changes: .../sql/catalyst/expressions/AttributeMap.scala| 2 + .../sql/catalyst/expressions/AttributeMap.scala| 2 + .../spark/sql/catalyst/optimizer/expressions.scala | 121 - .../org/apache/spark/sql/DataFrameSuite.scala | 12 ++ 4 files changed, 88 insertions(+), 49 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch branch-3.0 updated: [SPARK-32635][SQL] Fix foldable propagation
This is an automated email from the ASF dual-hosted git repository. yamamuro pushed a commit to branch branch-3.0 in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/branch-3.0 by this push: new ecc2f5d [SPARK-32635][SQL] Fix foldable propagation ecc2f5d is described below commit ecc2f5d9e227b62f418d65708f516ffe8e690f96 Author: Peter Toth AuthorDate: Fri Sep 18 08:17:23 2020 +0900 [SPARK-32635][SQL] Fix foldable propagation ### What changes were proposed in this pull request? This PR rewrites `FoldablePropagation` rule to replace attribute references in a node with foldables coming only from the node's children. Before this PR in the case of this example (with setting`spark.sql.optimizer.excludedRules=org.apache.spark.sql.catalyst.optimizer.ConvertToLocalRelation`): ```scala val a = Seq("1").toDF("col1").withColumn("col2", lit("1")) val b = Seq("2").toDF("col1").withColumn("col2", lit("2")) val aub = a.union(b) val c = aub.filter($"col1" === "2").cache() val d = Seq("2").toDF( "col4") val r = d.join(aub, $"col2" === $"col4").select("col4") val l = c.select("col2") val df = l.join(r, $"col2" === $"col4", "LeftOuter") df.show() ``` foldable propagation happens incorrectly: ``` Join LeftOuter, (col2#6 = col4#34) Join LeftOuter, (col2#6 = col4#34) !:- Project [col2#6] :- Project [1 AS col2#6] : +- InMemoryRelation [col1#4, col2#6], StorageLevel(disk, memory, deserialized, 1 replicas) : +- InMemoryRelation [col1#4, col2#6], StorageLevel(disk, memory, deserialized, 1 replicas) :+- Union :+- Union : :- *(1) Project [value#1 AS col1#4, 1 AS col2#6] : :- *(1) Project [value#1 AS col1#4, 1 AS col2#6] : : +- *(1) Filter (isnotnull(value#1) AND (value#1 = 2)) : : +- *(1) Filter (isnotnull(value#1) AND (value#1 = 2)) : : +- *(1) LocalTableScan [value#1] : : +- *(1) LocalTableScan [value#1] : +- *(2) Project [value#10 AS col1#13, 2 AS col2#15] : +- *(2) Project [value#10 AS col1#13, 2 AS col2#15] : +- *(2) Filter (isnotnull(value#10) AND (value#10 = 2)) : +- *(2) Filter (isnotnull(value#10) AND (value#10 = 2)) : +- *(2) LocalTableScan [value#10] : +- *(2) LocalTableScan [value#10] +- Project [col4#34] +- Project [col4#34] +- Join Inner, (col2#6 = col4#34) +- Join Inner, (col2#6 = col4#34) :- Project [value#31 AS col4#34] :- Project [value#31 AS col4#34] : +- LocalRelation [value#31] : +- LocalRelation [value#31] +- Project [col2#6] +- Project [col2#6] +- Union false, false +- Union false, false :- Project [1 AS col2#6] :- Project [1 AS col2#6] : +- LocalRelation [value#1] : +- LocalRelation [value#1] +- Project [2 AS col2#15] +- Project [2 AS col2#15] +- LocalRelation [value#10] +- LocalRelation [value#10] ``` and so the result is wrong: ``` +++ |col2|col4| +++ | 1|null| +++ ``` After this PR foldable propagation will not happen incorrectly and the result is correct: ``` +++ |col2|col4| +++ | 2| 2| +++ ``` ### Why are the changes needed? To fix a correctness issue. ### Does this PR introduce _any_ user-facing change? Yes, fixes a correctness issue. ###
[spark] branch master updated (ea3b979 -> 4ced588)
This is an automated email from the ASF dual-hosted git repository. yamamuro pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from ea3b979 [SPARK-32889][SQL] orc table column name supports special characters add 4ced588 [SPARK-32635][SQL] Fix foldable propagation No new revisions were added by this update. Summary of changes: .../sql/catalyst/expressions/AttributeMap.scala| 2 + .../sql/catalyst/expressions/AttributeMap.scala| 2 + .../spark/sql/catalyst/optimizer/expressions.scala | 121 - .../org/apache/spark/sql/DataFrameSuite.scala | 12 ++ 4 files changed, 88 insertions(+), 49 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch branch-3.0 updated: [SPARK-32635][SQL] Fix foldable propagation
This is an automated email from the ASF dual-hosted git repository. yamamuro pushed a commit to branch branch-3.0 in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/branch-3.0 by this push: new ecc2f5d [SPARK-32635][SQL] Fix foldable propagation ecc2f5d is described below commit ecc2f5d9e227b62f418d65708f516ffe8e690f96 Author: Peter Toth AuthorDate: Fri Sep 18 08:17:23 2020 +0900 [SPARK-32635][SQL] Fix foldable propagation ### What changes were proposed in this pull request? This PR rewrites `FoldablePropagation` rule to replace attribute references in a node with foldables coming only from the node's children. Before this PR in the case of this example (with setting`spark.sql.optimizer.excludedRules=org.apache.spark.sql.catalyst.optimizer.ConvertToLocalRelation`): ```scala val a = Seq("1").toDF("col1").withColumn("col2", lit("1")) val b = Seq("2").toDF("col1").withColumn("col2", lit("2")) val aub = a.union(b) val c = aub.filter($"col1" === "2").cache() val d = Seq("2").toDF( "col4") val r = d.join(aub, $"col2" === $"col4").select("col4") val l = c.select("col2") val df = l.join(r, $"col2" === $"col4", "LeftOuter") df.show() ``` foldable propagation happens incorrectly: ``` Join LeftOuter, (col2#6 = col4#34) Join LeftOuter, (col2#6 = col4#34) !:- Project [col2#6] :- Project [1 AS col2#6] : +- InMemoryRelation [col1#4, col2#6], StorageLevel(disk, memory, deserialized, 1 replicas) : +- InMemoryRelation [col1#4, col2#6], StorageLevel(disk, memory, deserialized, 1 replicas) :+- Union :+- Union : :- *(1) Project [value#1 AS col1#4, 1 AS col2#6] : :- *(1) Project [value#1 AS col1#4, 1 AS col2#6] : : +- *(1) Filter (isnotnull(value#1) AND (value#1 = 2)) : : +- *(1) Filter (isnotnull(value#1) AND (value#1 = 2)) : : +- *(1) LocalTableScan [value#1] : : +- *(1) LocalTableScan [value#1] : +- *(2) Project [value#10 AS col1#13, 2 AS col2#15] : +- *(2) Project [value#10 AS col1#13, 2 AS col2#15] : +- *(2) Filter (isnotnull(value#10) AND (value#10 = 2)) : +- *(2) Filter (isnotnull(value#10) AND (value#10 = 2)) : +- *(2) LocalTableScan [value#10] : +- *(2) LocalTableScan [value#10] +- Project [col4#34] +- Project [col4#34] +- Join Inner, (col2#6 = col4#34) +- Join Inner, (col2#6 = col4#34) :- Project [value#31 AS col4#34] :- Project [value#31 AS col4#34] : +- LocalRelation [value#31] : +- LocalRelation [value#31] +- Project [col2#6] +- Project [col2#6] +- Union false, false +- Union false, false :- Project [1 AS col2#6] :- Project [1 AS col2#6] : +- LocalRelation [value#1] : +- LocalRelation [value#1] +- Project [2 AS col2#15] +- Project [2 AS col2#15] +- LocalRelation [value#10] +- LocalRelation [value#10] ``` and so the result is wrong: ``` +++ |col2|col4| +++ | 1|null| +++ ``` After this PR foldable propagation will not happen incorrectly and the result is correct: ``` +++ |col2|col4| +++ | 2| 2| +++ ``` ### Why are the changes needed? To fix a correctness issue. ### Does this PR introduce _any_ user-facing change? Yes, fixes a correctness issue. ###
[spark] branch master updated (ea3b979 -> 4ced588)
This is an automated email from the ASF dual-hosted git repository. yamamuro pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from ea3b979 [SPARK-32889][SQL] orc table column name supports special characters add 4ced588 [SPARK-32635][SQL] Fix foldable propagation No new revisions were added by this update. Summary of changes: .../sql/catalyst/expressions/AttributeMap.scala| 2 + .../sql/catalyst/expressions/AttributeMap.scala| 2 + .../spark/sql/catalyst/optimizer/expressions.scala | 121 - .../org/apache/spark/sql/DataFrameSuite.scala | 12 ++ 4 files changed, 88 insertions(+), 49 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch branch-3.0 updated: [SPARK-32635][SQL] Fix foldable propagation
This is an automated email from the ASF dual-hosted git repository. yamamuro pushed a commit to branch branch-3.0 in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/branch-3.0 by this push: new ecc2f5d [SPARK-32635][SQL] Fix foldable propagation ecc2f5d is described below commit ecc2f5d9e227b62f418d65708f516ffe8e690f96 Author: Peter Toth AuthorDate: Fri Sep 18 08:17:23 2020 +0900 [SPARK-32635][SQL] Fix foldable propagation ### What changes were proposed in this pull request? This PR rewrites `FoldablePropagation` rule to replace attribute references in a node with foldables coming only from the node's children. Before this PR in the case of this example (with setting`spark.sql.optimizer.excludedRules=org.apache.spark.sql.catalyst.optimizer.ConvertToLocalRelation`): ```scala val a = Seq("1").toDF("col1").withColumn("col2", lit("1")) val b = Seq("2").toDF("col1").withColumn("col2", lit("2")) val aub = a.union(b) val c = aub.filter($"col1" === "2").cache() val d = Seq("2").toDF( "col4") val r = d.join(aub, $"col2" === $"col4").select("col4") val l = c.select("col2") val df = l.join(r, $"col2" === $"col4", "LeftOuter") df.show() ``` foldable propagation happens incorrectly: ``` Join LeftOuter, (col2#6 = col4#34) Join LeftOuter, (col2#6 = col4#34) !:- Project [col2#6] :- Project [1 AS col2#6] : +- InMemoryRelation [col1#4, col2#6], StorageLevel(disk, memory, deserialized, 1 replicas) : +- InMemoryRelation [col1#4, col2#6], StorageLevel(disk, memory, deserialized, 1 replicas) :+- Union :+- Union : :- *(1) Project [value#1 AS col1#4, 1 AS col2#6] : :- *(1) Project [value#1 AS col1#4, 1 AS col2#6] : : +- *(1) Filter (isnotnull(value#1) AND (value#1 = 2)) : : +- *(1) Filter (isnotnull(value#1) AND (value#1 = 2)) : : +- *(1) LocalTableScan [value#1] : : +- *(1) LocalTableScan [value#1] : +- *(2) Project [value#10 AS col1#13, 2 AS col2#15] : +- *(2) Project [value#10 AS col1#13, 2 AS col2#15] : +- *(2) Filter (isnotnull(value#10) AND (value#10 = 2)) : +- *(2) Filter (isnotnull(value#10) AND (value#10 = 2)) : +- *(2) LocalTableScan [value#10] : +- *(2) LocalTableScan [value#10] +- Project [col4#34] +- Project [col4#34] +- Join Inner, (col2#6 = col4#34) +- Join Inner, (col2#6 = col4#34) :- Project [value#31 AS col4#34] :- Project [value#31 AS col4#34] : +- LocalRelation [value#31] : +- LocalRelation [value#31] +- Project [col2#6] +- Project [col2#6] +- Union false, false +- Union false, false :- Project [1 AS col2#6] :- Project [1 AS col2#6] : +- LocalRelation [value#1] : +- LocalRelation [value#1] +- Project [2 AS col2#15] +- Project [2 AS col2#15] +- LocalRelation [value#10] +- LocalRelation [value#10] ``` and so the result is wrong: ``` +++ |col2|col4| +++ | 1|null| +++ ``` After this PR foldable propagation will not happen incorrectly and the result is correct: ``` +++ |col2|col4| +++ | 2| 2| +++ ``` ### Why are the changes needed? To fix a correctness issue. ### Does this PR introduce _any_ user-facing change? Yes, fixes a correctness issue. ###
[spark] branch master updated: [SPARK-32635][SQL] Fix foldable propagation
This is an automated email from the ASF dual-hosted git repository. yamamuro pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/master by this push: new 4ced588 [SPARK-32635][SQL] Fix foldable propagation 4ced588 is described below commit 4ced58862c707aa916f7a55d15c3887c94c9b210 Author: Peter Toth AuthorDate: Fri Sep 18 08:17:23 2020 +0900 [SPARK-32635][SQL] Fix foldable propagation ### What changes were proposed in this pull request? This PR rewrites `FoldablePropagation` rule to replace attribute references in a node with foldables coming only from the node's children. Before this PR in the case of this example (with setting`spark.sql.optimizer.excludedRules=org.apache.spark.sql.catalyst.optimizer.ConvertToLocalRelation`): ```scala val a = Seq("1").toDF("col1").withColumn("col2", lit("1")) val b = Seq("2").toDF("col1").withColumn("col2", lit("2")) val aub = a.union(b) val c = aub.filter($"col1" === "2").cache() val d = Seq("2").toDF( "col4") val r = d.join(aub, $"col2" === $"col4").select("col4") val l = c.select("col2") val df = l.join(r, $"col2" === $"col4", "LeftOuter") df.show() ``` foldable propagation happens incorrectly: ``` Join LeftOuter, (col2#6 = col4#34) Join LeftOuter, (col2#6 = col4#34) !:- Project [col2#6] :- Project [1 AS col2#6] : +- InMemoryRelation [col1#4, col2#6], StorageLevel(disk, memory, deserialized, 1 replicas) : +- InMemoryRelation [col1#4, col2#6], StorageLevel(disk, memory, deserialized, 1 replicas) :+- Union :+- Union : :- *(1) Project [value#1 AS col1#4, 1 AS col2#6] : :- *(1) Project [value#1 AS col1#4, 1 AS col2#6] : : +- *(1) Filter (isnotnull(value#1) AND (value#1 = 2)) : : +- *(1) Filter (isnotnull(value#1) AND (value#1 = 2)) : : +- *(1) LocalTableScan [value#1] : : +- *(1) LocalTableScan [value#1] : +- *(2) Project [value#10 AS col1#13, 2 AS col2#15] : +- *(2) Project [value#10 AS col1#13, 2 AS col2#15] : +- *(2) Filter (isnotnull(value#10) AND (value#10 = 2)) : +- *(2) Filter (isnotnull(value#10) AND (value#10 = 2)) : +- *(2) LocalTableScan [value#10] : +- *(2) LocalTableScan [value#10] +- Project [col4#34] +- Project [col4#34] +- Join Inner, (col2#6 = col4#34) +- Join Inner, (col2#6 = col4#34) :- Project [value#31 AS col4#34] :- Project [value#31 AS col4#34] : +- LocalRelation [value#31] : +- LocalRelation [value#31] +- Project [col2#6] +- Project [col2#6] +- Union false, false +- Union false, false :- Project [1 AS col2#6] :- Project [1 AS col2#6] : +- LocalRelation [value#1] : +- LocalRelation [value#1] +- Project [2 AS col2#15] +- Project [2 AS col2#15] +- LocalRelation [value#10] +- LocalRelation [value#10] ``` and so the result is wrong: ``` +++ |col2|col4| +++ | 1|null| +++ ``` After this PR foldable propagation will not happen incorrectly and the result is correct: ``` +++ |col2|col4| +++ | 2| 2| +++ ``` ### Why are the changes needed? To fix a correctness issue. ### Does this PR introduce _any_ user-facing change? Yes, fixes a correctness issue. ### How was t
[spark] branch branch-3.0 updated: [SPARK-32688][SQL][TEST] Add special values to LiteralGenerator for float and double
This is an automated email from the ASF dual-hosted git repository. yamamuro pushed a commit to branch branch-3.0 in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/branch-3.0 by this push: new cb6a0d0 [SPARK-32688][SQL][TEST] Add special values to LiteralGenerator for float and double cb6a0d0 is described below commit cb6a0d08cc020d9a2c19173c9023a9f5e565dd6c Author: Tanel Kiis AuthorDate: Wed Sep 16 12:13:15 2020 +0900 [SPARK-32688][SQL][TEST] Add special values to LiteralGenerator for float and double ### What changes were proposed in this pull request? The `LiteralGenerator` for float and double datatypes was supposed to yield special values (NaN, +-inf) among others, but the `Gen.chooseNum` method does not yield values that are outside the defined range. The `Gen.chooseNum` for a wide range of floats and doubles does not yield values in the "everyday" range as stated in https://github.com/typelevel/scalacheck/issues/113 . There is an similar class `RandomDataGenerator` that is used in some other tests. Added `-0.0` and `-0.0f` as special values to there too. These changes revealed an inconsistency with the equality check between `-0.0` and `0.0`. ### Why are the changes needed? The `LiteralGenerator` is mostly used in the `checkConsistencyBetweenInterpretedAndCodegen` method in `MathExpressionsSuite`. This change would have caught the bug fixed in #29495 . ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? Locally reverted #29495 and verified that the existing test cases caught the bug. Closes #29515 from tanelk/SPARK-32688. Authored-by: Tanel Kiis Signed-off-by: Takeshi Yamamuro (cherry picked from commit 6051755bfe23a0e4564bf19476ec34cd7fd6008d) Signed-off-by: Takeshi Yamamuro --- .../org/apache/spark/sql/RandomDataGenerator.scala| 4 ++-- .../sql/catalyst/expressions/LiteralGenerator.scala | 19 +++ 2 files changed, 17 insertions(+), 6 deletions(-) diff --git a/sql/catalyst/src/test/scala/org/apache/spark/sql/RandomDataGenerator.scala b/sql/catalyst/src/test/scala/org/apache/spark/sql/RandomDataGenerator.scala index 6a5bdc4..3e2dc3f 100644 --- a/sql/catalyst/src/test/scala/org/apache/spark/sql/RandomDataGenerator.scala +++ b/sql/catalyst/src/test/scala/org/apache/spark/sql/RandomDataGenerator.scala @@ -260,10 +260,10 @@ object RandomDataGenerator { new MathContext(precision)).bigDecimal) case DoubleType => randomNumeric[Double]( rand, r => longBitsToDouble(r.nextLong()), Seq(Double.MinValue, Double.MinPositiveValue, - Double.MaxValue, Double.PositiveInfinity, Double.NegativeInfinity, Double.NaN, 0.0)) + Double.MaxValue, Double.PositiveInfinity, Double.NegativeInfinity, Double.NaN, 0.0, -0.0)) case FloatType => randomNumeric[Float]( rand, r => intBitsToFloat(r.nextInt()), Seq(Float.MinValue, Float.MinPositiveValue, - Float.MaxValue, Float.PositiveInfinity, Float.NegativeInfinity, Float.NaN, 0.0f)) + Float.MaxValue, Float.PositiveInfinity, Float.NegativeInfinity, Float.NaN, 0.0f, -0.0f)) case ByteType => randomNumeric[Byte]( rand, _.nextInt().toByte, Seq(Byte.MinValue, Byte.MaxValue, 0.toByte)) case IntegerType => randomNumeric[Int]( diff --git a/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/LiteralGenerator.scala b/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/LiteralGenerator.scala index d92eb01..c8e3b0e 100644 --- a/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/LiteralGenerator.scala +++ b/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/LiteralGenerator.scala @@ -68,16 +68,27 @@ object LiteralGenerator { lazy val longLiteralGen: Gen[Literal] = for { l <- Arbitrary.arbLong.arbitrary } yield Literal.create(l, LongType) + // The floatLiteralGen and doubleLiteralGen will 50% of the time yield arbitrary values + // and 50% of the time will yield some special values that are more likely to reveal + // corner cases. This behavior is similar to the integral value generators. lazy val floatLiteralGen: Gen[Literal] = for { - f <- Gen.chooseNum(Float.MinValue / 2, Float.MaxValue / 2, -Float.NaN, Float.PositiveInfinity, Float.NegativeInfinity) + f <- Gen.oneOf( +Gen.oneOf( + Float.NaN, Float.PositiveInfinity, Float.NegativeInfinity, Float.MinPositiveValue, + Float.MaxValue, -Float.MaxValue, 0.0f, -0.0f, 1.0f, -1.0f), +Arbitrary.arbFloat.arbitrary + ) } yield Literal.create(f, FloatType) lazy val doubleLiteralGen: Gen[Literal] = for { - f <- Gen.chooseNum(Double.MinValue / 2, Double.MaxValue / 2, -
[spark] branch branch-3.0 updated: [SPARK-32688][SQL][TEST] Add special values to LiteralGenerator for float and double
This is an automated email from the ASF dual-hosted git repository. yamamuro pushed a commit to branch branch-3.0 in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/branch-3.0 by this push: new cb6a0d0 [SPARK-32688][SQL][TEST] Add special values to LiteralGenerator for float and double cb6a0d0 is described below commit cb6a0d08cc020d9a2c19173c9023a9f5e565dd6c Author: Tanel Kiis AuthorDate: Wed Sep 16 12:13:15 2020 +0900 [SPARK-32688][SQL][TEST] Add special values to LiteralGenerator for float and double ### What changes were proposed in this pull request? The `LiteralGenerator` for float and double datatypes was supposed to yield special values (NaN, +-inf) among others, but the `Gen.chooseNum` method does not yield values that are outside the defined range. The `Gen.chooseNum` for a wide range of floats and doubles does not yield values in the "everyday" range as stated in https://github.com/typelevel/scalacheck/issues/113 . There is an similar class `RandomDataGenerator` that is used in some other tests. Added `-0.0` and `-0.0f` as special values to there too. These changes revealed an inconsistency with the equality check between `-0.0` and `0.0`. ### Why are the changes needed? The `LiteralGenerator` is mostly used in the `checkConsistencyBetweenInterpretedAndCodegen` method in `MathExpressionsSuite`. This change would have caught the bug fixed in #29495 . ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? Locally reverted #29495 and verified that the existing test cases caught the bug. Closes #29515 from tanelk/SPARK-32688. Authored-by: Tanel Kiis Signed-off-by: Takeshi Yamamuro (cherry picked from commit 6051755bfe23a0e4564bf19476ec34cd7fd6008d) Signed-off-by: Takeshi Yamamuro --- .../org/apache/spark/sql/RandomDataGenerator.scala| 4 ++-- .../sql/catalyst/expressions/LiteralGenerator.scala | 19 +++ 2 files changed, 17 insertions(+), 6 deletions(-) diff --git a/sql/catalyst/src/test/scala/org/apache/spark/sql/RandomDataGenerator.scala b/sql/catalyst/src/test/scala/org/apache/spark/sql/RandomDataGenerator.scala index 6a5bdc4..3e2dc3f 100644 --- a/sql/catalyst/src/test/scala/org/apache/spark/sql/RandomDataGenerator.scala +++ b/sql/catalyst/src/test/scala/org/apache/spark/sql/RandomDataGenerator.scala @@ -260,10 +260,10 @@ object RandomDataGenerator { new MathContext(precision)).bigDecimal) case DoubleType => randomNumeric[Double]( rand, r => longBitsToDouble(r.nextLong()), Seq(Double.MinValue, Double.MinPositiveValue, - Double.MaxValue, Double.PositiveInfinity, Double.NegativeInfinity, Double.NaN, 0.0)) + Double.MaxValue, Double.PositiveInfinity, Double.NegativeInfinity, Double.NaN, 0.0, -0.0)) case FloatType => randomNumeric[Float]( rand, r => intBitsToFloat(r.nextInt()), Seq(Float.MinValue, Float.MinPositiveValue, - Float.MaxValue, Float.PositiveInfinity, Float.NegativeInfinity, Float.NaN, 0.0f)) + Float.MaxValue, Float.PositiveInfinity, Float.NegativeInfinity, Float.NaN, 0.0f, -0.0f)) case ByteType => randomNumeric[Byte]( rand, _.nextInt().toByte, Seq(Byte.MinValue, Byte.MaxValue, 0.toByte)) case IntegerType => randomNumeric[Int]( diff --git a/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/LiteralGenerator.scala b/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/LiteralGenerator.scala index d92eb01..c8e3b0e 100644 --- a/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/LiteralGenerator.scala +++ b/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/LiteralGenerator.scala @@ -68,16 +68,27 @@ object LiteralGenerator { lazy val longLiteralGen: Gen[Literal] = for { l <- Arbitrary.arbLong.arbitrary } yield Literal.create(l, LongType) + // The floatLiteralGen and doubleLiteralGen will 50% of the time yield arbitrary values + // and 50% of the time will yield some special values that are more likely to reveal + // corner cases. This behavior is similar to the integral value generators. lazy val floatLiteralGen: Gen[Literal] = for { - f <- Gen.chooseNum(Float.MinValue / 2, Float.MaxValue / 2, -Float.NaN, Float.PositiveInfinity, Float.NegativeInfinity) + f <- Gen.oneOf( +Gen.oneOf( + Float.NaN, Float.PositiveInfinity, Float.NegativeInfinity, Float.MinPositiveValue, + Float.MaxValue, -Float.MaxValue, 0.0f, -0.0f, 1.0f, -1.0f), +Arbitrary.arbFloat.arbitrary + ) } yield Literal.create(f, FloatType) lazy val doubleLiteralGen: Gen[Literal] = for { - f <- Gen.chooseNum(Double.MinValue / 2, Double.MaxValue / 2, -
[spark] branch master updated (b46c730 -> 6051755)
This is an automated email from the ASF dual-hosted git repository. yamamuro pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from b46c730 [SPARK-32704][SQL][TESTS][FOLLOW-UP] Check any physical rule instead of a specific rule in the test add 6051755 [SPARK-32688][SQL][TEST] Add special values to LiteralGenerator for float and double No new revisions were added by this update. Summary of changes: .../org/apache/spark/sql/RandomDataGenerator.scala| 4 ++-- .../sql/catalyst/expressions/LiteralGenerator.scala | 19 +++ 2 files changed, 17 insertions(+), 6 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (b46c730 -> 6051755)
This is an automated email from the ASF dual-hosted git repository. yamamuro pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from b46c730 [SPARK-32704][SQL][TESTS][FOLLOW-UP] Check any physical rule instead of a specific rule in the test add 6051755 [SPARK-32688][SQL][TEST] Add special values to LiteralGenerator for float and double No new revisions were added by this update. Summary of changes: .../org/apache/spark/sql/RandomDataGenerator.scala| 4 ++-- .../sql/catalyst/expressions/LiteralGenerator.scala | 19 +++ 2 files changed, 17 insertions(+), 6 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch branch-3.0 updated: [SPARK-32688][SQL][TEST] Add special values to LiteralGenerator for float and double
This is an automated email from the ASF dual-hosted git repository. yamamuro pushed a commit to branch branch-3.0 in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/branch-3.0 by this push: new cb6a0d0 [SPARK-32688][SQL][TEST] Add special values to LiteralGenerator for float and double cb6a0d0 is described below commit cb6a0d08cc020d9a2c19173c9023a9f5e565dd6c Author: Tanel Kiis AuthorDate: Wed Sep 16 12:13:15 2020 +0900 [SPARK-32688][SQL][TEST] Add special values to LiteralGenerator for float and double ### What changes were proposed in this pull request? The `LiteralGenerator` for float and double datatypes was supposed to yield special values (NaN, +-inf) among others, but the `Gen.chooseNum` method does not yield values that are outside the defined range. The `Gen.chooseNum` for a wide range of floats and doubles does not yield values in the "everyday" range as stated in https://github.com/typelevel/scalacheck/issues/113 . There is an similar class `RandomDataGenerator` that is used in some other tests. Added `-0.0` and `-0.0f` as special values to there too. These changes revealed an inconsistency with the equality check between `-0.0` and `0.0`. ### Why are the changes needed? The `LiteralGenerator` is mostly used in the `checkConsistencyBetweenInterpretedAndCodegen` method in `MathExpressionsSuite`. This change would have caught the bug fixed in #29495 . ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? Locally reverted #29495 and verified that the existing test cases caught the bug. Closes #29515 from tanelk/SPARK-32688. Authored-by: Tanel Kiis Signed-off-by: Takeshi Yamamuro (cherry picked from commit 6051755bfe23a0e4564bf19476ec34cd7fd6008d) Signed-off-by: Takeshi Yamamuro --- .../org/apache/spark/sql/RandomDataGenerator.scala| 4 ++-- .../sql/catalyst/expressions/LiteralGenerator.scala | 19 +++ 2 files changed, 17 insertions(+), 6 deletions(-) diff --git a/sql/catalyst/src/test/scala/org/apache/spark/sql/RandomDataGenerator.scala b/sql/catalyst/src/test/scala/org/apache/spark/sql/RandomDataGenerator.scala index 6a5bdc4..3e2dc3f 100644 --- a/sql/catalyst/src/test/scala/org/apache/spark/sql/RandomDataGenerator.scala +++ b/sql/catalyst/src/test/scala/org/apache/spark/sql/RandomDataGenerator.scala @@ -260,10 +260,10 @@ object RandomDataGenerator { new MathContext(precision)).bigDecimal) case DoubleType => randomNumeric[Double]( rand, r => longBitsToDouble(r.nextLong()), Seq(Double.MinValue, Double.MinPositiveValue, - Double.MaxValue, Double.PositiveInfinity, Double.NegativeInfinity, Double.NaN, 0.0)) + Double.MaxValue, Double.PositiveInfinity, Double.NegativeInfinity, Double.NaN, 0.0, -0.0)) case FloatType => randomNumeric[Float]( rand, r => intBitsToFloat(r.nextInt()), Seq(Float.MinValue, Float.MinPositiveValue, - Float.MaxValue, Float.PositiveInfinity, Float.NegativeInfinity, Float.NaN, 0.0f)) + Float.MaxValue, Float.PositiveInfinity, Float.NegativeInfinity, Float.NaN, 0.0f, -0.0f)) case ByteType => randomNumeric[Byte]( rand, _.nextInt().toByte, Seq(Byte.MinValue, Byte.MaxValue, 0.toByte)) case IntegerType => randomNumeric[Int]( diff --git a/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/LiteralGenerator.scala b/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/LiteralGenerator.scala index d92eb01..c8e3b0e 100644 --- a/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/LiteralGenerator.scala +++ b/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/LiteralGenerator.scala @@ -68,16 +68,27 @@ object LiteralGenerator { lazy val longLiteralGen: Gen[Literal] = for { l <- Arbitrary.arbLong.arbitrary } yield Literal.create(l, LongType) + // The floatLiteralGen and doubleLiteralGen will 50% of the time yield arbitrary values + // and 50% of the time will yield some special values that are more likely to reveal + // corner cases. This behavior is similar to the integral value generators. lazy val floatLiteralGen: Gen[Literal] = for { - f <- Gen.chooseNum(Float.MinValue / 2, Float.MaxValue / 2, -Float.NaN, Float.PositiveInfinity, Float.NegativeInfinity) + f <- Gen.oneOf( +Gen.oneOf( + Float.NaN, Float.PositiveInfinity, Float.NegativeInfinity, Float.MinPositiveValue, + Float.MaxValue, -Float.MaxValue, 0.0f, -0.0f, 1.0f, -1.0f), +Arbitrary.arbFloat.arbitrary + ) } yield Literal.create(f, FloatType) lazy val doubleLiteralGen: Gen[Literal] = for { - f <- Gen.chooseNum(Double.MinValue / 2, Double.MaxValue / 2, -
[spark] branch master updated (b46c730 -> 6051755)
This is an automated email from the ASF dual-hosted git repository. yamamuro pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from b46c730 [SPARK-32704][SQL][TESTS][FOLLOW-UP] Check any physical rule instead of a specific rule in the test add 6051755 [SPARK-32688][SQL][TEST] Add special values to LiteralGenerator for float and double No new revisions were added by this update. Summary of changes: .../org/apache/spark/sql/RandomDataGenerator.scala| 4 ++-- .../sql/catalyst/expressions/LiteralGenerator.scala | 19 +++ 2 files changed, 17 insertions(+), 6 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch branch-3.0 updated: [SPARK-32688][SQL][TEST] Add special values to LiteralGenerator for float and double
This is an automated email from the ASF dual-hosted git repository. yamamuro pushed a commit to branch branch-3.0 in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/branch-3.0 by this push: new cb6a0d0 [SPARK-32688][SQL][TEST] Add special values to LiteralGenerator for float and double cb6a0d0 is described below commit cb6a0d08cc020d9a2c19173c9023a9f5e565dd6c Author: Tanel Kiis AuthorDate: Wed Sep 16 12:13:15 2020 +0900 [SPARK-32688][SQL][TEST] Add special values to LiteralGenerator for float and double ### What changes were proposed in this pull request? The `LiteralGenerator` for float and double datatypes was supposed to yield special values (NaN, +-inf) among others, but the `Gen.chooseNum` method does not yield values that are outside the defined range. The `Gen.chooseNum` for a wide range of floats and doubles does not yield values in the "everyday" range as stated in https://github.com/typelevel/scalacheck/issues/113 . There is an similar class `RandomDataGenerator` that is used in some other tests. Added `-0.0` and `-0.0f` as special values to there too. These changes revealed an inconsistency with the equality check between `-0.0` and `0.0`. ### Why are the changes needed? The `LiteralGenerator` is mostly used in the `checkConsistencyBetweenInterpretedAndCodegen` method in `MathExpressionsSuite`. This change would have caught the bug fixed in #29495 . ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? Locally reverted #29495 and verified that the existing test cases caught the bug. Closes #29515 from tanelk/SPARK-32688. Authored-by: Tanel Kiis Signed-off-by: Takeshi Yamamuro (cherry picked from commit 6051755bfe23a0e4564bf19476ec34cd7fd6008d) Signed-off-by: Takeshi Yamamuro --- .../org/apache/spark/sql/RandomDataGenerator.scala| 4 ++-- .../sql/catalyst/expressions/LiteralGenerator.scala | 19 +++ 2 files changed, 17 insertions(+), 6 deletions(-) diff --git a/sql/catalyst/src/test/scala/org/apache/spark/sql/RandomDataGenerator.scala b/sql/catalyst/src/test/scala/org/apache/spark/sql/RandomDataGenerator.scala index 6a5bdc4..3e2dc3f 100644 --- a/sql/catalyst/src/test/scala/org/apache/spark/sql/RandomDataGenerator.scala +++ b/sql/catalyst/src/test/scala/org/apache/spark/sql/RandomDataGenerator.scala @@ -260,10 +260,10 @@ object RandomDataGenerator { new MathContext(precision)).bigDecimal) case DoubleType => randomNumeric[Double]( rand, r => longBitsToDouble(r.nextLong()), Seq(Double.MinValue, Double.MinPositiveValue, - Double.MaxValue, Double.PositiveInfinity, Double.NegativeInfinity, Double.NaN, 0.0)) + Double.MaxValue, Double.PositiveInfinity, Double.NegativeInfinity, Double.NaN, 0.0, -0.0)) case FloatType => randomNumeric[Float]( rand, r => intBitsToFloat(r.nextInt()), Seq(Float.MinValue, Float.MinPositiveValue, - Float.MaxValue, Float.PositiveInfinity, Float.NegativeInfinity, Float.NaN, 0.0f)) + Float.MaxValue, Float.PositiveInfinity, Float.NegativeInfinity, Float.NaN, 0.0f, -0.0f)) case ByteType => randomNumeric[Byte]( rand, _.nextInt().toByte, Seq(Byte.MinValue, Byte.MaxValue, 0.toByte)) case IntegerType => randomNumeric[Int]( diff --git a/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/LiteralGenerator.scala b/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/LiteralGenerator.scala index d92eb01..c8e3b0e 100644 --- a/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/LiteralGenerator.scala +++ b/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/LiteralGenerator.scala @@ -68,16 +68,27 @@ object LiteralGenerator { lazy val longLiteralGen: Gen[Literal] = for { l <- Arbitrary.arbLong.arbitrary } yield Literal.create(l, LongType) + // The floatLiteralGen and doubleLiteralGen will 50% of the time yield arbitrary values + // and 50% of the time will yield some special values that are more likely to reveal + // corner cases. This behavior is similar to the integral value generators. lazy val floatLiteralGen: Gen[Literal] = for { - f <- Gen.chooseNum(Float.MinValue / 2, Float.MaxValue / 2, -Float.NaN, Float.PositiveInfinity, Float.NegativeInfinity) + f <- Gen.oneOf( +Gen.oneOf( + Float.NaN, Float.PositiveInfinity, Float.NegativeInfinity, Float.MinPositiveValue, + Float.MaxValue, -Float.MaxValue, 0.0f, -0.0f, 1.0f, -1.0f), +Arbitrary.arbFloat.arbitrary + ) } yield Literal.create(f, FloatType) lazy val doubleLiteralGen: Gen[Literal] = for { - f <- Gen.chooseNum(Double.MinValue / 2, Double.MaxValue / 2, -
[spark] branch master updated (b46c730 -> 6051755)
This is an automated email from the ASF dual-hosted git repository. yamamuro pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from b46c730 [SPARK-32704][SQL][TESTS][FOLLOW-UP] Check any physical rule instead of a specific rule in the test add 6051755 [SPARK-32688][SQL][TEST] Add special values to LiteralGenerator for float and double No new revisions were added by this update. Summary of changes: .../org/apache/spark/sql/RandomDataGenerator.scala| 4 ++-- .../sql/catalyst/expressions/LiteralGenerator.scala | 19 +++ 2 files changed, 17 insertions(+), 6 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch branch-3.0 updated: [SPARK-32688][SQL][TEST] Add special values to LiteralGenerator for float and double
This is an automated email from the ASF dual-hosted git repository. yamamuro pushed a commit to branch branch-3.0 in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/branch-3.0 by this push: new cb6a0d0 [SPARK-32688][SQL][TEST] Add special values to LiteralGenerator for float and double cb6a0d0 is described below commit cb6a0d08cc020d9a2c19173c9023a9f5e565dd6c Author: Tanel Kiis AuthorDate: Wed Sep 16 12:13:15 2020 +0900 [SPARK-32688][SQL][TEST] Add special values to LiteralGenerator for float and double ### What changes were proposed in this pull request? The `LiteralGenerator` for float and double datatypes was supposed to yield special values (NaN, +-inf) among others, but the `Gen.chooseNum` method does not yield values that are outside the defined range. The `Gen.chooseNum` for a wide range of floats and doubles does not yield values in the "everyday" range as stated in https://github.com/typelevel/scalacheck/issues/113 . There is an similar class `RandomDataGenerator` that is used in some other tests. Added `-0.0` and `-0.0f` as special values to there too. These changes revealed an inconsistency with the equality check between `-0.0` and `0.0`. ### Why are the changes needed? The `LiteralGenerator` is mostly used in the `checkConsistencyBetweenInterpretedAndCodegen` method in `MathExpressionsSuite`. This change would have caught the bug fixed in #29495 . ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? Locally reverted #29495 and verified that the existing test cases caught the bug. Closes #29515 from tanelk/SPARK-32688. Authored-by: Tanel Kiis Signed-off-by: Takeshi Yamamuro (cherry picked from commit 6051755bfe23a0e4564bf19476ec34cd7fd6008d) Signed-off-by: Takeshi Yamamuro --- .../org/apache/spark/sql/RandomDataGenerator.scala| 4 ++-- .../sql/catalyst/expressions/LiteralGenerator.scala | 19 +++ 2 files changed, 17 insertions(+), 6 deletions(-) diff --git a/sql/catalyst/src/test/scala/org/apache/spark/sql/RandomDataGenerator.scala b/sql/catalyst/src/test/scala/org/apache/spark/sql/RandomDataGenerator.scala index 6a5bdc4..3e2dc3f 100644 --- a/sql/catalyst/src/test/scala/org/apache/spark/sql/RandomDataGenerator.scala +++ b/sql/catalyst/src/test/scala/org/apache/spark/sql/RandomDataGenerator.scala @@ -260,10 +260,10 @@ object RandomDataGenerator { new MathContext(precision)).bigDecimal) case DoubleType => randomNumeric[Double]( rand, r => longBitsToDouble(r.nextLong()), Seq(Double.MinValue, Double.MinPositiveValue, - Double.MaxValue, Double.PositiveInfinity, Double.NegativeInfinity, Double.NaN, 0.0)) + Double.MaxValue, Double.PositiveInfinity, Double.NegativeInfinity, Double.NaN, 0.0, -0.0)) case FloatType => randomNumeric[Float]( rand, r => intBitsToFloat(r.nextInt()), Seq(Float.MinValue, Float.MinPositiveValue, - Float.MaxValue, Float.PositiveInfinity, Float.NegativeInfinity, Float.NaN, 0.0f)) + Float.MaxValue, Float.PositiveInfinity, Float.NegativeInfinity, Float.NaN, 0.0f, -0.0f)) case ByteType => randomNumeric[Byte]( rand, _.nextInt().toByte, Seq(Byte.MinValue, Byte.MaxValue, 0.toByte)) case IntegerType => randomNumeric[Int]( diff --git a/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/LiteralGenerator.scala b/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/LiteralGenerator.scala index d92eb01..c8e3b0e 100644 --- a/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/LiteralGenerator.scala +++ b/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/LiteralGenerator.scala @@ -68,16 +68,27 @@ object LiteralGenerator { lazy val longLiteralGen: Gen[Literal] = for { l <- Arbitrary.arbLong.arbitrary } yield Literal.create(l, LongType) + // The floatLiteralGen and doubleLiteralGen will 50% of the time yield arbitrary values + // and 50% of the time will yield some special values that are more likely to reveal + // corner cases. This behavior is similar to the integral value generators. lazy val floatLiteralGen: Gen[Literal] = for { - f <- Gen.chooseNum(Float.MinValue / 2, Float.MaxValue / 2, -Float.NaN, Float.PositiveInfinity, Float.NegativeInfinity) + f <- Gen.oneOf( +Gen.oneOf( + Float.NaN, Float.PositiveInfinity, Float.NegativeInfinity, Float.MinPositiveValue, + Float.MaxValue, -Float.MaxValue, 0.0f, -0.0f, 1.0f, -1.0f), +Arbitrary.arbFloat.arbitrary + ) } yield Literal.create(f, FloatType) lazy val doubleLiteralGen: Gen[Literal] = for { - f <- Gen.chooseNum(Double.MinValue / 2, Double.MaxValue / 2, -
[spark] branch master updated (b46c730 -> 6051755)
This is an automated email from the ASF dual-hosted git repository. yamamuro pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from b46c730 [SPARK-32704][SQL][TESTS][FOLLOW-UP] Check any physical rule instead of a specific rule in the test add 6051755 [SPARK-32688][SQL][TEST] Add special values to LiteralGenerator for float and double No new revisions were added by this update. Summary of changes: .../org/apache/spark/sql/RandomDataGenerator.scala| 4 ++-- .../sql/catalyst/expressions/LiteralGenerator.scala | 19 +++ 2 files changed, 17 insertions(+), 6 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (5e82548 -> 7a17158)
This is an automated email from the ASF dual-hosted git repository. yamamuro pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from 5e82548 [SPARK-32844][SQL] Make `DataFrameReader.table` take the specified options for datasource v1 add 7a17158 [SPARK-32868][SQL] Add more order irrelevant aggregates to EliminateSorts No new revisions were added by this update. Summary of changes: .../apache/spark/sql/catalyst/dsl/package.scala| 6 + .../spark/sql/catalyst/optimizer/Optimizer.scala | 2 +- .../catalyst/optimizer/EliminateSortsSuite.scala | 26 -- 3 files changed, 26 insertions(+), 8 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (5e82548 -> 7a17158)
This is an automated email from the ASF dual-hosted git repository. yamamuro pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from 5e82548 [SPARK-32844][SQL] Make `DataFrameReader.table` take the specified options for datasource v1 add 7a17158 [SPARK-32868][SQL] Add more order irrelevant aggregates to EliminateSorts No new revisions were added by this update. Summary of changes: .../apache/spark/sql/catalyst/dsl/package.scala| 6 + .../spark/sql/catalyst/optimizer/Optimizer.scala | 2 +- .../catalyst/optimizer/EliminateSortsSuite.scala | 26 -- 3 files changed, 26 insertions(+), 8 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (5e82548 -> 7a17158)
This is an automated email from the ASF dual-hosted git repository. yamamuro pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from 5e82548 [SPARK-32844][SQL] Make `DataFrameReader.table` take the specified options for datasource v1 add 7a17158 [SPARK-32868][SQL] Add more order irrelevant aggregates to EliminateSorts No new revisions were added by this update. Summary of changes: .../apache/spark/sql/catalyst/dsl/package.scala| 6 + .../spark/sql/catalyst/optimizer/Optimizer.scala | 2 +- .../catalyst/optimizer/EliminateSortsSuite.scala | 26 -- 3 files changed, 26 insertions(+), 8 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (5e82548 -> 7a17158)
This is an automated email from the ASF dual-hosted git repository. yamamuro pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from 5e82548 [SPARK-32844][SQL] Make `DataFrameReader.table` take the specified options for datasource v1 add 7a17158 [SPARK-32868][SQL] Add more order irrelevant aggregates to EliminateSorts No new revisions were added by this update. Summary of changes: .../apache/spark/sql/catalyst/dsl/package.scala| 6 + .../spark/sql/catalyst/optimizer/Optimizer.scala | 2 +- .../catalyst/optimizer/EliminateSortsSuite.scala | 26 -- 3 files changed, 26 insertions(+), 8 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (5e82548 -> 7a17158)
This is an automated email from the ASF dual-hosted git repository. yamamuro pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from 5e82548 [SPARK-32844][SQL] Make `DataFrameReader.table` take the specified options for datasource v1 add 7a17158 [SPARK-32868][SQL] Add more order irrelevant aggregates to EliminateSorts No new revisions were added by this update. Summary of changes: .../apache/spark/sql/catalyst/dsl/package.scala| 6 + .../spark/sql/catalyst/optimizer/Optimizer.scala | 2 +- .../catalyst/optimizer/EliminateSortsSuite.scala | 26 -- 3 files changed, 26 insertions(+), 8 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (b4be6a6 -> 4269c2c)
This is an automated email from the ASF dual-hosted git repository. yamamuro pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from b4be6a6 [SPARK-32845][SS][TESTS] Add sinkParameter to check sink options robustly in DataStreamReaderWriterSuite add 4269c2c [SPARK-32851][SQL][TEST] Tests should fail if errors happen when generating projection code No new revisions were added by this update. Summary of changes: .../src/test/scala/org/apache/spark/sql/test/SharedSparkSession.scala | 2 ++ sql/hive/src/test/scala/org/apache/spark/sql/hive/test/TestHive.scala | 2 ++ 2 files changed, 4 insertions(+) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (b4be6a6 -> 4269c2c)
This is an automated email from the ASF dual-hosted git repository. yamamuro pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from b4be6a6 [SPARK-32845][SS][TESTS] Add sinkParameter to check sink options robustly in DataStreamReaderWriterSuite add 4269c2c [SPARK-32851][SQL][TEST] Tests should fail if errors happen when generating projection code No new revisions were added by this update. Summary of changes: .../src/test/scala/org/apache/spark/sql/test/SharedSparkSession.scala | 2 ++ sql/hive/src/test/scala/org/apache/spark/sql/hive/test/TestHive.scala | 2 ++ 2 files changed, 4 insertions(+) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (b4be6a6 -> 4269c2c)
This is an automated email from the ASF dual-hosted git repository. yamamuro pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from b4be6a6 [SPARK-32845][SS][TESTS] Add sinkParameter to check sink options robustly in DataStreamReaderWriterSuite add 4269c2c [SPARK-32851][SQL][TEST] Tests should fail if errors happen when generating projection code No new revisions were added by this update. Summary of changes: .../src/test/scala/org/apache/spark/sql/test/SharedSparkSession.scala | 2 ++ sql/hive/src/test/scala/org/apache/spark/sql/hive/test/TestHive.scala | 2 ++ 2 files changed, 4 insertions(+) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (b4be6a6 -> 4269c2c)
This is an automated email from the ASF dual-hosted git repository. yamamuro pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from b4be6a6 [SPARK-32845][SS][TESTS] Add sinkParameter to check sink options robustly in DataStreamReaderWriterSuite add 4269c2c [SPARK-32851][SQL][TEST] Tests should fail if errors happen when generating projection code No new revisions were added by this update. Summary of changes: .../src/test/scala/org/apache/spark/sql/test/SharedSparkSession.scala | 2 ++ sql/hive/src/test/scala/org/apache/spark/sql/hive/test/TestHive.scala | 2 ++ 2 files changed, 4 insertions(+) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (b4be6a6 -> 4269c2c)
This is an automated email from the ASF dual-hosted git repository. yamamuro pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from b4be6a6 [SPARK-32845][SS][TESTS] Add sinkParameter to check sink options robustly in DataStreamReaderWriterSuite add 4269c2c [SPARK-32851][SQL][TEST] Tests should fail if errors happen when generating projection code No new revisions were added by this update. Summary of changes: .../src/test/scala/org/apache/spark/sql/test/SharedSparkSession.scala | 2 ++ sql/hive/src/test/scala/org/apache/spark/sql/hive/test/TestHive.scala | 2 ++ 2 files changed, 4 insertions(+) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch branch-3.0 updated: [SPARK-32677][SQL][DOCS][MINOR] Improve code comment in CreateFunctionCommand
This is an automated email from the ASF dual-hosted git repository. yamamuro pushed a commit to branch branch-3.0 in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/branch-3.0 by this push: new cf14897 [SPARK-32677][SQL][DOCS][MINOR] Improve code comment in CreateFunctionCommand cf14897 is described below commit cf14897d355efbf4acb3497ef1b74cd3a9c35d59 Author: Wenchen Fan AuthorDate: Fri Sep 11 09:22:56 2020 +0900 [SPARK-32677][SQL][DOCS][MINOR] Improve code comment in CreateFunctionCommand ### What changes were proposed in this pull request? We made a mistake in https://github.com/apache/spark/pull/29502, as there is no code comment to explain why we can't load the UDF class when creating functions. This PR improves the code comment. ### Why are the changes needed? To avoid making the same mistake. ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? N/A Closes #29713 from cloud-fan/comment. Authored-by: Wenchen Fan Signed-off-by: Takeshi Yamamuro (cherry picked from commit 328d81a2d1131742bcfba5117896c093db39e721) Signed-off-by: Takeshi Yamamuro --- .../main/scala/org/apache/spark/sql/execution/command/functions.scala | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/sql/core/src/main/scala/org/apache/spark/sql/execution/command/functions.scala b/sql/core/src/main/scala/org/apache/spark/sql/execution/command/functions.scala index 6fdc7f4..d55d696 100644 --- a/sql/core/src/main/scala/org/apache/spark/sql/execution/command/functions.scala +++ b/sql/core/src/main/scala/org/apache/spark/sql/execution/command/functions.scala @@ -88,7 +88,9 @@ case class CreateFunctionCommand( } else { // For a permanent, we will store the metadata into underlying external catalog. // This function will be loaded into the FunctionRegistry when a query uses it. -// We do not load it into FunctionRegistry right now. +// We do not load it into FunctionRegistry right now, to avoid loading the resource and +// UDF class immediately, as the Spark application to create the function may not have +// access to the resource and/or UDF class. catalog.createFunction(func, ignoreIfExists) } } - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch branch-3.0 updated: [SPARK-32677][SQL][DOCS][MINOR] Improve code comment in CreateFunctionCommand
This is an automated email from the ASF dual-hosted git repository. yamamuro pushed a commit to branch branch-3.0 in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/branch-3.0 by this push: new cf14897 [SPARK-32677][SQL][DOCS][MINOR] Improve code comment in CreateFunctionCommand cf14897 is described below commit cf14897d355efbf4acb3497ef1b74cd3a9c35d59 Author: Wenchen Fan AuthorDate: Fri Sep 11 09:22:56 2020 +0900 [SPARK-32677][SQL][DOCS][MINOR] Improve code comment in CreateFunctionCommand ### What changes were proposed in this pull request? We made a mistake in https://github.com/apache/spark/pull/29502, as there is no code comment to explain why we can't load the UDF class when creating functions. This PR improves the code comment. ### Why are the changes needed? To avoid making the same mistake. ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? N/A Closes #29713 from cloud-fan/comment. Authored-by: Wenchen Fan Signed-off-by: Takeshi Yamamuro (cherry picked from commit 328d81a2d1131742bcfba5117896c093db39e721) Signed-off-by: Takeshi Yamamuro --- .../main/scala/org/apache/spark/sql/execution/command/functions.scala | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/sql/core/src/main/scala/org/apache/spark/sql/execution/command/functions.scala b/sql/core/src/main/scala/org/apache/spark/sql/execution/command/functions.scala index 6fdc7f4..d55d696 100644 --- a/sql/core/src/main/scala/org/apache/spark/sql/execution/command/functions.scala +++ b/sql/core/src/main/scala/org/apache/spark/sql/execution/command/functions.scala @@ -88,7 +88,9 @@ case class CreateFunctionCommand( } else { // For a permanent, we will store the metadata into underlying external catalog. // This function will be loaded into the FunctionRegistry when a query uses it. -// We do not load it into FunctionRegistry right now. +// We do not load it into FunctionRegistry right now, to avoid loading the resource and +// UDF class immediately, as the Spark application to create the function may not have +// access to the resource and/or UDF class. catalog.createFunction(func, ignoreIfExists) } } - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (5f468cc -> 328d81a)
This is an automated email from the ASF dual-hosted git repository. yamamuro pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from 5f468cc [SPARK-32822][SQL] Change the number of partitions to zero when a range is empty with WholeStageCodegen disabled or falled back add 328d81a [SPARK-32677][SQL][DOCS][MINOR] Improve code comment in CreateFunctionCommand No new revisions were added by this update. Summary of changes: .../main/scala/org/apache/spark/sql/execution/command/functions.scala | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch branch-3.0 updated: [SPARK-32677][SQL][DOCS][MINOR] Improve code comment in CreateFunctionCommand
This is an automated email from the ASF dual-hosted git repository. yamamuro pushed a commit to branch branch-3.0 in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/branch-3.0 by this push: new cf14897 [SPARK-32677][SQL][DOCS][MINOR] Improve code comment in CreateFunctionCommand cf14897 is described below commit cf14897d355efbf4acb3497ef1b74cd3a9c35d59 Author: Wenchen Fan AuthorDate: Fri Sep 11 09:22:56 2020 +0900 [SPARK-32677][SQL][DOCS][MINOR] Improve code comment in CreateFunctionCommand ### What changes were proposed in this pull request? We made a mistake in https://github.com/apache/spark/pull/29502, as there is no code comment to explain why we can't load the UDF class when creating functions. This PR improves the code comment. ### Why are the changes needed? To avoid making the same mistake. ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? N/A Closes #29713 from cloud-fan/comment. Authored-by: Wenchen Fan Signed-off-by: Takeshi Yamamuro (cherry picked from commit 328d81a2d1131742bcfba5117896c093db39e721) Signed-off-by: Takeshi Yamamuro --- .../main/scala/org/apache/spark/sql/execution/command/functions.scala | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/sql/core/src/main/scala/org/apache/spark/sql/execution/command/functions.scala b/sql/core/src/main/scala/org/apache/spark/sql/execution/command/functions.scala index 6fdc7f4..d55d696 100644 --- a/sql/core/src/main/scala/org/apache/spark/sql/execution/command/functions.scala +++ b/sql/core/src/main/scala/org/apache/spark/sql/execution/command/functions.scala @@ -88,7 +88,9 @@ case class CreateFunctionCommand( } else { // For a permanent, we will store the metadata into underlying external catalog. // This function will be loaded into the FunctionRegistry when a query uses it. -// We do not load it into FunctionRegistry right now. +// We do not load it into FunctionRegistry right now, to avoid loading the resource and +// UDF class immediately, as the Spark application to create the function may not have +// access to the resource and/or UDF class. catalog.createFunction(func, ignoreIfExists) } } - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org