[spark] branch master updated (7d6e3fb -> 5effa8e)

2020-10-08 Thread yamamuro
This is an automated email from the ASF dual-hosted git repository.

yamamuro pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 7d6e3fb  [SPARK-33074][SQL] Classify dialect exceptions in JDBC v2 
Table Catalog
 add 5effa8e  [SPARK-33091][SQL] Avoid using map instead of foreach to 
avoid potential side effect at callers of OrcUtils.readCatalystSchema

No new revisions were added by this update.

Summary of changes:
 .../org/apache/spark/sql/execution/datasources/orc/OrcFileFormat.scala  | 2 +-
 .../sql/execution/datasources/v2/orc/OrcPartitionReaderFactory.scala| 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org



[spark] branch branch-3.0 updated: [SPARK-33091][SQL] Avoid using map instead of foreach to avoid potential side effect at callers of OrcUtils.readCatalystSchema

2020-10-08 Thread yamamuro
This is an automated email from the ASF dual-hosted git repository.

yamamuro pushed a commit to branch branch-3.0
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/branch-3.0 by this push:
 new 782ab8e  [SPARK-33091][SQL] Avoid using map instead of foreach to 
avoid potential side effect at callers of OrcUtils.readCatalystSchema
782ab8e is described below

commit 782ab8e244252696c50b4b432d07a56c374b8680
Author: HyukjinKwon 
AuthorDate: Thu Oct 8 16:29:15 2020 +0900

[SPARK-33091][SQL] Avoid using map instead of foreach to avoid potential 
side effect at callers of OrcUtils.readCatalystSchema

### What changes were proposed in this pull request?

This is a kind of a followup of SPARK-32646. New JIRA was filed to control 
the fixed versions properly.

When you use `map`, it might be lazily evaluated and not executed. To avoid 
this,  we should better use `foreach`. See also SPARK-16694. Current codes look 
not causing any bug for now but it should be best to fix to avoid potential 
issues.

### Why are the changes needed?

To avoid potential issues from `map` being lazy and not executed.

### Does this PR introduce _any_ user-facing change?

No.

### How was this patch tested?

Ran related tests. CI in this PR should verify.

Closes #29974 from HyukjinKwon/SPARK-32646.

Authored-by: HyukjinKwon 
Signed-off-by: Takeshi Yamamuro 
(cherry picked from commit 5effa8ea261ba59214afedc2853d1b248b330ca6)
Signed-off-by: Takeshi Yamamuro 
---
 .../org/apache/spark/sql/execution/datasources/orc/OrcFileFormat.scala  | 2 +-
 .../sql/execution/datasources/v2/orc/OrcPartitionReaderFactory.scala| 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git 
a/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/orc/OrcFileFormat.scala
 
b/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/orc/OrcFileFormat.scala
index 69badb4..c540007 100644
--- 
a/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/orc/OrcFileFormat.scala
+++ 
b/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/orc/OrcFileFormat.scala
@@ -185,7 +185,7 @@ class OrcFileFormat
   } else {
 // ORC predicate pushdown
 if (orcFilterPushDown) {
-  OrcUtils.readCatalystSchema(filePath, conf, ignoreCorruptFiles).map 
{ fileSchema =>
+  OrcUtils.readCatalystSchema(filePath, conf, 
ignoreCorruptFiles).foreach { fileSchema =>
 OrcFilters.createFilter(fileSchema, filters).foreach { f =>
   OrcInputFormat.setSearchArgument(conf, f, fileSchema.fieldNames)
 }
diff --git 
a/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2/orc/OrcPartitionReaderFactory.scala
 
b/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2/orc/OrcPartitionReaderFactory.scala
index 1f38128..b0ddee0 100644
--- 
a/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2/orc/OrcPartitionReaderFactory.scala
+++ 
b/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2/orc/OrcPartitionReaderFactory.scala
@@ -69,7 +69,7 @@ case class OrcPartitionReaderFactory(
 
   private def pushDownPredicates(filePath: Path, conf: Configuration): Unit = {
 if (orcFilterPushDown) {
-  OrcUtils.readCatalystSchema(filePath, conf, ignoreCorruptFiles).map { 
fileSchema =>
+  OrcUtils.readCatalystSchema(filePath, conf, ignoreCorruptFiles).foreach 
{ fileSchema =>
 OrcFilters.createFilter(fileSchema, filters).foreach { f =>
   OrcInputFormat.setSearchArgument(conf, f, fileSchema.fieldNames)
 }


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org



[spark] branch master updated (7d6e3fb -> 5effa8e)

2020-10-08 Thread yamamuro
This is an automated email from the ASF dual-hosted git repository.

yamamuro pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 7d6e3fb  [SPARK-33074][SQL] Classify dialect exceptions in JDBC v2 
Table Catalog
 add 5effa8e  [SPARK-33091][SQL] Avoid using map instead of foreach to 
avoid potential side effect at callers of OrcUtils.readCatalystSchema

No new revisions were added by this update.

Summary of changes:
 .../org/apache/spark/sql/execution/datasources/orc/OrcFileFormat.scala  | 2 +-
 .../sql/execution/datasources/v2/orc/OrcPartitionReaderFactory.scala| 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org



[spark] branch branch-3.0 updated: [SPARK-33091][SQL] Avoid using map instead of foreach to avoid potential side effect at callers of OrcUtils.readCatalystSchema

2020-10-08 Thread yamamuro
This is an automated email from the ASF dual-hosted git repository.

yamamuro pushed a commit to branch branch-3.0
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/branch-3.0 by this push:
 new 782ab8e  [SPARK-33091][SQL] Avoid using map instead of foreach to 
avoid potential side effect at callers of OrcUtils.readCatalystSchema
782ab8e is described below

commit 782ab8e244252696c50b4b432d07a56c374b8680
Author: HyukjinKwon 
AuthorDate: Thu Oct 8 16:29:15 2020 +0900

[SPARK-33091][SQL] Avoid using map instead of foreach to avoid potential 
side effect at callers of OrcUtils.readCatalystSchema

### What changes were proposed in this pull request?

This is a kind of a followup of SPARK-32646. New JIRA was filed to control 
the fixed versions properly.

When you use `map`, it might be lazily evaluated and not executed. To avoid 
this,  we should better use `foreach`. See also SPARK-16694. Current codes look 
not causing any bug for now but it should be best to fix to avoid potential 
issues.

### Why are the changes needed?

To avoid potential issues from `map` being lazy and not executed.

### Does this PR introduce _any_ user-facing change?

No.

### How was this patch tested?

Ran related tests. CI in this PR should verify.

Closes #29974 from HyukjinKwon/SPARK-32646.

Authored-by: HyukjinKwon 
Signed-off-by: Takeshi Yamamuro 
(cherry picked from commit 5effa8ea261ba59214afedc2853d1b248b330ca6)
Signed-off-by: Takeshi Yamamuro 
---
 .../org/apache/spark/sql/execution/datasources/orc/OrcFileFormat.scala  | 2 +-
 .../sql/execution/datasources/v2/orc/OrcPartitionReaderFactory.scala| 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git 
a/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/orc/OrcFileFormat.scala
 
b/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/orc/OrcFileFormat.scala
index 69badb4..c540007 100644
--- 
a/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/orc/OrcFileFormat.scala
+++ 
b/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/orc/OrcFileFormat.scala
@@ -185,7 +185,7 @@ class OrcFileFormat
   } else {
 // ORC predicate pushdown
 if (orcFilterPushDown) {
-  OrcUtils.readCatalystSchema(filePath, conf, ignoreCorruptFiles).map 
{ fileSchema =>
+  OrcUtils.readCatalystSchema(filePath, conf, 
ignoreCorruptFiles).foreach { fileSchema =>
 OrcFilters.createFilter(fileSchema, filters).foreach { f =>
   OrcInputFormat.setSearchArgument(conf, f, fileSchema.fieldNames)
 }
diff --git 
a/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2/orc/OrcPartitionReaderFactory.scala
 
b/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2/orc/OrcPartitionReaderFactory.scala
index 1f38128..b0ddee0 100644
--- 
a/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2/orc/OrcPartitionReaderFactory.scala
+++ 
b/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2/orc/OrcPartitionReaderFactory.scala
@@ -69,7 +69,7 @@ case class OrcPartitionReaderFactory(
 
   private def pushDownPredicates(filePath: Path, conf: Configuration): Unit = {
 if (orcFilterPushDown) {
-  OrcUtils.readCatalystSchema(filePath, conf, ignoreCorruptFiles).map { 
fileSchema =>
+  OrcUtils.readCatalystSchema(filePath, conf, ignoreCorruptFiles).foreach 
{ fileSchema =>
 OrcFilters.createFilter(fileSchema, filters).foreach { f =>
   OrcInputFormat.setSearchArgument(conf, f, fileSchema.fieldNames)
 }


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org



[spark] branch master updated (7d6e3fb -> 5effa8e)

2020-10-08 Thread yamamuro
This is an automated email from the ASF dual-hosted git repository.

yamamuro pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 7d6e3fb  [SPARK-33074][SQL] Classify dialect exceptions in JDBC v2 
Table Catalog
 add 5effa8e  [SPARK-33091][SQL] Avoid using map instead of foreach to 
avoid potential side effect at callers of OrcUtils.readCatalystSchema

No new revisions were added by this update.

Summary of changes:
 .../org/apache/spark/sql/execution/datasources/orc/OrcFileFormat.scala  | 2 +-
 .../sql/execution/datasources/v2/orc/OrcPartitionReaderFactory.scala| 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org



[spark] branch master updated (7d6e3fb -> 5effa8e)

2020-10-08 Thread yamamuro
This is an automated email from the ASF dual-hosted git repository.

yamamuro pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 7d6e3fb  [SPARK-33074][SQL] Classify dialect exceptions in JDBC v2 
Table Catalog
 add 5effa8e  [SPARK-33091][SQL] Avoid using map instead of foreach to 
avoid potential side effect at callers of OrcUtils.readCatalystSchema

No new revisions were added by this update.

Summary of changes:
 .../org/apache/spark/sql/execution/datasources/orc/OrcFileFormat.scala  | 2 +-
 .../sql/execution/datasources/v2/orc/OrcPartitionReaderFactory.scala| 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org



[spark] branch master updated (72da6f8 -> 94d648d)

2020-10-07 Thread yamamuro
This is an automated email from the ASF dual-hosted git repository.

yamamuro pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 72da6f8  [SPARK-33002][PYTHON] Remove non-API annotations
 add 94d648d  [SPARK-33036][SQL] Refactor RewriteCorrelatedScalarSubquery 
code to replace exprIds in a bottom-up manner

No new revisions were added by this update.

Summary of changes:
 .../spark/sql/catalyst/optimizer/subquery.scala| 80 ++
 1 file changed, 51 insertions(+), 29 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org



[spark] branch master updated (72da6f8 -> 94d648d)

2020-10-07 Thread yamamuro
This is an automated email from the ASF dual-hosted git repository.

yamamuro pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 72da6f8  [SPARK-33002][PYTHON] Remove non-API annotations
 add 94d648d  [SPARK-33036][SQL] Refactor RewriteCorrelatedScalarSubquery 
code to replace exprIds in a bottom-up manner

No new revisions were added by this update.

Summary of changes:
 .../spark/sql/catalyst/optimizer/subquery.scala| 80 ++
 1 file changed, 51 insertions(+), 29 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org



[spark] branch master updated (72da6f8 -> 94d648d)

2020-10-07 Thread yamamuro
This is an automated email from the ASF dual-hosted git repository.

yamamuro pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 72da6f8  [SPARK-33002][PYTHON] Remove non-API annotations
 add 94d648d  [SPARK-33036][SQL] Refactor RewriteCorrelatedScalarSubquery 
code to replace exprIds in a bottom-up manner

No new revisions were added by this update.

Summary of changes:
 .../spark/sql/catalyst/optimizer/subquery.scala| 80 ++
 1 file changed, 51 insertions(+), 29 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org



[spark] branch master updated (72da6f8 -> 94d648d)

2020-10-07 Thread yamamuro
This is an automated email from the ASF dual-hosted git repository.

yamamuro pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 72da6f8  [SPARK-33002][PYTHON] Remove non-API annotations
 add 94d648d  [SPARK-33036][SQL] Refactor RewriteCorrelatedScalarSubquery 
code to replace exprIds in a bottom-up manner

No new revisions were added by this update.

Summary of changes:
 .../spark/sql/catalyst/optimizer/subquery.scala| 80 ++
 1 file changed, 51 insertions(+), 29 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org



[spark] branch master updated (72da6f8 -> 94d648d)

2020-10-07 Thread yamamuro
This is an automated email from the ASF dual-hosted git repository.

yamamuro pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 72da6f8  [SPARK-33002][PYTHON] Remove non-API annotations
 add 94d648d  [SPARK-33036][SQL] Refactor RewriteCorrelatedScalarSubquery 
code to replace exprIds in a bottom-up manner

No new revisions were added by this update.

Summary of changes:
 .../spark/sql/catalyst/optimizer/subquery.scala| 80 ++
 1 file changed, 51 insertions(+), 29 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org



[spark] branch master updated (1299c8a -> 5af62a2)

2020-10-03 Thread yamamuro
This is an automated email from the ASF dual-hosted git repository.

yamamuro pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 1299c8a  [SPARK-33037][SHUFFLE] Remove knownManagers to support user's 
custom shuffle manager plugin
 add 5af62a2  [SPARK-33052][SQL][TEST] Make all the database versions 
up-to-date for integration tests

No new revisions were added by this update.

Summary of changes:
 .../src/test/resources/mariadb_docker_entrypoint.sh   |  2 +-
 .../scala/org/apache/spark/sql/jdbc/DB2IntegrationSuite.scala |  9 -
 .../org/apache/spark/sql/jdbc/DB2KrbIntegrationSuite.scala|  9 -
 .../apache/spark/sql/jdbc/MariaDBKrbIntegrationSuite.scala|  4 +++-
 .../apache/spark/sql/jdbc/MsSqlServerIntegrationSuite.scala   | 10 +-
 .../org/apache/spark/sql/jdbc/MySQLIntegrationSuite.scala | 11 +--
 .../org/apache/spark/sql/jdbc/PostgresIntegrationSuite.scala  |  9 -
 .../apache/spark/sql/jdbc/PostgresKrbIntegrationSuite.scala   |  9 -
 8 files changed, 54 insertions(+), 9 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org



[spark] branch master updated (1299c8a -> 5af62a2)

2020-10-03 Thread yamamuro
This is an automated email from the ASF dual-hosted git repository.

yamamuro pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 1299c8a  [SPARK-33037][SHUFFLE] Remove knownManagers to support user's 
custom shuffle manager plugin
 add 5af62a2  [SPARK-33052][SQL][TEST] Make all the database versions 
up-to-date for integration tests

No new revisions were added by this update.

Summary of changes:
 .../src/test/resources/mariadb_docker_entrypoint.sh   |  2 +-
 .../scala/org/apache/spark/sql/jdbc/DB2IntegrationSuite.scala |  9 -
 .../org/apache/spark/sql/jdbc/DB2KrbIntegrationSuite.scala|  9 -
 .../apache/spark/sql/jdbc/MariaDBKrbIntegrationSuite.scala|  4 +++-
 .../apache/spark/sql/jdbc/MsSqlServerIntegrationSuite.scala   | 10 +-
 .../org/apache/spark/sql/jdbc/MySQLIntegrationSuite.scala | 11 +--
 .../org/apache/spark/sql/jdbc/PostgresIntegrationSuite.scala  |  9 -
 .../apache/spark/sql/jdbc/PostgresKrbIntegrationSuite.scala   |  9 -
 8 files changed, 54 insertions(+), 9 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org



[spark] branch master updated (1299c8a -> 5af62a2)

2020-10-03 Thread yamamuro
This is an automated email from the ASF dual-hosted git repository.

yamamuro pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 1299c8a  [SPARK-33037][SHUFFLE] Remove knownManagers to support user's 
custom shuffle manager plugin
 add 5af62a2  [SPARK-33052][SQL][TEST] Make all the database versions 
up-to-date for integration tests

No new revisions were added by this update.

Summary of changes:
 .../src/test/resources/mariadb_docker_entrypoint.sh   |  2 +-
 .../scala/org/apache/spark/sql/jdbc/DB2IntegrationSuite.scala |  9 -
 .../org/apache/spark/sql/jdbc/DB2KrbIntegrationSuite.scala|  9 -
 .../apache/spark/sql/jdbc/MariaDBKrbIntegrationSuite.scala|  4 +++-
 .../apache/spark/sql/jdbc/MsSqlServerIntegrationSuite.scala   | 10 +-
 .../org/apache/spark/sql/jdbc/MySQLIntegrationSuite.scala | 11 +--
 .../org/apache/spark/sql/jdbc/PostgresIntegrationSuite.scala  |  9 -
 .../apache/spark/sql/jdbc/PostgresKrbIntegrationSuite.scala   |  9 -
 8 files changed, 54 insertions(+), 9 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org



[spark] branch master updated (1299c8a -> 5af62a2)

2020-10-03 Thread yamamuro
This is an automated email from the ASF dual-hosted git repository.

yamamuro pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 1299c8a  [SPARK-33037][SHUFFLE] Remove knownManagers to support user's 
custom shuffle manager plugin
 add 5af62a2  [SPARK-33052][SQL][TEST] Make all the database versions 
up-to-date for integration tests

No new revisions were added by this update.

Summary of changes:
 .../src/test/resources/mariadb_docker_entrypoint.sh   |  2 +-
 .../scala/org/apache/spark/sql/jdbc/DB2IntegrationSuite.scala |  9 -
 .../org/apache/spark/sql/jdbc/DB2KrbIntegrationSuite.scala|  9 -
 .../apache/spark/sql/jdbc/MariaDBKrbIntegrationSuite.scala|  4 +++-
 .../apache/spark/sql/jdbc/MsSqlServerIntegrationSuite.scala   | 10 +-
 .../org/apache/spark/sql/jdbc/MySQLIntegrationSuite.scala | 11 +--
 .../org/apache/spark/sql/jdbc/PostgresIntegrationSuite.scala  |  9 -
 .../apache/spark/sql/jdbc/PostgresKrbIntegrationSuite.scala   |  9 -
 8 files changed, 54 insertions(+), 9 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org



[spark] branch master updated (1299c8a -> 5af62a2)

2020-10-03 Thread yamamuro
This is an automated email from the ASF dual-hosted git repository.

yamamuro pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 1299c8a  [SPARK-33037][SHUFFLE] Remove knownManagers to support user's 
custom shuffle manager plugin
 add 5af62a2  [SPARK-33052][SQL][TEST] Make all the database versions 
up-to-date for integration tests

No new revisions were added by this update.

Summary of changes:
 .../src/test/resources/mariadb_docker_entrypoint.sh   |  2 +-
 .../scala/org/apache/spark/sql/jdbc/DB2IntegrationSuite.scala |  9 -
 .../org/apache/spark/sql/jdbc/DB2KrbIntegrationSuite.scala|  9 -
 .../apache/spark/sql/jdbc/MariaDBKrbIntegrationSuite.scala|  4 +++-
 .../apache/spark/sql/jdbc/MsSqlServerIntegrationSuite.scala   | 10 +-
 .../org/apache/spark/sql/jdbc/MySQLIntegrationSuite.scala | 11 +--
 .../org/apache/spark/sql/jdbc/PostgresIntegrationSuite.scala  |  9 -
 .../apache/spark/sql/jdbc/PostgresKrbIntegrationSuite.scala   |  9 -
 8 files changed, 54 insertions(+), 9 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org



[spark] branch master updated (9b88aca -> 82721ce)

2020-10-02 Thread yamamuro
This is an automated email from the ASF dual-hosted git repository.

yamamuro pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 9b88aca  [SPARK-33030][R] Add nth_value to SparkR
 add 82721ce  [SPARK-32741][SQL][FOLLOWUP] Run plan integrity check only 
for effective plan changes

No new revisions were added by this update.

Summary of changes:
 .../main/scala/org/apache/spark/sql/catalyst/rules/RuleExecutor.scala   | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org



[spark] branch master updated (9b88aca -> 82721ce)

2020-10-02 Thread yamamuro
This is an automated email from the ASF dual-hosted git repository.

yamamuro pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 9b88aca  [SPARK-33030][R] Add nth_value to SparkR
 add 82721ce  [SPARK-32741][SQL][FOLLOWUP] Run plan integrity check only 
for effective plan changes

No new revisions were added by this update.

Summary of changes:
 .../main/scala/org/apache/spark/sql/catalyst/rules/RuleExecutor.scala   | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org



[spark] branch master updated (9b88aca -> 82721ce)

2020-10-02 Thread yamamuro
This is an automated email from the ASF dual-hosted git repository.

yamamuro pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 9b88aca  [SPARK-33030][R] Add nth_value to SparkR
 add 82721ce  [SPARK-32741][SQL][FOLLOWUP] Run plan integrity check only 
for effective plan changes

No new revisions were added by this update.

Summary of changes:
 .../main/scala/org/apache/spark/sql/catalyst/rules/RuleExecutor.scala   | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org



[spark] branch master updated (9b88aca -> 82721ce)

2020-10-02 Thread yamamuro
This is an automated email from the ASF dual-hosted git repository.

yamamuro pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 9b88aca  [SPARK-33030][R] Add nth_value to SparkR
 add 82721ce  [SPARK-32741][SQL][FOLLOWUP] Run plan integrity check only 
for effective plan changes

No new revisions were added by this update.

Summary of changes:
 .../main/scala/org/apache/spark/sql/catalyst/rules/RuleExecutor.scala   | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org



[spark] branch master updated (9b88aca -> 82721ce)

2020-10-02 Thread yamamuro
This is an automated email from the ASF dual-hosted git repository.

yamamuro pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 9b88aca  [SPARK-33030][R] Add nth_value to SparkR
 add 82721ce  [SPARK-32741][SQL][FOLLOWUP] Run plan integrity check only 
for effective plan changes

No new revisions were added by this update.

Summary of changes:
 .../main/scala/org/apache/spark/sql/catalyst/rules/RuleExecutor.scala   | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org



[spark] branch master updated (8657742 -> d6f3138)

2020-10-01 Thread yamamuro
This is an automated email from the ASF dual-hosted git repository.

yamamuro pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 8657742  [SPARK-32996][WEB-UI][FOLLOWUP] Move ExecutorSummarySuite to 
proper path
 add d6f3138  [SPARK-32859][SQL] Introduce physical rule to decide 
bucketing dynamically

No new revisions were added by this update.

Summary of changes:
 .../catalyst/optimizer/CostBasedJoinReorder.scala  |   2 +-
 .../org/apache/spark/sql/internal/SQLConf.scala|  13 ++
 .../spark/sql/execution/DataSourceScanExec.scala   |  37 ++--
 .../spark/sql/execution/QueryExecution.scala   |   3 +-
 .../bucketing/DisableUnnecessaryBucketedScan.scala | 161 +++
 .../org/apache/spark/sql/DataFrameJoinSuite.scala  |   2 +-
 .../scala/org/apache/spark/sql/SubquerySuite.scala |   2 +-
 .../DisableUnnecessaryBucketedScanSuite.scala  | 221 +
 ...ecessaryBucketedScanWithHiveSupportSuite.scala} |   5 +-
 9 files changed, 427 insertions(+), 19 deletions(-)
 create mode 100644 
sql/core/src/main/scala/org/apache/spark/sql/execution/bucketing/DisableUnnecessaryBucketedScan.scala
 create mode 100644 
sql/core/src/test/scala/org/apache/spark/sql/sources/DisableUnnecessaryBucketedScanSuite.scala
 copy 
sql/hive/src/test/scala/org/apache/spark/sql/sources/{BucketedReadWithHiveSupportSuite.scala
 => DisableUnnecessaryBucketedScanWithHiveSupportSuite.scala} (89%)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org



[spark] branch master updated (8657742 -> d6f3138)

2020-10-01 Thread yamamuro
This is an automated email from the ASF dual-hosted git repository.

yamamuro pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 8657742  [SPARK-32996][WEB-UI][FOLLOWUP] Move ExecutorSummarySuite to 
proper path
 add d6f3138  [SPARK-32859][SQL] Introduce physical rule to decide 
bucketing dynamically

No new revisions were added by this update.

Summary of changes:
 .../catalyst/optimizer/CostBasedJoinReorder.scala  |   2 +-
 .../org/apache/spark/sql/internal/SQLConf.scala|  13 ++
 .../spark/sql/execution/DataSourceScanExec.scala   |  37 ++--
 .../spark/sql/execution/QueryExecution.scala   |   3 +-
 .../bucketing/DisableUnnecessaryBucketedScan.scala | 161 +++
 .../org/apache/spark/sql/DataFrameJoinSuite.scala  |   2 +-
 .../scala/org/apache/spark/sql/SubquerySuite.scala |   2 +-
 .../DisableUnnecessaryBucketedScanSuite.scala  | 221 +
 ...ecessaryBucketedScanWithHiveSupportSuite.scala} |   5 +-
 9 files changed, 427 insertions(+), 19 deletions(-)
 create mode 100644 
sql/core/src/main/scala/org/apache/spark/sql/execution/bucketing/DisableUnnecessaryBucketedScan.scala
 create mode 100644 
sql/core/src/test/scala/org/apache/spark/sql/sources/DisableUnnecessaryBucketedScanSuite.scala
 copy 
sql/hive/src/test/scala/org/apache/spark/sql/sources/{BucketedReadWithHiveSupportSuite.scala
 => DisableUnnecessaryBucketedScanWithHiveSupportSuite.scala} (89%)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org



[spark] branch master updated (8657742 -> d6f3138)

2020-10-01 Thread yamamuro
This is an automated email from the ASF dual-hosted git repository.

yamamuro pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 8657742  [SPARK-32996][WEB-UI][FOLLOWUP] Move ExecutorSummarySuite to 
proper path
 add d6f3138  [SPARK-32859][SQL] Introduce physical rule to decide 
bucketing dynamically

No new revisions were added by this update.

Summary of changes:
 .../catalyst/optimizer/CostBasedJoinReorder.scala  |   2 +-
 .../org/apache/spark/sql/internal/SQLConf.scala|  13 ++
 .../spark/sql/execution/DataSourceScanExec.scala   |  37 ++--
 .../spark/sql/execution/QueryExecution.scala   |   3 +-
 .../bucketing/DisableUnnecessaryBucketedScan.scala | 161 +++
 .../org/apache/spark/sql/DataFrameJoinSuite.scala  |   2 +-
 .../scala/org/apache/spark/sql/SubquerySuite.scala |   2 +-
 .../DisableUnnecessaryBucketedScanSuite.scala  | 221 +
 ...ecessaryBucketedScanWithHiveSupportSuite.scala} |   5 +-
 9 files changed, 427 insertions(+), 19 deletions(-)
 create mode 100644 
sql/core/src/main/scala/org/apache/spark/sql/execution/bucketing/DisableUnnecessaryBucketedScan.scala
 create mode 100644 
sql/core/src/test/scala/org/apache/spark/sql/sources/DisableUnnecessaryBucketedScanSuite.scala
 copy 
sql/hive/src/test/scala/org/apache/spark/sql/sources/{BucketedReadWithHiveSupportSuite.scala
 => DisableUnnecessaryBucketedScanWithHiveSupportSuite.scala} (89%)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org



[spark] branch master updated (8657742 -> d6f3138)

2020-10-01 Thread yamamuro
This is an automated email from the ASF dual-hosted git repository.

yamamuro pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 8657742  [SPARK-32996][WEB-UI][FOLLOWUP] Move ExecutorSummarySuite to 
proper path
 add d6f3138  [SPARK-32859][SQL] Introduce physical rule to decide 
bucketing dynamically

No new revisions were added by this update.

Summary of changes:
 .../catalyst/optimizer/CostBasedJoinReorder.scala  |   2 +-
 .../org/apache/spark/sql/internal/SQLConf.scala|  13 ++
 .../spark/sql/execution/DataSourceScanExec.scala   |  37 ++--
 .../spark/sql/execution/QueryExecution.scala   |   3 +-
 .../bucketing/DisableUnnecessaryBucketedScan.scala | 161 +++
 .../org/apache/spark/sql/DataFrameJoinSuite.scala  |   2 +-
 .../scala/org/apache/spark/sql/SubquerySuite.scala |   2 +-
 .../DisableUnnecessaryBucketedScanSuite.scala  | 221 +
 ...ecessaryBucketedScanWithHiveSupportSuite.scala} |   5 +-
 9 files changed, 427 insertions(+), 19 deletions(-)
 create mode 100644 
sql/core/src/main/scala/org/apache/spark/sql/execution/bucketing/DisableUnnecessaryBucketedScan.scala
 create mode 100644 
sql/core/src/test/scala/org/apache/spark/sql/sources/DisableUnnecessaryBucketedScanSuite.scala
 copy 
sql/hive/src/test/scala/org/apache/spark/sql/sources/{BucketedReadWithHiveSupportSuite.scala
 => DisableUnnecessaryBucketedScanWithHiveSupportSuite.scala} (89%)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org



[spark] branch master updated (8657742 -> d6f3138)

2020-10-01 Thread yamamuro
This is an automated email from the ASF dual-hosted git repository.

yamamuro pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 8657742  [SPARK-32996][WEB-UI][FOLLOWUP] Move ExecutorSummarySuite to 
proper path
 add d6f3138  [SPARK-32859][SQL] Introduce physical rule to decide 
bucketing dynamically

No new revisions were added by this update.

Summary of changes:
 .../catalyst/optimizer/CostBasedJoinReorder.scala  |   2 +-
 .../org/apache/spark/sql/internal/SQLConf.scala|  13 ++
 .../spark/sql/execution/DataSourceScanExec.scala   |  37 ++--
 .../spark/sql/execution/QueryExecution.scala   |   3 +-
 .../bucketing/DisableUnnecessaryBucketedScan.scala | 161 +++
 .../org/apache/spark/sql/DataFrameJoinSuite.scala  |   2 +-
 .../scala/org/apache/spark/sql/SubquerySuite.scala |   2 +-
 .../DisableUnnecessaryBucketedScanSuite.scala  | 221 +
 ...ecessaryBucketedScanWithHiveSupportSuite.scala} |   5 +-
 9 files changed, 427 insertions(+), 19 deletions(-)
 create mode 100644 
sql/core/src/main/scala/org/apache/spark/sql/execution/bucketing/DisableUnnecessaryBucketedScan.scala
 create mode 100644 
sql/core/src/test/scala/org/apache/spark/sql/sources/DisableUnnecessaryBucketedScanSuite.scala
 copy 
sql/hive/src/test/scala/org/apache/spark/sql/sources/{BucketedReadWithHiveSupportSuite.scala
 => DisableUnnecessaryBucketedScanWithHiveSupportSuite.scala} (89%)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org



[spark] branch branch-3.0 updated: [SQL][DOC][MINOR] Corrects input table names in the examples of CREATE FUNCTION doc

2020-10-01 Thread yamamuro
This is an automated email from the ASF dual-hosted git repository.

yamamuro pushed a commit to branch branch-3.0
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/branch-3.0 by this push:
 new bc29602  [SQL][DOC][MINOR] Corrects input table names in the examples 
of CREATE FUNCTION doc
bc29602 is described below

commit bc29602e740393aa00c3154986000eaf1be2f965
Author: iRakson 
AuthorDate: Thu Oct 1 20:50:16 2020 +0900

[SQL][DOC][MINOR] Corrects input table names in the examples of CREATE 
FUNCTION doc

### What changes were proposed in this pull request?
Fix Typo

### Why are the changes needed?
To maintain consistency.
Correct table name should be used for SELECT command.

### Does this PR introduce _any_ user-facing change?
Yes. Now CREATE FUNCTION doc will show the correct name of table.

### How was this patch tested?
Manually. Doc changes.

Closes #29920 from iRakson/fixTypo.

Authored-by: iRakson 
Signed-off-by: Takeshi Yamamuro 
(cherry picked from commit d3dbe1a9076c8a76be0590ca071bfbec6114813b)
Signed-off-by: Takeshi Yamamuro 
---
 docs/sql-ref-syntax-ddl-create-function.md | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/docs/sql-ref-syntax-ddl-create-function.md 
b/docs/sql-ref-syntax-ddl-create-function.md
index aa6c1fa..dfa4f4f 100644
--- a/docs/sql-ref-syntax-ddl-create-function.md
+++ b/docs/sql-ref-syntax-ddl-create-function.md
@@ -112,7 +112,7 @@ SHOW USER FUNCTIONS;
 +--+
 
 -- Invoke the function. Every selected value should be incremented by 10.
-SELECT simple_udf(c1) AS function_return_value FROM t1;
+SELECT simple_udf(c1) AS function_return_value FROM test;
 +-+
 |function_return_value|
 +-+
@@ -150,7 +150,7 @@ CREATE OR REPLACE FUNCTION simple_udf AS 'SimpleUdfR'
 USING JAR '/tmp/SimpleUdfR.jar';
 
 -- Invoke the function. Every selected value should be incremented by 20.
-SELECT simple_udf(c1) AS function_return_value FROM t1;
+SELECT simple_udf(c1) AS function_return_value FROM test;
 +-+
 |function_return_value|
 +-+


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org



[spark] branch branch-3.0 updated: [SQL][DOC][MINOR] Corrects input table names in the examples of CREATE FUNCTION doc

2020-10-01 Thread yamamuro
This is an automated email from the ASF dual-hosted git repository.

yamamuro pushed a commit to branch branch-3.0
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/branch-3.0 by this push:
 new bc29602  [SQL][DOC][MINOR] Corrects input table names in the examples 
of CREATE FUNCTION doc
bc29602 is described below

commit bc29602e740393aa00c3154986000eaf1be2f965
Author: iRakson 
AuthorDate: Thu Oct 1 20:50:16 2020 +0900

[SQL][DOC][MINOR] Corrects input table names in the examples of CREATE 
FUNCTION doc

### What changes were proposed in this pull request?
Fix Typo

### Why are the changes needed?
To maintain consistency.
Correct table name should be used for SELECT command.

### Does this PR introduce _any_ user-facing change?
Yes. Now CREATE FUNCTION doc will show the correct name of table.

### How was this patch tested?
Manually. Doc changes.

Closes #29920 from iRakson/fixTypo.

Authored-by: iRakson 
Signed-off-by: Takeshi Yamamuro 
(cherry picked from commit d3dbe1a9076c8a76be0590ca071bfbec6114813b)
Signed-off-by: Takeshi Yamamuro 
---
 docs/sql-ref-syntax-ddl-create-function.md | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/docs/sql-ref-syntax-ddl-create-function.md 
b/docs/sql-ref-syntax-ddl-create-function.md
index aa6c1fa..dfa4f4f 100644
--- a/docs/sql-ref-syntax-ddl-create-function.md
+++ b/docs/sql-ref-syntax-ddl-create-function.md
@@ -112,7 +112,7 @@ SHOW USER FUNCTIONS;
 +--+
 
 -- Invoke the function. Every selected value should be incremented by 10.
-SELECT simple_udf(c1) AS function_return_value FROM t1;
+SELECT simple_udf(c1) AS function_return_value FROM test;
 +-+
 |function_return_value|
 +-+
@@ -150,7 +150,7 @@ CREATE OR REPLACE FUNCTION simple_udf AS 'SimpleUdfR'
 USING JAR '/tmp/SimpleUdfR.jar';
 
 -- Invoke the function. Every selected value should be incremented by 20.
-SELECT simple_udf(c1) AS function_return_value FROM t1;
+SELECT simple_udf(c1) AS function_return_value FROM test;
 +-+
 |function_return_value|
 +-+


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org



[spark] branch master updated (5651284 -> d3dbe1a)

2020-10-01 Thread yamamuro
This is an automated email from the ASF dual-hosted git repository.

yamamuro pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 5651284  [SPARK-32992][SQL] Map Oracle's ROWID type to StringType in 
read via JDBC
 add d3dbe1a  [SQL][DOC][MINOR] Corrects input table names in the examples 
of CREATE FUNCTION doc

No new revisions were added by this update.

Summary of changes:
 docs/sql-ref-syntax-ddl-create-function.md | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org



[spark] branch branch-3.0 updated: [SQL][DOC][MINOR] Corrects input table names in the examples of CREATE FUNCTION doc

2020-10-01 Thread yamamuro
This is an automated email from the ASF dual-hosted git repository.

yamamuro pushed a commit to branch branch-3.0
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/branch-3.0 by this push:
 new bc29602  [SQL][DOC][MINOR] Corrects input table names in the examples 
of CREATE FUNCTION doc
bc29602 is described below

commit bc29602e740393aa00c3154986000eaf1be2f965
Author: iRakson 
AuthorDate: Thu Oct 1 20:50:16 2020 +0900

[SQL][DOC][MINOR] Corrects input table names in the examples of CREATE 
FUNCTION doc

### What changes were proposed in this pull request?
Fix Typo

### Why are the changes needed?
To maintain consistency.
Correct table name should be used for SELECT command.

### Does this PR introduce _any_ user-facing change?
Yes. Now CREATE FUNCTION doc will show the correct name of table.

### How was this patch tested?
Manually. Doc changes.

Closes #29920 from iRakson/fixTypo.

Authored-by: iRakson 
Signed-off-by: Takeshi Yamamuro 
(cherry picked from commit d3dbe1a9076c8a76be0590ca071bfbec6114813b)
Signed-off-by: Takeshi Yamamuro 
---
 docs/sql-ref-syntax-ddl-create-function.md | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/docs/sql-ref-syntax-ddl-create-function.md 
b/docs/sql-ref-syntax-ddl-create-function.md
index aa6c1fa..dfa4f4f 100644
--- a/docs/sql-ref-syntax-ddl-create-function.md
+++ b/docs/sql-ref-syntax-ddl-create-function.md
@@ -112,7 +112,7 @@ SHOW USER FUNCTIONS;
 +--+
 
 -- Invoke the function. Every selected value should be incremented by 10.
-SELECT simple_udf(c1) AS function_return_value FROM t1;
+SELECT simple_udf(c1) AS function_return_value FROM test;
 +-+
 |function_return_value|
 +-+
@@ -150,7 +150,7 @@ CREATE OR REPLACE FUNCTION simple_udf AS 'SimpleUdfR'
 USING JAR '/tmp/SimpleUdfR.jar';
 
 -- Invoke the function. Every selected value should be incremented by 20.
-SELECT simple_udf(c1) AS function_return_value FROM t1;
+SELECT simple_udf(c1) AS function_return_value FROM test;
 +-+
 |function_return_value|
 +-+


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org



[spark] branch master updated (5651284 -> d3dbe1a)

2020-10-01 Thread yamamuro
This is an automated email from the ASF dual-hosted git repository.

yamamuro pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 5651284  [SPARK-32992][SQL] Map Oracle's ROWID type to StringType in 
read via JDBC
 add d3dbe1a  [SQL][DOC][MINOR] Corrects input table names in the examples 
of CREATE FUNCTION doc

No new revisions were added by this update.

Summary of changes:
 docs/sql-ref-syntax-ddl-create-function.md | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org



[spark] branch branch-3.0 updated: [SQL][DOC][MINOR] Corrects input table names in the examples of CREATE FUNCTION doc

2020-10-01 Thread yamamuro
This is an automated email from the ASF dual-hosted git repository.

yamamuro pushed a commit to branch branch-3.0
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/branch-3.0 by this push:
 new bc29602  [SQL][DOC][MINOR] Corrects input table names in the examples 
of CREATE FUNCTION doc
bc29602 is described below

commit bc29602e740393aa00c3154986000eaf1be2f965
Author: iRakson 
AuthorDate: Thu Oct 1 20:50:16 2020 +0900

[SQL][DOC][MINOR] Corrects input table names in the examples of CREATE 
FUNCTION doc

### What changes were proposed in this pull request?
Fix Typo

### Why are the changes needed?
To maintain consistency.
Correct table name should be used for SELECT command.

### Does this PR introduce _any_ user-facing change?
Yes. Now CREATE FUNCTION doc will show the correct name of table.

### How was this patch tested?
Manually. Doc changes.

Closes #29920 from iRakson/fixTypo.

Authored-by: iRakson 
Signed-off-by: Takeshi Yamamuro 
(cherry picked from commit d3dbe1a9076c8a76be0590ca071bfbec6114813b)
Signed-off-by: Takeshi Yamamuro 
---
 docs/sql-ref-syntax-ddl-create-function.md | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/docs/sql-ref-syntax-ddl-create-function.md 
b/docs/sql-ref-syntax-ddl-create-function.md
index aa6c1fa..dfa4f4f 100644
--- a/docs/sql-ref-syntax-ddl-create-function.md
+++ b/docs/sql-ref-syntax-ddl-create-function.md
@@ -112,7 +112,7 @@ SHOW USER FUNCTIONS;
 +--+
 
 -- Invoke the function. Every selected value should be incremented by 10.
-SELECT simple_udf(c1) AS function_return_value FROM t1;
+SELECT simple_udf(c1) AS function_return_value FROM test;
 +-+
 |function_return_value|
 +-+
@@ -150,7 +150,7 @@ CREATE OR REPLACE FUNCTION simple_udf AS 'SimpleUdfR'
 USING JAR '/tmp/SimpleUdfR.jar';
 
 -- Invoke the function. Every selected value should be incremented by 20.
-SELECT simple_udf(c1) AS function_return_value FROM t1;
+SELECT simple_udf(c1) AS function_return_value FROM test;
 +-+
 |function_return_value|
 +-+


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org



[spark] branch master updated (5651284 -> d3dbe1a)

2020-10-01 Thread yamamuro
This is an automated email from the ASF dual-hosted git repository.

yamamuro pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 5651284  [SPARK-32992][SQL] Map Oracle's ROWID type to StringType in 
read via JDBC
 add d3dbe1a  [SQL][DOC][MINOR] Corrects input table names in the examples 
of CREATE FUNCTION doc

No new revisions were added by this update.

Summary of changes:
 docs/sql-ref-syntax-ddl-create-function.md | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org



[spark] branch branch-3.0 updated: [SQL][DOC][MINOR] Corrects input table names in the examples of CREATE FUNCTION doc

2020-10-01 Thread yamamuro
This is an automated email from the ASF dual-hosted git repository.

yamamuro pushed a commit to branch branch-3.0
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/branch-3.0 by this push:
 new bc29602  [SQL][DOC][MINOR] Corrects input table names in the examples 
of CREATE FUNCTION doc
bc29602 is described below

commit bc29602e740393aa00c3154986000eaf1be2f965
Author: iRakson 
AuthorDate: Thu Oct 1 20:50:16 2020 +0900

[SQL][DOC][MINOR] Corrects input table names in the examples of CREATE 
FUNCTION doc

### What changes were proposed in this pull request?
Fix Typo

### Why are the changes needed?
To maintain consistency.
Correct table name should be used for SELECT command.

### Does this PR introduce _any_ user-facing change?
Yes. Now CREATE FUNCTION doc will show the correct name of table.

### How was this patch tested?
Manually. Doc changes.

Closes #29920 from iRakson/fixTypo.

Authored-by: iRakson 
Signed-off-by: Takeshi Yamamuro 
(cherry picked from commit d3dbe1a9076c8a76be0590ca071bfbec6114813b)
Signed-off-by: Takeshi Yamamuro 
---
 docs/sql-ref-syntax-ddl-create-function.md | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/docs/sql-ref-syntax-ddl-create-function.md 
b/docs/sql-ref-syntax-ddl-create-function.md
index aa6c1fa..dfa4f4f 100644
--- a/docs/sql-ref-syntax-ddl-create-function.md
+++ b/docs/sql-ref-syntax-ddl-create-function.md
@@ -112,7 +112,7 @@ SHOW USER FUNCTIONS;
 +--+
 
 -- Invoke the function. Every selected value should be incremented by 10.
-SELECT simple_udf(c1) AS function_return_value FROM t1;
+SELECT simple_udf(c1) AS function_return_value FROM test;
 +-+
 |function_return_value|
 +-+
@@ -150,7 +150,7 @@ CREATE OR REPLACE FUNCTION simple_udf AS 'SimpleUdfR'
 USING JAR '/tmp/SimpleUdfR.jar';
 
 -- Invoke the function. Every selected value should be incremented by 20.
-SELECT simple_udf(c1) AS function_return_value FROM t1;
+SELECT simple_udf(c1) AS function_return_value FROM test;
 +-+
 |function_return_value|
 +-+


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org



[spark] branch master updated (5651284 -> d3dbe1a)

2020-10-01 Thread yamamuro
This is an automated email from the ASF dual-hosted git repository.

yamamuro pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 5651284  [SPARK-32992][SQL] Map Oracle's ROWID type to StringType in 
read via JDBC
 add d3dbe1a  [SQL][DOC][MINOR] Corrects input table names in the examples 
of CREATE FUNCTION doc

No new revisions were added by this update.

Summary of changes:
 docs/sql-ref-syntax-ddl-create-function.md | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org



[spark] branch master updated (5651284 -> d3dbe1a)

2020-10-01 Thread yamamuro
This is an automated email from the ASF dual-hosted git repository.

yamamuro pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 5651284  [SPARK-32992][SQL] Map Oracle's ROWID type to StringType in 
read via JDBC
 add d3dbe1a  [SQL][DOC][MINOR] Corrects input table names in the examples 
of CREATE FUNCTION doc

No new revisions were added by this update.

Summary of changes:
 docs/sql-ref-syntax-ddl-create-function.md | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org



[spark] branch master updated (28ed3a5 -> 5651284)

2020-10-01 Thread yamamuro
This is an automated email from the ASF dual-hosted git repository.

yamamuro pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 28ed3a5  [SPARK-32723][WEBUI] Upgrade to jQuery 3.5.1
 add 5651284  [SPARK-32992][SQL] Map Oracle's ROWID type to StringType in 
read via JDBC

No new revisions were added by this update.

Summary of changes:
 .../org/apache/spark/sql/jdbc/OracleIntegrationSuite.scala| 11 +++
 .../main/scala/org/apache/spark/sql/jdbc/OracleDialect.scala  |  6 ++
 2 files changed, 17 insertions(+)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org



[spark] branch master updated (28ed3a5 -> 5651284)

2020-10-01 Thread yamamuro
This is an automated email from the ASF dual-hosted git repository.

yamamuro pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 28ed3a5  [SPARK-32723][WEBUI] Upgrade to jQuery 3.5.1
 add 5651284  [SPARK-32992][SQL] Map Oracle's ROWID type to StringType in 
read via JDBC

No new revisions were added by this update.

Summary of changes:
 .../org/apache/spark/sql/jdbc/OracleIntegrationSuite.scala| 11 +++
 .../main/scala/org/apache/spark/sql/jdbc/OracleDialect.scala  |  6 ++
 2 files changed, 17 insertions(+)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org



[spark] branch master updated (28ed3a5 -> 5651284)

2020-09-30 Thread yamamuro
This is an automated email from the ASF dual-hosted git repository.

yamamuro pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 28ed3a5  [SPARK-32723][WEBUI] Upgrade to jQuery 3.5.1
 add 5651284  [SPARK-32992][SQL] Map Oracle's ROWID type to StringType in 
read via JDBC

No new revisions were added by this update.

Summary of changes:
 .../org/apache/spark/sql/jdbc/OracleIntegrationSuite.scala| 11 +++
 .../main/scala/org/apache/spark/sql/jdbc/OracleDialect.scala  |  6 ++
 2 files changed, 17 insertions(+)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org



[spark] branch master updated (28ed3a5 -> 5651284)

2020-09-30 Thread yamamuro
This is an automated email from the ASF dual-hosted git repository.

yamamuro pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 28ed3a5  [SPARK-32723][WEBUI] Upgrade to jQuery 3.5.1
 add 5651284  [SPARK-32992][SQL] Map Oracle's ROWID type to StringType in 
read via JDBC

No new revisions were added by this update.

Summary of changes:
 .../org/apache/spark/sql/jdbc/OracleIntegrationSuite.scala| 11 +++
 .../main/scala/org/apache/spark/sql/jdbc/OracleDialect.scala  |  6 ++
 2 files changed, 17 insertions(+)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org



[spark] branch master updated (28ed3a5 -> 5651284)

2020-09-30 Thread yamamuro
This is an automated email from the ASF dual-hosted git repository.

yamamuro pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 28ed3a5  [SPARK-32723][WEBUI] Upgrade to jQuery 3.5.1
 add 5651284  [SPARK-32992][SQL] Map Oracle's ROWID type to StringType in 
read via JDBC

No new revisions were added by this update.

Summary of changes:
 .../org/apache/spark/sql/jdbc/OracleIntegrationSuite.scala| 11 +++
 .../main/scala/org/apache/spark/sql/jdbc/OracleDialect.scala  |  6 ++
 2 files changed, 17 insertions(+)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org



[spark] branch branch-3.0 updated: [SPARK-31753][SQL][DOCS][FOLLOW-UP] Add missing keywords in the SQL docs

2020-09-30 Thread yamamuro
This is an automated email from the ASF dual-hosted git repository.

yamamuro pushed a commit to branch branch-3.0
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/branch-3.0 by this push:
 new db6ba04  [SPARK-31753][SQL][DOCS][FOLLOW-UP] Add missing keywords in 
the SQL docs
db6ba04 is described below

commit db6ba049c43e2aa1521ed39c9f2b802ad04d111f
Author: GuoPhilipse <46367746+guophili...@users.noreply.github.com>
AuthorDate: Thu Oct 1 08:15:53 2020 +0900

[SPARK-31753][SQL][DOCS][FOLLOW-UP] Add missing keywords in the SQL docs

### What changes were proposed in this pull request?
update sql-ref docs, the following key words will be added in this PR.

CLUSTERED BY
SORTED BY
INTO num_buckets BUCKETS

### Why are the changes needed?
let more users know the sql key words usage

### Does this PR introduce _any_ user-facing change?
No

![image](https://user-images.githubusercontent.com/46367746/94428281-0a6b8080-01c3-11eb-9ff3-899f8da602ca.png)

![image](https://user-images.githubusercontent.com/46367746/94428285-0d667100-01c3-11eb-8a54-90e7641d917b.png)

![image](https://user-images.githubusercontent.com/46367746/94428288-0f303480-01c3-11eb-9e1d-023538aa6e2d.png)

### How was this patch tested?
generate html test

Closes #29883 from GuoPhilipse/add-sql-missing-keywords.

Lead-authored-by: GuoPhilipse 
<46367746+guophili...@users.noreply.github.com>
Co-authored-by: GuoPhilipse 
Signed-off-by: Takeshi Yamamuro 
(cherry picked from commit 3bdbb5546d2517dda6f71613927cc1783c87f319)
Signed-off-by: Takeshi Yamamuro 
---
 docs/sql-ref-syntax-ddl-create-table-datasource.md |  7 -
 docs/sql-ref-syntax-ddl-create-table-hiveformat.md | 32 ++
 2 files changed, 38 insertions(+), 1 deletion(-)

diff --git a/docs/sql-ref-syntax-ddl-create-table-datasource.md 
b/docs/sql-ref-syntax-ddl-create-table-datasource.md
index d334447..ba0516a 100644
--- a/docs/sql-ref-syntax-ddl-create-table-datasource.md
+++ b/docs/sql-ref-syntax-ddl-create-table-datasource.md
@@ -67,7 +67,12 @@ as any order. For example, you can write COMMENT 
table_comment after TBLPROPERTI
 
 * **SORTED BY**
 
-Determines the order in which the data is stored in buckets. Default is 
Ascending order.
+Specifies an ordering of bucket columns. Optionally, one can use ASC for 
an ascending order or DESC for a descending order after any column names in the 
SORTED BY clause.
+If not specified, ASC is assumed by default.
+   
+* **INTO num_buckets BUCKETS**
+
+Specifies buckets numbers, which is used in `CLUSTERED BY` clause.
 
 * **LOCATION**
 
diff --git a/docs/sql-ref-syntax-ddl-create-table-hiveformat.md 
b/docs/sql-ref-syntax-ddl-create-table-hiveformat.md
index 7bf847d..3a8c8d5 100644
--- a/docs/sql-ref-syntax-ddl-create-table-hiveformat.md
+++ b/docs/sql-ref-syntax-ddl-create-table-hiveformat.md
@@ -31,6 +31,9 @@ CREATE [ EXTERNAL ] TABLE [ IF NOT EXISTS ] table_identifier
 [ COMMENT table_comment ]
 [ PARTITIONED BY ( col_name2[:] col_type2 [ COMMENT col_comment2 ], ... ) 
 | ( col_name1, col_name2, ... ) ]
+[ CLUSTERED BY ( col_name1, col_name2, ...) 
+[ SORTED BY ( col_name1 [ ASC | DESC ], col_name2 [ ASC | DESC ], ... 
) ] 
+INTO num_buckets BUCKETS ]
 [ ROW FORMAT row_format ]
 [ STORED AS file_format ]
 [ LOCATION path ]
@@ -65,6 +68,21 @@ as any order. For example, you can write COMMENT 
table_comment after TBLPROPERTI
 
 Partitions are created on the table, based on the columns specified.
 
+* **CLUSTERED BY**
+
+Partitions created on the table will be bucketed into fixed buckets based 
on the column specified for bucketing.
+
+**NOTE:** Bucketing is an optimization technique that uses buckets (and 
bucketing columns) to determine data partitioning and avoid data shuffle.
+
+* **SORTED BY**
+
+Specifies an ordering of bucket columns. Optionally, one can use ASC for 
an ascending order or DESC for a descending order after any column names in the 
SORTED BY clause.
+If not specified, ASC is assumed by default.
+
+* **INTO num_buckets BUCKETS**
+
+Specifies buckets numbers, which is used in `CLUSTERED BY` clause.
+
 * **row_format**
 
 Use the `SERDE` clause to specify a custom SerDe for one table. Otherwise, 
use the `DELIMITED` clause to use the native SerDe and specify the delimiter, 
escape character, null character and so on.
@@ -203,6 +221,20 @@ CREATE EXTERNAL TABLE family (id INT, name STRING)
 STORED AS INPUTFORMAT 
'com.ly.spark.example.serde.io.SerDeExampleInputFormat'
 OUTPUTFORMAT 'com.ly.spark.example.serde.io.SerDeExampleOutputFormat'
 LOCATION '/tmp/family/';
+
+--Use `CLUSTERED BY` clause to create bucket table without `SORTED BY`
+CREATE TABLE clustered_by_test1 (ID INT, AGE STRING)
+CLUSTERED BY (ID)

[spark] branch branch-3.0 updated: [SPARK-31753][SQL][DOCS][FOLLOW-UP] Add missing keywords in the SQL docs

2020-09-30 Thread yamamuro
This is an automated email from the ASF dual-hosted git repository.

yamamuro pushed a commit to branch branch-3.0
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/branch-3.0 by this push:
 new db6ba04  [SPARK-31753][SQL][DOCS][FOLLOW-UP] Add missing keywords in 
the SQL docs
db6ba04 is described below

commit db6ba049c43e2aa1521ed39c9f2b802ad04d111f
Author: GuoPhilipse <46367746+guophili...@users.noreply.github.com>
AuthorDate: Thu Oct 1 08:15:53 2020 +0900

[SPARK-31753][SQL][DOCS][FOLLOW-UP] Add missing keywords in the SQL docs

### What changes were proposed in this pull request?
update sql-ref docs, the following key words will be added in this PR.

CLUSTERED BY
SORTED BY
INTO num_buckets BUCKETS

### Why are the changes needed?
let more users know the sql key words usage

### Does this PR introduce _any_ user-facing change?
No

![image](https://user-images.githubusercontent.com/46367746/94428281-0a6b8080-01c3-11eb-9ff3-899f8da602ca.png)

![image](https://user-images.githubusercontent.com/46367746/94428285-0d667100-01c3-11eb-8a54-90e7641d917b.png)

![image](https://user-images.githubusercontent.com/46367746/94428288-0f303480-01c3-11eb-9e1d-023538aa6e2d.png)

### How was this patch tested?
generate html test

Closes #29883 from GuoPhilipse/add-sql-missing-keywords.

Lead-authored-by: GuoPhilipse 
<46367746+guophili...@users.noreply.github.com>
Co-authored-by: GuoPhilipse 
Signed-off-by: Takeshi Yamamuro 
(cherry picked from commit 3bdbb5546d2517dda6f71613927cc1783c87f319)
Signed-off-by: Takeshi Yamamuro 
---
 docs/sql-ref-syntax-ddl-create-table-datasource.md |  7 -
 docs/sql-ref-syntax-ddl-create-table-hiveformat.md | 32 ++
 2 files changed, 38 insertions(+), 1 deletion(-)

diff --git a/docs/sql-ref-syntax-ddl-create-table-datasource.md 
b/docs/sql-ref-syntax-ddl-create-table-datasource.md
index d334447..ba0516a 100644
--- a/docs/sql-ref-syntax-ddl-create-table-datasource.md
+++ b/docs/sql-ref-syntax-ddl-create-table-datasource.md
@@ -67,7 +67,12 @@ as any order. For example, you can write COMMENT 
table_comment after TBLPROPERTI
 
 * **SORTED BY**
 
-Determines the order in which the data is stored in buckets. Default is 
Ascending order.
+Specifies an ordering of bucket columns. Optionally, one can use ASC for 
an ascending order or DESC for a descending order after any column names in the 
SORTED BY clause.
+If not specified, ASC is assumed by default.
+   
+* **INTO num_buckets BUCKETS**
+
+Specifies buckets numbers, which is used in `CLUSTERED BY` clause.
 
 * **LOCATION**
 
diff --git a/docs/sql-ref-syntax-ddl-create-table-hiveformat.md 
b/docs/sql-ref-syntax-ddl-create-table-hiveformat.md
index 7bf847d..3a8c8d5 100644
--- a/docs/sql-ref-syntax-ddl-create-table-hiveformat.md
+++ b/docs/sql-ref-syntax-ddl-create-table-hiveformat.md
@@ -31,6 +31,9 @@ CREATE [ EXTERNAL ] TABLE [ IF NOT EXISTS ] table_identifier
 [ COMMENT table_comment ]
 [ PARTITIONED BY ( col_name2[:] col_type2 [ COMMENT col_comment2 ], ... ) 
 | ( col_name1, col_name2, ... ) ]
+[ CLUSTERED BY ( col_name1, col_name2, ...) 
+[ SORTED BY ( col_name1 [ ASC | DESC ], col_name2 [ ASC | DESC ], ... 
) ] 
+INTO num_buckets BUCKETS ]
 [ ROW FORMAT row_format ]
 [ STORED AS file_format ]
 [ LOCATION path ]
@@ -65,6 +68,21 @@ as any order. For example, you can write COMMENT 
table_comment after TBLPROPERTI
 
 Partitions are created on the table, based on the columns specified.
 
+* **CLUSTERED BY**
+
+Partitions created on the table will be bucketed into fixed buckets based 
on the column specified for bucketing.
+
+**NOTE:** Bucketing is an optimization technique that uses buckets (and 
bucketing columns) to determine data partitioning and avoid data shuffle.
+
+* **SORTED BY**
+
+Specifies an ordering of bucket columns. Optionally, one can use ASC for 
an ascending order or DESC for a descending order after any column names in the 
SORTED BY clause.
+If not specified, ASC is assumed by default.
+
+* **INTO num_buckets BUCKETS**
+
+Specifies buckets numbers, which is used in `CLUSTERED BY` clause.
+
 * **row_format**
 
 Use the `SERDE` clause to specify a custom SerDe for one table. Otherwise, 
use the `DELIMITED` clause to use the native SerDe and specify the delimiter, 
escape character, null character and so on.
@@ -203,6 +221,20 @@ CREATE EXTERNAL TABLE family (id INT, name STRING)
 STORED AS INPUTFORMAT 
'com.ly.spark.example.serde.io.SerDeExampleInputFormat'
 OUTPUTFORMAT 'com.ly.spark.example.serde.io.SerDeExampleOutputFormat'
 LOCATION '/tmp/family/';
+
+--Use `CLUSTERED BY` clause to create bucket table without `SORTED BY`
+CREATE TABLE clustered_by_test1 (ID INT, AGE STRING)
+CLUSTERED BY (ID)

[spark] branch master updated (ece8d8e -> 3bdbb55)

2020-09-30 Thread yamamuro
This is an automated email from the ASF dual-hosted git repository.

yamamuro pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from ece8d8e  [SPARK-33006][K8S][DOCS] Add dynamic PVC usage example into 
K8s doc
 add 3bdbb55  [SPARK-31753][SQL][DOCS][FOLLOW-UP] Add missing keywords in 
the SQL docs

No new revisions were added by this update.

Summary of changes:
 docs/sql-ref-syntax-ddl-create-table-datasource.md |  7 -
 docs/sql-ref-syntax-ddl-create-table-hiveformat.md | 32 ++
 2 files changed, 38 insertions(+), 1 deletion(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org



[spark] branch branch-3.0 updated: [SPARK-31753][SQL][DOCS][FOLLOW-UP] Add missing keywords in the SQL docs

2020-09-30 Thread yamamuro
This is an automated email from the ASF dual-hosted git repository.

yamamuro pushed a commit to branch branch-3.0
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/branch-3.0 by this push:
 new db6ba04  [SPARK-31753][SQL][DOCS][FOLLOW-UP] Add missing keywords in 
the SQL docs
db6ba04 is described below

commit db6ba049c43e2aa1521ed39c9f2b802ad04d111f
Author: GuoPhilipse <46367746+guophili...@users.noreply.github.com>
AuthorDate: Thu Oct 1 08:15:53 2020 +0900

[SPARK-31753][SQL][DOCS][FOLLOW-UP] Add missing keywords in the SQL docs

### What changes were proposed in this pull request?
update sql-ref docs, the following key words will be added in this PR.

CLUSTERED BY
SORTED BY
INTO num_buckets BUCKETS

### Why are the changes needed?
let more users know the sql key words usage

### Does this PR introduce _any_ user-facing change?
No

![image](https://user-images.githubusercontent.com/46367746/94428281-0a6b8080-01c3-11eb-9ff3-899f8da602ca.png)

![image](https://user-images.githubusercontent.com/46367746/94428285-0d667100-01c3-11eb-8a54-90e7641d917b.png)

![image](https://user-images.githubusercontent.com/46367746/94428288-0f303480-01c3-11eb-9e1d-023538aa6e2d.png)

### How was this patch tested?
generate html test

Closes #29883 from GuoPhilipse/add-sql-missing-keywords.

Lead-authored-by: GuoPhilipse 
<46367746+guophili...@users.noreply.github.com>
Co-authored-by: GuoPhilipse 
Signed-off-by: Takeshi Yamamuro 
(cherry picked from commit 3bdbb5546d2517dda6f71613927cc1783c87f319)
Signed-off-by: Takeshi Yamamuro 
---
 docs/sql-ref-syntax-ddl-create-table-datasource.md |  7 -
 docs/sql-ref-syntax-ddl-create-table-hiveformat.md | 32 ++
 2 files changed, 38 insertions(+), 1 deletion(-)

diff --git a/docs/sql-ref-syntax-ddl-create-table-datasource.md 
b/docs/sql-ref-syntax-ddl-create-table-datasource.md
index d334447..ba0516a 100644
--- a/docs/sql-ref-syntax-ddl-create-table-datasource.md
+++ b/docs/sql-ref-syntax-ddl-create-table-datasource.md
@@ -67,7 +67,12 @@ as any order. For example, you can write COMMENT 
table_comment after TBLPROPERTI
 
 * **SORTED BY**
 
-Determines the order in which the data is stored in buckets. Default is 
Ascending order.
+Specifies an ordering of bucket columns. Optionally, one can use ASC for 
an ascending order or DESC for a descending order after any column names in the 
SORTED BY clause.
+If not specified, ASC is assumed by default.
+   
+* **INTO num_buckets BUCKETS**
+
+Specifies buckets numbers, which is used in `CLUSTERED BY` clause.
 
 * **LOCATION**
 
diff --git a/docs/sql-ref-syntax-ddl-create-table-hiveformat.md 
b/docs/sql-ref-syntax-ddl-create-table-hiveformat.md
index 7bf847d..3a8c8d5 100644
--- a/docs/sql-ref-syntax-ddl-create-table-hiveformat.md
+++ b/docs/sql-ref-syntax-ddl-create-table-hiveformat.md
@@ -31,6 +31,9 @@ CREATE [ EXTERNAL ] TABLE [ IF NOT EXISTS ] table_identifier
 [ COMMENT table_comment ]
 [ PARTITIONED BY ( col_name2[:] col_type2 [ COMMENT col_comment2 ], ... ) 
 | ( col_name1, col_name2, ... ) ]
+[ CLUSTERED BY ( col_name1, col_name2, ...) 
+[ SORTED BY ( col_name1 [ ASC | DESC ], col_name2 [ ASC | DESC ], ... 
) ] 
+INTO num_buckets BUCKETS ]
 [ ROW FORMAT row_format ]
 [ STORED AS file_format ]
 [ LOCATION path ]
@@ -65,6 +68,21 @@ as any order. For example, you can write COMMENT 
table_comment after TBLPROPERTI
 
 Partitions are created on the table, based on the columns specified.
 
+* **CLUSTERED BY**
+
+Partitions created on the table will be bucketed into fixed buckets based 
on the column specified for bucketing.
+
+**NOTE:** Bucketing is an optimization technique that uses buckets (and 
bucketing columns) to determine data partitioning and avoid data shuffle.
+
+* **SORTED BY**
+
+Specifies an ordering of bucket columns. Optionally, one can use ASC for 
an ascending order or DESC for a descending order after any column names in the 
SORTED BY clause.
+If not specified, ASC is assumed by default.
+
+* **INTO num_buckets BUCKETS**
+
+Specifies buckets numbers, which is used in `CLUSTERED BY` clause.
+
 * **row_format**
 
 Use the `SERDE` clause to specify a custom SerDe for one table. Otherwise, 
use the `DELIMITED` clause to use the native SerDe and specify the delimiter, 
escape character, null character and so on.
@@ -203,6 +221,20 @@ CREATE EXTERNAL TABLE family (id INT, name STRING)
 STORED AS INPUTFORMAT 
'com.ly.spark.example.serde.io.SerDeExampleInputFormat'
 OUTPUTFORMAT 'com.ly.spark.example.serde.io.SerDeExampleOutputFormat'
 LOCATION '/tmp/family/';
+
+--Use `CLUSTERED BY` clause to create bucket table without `SORTED BY`
+CREATE TABLE clustered_by_test1 (ID INT, AGE STRING)
+CLUSTERED BY (ID)

[spark] branch master updated (ece8d8e -> 3bdbb55)

2020-09-30 Thread yamamuro
This is an automated email from the ASF dual-hosted git repository.

yamamuro pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from ece8d8e  [SPARK-33006][K8S][DOCS] Add dynamic PVC usage example into 
K8s doc
 add 3bdbb55  [SPARK-31753][SQL][DOCS][FOLLOW-UP] Add missing keywords in 
the SQL docs

No new revisions were added by this update.

Summary of changes:
 docs/sql-ref-syntax-ddl-create-table-datasource.md |  7 -
 docs/sql-ref-syntax-ddl-create-table-hiveformat.md | 32 ++
 2 files changed, 38 insertions(+), 1 deletion(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org



[spark] branch branch-3.0 updated: [SPARK-31753][SQL][DOCS][FOLLOW-UP] Add missing keywords in the SQL docs

2020-09-30 Thread yamamuro
This is an automated email from the ASF dual-hosted git repository.

yamamuro pushed a commit to branch branch-3.0
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/branch-3.0 by this push:
 new db6ba04  [SPARK-31753][SQL][DOCS][FOLLOW-UP] Add missing keywords in 
the SQL docs
db6ba04 is described below

commit db6ba049c43e2aa1521ed39c9f2b802ad04d111f
Author: GuoPhilipse <46367746+guophili...@users.noreply.github.com>
AuthorDate: Thu Oct 1 08:15:53 2020 +0900

[SPARK-31753][SQL][DOCS][FOLLOW-UP] Add missing keywords in the SQL docs

### What changes were proposed in this pull request?
update sql-ref docs, the following key words will be added in this PR.

CLUSTERED BY
SORTED BY
INTO num_buckets BUCKETS

### Why are the changes needed?
let more users know the sql key words usage

### Does this PR introduce _any_ user-facing change?
No

![image](https://user-images.githubusercontent.com/46367746/94428281-0a6b8080-01c3-11eb-9ff3-899f8da602ca.png)

![image](https://user-images.githubusercontent.com/46367746/94428285-0d667100-01c3-11eb-8a54-90e7641d917b.png)

![image](https://user-images.githubusercontent.com/46367746/94428288-0f303480-01c3-11eb-9e1d-023538aa6e2d.png)

### How was this patch tested?
generate html test

Closes #29883 from GuoPhilipse/add-sql-missing-keywords.

Lead-authored-by: GuoPhilipse 
<46367746+guophili...@users.noreply.github.com>
Co-authored-by: GuoPhilipse 
Signed-off-by: Takeshi Yamamuro 
(cherry picked from commit 3bdbb5546d2517dda6f71613927cc1783c87f319)
Signed-off-by: Takeshi Yamamuro 
---
 docs/sql-ref-syntax-ddl-create-table-datasource.md |  7 -
 docs/sql-ref-syntax-ddl-create-table-hiveformat.md | 32 ++
 2 files changed, 38 insertions(+), 1 deletion(-)

diff --git a/docs/sql-ref-syntax-ddl-create-table-datasource.md 
b/docs/sql-ref-syntax-ddl-create-table-datasource.md
index d334447..ba0516a 100644
--- a/docs/sql-ref-syntax-ddl-create-table-datasource.md
+++ b/docs/sql-ref-syntax-ddl-create-table-datasource.md
@@ -67,7 +67,12 @@ as any order. For example, you can write COMMENT 
table_comment after TBLPROPERTI
 
 * **SORTED BY**
 
-Determines the order in which the data is stored in buckets. Default is 
Ascending order.
+Specifies an ordering of bucket columns. Optionally, one can use ASC for 
an ascending order or DESC for a descending order after any column names in the 
SORTED BY clause.
+If not specified, ASC is assumed by default.
+   
+* **INTO num_buckets BUCKETS**
+
+Specifies buckets numbers, which is used in `CLUSTERED BY` clause.
 
 * **LOCATION**
 
diff --git a/docs/sql-ref-syntax-ddl-create-table-hiveformat.md 
b/docs/sql-ref-syntax-ddl-create-table-hiveformat.md
index 7bf847d..3a8c8d5 100644
--- a/docs/sql-ref-syntax-ddl-create-table-hiveformat.md
+++ b/docs/sql-ref-syntax-ddl-create-table-hiveformat.md
@@ -31,6 +31,9 @@ CREATE [ EXTERNAL ] TABLE [ IF NOT EXISTS ] table_identifier
 [ COMMENT table_comment ]
 [ PARTITIONED BY ( col_name2[:] col_type2 [ COMMENT col_comment2 ], ... ) 
 | ( col_name1, col_name2, ... ) ]
+[ CLUSTERED BY ( col_name1, col_name2, ...) 
+[ SORTED BY ( col_name1 [ ASC | DESC ], col_name2 [ ASC | DESC ], ... 
) ] 
+INTO num_buckets BUCKETS ]
 [ ROW FORMAT row_format ]
 [ STORED AS file_format ]
 [ LOCATION path ]
@@ -65,6 +68,21 @@ as any order. For example, you can write COMMENT 
table_comment after TBLPROPERTI
 
 Partitions are created on the table, based on the columns specified.
 
+* **CLUSTERED BY**
+
+Partitions created on the table will be bucketed into fixed buckets based 
on the column specified for bucketing.
+
+**NOTE:** Bucketing is an optimization technique that uses buckets (and 
bucketing columns) to determine data partitioning and avoid data shuffle.
+
+* **SORTED BY**
+
+Specifies an ordering of bucket columns. Optionally, one can use ASC for 
an ascending order or DESC for a descending order after any column names in the 
SORTED BY clause.
+If not specified, ASC is assumed by default.
+
+* **INTO num_buckets BUCKETS**
+
+Specifies buckets numbers, which is used in `CLUSTERED BY` clause.
+
 * **row_format**
 
 Use the `SERDE` clause to specify a custom SerDe for one table. Otherwise, 
use the `DELIMITED` clause to use the native SerDe and specify the delimiter, 
escape character, null character and so on.
@@ -203,6 +221,20 @@ CREATE EXTERNAL TABLE family (id INT, name STRING)
 STORED AS INPUTFORMAT 
'com.ly.spark.example.serde.io.SerDeExampleInputFormat'
 OUTPUTFORMAT 'com.ly.spark.example.serde.io.SerDeExampleOutputFormat'
 LOCATION '/tmp/family/';
+
+--Use `CLUSTERED BY` clause to create bucket table without `SORTED BY`
+CREATE TABLE clustered_by_test1 (ID INT, AGE STRING)
+CLUSTERED BY (ID)

[spark] branch master updated (ece8d8e -> 3bdbb55)

2020-09-30 Thread yamamuro
This is an automated email from the ASF dual-hosted git repository.

yamamuro pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from ece8d8e  [SPARK-33006][K8S][DOCS] Add dynamic PVC usage example into 
K8s doc
 add 3bdbb55  [SPARK-31753][SQL][DOCS][FOLLOW-UP] Add missing keywords in 
the SQL docs

No new revisions were added by this update.

Summary of changes:
 docs/sql-ref-syntax-ddl-create-table-datasource.md |  7 -
 docs/sql-ref-syntax-ddl-create-table-hiveformat.md | 32 ++
 2 files changed, 38 insertions(+), 1 deletion(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org



[spark] branch branch-3.0 updated: [SPARK-31753][SQL][DOCS][FOLLOW-UP] Add missing keywords in the SQL docs

2020-09-30 Thread yamamuro
This is an automated email from the ASF dual-hosted git repository.

yamamuro pushed a commit to branch branch-3.0
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/branch-3.0 by this push:
 new db6ba04  [SPARK-31753][SQL][DOCS][FOLLOW-UP] Add missing keywords in 
the SQL docs
db6ba04 is described below

commit db6ba049c43e2aa1521ed39c9f2b802ad04d111f
Author: GuoPhilipse <46367746+guophili...@users.noreply.github.com>
AuthorDate: Thu Oct 1 08:15:53 2020 +0900

[SPARK-31753][SQL][DOCS][FOLLOW-UP] Add missing keywords in the SQL docs

### What changes were proposed in this pull request?
update sql-ref docs, the following key words will be added in this PR.

CLUSTERED BY
SORTED BY
INTO num_buckets BUCKETS

### Why are the changes needed?
let more users know the sql key words usage

### Does this PR introduce _any_ user-facing change?
No

![image](https://user-images.githubusercontent.com/46367746/94428281-0a6b8080-01c3-11eb-9ff3-899f8da602ca.png)

![image](https://user-images.githubusercontent.com/46367746/94428285-0d667100-01c3-11eb-8a54-90e7641d917b.png)

![image](https://user-images.githubusercontent.com/46367746/94428288-0f303480-01c3-11eb-9e1d-023538aa6e2d.png)

### How was this patch tested?
generate html test

Closes #29883 from GuoPhilipse/add-sql-missing-keywords.

Lead-authored-by: GuoPhilipse 
<46367746+guophili...@users.noreply.github.com>
Co-authored-by: GuoPhilipse 
Signed-off-by: Takeshi Yamamuro 
(cherry picked from commit 3bdbb5546d2517dda6f71613927cc1783c87f319)
Signed-off-by: Takeshi Yamamuro 
---
 docs/sql-ref-syntax-ddl-create-table-datasource.md |  7 -
 docs/sql-ref-syntax-ddl-create-table-hiveformat.md | 32 ++
 2 files changed, 38 insertions(+), 1 deletion(-)

diff --git a/docs/sql-ref-syntax-ddl-create-table-datasource.md 
b/docs/sql-ref-syntax-ddl-create-table-datasource.md
index d334447..ba0516a 100644
--- a/docs/sql-ref-syntax-ddl-create-table-datasource.md
+++ b/docs/sql-ref-syntax-ddl-create-table-datasource.md
@@ -67,7 +67,12 @@ as any order. For example, you can write COMMENT 
table_comment after TBLPROPERTI
 
 * **SORTED BY**
 
-Determines the order in which the data is stored in buckets. Default is 
Ascending order.
+Specifies an ordering of bucket columns. Optionally, one can use ASC for 
an ascending order or DESC for a descending order after any column names in the 
SORTED BY clause.
+If not specified, ASC is assumed by default.
+   
+* **INTO num_buckets BUCKETS**
+
+Specifies buckets numbers, which is used in `CLUSTERED BY` clause.
 
 * **LOCATION**
 
diff --git a/docs/sql-ref-syntax-ddl-create-table-hiveformat.md 
b/docs/sql-ref-syntax-ddl-create-table-hiveformat.md
index 7bf847d..3a8c8d5 100644
--- a/docs/sql-ref-syntax-ddl-create-table-hiveformat.md
+++ b/docs/sql-ref-syntax-ddl-create-table-hiveformat.md
@@ -31,6 +31,9 @@ CREATE [ EXTERNAL ] TABLE [ IF NOT EXISTS ] table_identifier
 [ COMMENT table_comment ]
 [ PARTITIONED BY ( col_name2[:] col_type2 [ COMMENT col_comment2 ], ... ) 
 | ( col_name1, col_name2, ... ) ]
+[ CLUSTERED BY ( col_name1, col_name2, ...) 
+[ SORTED BY ( col_name1 [ ASC | DESC ], col_name2 [ ASC | DESC ], ... 
) ] 
+INTO num_buckets BUCKETS ]
 [ ROW FORMAT row_format ]
 [ STORED AS file_format ]
 [ LOCATION path ]
@@ -65,6 +68,21 @@ as any order. For example, you can write COMMENT 
table_comment after TBLPROPERTI
 
 Partitions are created on the table, based on the columns specified.
 
+* **CLUSTERED BY**
+
+Partitions created on the table will be bucketed into fixed buckets based 
on the column specified for bucketing.
+
+**NOTE:** Bucketing is an optimization technique that uses buckets (and 
bucketing columns) to determine data partitioning and avoid data shuffle.
+
+* **SORTED BY**
+
+Specifies an ordering of bucket columns. Optionally, one can use ASC for 
an ascending order or DESC for a descending order after any column names in the 
SORTED BY clause.
+If not specified, ASC is assumed by default.
+
+* **INTO num_buckets BUCKETS**
+
+Specifies buckets numbers, which is used in `CLUSTERED BY` clause.
+
 * **row_format**
 
 Use the `SERDE` clause to specify a custom SerDe for one table. Otherwise, 
use the `DELIMITED` clause to use the native SerDe and specify the delimiter, 
escape character, null character and so on.
@@ -203,6 +221,20 @@ CREATE EXTERNAL TABLE family (id INT, name STRING)
 STORED AS INPUTFORMAT 
'com.ly.spark.example.serde.io.SerDeExampleInputFormat'
 OUTPUTFORMAT 'com.ly.spark.example.serde.io.SerDeExampleOutputFormat'
 LOCATION '/tmp/family/';
+
+--Use `CLUSTERED BY` clause to create bucket table without `SORTED BY`
+CREATE TABLE clustered_by_test1 (ID INT, AGE STRING)
+CLUSTERED BY (ID)

[spark] branch master updated (ece8d8e -> 3bdbb55)

2020-09-30 Thread yamamuro
This is an automated email from the ASF dual-hosted git repository.

yamamuro pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from ece8d8e  [SPARK-33006][K8S][DOCS] Add dynamic PVC usage example into 
K8s doc
 add 3bdbb55  [SPARK-31753][SQL][DOCS][FOLLOW-UP] Add missing keywords in 
the SQL docs

No new revisions were added by this update.

Summary of changes:
 docs/sql-ref-syntax-ddl-create-table-datasource.md |  7 -
 docs/sql-ref-syntax-ddl-create-table-hiveformat.md | 32 ++
 2 files changed, 38 insertions(+), 1 deletion(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org



[spark] branch master updated (ece8d8e -> 3bdbb55)

2020-09-30 Thread yamamuro
This is an automated email from the ASF dual-hosted git repository.

yamamuro pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from ece8d8e  [SPARK-33006][K8S][DOCS] Add dynamic PVC usage example into 
K8s doc
 add 3bdbb55  [SPARK-31753][SQL][DOCS][FOLLOW-UP] Add missing keywords in 
the SQL docs

No new revisions were added by this update.

Summary of changes:
 docs/sql-ref-syntax-ddl-create-table-datasource.md |  7 -
 docs/sql-ref-syntax-ddl-create-table-hiveformat.md | 32 ++
 2 files changed, 38 insertions(+), 1 deletion(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org



[spark] branch master updated (cc06266 -> 3a299aa)

2020-09-30 Thread yamamuro
This is an automated email from the ASF dual-hosted git repository.

yamamuro pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from cc06266  [SPARK-33019][CORE] Use 
spark.hadoop.mapreduce.fileoutputcommitter.algorithm.version=1 by default
 add 3a299aa  [SPARK-32741][SQL] Check if the same ExprId refers to the 
unique attribute in logical plans

No new revisions were added by this update.

Summary of changes:
 .../spark/sql/catalyst/analysis/Analyzer.scala | 11 +++-
 .../spark/sql/catalyst/optimizer/Optimizer.scala   | 15 +++--
 .../spark/sql/catalyst/optimizer/subquery.scala| 51 +---
 .../sql/catalyst/plans/logical/LogicalPlan.scala   | 70 ++
 .../optimizer/FoldablePropagationSuite.scala   |  4 +-
 .../plans/logical/LogicalPlanIntegritySuite.scala  | 51 
 .../sql/execution/adaptive/AQEOptimizer.scala  |  8 ++-
 .../apache/spark/sql/streaming/StreamSuite.scala   |  7 +--
 8 files changed, 181 insertions(+), 36 deletions(-)
 create mode 100644 
sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/plans/logical/LogicalPlanIntegritySuite.scala


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org



[spark] branch master updated (cc06266 -> 3a299aa)

2020-09-30 Thread yamamuro
This is an automated email from the ASF dual-hosted git repository.

yamamuro pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from cc06266  [SPARK-33019][CORE] Use 
spark.hadoop.mapreduce.fileoutputcommitter.algorithm.version=1 by default
 add 3a299aa  [SPARK-32741][SQL] Check if the same ExprId refers to the 
unique attribute in logical plans

No new revisions were added by this update.

Summary of changes:
 .../spark/sql/catalyst/analysis/Analyzer.scala | 11 +++-
 .../spark/sql/catalyst/optimizer/Optimizer.scala   | 15 +++--
 .../spark/sql/catalyst/optimizer/subquery.scala| 51 +---
 .../sql/catalyst/plans/logical/LogicalPlan.scala   | 70 ++
 .../optimizer/FoldablePropagationSuite.scala   |  4 +-
 .../plans/logical/LogicalPlanIntegritySuite.scala  | 51 
 .../sql/execution/adaptive/AQEOptimizer.scala  |  8 ++-
 .../apache/spark/sql/streaming/StreamSuite.scala   |  7 +--
 8 files changed, 181 insertions(+), 36 deletions(-)
 create mode 100644 
sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/plans/logical/LogicalPlanIntegritySuite.scala


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org



[spark] branch master updated (cc06266 -> 3a299aa)

2020-09-30 Thread yamamuro
This is an automated email from the ASF dual-hosted git repository.

yamamuro pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from cc06266  [SPARK-33019][CORE] Use 
spark.hadoop.mapreduce.fileoutputcommitter.algorithm.version=1 by default
 add 3a299aa  [SPARK-32741][SQL] Check if the same ExprId refers to the 
unique attribute in logical plans

No new revisions were added by this update.

Summary of changes:
 .../spark/sql/catalyst/analysis/Analyzer.scala | 11 +++-
 .../spark/sql/catalyst/optimizer/Optimizer.scala   | 15 +++--
 .../spark/sql/catalyst/optimizer/subquery.scala| 51 +---
 .../sql/catalyst/plans/logical/LogicalPlan.scala   | 70 ++
 .../optimizer/FoldablePropagationSuite.scala   |  4 +-
 .../plans/logical/LogicalPlanIntegritySuite.scala  | 51 
 .../sql/execution/adaptive/AQEOptimizer.scala  |  8 ++-
 .../apache/spark/sql/streaming/StreamSuite.scala   |  7 +--
 8 files changed, 181 insertions(+), 36 deletions(-)
 create mode 100644 
sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/plans/logical/LogicalPlanIntegritySuite.scala


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org



[spark] branch master updated (cc06266 -> 3a299aa)

2020-09-30 Thread yamamuro
This is an automated email from the ASF dual-hosted git repository.

yamamuro pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from cc06266  [SPARK-33019][CORE] Use 
spark.hadoop.mapreduce.fileoutputcommitter.algorithm.version=1 by default
 add 3a299aa  [SPARK-32741][SQL] Check if the same ExprId refers to the 
unique attribute in logical plans

No new revisions were added by this update.

Summary of changes:
 .../spark/sql/catalyst/analysis/Analyzer.scala | 11 +++-
 .../spark/sql/catalyst/optimizer/Optimizer.scala   | 15 +++--
 .../spark/sql/catalyst/optimizer/subquery.scala| 51 +---
 .../sql/catalyst/plans/logical/LogicalPlan.scala   | 70 ++
 .../optimizer/FoldablePropagationSuite.scala   |  4 +-
 .../plans/logical/LogicalPlanIntegritySuite.scala  | 51 
 .../sql/execution/adaptive/AQEOptimizer.scala  |  8 ++-
 .../apache/spark/sql/streaming/StreamSuite.scala   |  7 +--
 8 files changed, 181 insertions(+), 36 deletions(-)
 create mode 100644 
sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/plans/logical/LogicalPlanIntegritySuite.scala


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org



[spark] branch master updated (cc06266 -> 3a299aa)

2020-09-30 Thread yamamuro
This is an automated email from the ASF dual-hosted git repository.

yamamuro pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from cc06266  [SPARK-33019][CORE] Use 
spark.hadoop.mapreduce.fileoutputcommitter.algorithm.version=1 by default
 add 3a299aa  [SPARK-32741][SQL] Check if the same ExprId refers to the 
unique attribute in logical plans

No new revisions were added by this update.

Summary of changes:
 .../spark/sql/catalyst/analysis/Analyzer.scala | 11 +++-
 .../spark/sql/catalyst/optimizer/Optimizer.scala   | 15 +++--
 .../spark/sql/catalyst/optimizer/subquery.scala| 51 +---
 .../sql/catalyst/plans/logical/LogicalPlan.scala   | 70 ++
 .../optimizer/FoldablePropagationSuite.scala   |  4 +-
 .../plans/logical/LogicalPlanIntegritySuite.scala  | 51 
 .../sql/execution/adaptive/AQEOptimizer.scala  |  8 ++-
 .../apache/spark/sql/streaming/StreamSuite.scala   |  7 +--
 8 files changed, 181 insertions(+), 36 deletions(-)
 create mode 100644 
sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/plans/logical/LogicalPlanIntegritySuite.scala


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org



[spark] branch master updated (b53da23 -> acfee3c)

2020-09-22 Thread yamamuro
This is an automated email from the ASF dual-hosted git repository.

yamamuro pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from b53da23  [MINOR][SQL] Improve examples for `percentile_approx()`
 add acfee3c  [SPARK-32870][DOCS][SQL] Make sure that all expressions have 
their ExpressionDescription filled

No new revisions were added by this update.

Summary of changes:
 .../sql/catalyst/analysis/FunctionRegistry.scala   |  2 +-
 .../expressions/CallMethodViaReflection.scala  |  3 +-
 .../spark/sql/catalyst/expressions/Cast.scala  |  3 +-
 .../expressions/MonotonicallyIncreasingID.scala|  8 +-
 .../catalyst/expressions/SparkPartitionID.scala|  8 +-
 .../expressions/aggregate/CountMinSketchAgg.scala  |  7 ++
 .../expressions/aggregate/bitwiseAggregates.scala  |  1 +
 .../sql/catalyst/expressions/arithmetic.scala  | 38 ++---
 .../catalyst/expressions/bitwiseExpressions.scala  | 12 ++-
 .../expressions/collectionOperations.scala | 18 +++--
 .../catalyst/expressions/complexTypeCreator.scala  | 20 +++--
 .../expressions/conditionalExpressions.scala   |  6 +-
 .../catalyst/expressions/datetimeExpressions.scala | 40 +-
 .../sql/catalyst/expressions/generators.scala  | 12 ++-
 .../spark/sql/catalyst/expressions/hash.scala  | 18 +++--
 .../sql/catalyst/expressions/inputFileBlock.scala  | 27 ++-
 .../sql/catalyst/expressions/jsonExpressions.scala |  6 +-
 .../spark/sql/catalyst/expressions/misc.scala  | 16 +++-
 .../sql/catalyst/expressions/predicates.scala  | 61 ---
 .../catalyst/expressions/windowExpressions.scala   | 91 ++
 .../spark/sql/catalyst/expressions/xml/xpath.scala | 24 --
 .../sql-functions/sql-expression-schema.md | 48 ++--
 .../test/resources/sql-tests/results/cast.sql.out  |  2 +
 .../apache/spark/sql/ExpressionsSchemaSuite.scala  | 10 ++-
 .../spark/sql/execution/command/DDLSuite.scala |  6 +-
 .../sql/expressions/ExpressionInfoSuite.scala  | 37 -
 26 files changed, 404 insertions(+), 120 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org



[spark] branch master updated (b53da23 -> acfee3c)

2020-09-22 Thread yamamuro
This is an automated email from the ASF dual-hosted git repository.

yamamuro pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from b53da23  [MINOR][SQL] Improve examples for `percentile_approx()`
 add acfee3c  [SPARK-32870][DOCS][SQL] Make sure that all expressions have 
their ExpressionDescription filled

No new revisions were added by this update.

Summary of changes:
 .../sql/catalyst/analysis/FunctionRegistry.scala   |  2 +-
 .../expressions/CallMethodViaReflection.scala  |  3 +-
 .../spark/sql/catalyst/expressions/Cast.scala  |  3 +-
 .../expressions/MonotonicallyIncreasingID.scala|  8 +-
 .../catalyst/expressions/SparkPartitionID.scala|  8 +-
 .../expressions/aggregate/CountMinSketchAgg.scala  |  7 ++
 .../expressions/aggregate/bitwiseAggregates.scala  |  1 +
 .../sql/catalyst/expressions/arithmetic.scala  | 38 ++---
 .../catalyst/expressions/bitwiseExpressions.scala  | 12 ++-
 .../expressions/collectionOperations.scala | 18 +++--
 .../catalyst/expressions/complexTypeCreator.scala  | 20 +++--
 .../expressions/conditionalExpressions.scala   |  6 +-
 .../catalyst/expressions/datetimeExpressions.scala | 40 +-
 .../sql/catalyst/expressions/generators.scala  | 12 ++-
 .../spark/sql/catalyst/expressions/hash.scala  | 18 +++--
 .../sql/catalyst/expressions/inputFileBlock.scala  | 27 ++-
 .../sql/catalyst/expressions/jsonExpressions.scala |  6 +-
 .../spark/sql/catalyst/expressions/misc.scala  | 16 +++-
 .../sql/catalyst/expressions/predicates.scala  | 61 ---
 .../catalyst/expressions/windowExpressions.scala   | 91 ++
 .../spark/sql/catalyst/expressions/xml/xpath.scala | 24 --
 .../sql-functions/sql-expression-schema.md | 48 ++--
 .../test/resources/sql-tests/results/cast.sql.out  |  2 +
 .../apache/spark/sql/ExpressionsSchemaSuite.scala  | 10 ++-
 .../spark/sql/execution/command/DDLSuite.scala |  6 +-
 .../sql/expressions/ExpressionInfoSuite.scala  | 37 -
 26 files changed, 404 insertions(+), 120 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org



[spark] branch master updated (b53da23 -> acfee3c)

2020-09-22 Thread yamamuro
This is an automated email from the ASF dual-hosted git repository.

yamamuro pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from b53da23  [MINOR][SQL] Improve examples for `percentile_approx()`
 add acfee3c  [SPARK-32870][DOCS][SQL] Make sure that all expressions have 
their ExpressionDescription filled

No new revisions were added by this update.

Summary of changes:
 .../sql/catalyst/analysis/FunctionRegistry.scala   |  2 +-
 .../expressions/CallMethodViaReflection.scala  |  3 +-
 .../spark/sql/catalyst/expressions/Cast.scala  |  3 +-
 .../expressions/MonotonicallyIncreasingID.scala|  8 +-
 .../catalyst/expressions/SparkPartitionID.scala|  8 +-
 .../expressions/aggregate/CountMinSketchAgg.scala  |  7 ++
 .../expressions/aggregate/bitwiseAggregates.scala  |  1 +
 .../sql/catalyst/expressions/arithmetic.scala  | 38 ++---
 .../catalyst/expressions/bitwiseExpressions.scala  | 12 ++-
 .../expressions/collectionOperations.scala | 18 +++--
 .../catalyst/expressions/complexTypeCreator.scala  | 20 +++--
 .../expressions/conditionalExpressions.scala   |  6 +-
 .../catalyst/expressions/datetimeExpressions.scala | 40 +-
 .../sql/catalyst/expressions/generators.scala  | 12 ++-
 .../spark/sql/catalyst/expressions/hash.scala  | 18 +++--
 .../sql/catalyst/expressions/inputFileBlock.scala  | 27 ++-
 .../sql/catalyst/expressions/jsonExpressions.scala |  6 +-
 .../spark/sql/catalyst/expressions/misc.scala  | 16 +++-
 .../sql/catalyst/expressions/predicates.scala  | 61 ---
 .../catalyst/expressions/windowExpressions.scala   | 91 ++
 .../spark/sql/catalyst/expressions/xml/xpath.scala | 24 --
 .../sql-functions/sql-expression-schema.md | 48 ++--
 .../test/resources/sql-tests/results/cast.sql.out  |  2 +
 .../apache/spark/sql/ExpressionsSchemaSuite.scala  | 10 ++-
 .../spark/sql/execution/command/DDLSuite.scala |  6 +-
 .../sql/expressions/ExpressionInfoSuite.scala  | 37 -
 26 files changed, 404 insertions(+), 120 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org



[spark] branch master updated (b53da23 -> acfee3c)

2020-09-22 Thread yamamuro
This is an automated email from the ASF dual-hosted git repository.

yamamuro pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from b53da23  [MINOR][SQL] Improve examples for `percentile_approx()`
 add acfee3c  [SPARK-32870][DOCS][SQL] Make sure that all expressions have 
their ExpressionDescription filled

No new revisions were added by this update.

Summary of changes:
 .../sql/catalyst/analysis/FunctionRegistry.scala   |  2 +-
 .../expressions/CallMethodViaReflection.scala  |  3 +-
 .../spark/sql/catalyst/expressions/Cast.scala  |  3 +-
 .../expressions/MonotonicallyIncreasingID.scala|  8 +-
 .../catalyst/expressions/SparkPartitionID.scala|  8 +-
 .../expressions/aggregate/CountMinSketchAgg.scala  |  7 ++
 .../expressions/aggregate/bitwiseAggregates.scala  |  1 +
 .../sql/catalyst/expressions/arithmetic.scala  | 38 ++---
 .../catalyst/expressions/bitwiseExpressions.scala  | 12 ++-
 .../expressions/collectionOperations.scala | 18 +++--
 .../catalyst/expressions/complexTypeCreator.scala  | 20 +++--
 .../expressions/conditionalExpressions.scala   |  6 +-
 .../catalyst/expressions/datetimeExpressions.scala | 40 +-
 .../sql/catalyst/expressions/generators.scala  | 12 ++-
 .../spark/sql/catalyst/expressions/hash.scala  | 18 +++--
 .../sql/catalyst/expressions/inputFileBlock.scala  | 27 ++-
 .../sql/catalyst/expressions/jsonExpressions.scala |  6 +-
 .../spark/sql/catalyst/expressions/misc.scala  | 16 +++-
 .../sql/catalyst/expressions/predicates.scala  | 61 ---
 .../catalyst/expressions/windowExpressions.scala   | 91 ++
 .../spark/sql/catalyst/expressions/xml/xpath.scala | 24 --
 .../sql-functions/sql-expression-schema.md | 48 ++--
 .../test/resources/sql-tests/results/cast.sql.out  |  2 +
 .../apache/spark/sql/ExpressionsSchemaSuite.scala  | 10 ++-
 .../spark/sql/execution/command/DDLSuite.scala |  6 +-
 .../sql/expressions/ExpressionInfoSuite.scala  | 37 -
 26 files changed, 404 insertions(+), 120 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org



[spark] branch master updated (b53da23 -> acfee3c)

2020-09-22 Thread yamamuro
This is an automated email from the ASF dual-hosted git repository.

yamamuro pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from b53da23  [MINOR][SQL] Improve examples for `percentile_approx()`
 add acfee3c  [SPARK-32870][DOCS][SQL] Make sure that all expressions have 
their ExpressionDescription filled

No new revisions were added by this update.

Summary of changes:
 .../sql/catalyst/analysis/FunctionRegistry.scala   |  2 +-
 .../expressions/CallMethodViaReflection.scala  |  3 +-
 .../spark/sql/catalyst/expressions/Cast.scala  |  3 +-
 .../expressions/MonotonicallyIncreasingID.scala|  8 +-
 .../catalyst/expressions/SparkPartitionID.scala|  8 +-
 .../expressions/aggregate/CountMinSketchAgg.scala  |  7 ++
 .../expressions/aggregate/bitwiseAggregates.scala  |  1 +
 .../sql/catalyst/expressions/arithmetic.scala  | 38 ++---
 .../catalyst/expressions/bitwiseExpressions.scala  | 12 ++-
 .../expressions/collectionOperations.scala | 18 +++--
 .../catalyst/expressions/complexTypeCreator.scala  | 20 +++--
 .../expressions/conditionalExpressions.scala   |  6 +-
 .../catalyst/expressions/datetimeExpressions.scala | 40 +-
 .../sql/catalyst/expressions/generators.scala  | 12 ++-
 .../spark/sql/catalyst/expressions/hash.scala  | 18 +++--
 .../sql/catalyst/expressions/inputFileBlock.scala  | 27 ++-
 .../sql/catalyst/expressions/jsonExpressions.scala |  6 +-
 .../spark/sql/catalyst/expressions/misc.scala  | 16 +++-
 .../sql/catalyst/expressions/predicates.scala  | 61 ---
 .../catalyst/expressions/windowExpressions.scala   | 91 ++
 .../spark/sql/catalyst/expressions/xml/xpath.scala | 24 --
 .../sql-functions/sql-expression-schema.md | 48 ++--
 .../test/resources/sql-tests/results/cast.sql.out  |  2 +
 .../apache/spark/sql/ExpressionsSchemaSuite.scala  | 10 ++-
 .../spark/sql/execution/command/DDLSuite.scala |  6 +-
 .../sql/expressions/ExpressionInfoSuite.scala  | 37 -
 26 files changed, 404 insertions(+), 120 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org



[spark] branch master updated (4ced588 -> 68e0d5f)

2020-09-17 Thread yamamuro
This is an automated email from the ASF dual-hosted git repository.

yamamuro pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 4ced588  [SPARK-32635][SQL] Fix foldable propagation
 add 68e0d5f  [SPARK-32902][SQL] Logging plan changes for AQE

No new revisions were added by this update.

Summary of changes:
 .../execution/adaptive/AdaptiveSparkPlanExec.scala | 45 +-
 .../adaptive/AdaptiveQueryExecSuite.scala  | 20 ++
 2 files changed, 56 insertions(+), 9 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org



[spark] branch master updated (4ced588 -> 68e0d5f)

2020-09-17 Thread yamamuro
This is an automated email from the ASF dual-hosted git repository.

yamamuro pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 4ced588  [SPARK-32635][SQL] Fix foldable propagation
 add 68e0d5f  [SPARK-32902][SQL] Logging plan changes for AQE

No new revisions were added by this update.

Summary of changes:
 .../execution/adaptive/AdaptiveSparkPlanExec.scala | 45 +-
 .../adaptive/AdaptiveQueryExecSuite.scala  | 20 ++
 2 files changed, 56 insertions(+), 9 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org



[spark] branch master updated (4ced588 -> 68e0d5f)

2020-09-17 Thread yamamuro
This is an automated email from the ASF dual-hosted git repository.

yamamuro pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 4ced588  [SPARK-32635][SQL] Fix foldable propagation
 add 68e0d5f  [SPARK-32902][SQL] Logging plan changes for AQE

No new revisions were added by this update.

Summary of changes:
 .../execution/adaptive/AdaptiveSparkPlanExec.scala | 45 +-
 .../adaptive/AdaptiveQueryExecSuite.scala  | 20 ++
 2 files changed, 56 insertions(+), 9 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org



[spark] branch master updated (4ced588 -> 68e0d5f)

2020-09-17 Thread yamamuro
This is an automated email from the ASF dual-hosted git repository.

yamamuro pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 4ced588  [SPARK-32635][SQL] Fix foldable propagation
 add 68e0d5f  [SPARK-32902][SQL] Logging plan changes for AQE

No new revisions were added by this update.

Summary of changes:
 .../execution/adaptive/AdaptiveSparkPlanExec.scala | 45 +-
 .../adaptive/AdaptiveQueryExecSuite.scala  | 20 ++
 2 files changed, 56 insertions(+), 9 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org



[spark] branch master updated (4ced588 -> 68e0d5f)

2020-09-17 Thread yamamuro
This is an automated email from the ASF dual-hosted git repository.

yamamuro pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 4ced588  [SPARK-32635][SQL] Fix foldable propagation
 add 68e0d5f  [SPARK-32902][SQL] Logging plan changes for AQE

No new revisions were added by this update.

Summary of changes:
 .../execution/adaptive/AdaptiveSparkPlanExec.scala | 45 +-
 .../adaptive/AdaptiveQueryExecSuite.scala  | 20 ++
 2 files changed, 56 insertions(+), 9 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org



[spark] branch branch-3.0 updated: [SPARK-32635][SQL] Fix foldable propagation

2020-09-17 Thread yamamuro
This is an automated email from the ASF dual-hosted git repository.

yamamuro pushed a commit to branch branch-3.0
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/branch-3.0 by this push:
 new ecc2f5d  [SPARK-32635][SQL] Fix foldable propagation
ecc2f5d is described below

commit ecc2f5d9e227b62f418d65708f516ffe8e690f96
Author: Peter Toth 
AuthorDate: Fri Sep 18 08:17:23 2020 +0900

[SPARK-32635][SQL] Fix foldable propagation

### What changes were proposed in this pull request?
This PR rewrites `FoldablePropagation` rule to replace attribute references 
in a node with foldables coming only from the node's children.

Before this PR in the case of this example (with 
setting`spark.sql.optimizer.excludedRules=org.apache.spark.sql.catalyst.optimizer.ConvertToLocalRelation`):
```scala
val a = Seq("1").toDF("col1").withColumn("col2", lit("1"))
val b = Seq("2").toDF("col1").withColumn("col2", lit("2"))
val aub = a.union(b)
val c = aub.filter($"col1" === "2").cache()
val d = Seq("2").toDF( "col4")
val r = d.join(aub, $"col2" === $"col4").select("col4")
val l = c.select("col2")
val df = l.join(r, $"col2" === $"col4", "LeftOuter")
df.show()
```
foldable propagation happens incorrectly:
```
 Join LeftOuter, (col2#6 = col4#34) 
 Join LeftOuter, (col2#6 = col4#34)
!:- Project [col2#6]
 :- Project [1 AS col2#6]
 :  +- InMemoryRelation [col1#4, col2#6], StorageLevel(disk, memory, 
deserialized, 1 replicas)   :  +- InMemoryRelation [col1#4, col2#6], 
StorageLevel(disk, memory, deserialized, 1 replicas)
 :+- Union  
 :+- Union
 :   :- *(1) Project [value#1 AS col1#4, 1 AS col2#6]   
 :   :- *(1) Project [value#1 AS col1#4, 1 AS 
col2#6]
 :   :  +- *(1) Filter (isnotnull(value#1) AND (value#1 = 2))   
 :   :  +- *(1) Filter (isnotnull(value#1) AND 
(value#1 = 2))
 :   : +- *(1) LocalTableScan [value#1] 
 :   : +- *(1) LocalTableScan [value#1]
 :   +- *(2) Project [value#10 AS col1#13, 2 AS col2#15]
 :   +- *(2) Project [value#10 AS col1#13, 2 AS 
col2#15]
 :  +- *(2) Filter (isnotnull(value#10) AND (value#10 = 2)) 
 :  +- *(2) Filter (isnotnull(value#10) AND 
(value#10 = 2))
 : +- *(2) LocalTableScan [value#10]
 : +- *(2) LocalTableScan [value#10]
 +- Project [col4#34]   
 +- Project [col4#34]
+- Join Inner, (col2#6 = col4#34)   
+- Join Inner, (col2#6 = col4#34)
   :- Project [value#31 AS col4#34] 
   :- Project [value#31 AS col4#34]
   :  +- LocalRelation [value#31]   
   :  +- LocalRelation [value#31]
   +- Project [col2#6]  
   +- Project [col2#6]
  +- Union false, false 
  +- Union false, false
 :- Project [1 AS col2#6]   
 :- Project [1 AS col2#6]
 :  +- LocalRelation [value#1]  
 :  +- LocalRelation [value#1]
 +- Project [2 AS col2#15]  
 +- Project [2 AS col2#15]
+- LocalRelation [value#10] 
+- LocalRelation [value#10]

```
and so the result is wrong:
```
+++
|col2|col4|
+++
|   1|null|
+++
```

After this PR foldable propagation will not happen incorrectly and the 
result is correct:
```
+++
|col2|col4|
+++
|   2|   2|
+++
```

### Why are the changes needed?
To fix a correctness issue.

### Does this PR introduce _any_ user-facing change?
Yes, fixes a correctness issue.

### 

[spark] branch master updated (ea3b979 -> 4ced588)

2020-09-17 Thread yamamuro
This is an automated email from the ASF dual-hosted git repository.

yamamuro pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from ea3b979  [SPARK-32889][SQL] orc table column name supports special 
characters
 add 4ced588  [SPARK-32635][SQL] Fix foldable propagation

No new revisions were added by this update.

Summary of changes:
 .../sql/catalyst/expressions/AttributeMap.scala|   2 +
 .../sql/catalyst/expressions/AttributeMap.scala|   2 +
 .../spark/sql/catalyst/optimizer/expressions.scala | 121 -
 .../org/apache/spark/sql/DataFrameSuite.scala  |  12 ++
 4 files changed, 88 insertions(+), 49 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org



[spark] branch branch-3.0 updated: [SPARK-32635][SQL] Fix foldable propagation

2020-09-17 Thread yamamuro
This is an automated email from the ASF dual-hosted git repository.

yamamuro pushed a commit to branch branch-3.0
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/branch-3.0 by this push:
 new ecc2f5d  [SPARK-32635][SQL] Fix foldable propagation
ecc2f5d is described below

commit ecc2f5d9e227b62f418d65708f516ffe8e690f96
Author: Peter Toth 
AuthorDate: Fri Sep 18 08:17:23 2020 +0900

[SPARK-32635][SQL] Fix foldable propagation

### What changes were proposed in this pull request?
This PR rewrites `FoldablePropagation` rule to replace attribute references 
in a node with foldables coming only from the node's children.

Before this PR in the case of this example (with 
setting`spark.sql.optimizer.excludedRules=org.apache.spark.sql.catalyst.optimizer.ConvertToLocalRelation`):
```scala
val a = Seq("1").toDF("col1").withColumn("col2", lit("1"))
val b = Seq("2").toDF("col1").withColumn("col2", lit("2"))
val aub = a.union(b)
val c = aub.filter($"col1" === "2").cache()
val d = Seq("2").toDF( "col4")
val r = d.join(aub, $"col2" === $"col4").select("col4")
val l = c.select("col2")
val df = l.join(r, $"col2" === $"col4", "LeftOuter")
df.show()
```
foldable propagation happens incorrectly:
```
 Join LeftOuter, (col2#6 = col4#34) 
 Join LeftOuter, (col2#6 = col4#34)
!:- Project [col2#6]
 :- Project [1 AS col2#6]
 :  +- InMemoryRelation [col1#4, col2#6], StorageLevel(disk, memory, 
deserialized, 1 replicas)   :  +- InMemoryRelation [col1#4, col2#6], 
StorageLevel(disk, memory, deserialized, 1 replicas)
 :+- Union  
 :+- Union
 :   :- *(1) Project [value#1 AS col1#4, 1 AS col2#6]   
 :   :- *(1) Project [value#1 AS col1#4, 1 AS 
col2#6]
 :   :  +- *(1) Filter (isnotnull(value#1) AND (value#1 = 2))   
 :   :  +- *(1) Filter (isnotnull(value#1) AND 
(value#1 = 2))
 :   : +- *(1) LocalTableScan [value#1] 
 :   : +- *(1) LocalTableScan [value#1]
 :   +- *(2) Project [value#10 AS col1#13, 2 AS col2#15]
 :   +- *(2) Project [value#10 AS col1#13, 2 AS 
col2#15]
 :  +- *(2) Filter (isnotnull(value#10) AND (value#10 = 2)) 
 :  +- *(2) Filter (isnotnull(value#10) AND 
(value#10 = 2))
 : +- *(2) LocalTableScan [value#10]
 : +- *(2) LocalTableScan [value#10]
 +- Project [col4#34]   
 +- Project [col4#34]
+- Join Inner, (col2#6 = col4#34)   
+- Join Inner, (col2#6 = col4#34)
   :- Project [value#31 AS col4#34] 
   :- Project [value#31 AS col4#34]
   :  +- LocalRelation [value#31]   
   :  +- LocalRelation [value#31]
   +- Project [col2#6]  
   +- Project [col2#6]
  +- Union false, false 
  +- Union false, false
 :- Project [1 AS col2#6]   
 :- Project [1 AS col2#6]
 :  +- LocalRelation [value#1]  
 :  +- LocalRelation [value#1]
 +- Project [2 AS col2#15]  
 +- Project [2 AS col2#15]
+- LocalRelation [value#10] 
+- LocalRelation [value#10]

```
and so the result is wrong:
```
+++
|col2|col4|
+++
|   1|null|
+++
```

After this PR foldable propagation will not happen incorrectly and the 
result is correct:
```
+++
|col2|col4|
+++
|   2|   2|
+++
```

### Why are the changes needed?
To fix a correctness issue.

### Does this PR introduce _any_ user-facing change?
Yes, fixes a correctness issue.

### 

[spark] branch master updated (ea3b979 -> 4ced588)

2020-09-17 Thread yamamuro
This is an automated email from the ASF dual-hosted git repository.

yamamuro pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from ea3b979  [SPARK-32889][SQL] orc table column name supports special 
characters
 add 4ced588  [SPARK-32635][SQL] Fix foldable propagation

No new revisions were added by this update.

Summary of changes:
 .../sql/catalyst/expressions/AttributeMap.scala|   2 +
 .../sql/catalyst/expressions/AttributeMap.scala|   2 +
 .../spark/sql/catalyst/optimizer/expressions.scala | 121 -
 .../org/apache/spark/sql/DataFrameSuite.scala  |  12 ++
 4 files changed, 88 insertions(+), 49 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org



[spark] branch branch-3.0 updated: [SPARK-32635][SQL] Fix foldable propagation

2020-09-17 Thread yamamuro
This is an automated email from the ASF dual-hosted git repository.

yamamuro pushed a commit to branch branch-3.0
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/branch-3.0 by this push:
 new ecc2f5d  [SPARK-32635][SQL] Fix foldable propagation
ecc2f5d is described below

commit ecc2f5d9e227b62f418d65708f516ffe8e690f96
Author: Peter Toth 
AuthorDate: Fri Sep 18 08:17:23 2020 +0900

[SPARK-32635][SQL] Fix foldable propagation

### What changes were proposed in this pull request?
This PR rewrites `FoldablePropagation` rule to replace attribute references 
in a node with foldables coming only from the node's children.

Before this PR in the case of this example (with 
setting`spark.sql.optimizer.excludedRules=org.apache.spark.sql.catalyst.optimizer.ConvertToLocalRelation`):
```scala
val a = Seq("1").toDF("col1").withColumn("col2", lit("1"))
val b = Seq("2").toDF("col1").withColumn("col2", lit("2"))
val aub = a.union(b)
val c = aub.filter($"col1" === "2").cache()
val d = Seq("2").toDF( "col4")
val r = d.join(aub, $"col2" === $"col4").select("col4")
val l = c.select("col2")
val df = l.join(r, $"col2" === $"col4", "LeftOuter")
df.show()
```
foldable propagation happens incorrectly:
```
 Join LeftOuter, (col2#6 = col4#34) 
 Join LeftOuter, (col2#6 = col4#34)
!:- Project [col2#6]
 :- Project [1 AS col2#6]
 :  +- InMemoryRelation [col1#4, col2#6], StorageLevel(disk, memory, 
deserialized, 1 replicas)   :  +- InMemoryRelation [col1#4, col2#6], 
StorageLevel(disk, memory, deserialized, 1 replicas)
 :+- Union  
 :+- Union
 :   :- *(1) Project [value#1 AS col1#4, 1 AS col2#6]   
 :   :- *(1) Project [value#1 AS col1#4, 1 AS 
col2#6]
 :   :  +- *(1) Filter (isnotnull(value#1) AND (value#1 = 2))   
 :   :  +- *(1) Filter (isnotnull(value#1) AND 
(value#1 = 2))
 :   : +- *(1) LocalTableScan [value#1] 
 :   : +- *(1) LocalTableScan [value#1]
 :   +- *(2) Project [value#10 AS col1#13, 2 AS col2#15]
 :   +- *(2) Project [value#10 AS col1#13, 2 AS 
col2#15]
 :  +- *(2) Filter (isnotnull(value#10) AND (value#10 = 2)) 
 :  +- *(2) Filter (isnotnull(value#10) AND 
(value#10 = 2))
 : +- *(2) LocalTableScan [value#10]
 : +- *(2) LocalTableScan [value#10]
 +- Project [col4#34]   
 +- Project [col4#34]
+- Join Inner, (col2#6 = col4#34)   
+- Join Inner, (col2#6 = col4#34)
   :- Project [value#31 AS col4#34] 
   :- Project [value#31 AS col4#34]
   :  +- LocalRelation [value#31]   
   :  +- LocalRelation [value#31]
   +- Project [col2#6]  
   +- Project [col2#6]
  +- Union false, false 
  +- Union false, false
 :- Project [1 AS col2#6]   
 :- Project [1 AS col2#6]
 :  +- LocalRelation [value#1]  
 :  +- LocalRelation [value#1]
 +- Project [2 AS col2#15]  
 +- Project [2 AS col2#15]
+- LocalRelation [value#10] 
+- LocalRelation [value#10]

```
and so the result is wrong:
```
+++
|col2|col4|
+++
|   1|null|
+++
```

After this PR foldable propagation will not happen incorrectly and the 
result is correct:
```
+++
|col2|col4|
+++
|   2|   2|
+++
```

### Why are the changes needed?
To fix a correctness issue.

### Does this PR introduce _any_ user-facing change?
Yes, fixes a correctness issue.

### 

[spark] branch master updated (ea3b979 -> 4ced588)

2020-09-17 Thread yamamuro
This is an automated email from the ASF dual-hosted git repository.

yamamuro pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from ea3b979  [SPARK-32889][SQL] orc table column name supports special 
characters
 add 4ced588  [SPARK-32635][SQL] Fix foldable propagation

No new revisions were added by this update.

Summary of changes:
 .../sql/catalyst/expressions/AttributeMap.scala|   2 +
 .../sql/catalyst/expressions/AttributeMap.scala|   2 +
 .../spark/sql/catalyst/optimizer/expressions.scala | 121 -
 .../org/apache/spark/sql/DataFrameSuite.scala  |  12 ++
 4 files changed, 88 insertions(+), 49 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org



[spark] branch branch-3.0 updated: [SPARK-32635][SQL] Fix foldable propagation

2020-09-17 Thread yamamuro
This is an automated email from the ASF dual-hosted git repository.

yamamuro pushed a commit to branch branch-3.0
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/branch-3.0 by this push:
 new ecc2f5d  [SPARK-32635][SQL] Fix foldable propagation
ecc2f5d is described below

commit ecc2f5d9e227b62f418d65708f516ffe8e690f96
Author: Peter Toth 
AuthorDate: Fri Sep 18 08:17:23 2020 +0900

[SPARK-32635][SQL] Fix foldable propagation

### What changes were proposed in this pull request?
This PR rewrites `FoldablePropagation` rule to replace attribute references 
in a node with foldables coming only from the node's children.

Before this PR in the case of this example (with 
setting`spark.sql.optimizer.excludedRules=org.apache.spark.sql.catalyst.optimizer.ConvertToLocalRelation`):
```scala
val a = Seq("1").toDF("col1").withColumn("col2", lit("1"))
val b = Seq("2").toDF("col1").withColumn("col2", lit("2"))
val aub = a.union(b)
val c = aub.filter($"col1" === "2").cache()
val d = Seq("2").toDF( "col4")
val r = d.join(aub, $"col2" === $"col4").select("col4")
val l = c.select("col2")
val df = l.join(r, $"col2" === $"col4", "LeftOuter")
df.show()
```
foldable propagation happens incorrectly:
```
 Join LeftOuter, (col2#6 = col4#34) 
 Join LeftOuter, (col2#6 = col4#34)
!:- Project [col2#6]
 :- Project [1 AS col2#6]
 :  +- InMemoryRelation [col1#4, col2#6], StorageLevel(disk, memory, 
deserialized, 1 replicas)   :  +- InMemoryRelation [col1#4, col2#6], 
StorageLevel(disk, memory, deserialized, 1 replicas)
 :+- Union  
 :+- Union
 :   :- *(1) Project [value#1 AS col1#4, 1 AS col2#6]   
 :   :- *(1) Project [value#1 AS col1#4, 1 AS 
col2#6]
 :   :  +- *(1) Filter (isnotnull(value#1) AND (value#1 = 2))   
 :   :  +- *(1) Filter (isnotnull(value#1) AND 
(value#1 = 2))
 :   : +- *(1) LocalTableScan [value#1] 
 :   : +- *(1) LocalTableScan [value#1]
 :   +- *(2) Project [value#10 AS col1#13, 2 AS col2#15]
 :   +- *(2) Project [value#10 AS col1#13, 2 AS 
col2#15]
 :  +- *(2) Filter (isnotnull(value#10) AND (value#10 = 2)) 
 :  +- *(2) Filter (isnotnull(value#10) AND 
(value#10 = 2))
 : +- *(2) LocalTableScan [value#10]
 : +- *(2) LocalTableScan [value#10]
 +- Project [col4#34]   
 +- Project [col4#34]
+- Join Inner, (col2#6 = col4#34)   
+- Join Inner, (col2#6 = col4#34)
   :- Project [value#31 AS col4#34] 
   :- Project [value#31 AS col4#34]
   :  +- LocalRelation [value#31]   
   :  +- LocalRelation [value#31]
   +- Project [col2#6]  
   +- Project [col2#6]
  +- Union false, false 
  +- Union false, false
 :- Project [1 AS col2#6]   
 :- Project [1 AS col2#6]
 :  +- LocalRelation [value#1]  
 :  +- LocalRelation [value#1]
 +- Project [2 AS col2#15]  
 +- Project [2 AS col2#15]
+- LocalRelation [value#10] 
+- LocalRelation [value#10]

```
and so the result is wrong:
```
+++
|col2|col4|
+++
|   1|null|
+++
```

After this PR foldable propagation will not happen incorrectly and the 
result is correct:
```
+++
|col2|col4|
+++
|   2|   2|
+++
```

### Why are the changes needed?
To fix a correctness issue.

### Does this PR introduce _any_ user-facing change?
Yes, fixes a correctness issue.

### 

[spark] branch master updated (ea3b979 -> 4ced588)

2020-09-17 Thread yamamuro
This is an automated email from the ASF dual-hosted git repository.

yamamuro pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from ea3b979  [SPARK-32889][SQL] orc table column name supports special 
characters
 add 4ced588  [SPARK-32635][SQL] Fix foldable propagation

No new revisions were added by this update.

Summary of changes:
 .../sql/catalyst/expressions/AttributeMap.scala|   2 +
 .../sql/catalyst/expressions/AttributeMap.scala|   2 +
 .../spark/sql/catalyst/optimizer/expressions.scala | 121 -
 .../org/apache/spark/sql/DataFrameSuite.scala  |  12 ++
 4 files changed, 88 insertions(+), 49 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org



[spark] branch branch-3.0 updated: [SPARK-32635][SQL] Fix foldable propagation

2020-09-17 Thread yamamuro
This is an automated email from the ASF dual-hosted git repository.

yamamuro pushed a commit to branch branch-3.0
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/branch-3.0 by this push:
 new ecc2f5d  [SPARK-32635][SQL] Fix foldable propagation
ecc2f5d is described below

commit ecc2f5d9e227b62f418d65708f516ffe8e690f96
Author: Peter Toth 
AuthorDate: Fri Sep 18 08:17:23 2020 +0900

[SPARK-32635][SQL] Fix foldable propagation

### What changes were proposed in this pull request?
This PR rewrites `FoldablePropagation` rule to replace attribute references 
in a node with foldables coming only from the node's children.

Before this PR in the case of this example (with 
setting`spark.sql.optimizer.excludedRules=org.apache.spark.sql.catalyst.optimizer.ConvertToLocalRelation`):
```scala
val a = Seq("1").toDF("col1").withColumn("col2", lit("1"))
val b = Seq("2").toDF("col1").withColumn("col2", lit("2"))
val aub = a.union(b)
val c = aub.filter($"col1" === "2").cache()
val d = Seq("2").toDF( "col4")
val r = d.join(aub, $"col2" === $"col4").select("col4")
val l = c.select("col2")
val df = l.join(r, $"col2" === $"col4", "LeftOuter")
df.show()
```
foldable propagation happens incorrectly:
```
 Join LeftOuter, (col2#6 = col4#34) 
 Join LeftOuter, (col2#6 = col4#34)
!:- Project [col2#6]
 :- Project [1 AS col2#6]
 :  +- InMemoryRelation [col1#4, col2#6], StorageLevel(disk, memory, 
deserialized, 1 replicas)   :  +- InMemoryRelation [col1#4, col2#6], 
StorageLevel(disk, memory, deserialized, 1 replicas)
 :+- Union  
 :+- Union
 :   :- *(1) Project [value#1 AS col1#4, 1 AS col2#6]   
 :   :- *(1) Project [value#1 AS col1#4, 1 AS 
col2#6]
 :   :  +- *(1) Filter (isnotnull(value#1) AND (value#1 = 2))   
 :   :  +- *(1) Filter (isnotnull(value#1) AND 
(value#1 = 2))
 :   : +- *(1) LocalTableScan [value#1] 
 :   : +- *(1) LocalTableScan [value#1]
 :   +- *(2) Project [value#10 AS col1#13, 2 AS col2#15]
 :   +- *(2) Project [value#10 AS col1#13, 2 AS 
col2#15]
 :  +- *(2) Filter (isnotnull(value#10) AND (value#10 = 2)) 
 :  +- *(2) Filter (isnotnull(value#10) AND 
(value#10 = 2))
 : +- *(2) LocalTableScan [value#10]
 : +- *(2) LocalTableScan [value#10]
 +- Project [col4#34]   
 +- Project [col4#34]
+- Join Inner, (col2#6 = col4#34)   
+- Join Inner, (col2#6 = col4#34)
   :- Project [value#31 AS col4#34] 
   :- Project [value#31 AS col4#34]
   :  +- LocalRelation [value#31]   
   :  +- LocalRelation [value#31]
   +- Project [col2#6]  
   +- Project [col2#6]
  +- Union false, false 
  +- Union false, false
 :- Project [1 AS col2#6]   
 :- Project [1 AS col2#6]
 :  +- LocalRelation [value#1]  
 :  +- LocalRelation [value#1]
 +- Project [2 AS col2#15]  
 +- Project [2 AS col2#15]
+- LocalRelation [value#10] 
+- LocalRelation [value#10]

```
and so the result is wrong:
```
+++
|col2|col4|
+++
|   1|null|
+++
```

After this PR foldable propagation will not happen incorrectly and the 
result is correct:
```
+++
|col2|col4|
+++
|   2|   2|
+++
```

### Why are the changes needed?
To fix a correctness issue.

### Does this PR introduce _any_ user-facing change?
Yes, fixes a correctness issue.

### 

[spark] branch master updated: [SPARK-32635][SQL] Fix foldable propagation

2020-09-17 Thread yamamuro
This is an automated email from the ASF dual-hosted git repository.

yamamuro pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/master by this push:
 new 4ced588  [SPARK-32635][SQL] Fix foldable propagation
4ced588 is described below

commit 4ced58862c707aa916f7a55d15c3887c94c9b210
Author: Peter Toth 
AuthorDate: Fri Sep 18 08:17:23 2020 +0900

[SPARK-32635][SQL] Fix foldable propagation

### What changes were proposed in this pull request?
This PR rewrites `FoldablePropagation` rule to replace attribute references 
in a node with foldables coming only from the node's children.

Before this PR in the case of this example (with 
setting`spark.sql.optimizer.excludedRules=org.apache.spark.sql.catalyst.optimizer.ConvertToLocalRelation`):
```scala
val a = Seq("1").toDF("col1").withColumn("col2", lit("1"))
val b = Seq("2").toDF("col1").withColumn("col2", lit("2"))
val aub = a.union(b)
val c = aub.filter($"col1" === "2").cache()
val d = Seq("2").toDF( "col4")
val r = d.join(aub, $"col2" === $"col4").select("col4")
val l = c.select("col2")
val df = l.join(r, $"col2" === $"col4", "LeftOuter")
df.show()
```
foldable propagation happens incorrectly:
```
 Join LeftOuter, (col2#6 = col4#34) 
 Join LeftOuter, (col2#6 = col4#34)
!:- Project [col2#6]
 :- Project [1 AS col2#6]
 :  +- InMemoryRelation [col1#4, col2#6], StorageLevel(disk, memory, 
deserialized, 1 replicas)   :  +- InMemoryRelation [col1#4, col2#6], 
StorageLevel(disk, memory, deserialized, 1 replicas)
 :+- Union  
 :+- Union
 :   :- *(1) Project [value#1 AS col1#4, 1 AS col2#6]   
 :   :- *(1) Project [value#1 AS col1#4, 1 AS 
col2#6]
 :   :  +- *(1) Filter (isnotnull(value#1) AND (value#1 = 2))   
 :   :  +- *(1) Filter (isnotnull(value#1) AND 
(value#1 = 2))
 :   : +- *(1) LocalTableScan [value#1] 
 :   : +- *(1) LocalTableScan [value#1]
 :   +- *(2) Project [value#10 AS col1#13, 2 AS col2#15]
 :   +- *(2) Project [value#10 AS col1#13, 2 AS 
col2#15]
 :  +- *(2) Filter (isnotnull(value#10) AND (value#10 = 2)) 
 :  +- *(2) Filter (isnotnull(value#10) AND 
(value#10 = 2))
 : +- *(2) LocalTableScan [value#10]
 : +- *(2) LocalTableScan [value#10]
 +- Project [col4#34]   
 +- Project [col4#34]
+- Join Inner, (col2#6 = col4#34)   
+- Join Inner, (col2#6 = col4#34)
   :- Project [value#31 AS col4#34] 
   :- Project [value#31 AS col4#34]
   :  +- LocalRelation [value#31]   
   :  +- LocalRelation [value#31]
   +- Project [col2#6]  
   +- Project [col2#6]
  +- Union false, false 
  +- Union false, false
 :- Project [1 AS col2#6]   
 :- Project [1 AS col2#6]
 :  +- LocalRelation [value#1]  
 :  +- LocalRelation [value#1]
 +- Project [2 AS col2#15]  
 +- Project [2 AS col2#15]
+- LocalRelation [value#10] 
+- LocalRelation [value#10]

```
and so the result is wrong:
```
+++
|col2|col4|
+++
|   1|null|
+++
```

After this PR foldable propagation will not happen incorrectly and the 
result is correct:
```
+++
|col2|col4|
+++
|   2|   2|
+++
```

### Why are the changes needed?
To fix a correctness issue.

### Does this PR introduce _any_ user-facing change?
Yes, fixes a correctness issue.

### How was t

[spark] branch branch-3.0 updated: [SPARK-32688][SQL][TEST] Add special values to LiteralGenerator for float and double

2020-09-15 Thread yamamuro
This is an automated email from the ASF dual-hosted git repository.

yamamuro pushed a commit to branch branch-3.0
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/branch-3.0 by this push:
 new cb6a0d0  [SPARK-32688][SQL][TEST] Add special values to 
LiteralGenerator for float and double
cb6a0d0 is described below

commit cb6a0d08cc020d9a2c19173c9023a9f5e565dd6c
Author: Tanel Kiis 
AuthorDate: Wed Sep 16 12:13:15 2020 +0900

[SPARK-32688][SQL][TEST] Add special values to LiteralGenerator for float 
and double

### What changes were proposed in this pull request?

The `LiteralGenerator` for float and double datatypes was supposed to yield 
special values (NaN, +-inf) among others, but the `Gen.chooseNum` method does 
not yield values that are outside the defined range. The `Gen.chooseNum` for a 
wide range of floats and doubles does not yield values in the "everyday" range 
as stated in https://github.com/typelevel/scalacheck/issues/113 .

There is an similar class `RandomDataGenerator` that is used in some other 
tests. Added `-0.0` and `-0.0f` as special values to there too.

These changes revealed an inconsistency with the equality check between 
`-0.0` and `0.0`.

### Why are the changes needed?

The `LiteralGenerator` is mostly used in the 
`checkConsistencyBetweenInterpretedAndCodegen` method in 
`MathExpressionsSuite`. This change would have caught the bug fixed in #29495 .

### Does this PR introduce _any_ user-facing change?

No

### How was this patch tested?

Locally reverted #29495 and verified that the existing test cases caught 
the bug.

Closes #29515 from tanelk/SPARK-32688.

Authored-by: Tanel Kiis 
Signed-off-by: Takeshi Yamamuro 
(cherry picked from commit 6051755bfe23a0e4564bf19476ec34cd7fd6008d)
Signed-off-by: Takeshi Yamamuro 
---
 .../org/apache/spark/sql/RandomDataGenerator.scala|  4 ++--
 .../sql/catalyst/expressions/LiteralGenerator.scala   | 19 +++
 2 files changed, 17 insertions(+), 6 deletions(-)

diff --git 
a/sql/catalyst/src/test/scala/org/apache/spark/sql/RandomDataGenerator.scala 
b/sql/catalyst/src/test/scala/org/apache/spark/sql/RandomDataGenerator.scala
index 6a5bdc4..3e2dc3f 100644
--- a/sql/catalyst/src/test/scala/org/apache/spark/sql/RandomDataGenerator.scala
+++ b/sql/catalyst/src/test/scala/org/apache/spark/sql/RandomDataGenerator.scala
@@ -260,10 +260,10 @@ object RandomDataGenerator {
   new MathContext(precision)).bigDecimal)
   case DoubleType => randomNumeric[Double](
 rand, r => longBitsToDouble(r.nextLong()), Seq(Double.MinValue, 
Double.MinPositiveValue,
-  Double.MaxValue, Double.PositiveInfinity, Double.NegativeInfinity, 
Double.NaN, 0.0))
+  Double.MaxValue, Double.PositiveInfinity, Double.NegativeInfinity, 
Double.NaN, 0.0, -0.0))
   case FloatType => randomNumeric[Float](
 rand, r => intBitsToFloat(r.nextInt()), Seq(Float.MinValue, 
Float.MinPositiveValue,
-  Float.MaxValue, Float.PositiveInfinity, Float.NegativeInfinity, 
Float.NaN, 0.0f))
+  Float.MaxValue, Float.PositiveInfinity, Float.NegativeInfinity, 
Float.NaN, 0.0f, -0.0f))
   case ByteType => randomNumeric[Byte](
 rand, _.nextInt().toByte, Seq(Byte.MinValue, Byte.MaxValue, 0.toByte))
   case IntegerType => randomNumeric[Int](
diff --git 
a/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/LiteralGenerator.scala
 
b/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/LiteralGenerator.scala
index d92eb01..c8e3b0e 100644
--- 
a/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/LiteralGenerator.scala
+++ 
b/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/LiteralGenerator.scala
@@ -68,16 +68,27 @@ object LiteralGenerator {
   lazy val longLiteralGen: Gen[Literal] =
 for { l <- Arbitrary.arbLong.arbitrary } yield Literal.create(l, LongType)
 
+  // The floatLiteralGen and doubleLiteralGen will 50% of the time yield 
arbitrary values
+  // and 50% of the time will yield some special values that are more likely 
to reveal
+  // corner cases. This behavior is similar to the integral value generators.
   lazy val floatLiteralGen: Gen[Literal] =
 for {
-  f <- Gen.chooseNum(Float.MinValue / 2, Float.MaxValue / 2,
-Float.NaN, Float.PositiveInfinity, Float.NegativeInfinity)
+  f <- Gen.oneOf(
+Gen.oneOf(
+  Float.NaN, Float.PositiveInfinity, Float.NegativeInfinity, 
Float.MinPositiveValue,
+  Float.MaxValue, -Float.MaxValue, 0.0f, -0.0f, 1.0f, -1.0f),
+Arbitrary.arbFloat.arbitrary
+  )
 } yield Literal.create(f, FloatType)
 
   lazy val doubleLiteralGen: Gen[Literal] =
 for {
-  f <- Gen.chooseNum(Double.MinValue / 2, Double.MaxValue / 2,
-   

[spark] branch branch-3.0 updated: [SPARK-32688][SQL][TEST] Add special values to LiteralGenerator for float and double

2020-09-15 Thread yamamuro
This is an automated email from the ASF dual-hosted git repository.

yamamuro pushed a commit to branch branch-3.0
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/branch-3.0 by this push:
 new cb6a0d0  [SPARK-32688][SQL][TEST] Add special values to 
LiteralGenerator for float and double
cb6a0d0 is described below

commit cb6a0d08cc020d9a2c19173c9023a9f5e565dd6c
Author: Tanel Kiis 
AuthorDate: Wed Sep 16 12:13:15 2020 +0900

[SPARK-32688][SQL][TEST] Add special values to LiteralGenerator for float 
and double

### What changes were proposed in this pull request?

The `LiteralGenerator` for float and double datatypes was supposed to yield 
special values (NaN, +-inf) among others, but the `Gen.chooseNum` method does 
not yield values that are outside the defined range. The `Gen.chooseNum` for a 
wide range of floats and doubles does not yield values in the "everyday" range 
as stated in https://github.com/typelevel/scalacheck/issues/113 .

There is an similar class `RandomDataGenerator` that is used in some other 
tests. Added `-0.0` and `-0.0f` as special values to there too.

These changes revealed an inconsistency with the equality check between 
`-0.0` and `0.0`.

### Why are the changes needed?

The `LiteralGenerator` is mostly used in the 
`checkConsistencyBetweenInterpretedAndCodegen` method in 
`MathExpressionsSuite`. This change would have caught the bug fixed in #29495 .

### Does this PR introduce _any_ user-facing change?

No

### How was this patch tested?

Locally reverted #29495 and verified that the existing test cases caught 
the bug.

Closes #29515 from tanelk/SPARK-32688.

Authored-by: Tanel Kiis 
Signed-off-by: Takeshi Yamamuro 
(cherry picked from commit 6051755bfe23a0e4564bf19476ec34cd7fd6008d)
Signed-off-by: Takeshi Yamamuro 
---
 .../org/apache/spark/sql/RandomDataGenerator.scala|  4 ++--
 .../sql/catalyst/expressions/LiteralGenerator.scala   | 19 +++
 2 files changed, 17 insertions(+), 6 deletions(-)

diff --git 
a/sql/catalyst/src/test/scala/org/apache/spark/sql/RandomDataGenerator.scala 
b/sql/catalyst/src/test/scala/org/apache/spark/sql/RandomDataGenerator.scala
index 6a5bdc4..3e2dc3f 100644
--- a/sql/catalyst/src/test/scala/org/apache/spark/sql/RandomDataGenerator.scala
+++ b/sql/catalyst/src/test/scala/org/apache/spark/sql/RandomDataGenerator.scala
@@ -260,10 +260,10 @@ object RandomDataGenerator {
   new MathContext(precision)).bigDecimal)
   case DoubleType => randomNumeric[Double](
 rand, r => longBitsToDouble(r.nextLong()), Seq(Double.MinValue, 
Double.MinPositiveValue,
-  Double.MaxValue, Double.PositiveInfinity, Double.NegativeInfinity, 
Double.NaN, 0.0))
+  Double.MaxValue, Double.PositiveInfinity, Double.NegativeInfinity, 
Double.NaN, 0.0, -0.0))
   case FloatType => randomNumeric[Float](
 rand, r => intBitsToFloat(r.nextInt()), Seq(Float.MinValue, 
Float.MinPositiveValue,
-  Float.MaxValue, Float.PositiveInfinity, Float.NegativeInfinity, 
Float.NaN, 0.0f))
+  Float.MaxValue, Float.PositiveInfinity, Float.NegativeInfinity, 
Float.NaN, 0.0f, -0.0f))
   case ByteType => randomNumeric[Byte](
 rand, _.nextInt().toByte, Seq(Byte.MinValue, Byte.MaxValue, 0.toByte))
   case IntegerType => randomNumeric[Int](
diff --git 
a/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/LiteralGenerator.scala
 
b/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/LiteralGenerator.scala
index d92eb01..c8e3b0e 100644
--- 
a/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/LiteralGenerator.scala
+++ 
b/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/LiteralGenerator.scala
@@ -68,16 +68,27 @@ object LiteralGenerator {
   lazy val longLiteralGen: Gen[Literal] =
 for { l <- Arbitrary.arbLong.arbitrary } yield Literal.create(l, LongType)
 
+  // The floatLiteralGen and doubleLiteralGen will 50% of the time yield 
arbitrary values
+  // and 50% of the time will yield some special values that are more likely 
to reveal
+  // corner cases. This behavior is similar to the integral value generators.
   lazy val floatLiteralGen: Gen[Literal] =
 for {
-  f <- Gen.chooseNum(Float.MinValue / 2, Float.MaxValue / 2,
-Float.NaN, Float.PositiveInfinity, Float.NegativeInfinity)
+  f <- Gen.oneOf(
+Gen.oneOf(
+  Float.NaN, Float.PositiveInfinity, Float.NegativeInfinity, 
Float.MinPositiveValue,
+  Float.MaxValue, -Float.MaxValue, 0.0f, -0.0f, 1.0f, -1.0f),
+Arbitrary.arbFloat.arbitrary
+  )
 } yield Literal.create(f, FloatType)
 
   lazy val doubleLiteralGen: Gen[Literal] =
 for {
-  f <- Gen.chooseNum(Double.MinValue / 2, Double.MaxValue / 2,
-   

[spark] branch master updated (b46c730 -> 6051755)

2020-09-15 Thread yamamuro
This is an automated email from the ASF dual-hosted git repository.

yamamuro pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from b46c730  [SPARK-32704][SQL][TESTS][FOLLOW-UP] Check any physical rule 
instead of a specific rule in the test
 add 6051755  [SPARK-32688][SQL][TEST] Add special values to 
LiteralGenerator for float and double

No new revisions were added by this update.

Summary of changes:
 .../org/apache/spark/sql/RandomDataGenerator.scala|  4 ++--
 .../sql/catalyst/expressions/LiteralGenerator.scala   | 19 +++
 2 files changed, 17 insertions(+), 6 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org



[spark] branch master updated (b46c730 -> 6051755)

2020-09-15 Thread yamamuro
This is an automated email from the ASF dual-hosted git repository.

yamamuro pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from b46c730  [SPARK-32704][SQL][TESTS][FOLLOW-UP] Check any physical rule 
instead of a specific rule in the test
 add 6051755  [SPARK-32688][SQL][TEST] Add special values to 
LiteralGenerator for float and double

No new revisions were added by this update.

Summary of changes:
 .../org/apache/spark/sql/RandomDataGenerator.scala|  4 ++--
 .../sql/catalyst/expressions/LiteralGenerator.scala   | 19 +++
 2 files changed, 17 insertions(+), 6 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org



[spark] branch branch-3.0 updated: [SPARK-32688][SQL][TEST] Add special values to LiteralGenerator for float and double

2020-09-15 Thread yamamuro
This is an automated email from the ASF dual-hosted git repository.

yamamuro pushed a commit to branch branch-3.0
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/branch-3.0 by this push:
 new cb6a0d0  [SPARK-32688][SQL][TEST] Add special values to 
LiteralGenerator for float and double
cb6a0d0 is described below

commit cb6a0d08cc020d9a2c19173c9023a9f5e565dd6c
Author: Tanel Kiis 
AuthorDate: Wed Sep 16 12:13:15 2020 +0900

[SPARK-32688][SQL][TEST] Add special values to LiteralGenerator for float 
and double

### What changes were proposed in this pull request?

The `LiteralGenerator` for float and double datatypes was supposed to yield 
special values (NaN, +-inf) among others, but the `Gen.chooseNum` method does 
not yield values that are outside the defined range. The `Gen.chooseNum` for a 
wide range of floats and doubles does not yield values in the "everyday" range 
as stated in https://github.com/typelevel/scalacheck/issues/113 .

There is an similar class `RandomDataGenerator` that is used in some other 
tests. Added `-0.0` and `-0.0f` as special values to there too.

These changes revealed an inconsistency with the equality check between 
`-0.0` and `0.0`.

### Why are the changes needed?

The `LiteralGenerator` is mostly used in the 
`checkConsistencyBetweenInterpretedAndCodegen` method in 
`MathExpressionsSuite`. This change would have caught the bug fixed in #29495 .

### Does this PR introduce _any_ user-facing change?

No

### How was this patch tested?

Locally reverted #29495 and verified that the existing test cases caught 
the bug.

Closes #29515 from tanelk/SPARK-32688.

Authored-by: Tanel Kiis 
Signed-off-by: Takeshi Yamamuro 
(cherry picked from commit 6051755bfe23a0e4564bf19476ec34cd7fd6008d)
Signed-off-by: Takeshi Yamamuro 
---
 .../org/apache/spark/sql/RandomDataGenerator.scala|  4 ++--
 .../sql/catalyst/expressions/LiteralGenerator.scala   | 19 +++
 2 files changed, 17 insertions(+), 6 deletions(-)

diff --git 
a/sql/catalyst/src/test/scala/org/apache/spark/sql/RandomDataGenerator.scala 
b/sql/catalyst/src/test/scala/org/apache/spark/sql/RandomDataGenerator.scala
index 6a5bdc4..3e2dc3f 100644
--- a/sql/catalyst/src/test/scala/org/apache/spark/sql/RandomDataGenerator.scala
+++ b/sql/catalyst/src/test/scala/org/apache/spark/sql/RandomDataGenerator.scala
@@ -260,10 +260,10 @@ object RandomDataGenerator {
   new MathContext(precision)).bigDecimal)
   case DoubleType => randomNumeric[Double](
 rand, r => longBitsToDouble(r.nextLong()), Seq(Double.MinValue, 
Double.MinPositiveValue,
-  Double.MaxValue, Double.PositiveInfinity, Double.NegativeInfinity, 
Double.NaN, 0.0))
+  Double.MaxValue, Double.PositiveInfinity, Double.NegativeInfinity, 
Double.NaN, 0.0, -0.0))
   case FloatType => randomNumeric[Float](
 rand, r => intBitsToFloat(r.nextInt()), Seq(Float.MinValue, 
Float.MinPositiveValue,
-  Float.MaxValue, Float.PositiveInfinity, Float.NegativeInfinity, 
Float.NaN, 0.0f))
+  Float.MaxValue, Float.PositiveInfinity, Float.NegativeInfinity, 
Float.NaN, 0.0f, -0.0f))
   case ByteType => randomNumeric[Byte](
 rand, _.nextInt().toByte, Seq(Byte.MinValue, Byte.MaxValue, 0.toByte))
   case IntegerType => randomNumeric[Int](
diff --git 
a/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/LiteralGenerator.scala
 
b/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/LiteralGenerator.scala
index d92eb01..c8e3b0e 100644
--- 
a/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/LiteralGenerator.scala
+++ 
b/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/LiteralGenerator.scala
@@ -68,16 +68,27 @@ object LiteralGenerator {
   lazy val longLiteralGen: Gen[Literal] =
 for { l <- Arbitrary.arbLong.arbitrary } yield Literal.create(l, LongType)
 
+  // The floatLiteralGen and doubleLiteralGen will 50% of the time yield 
arbitrary values
+  // and 50% of the time will yield some special values that are more likely 
to reveal
+  // corner cases. This behavior is similar to the integral value generators.
   lazy val floatLiteralGen: Gen[Literal] =
 for {
-  f <- Gen.chooseNum(Float.MinValue / 2, Float.MaxValue / 2,
-Float.NaN, Float.PositiveInfinity, Float.NegativeInfinity)
+  f <- Gen.oneOf(
+Gen.oneOf(
+  Float.NaN, Float.PositiveInfinity, Float.NegativeInfinity, 
Float.MinPositiveValue,
+  Float.MaxValue, -Float.MaxValue, 0.0f, -0.0f, 1.0f, -1.0f),
+Arbitrary.arbFloat.arbitrary
+  )
 } yield Literal.create(f, FloatType)
 
   lazy val doubleLiteralGen: Gen[Literal] =
 for {
-  f <- Gen.chooseNum(Double.MinValue / 2, Double.MaxValue / 2,
-   

[spark] branch master updated (b46c730 -> 6051755)

2020-09-15 Thread yamamuro
This is an automated email from the ASF dual-hosted git repository.

yamamuro pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from b46c730  [SPARK-32704][SQL][TESTS][FOLLOW-UP] Check any physical rule 
instead of a specific rule in the test
 add 6051755  [SPARK-32688][SQL][TEST] Add special values to 
LiteralGenerator for float and double

No new revisions were added by this update.

Summary of changes:
 .../org/apache/spark/sql/RandomDataGenerator.scala|  4 ++--
 .../sql/catalyst/expressions/LiteralGenerator.scala   | 19 +++
 2 files changed, 17 insertions(+), 6 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org



[spark] branch branch-3.0 updated: [SPARK-32688][SQL][TEST] Add special values to LiteralGenerator for float and double

2020-09-15 Thread yamamuro
This is an automated email from the ASF dual-hosted git repository.

yamamuro pushed a commit to branch branch-3.0
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/branch-3.0 by this push:
 new cb6a0d0  [SPARK-32688][SQL][TEST] Add special values to 
LiteralGenerator for float and double
cb6a0d0 is described below

commit cb6a0d08cc020d9a2c19173c9023a9f5e565dd6c
Author: Tanel Kiis 
AuthorDate: Wed Sep 16 12:13:15 2020 +0900

[SPARK-32688][SQL][TEST] Add special values to LiteralGenerator for float 
and double

### What changes were proposed in this pull request?

The `LiteralGenerator` for float and double datatypes was supposed to yield 
special values (NaN, +-inf) among others, but the `Gen.chooseNum` method does 
not yield values that are outside the defined range. The `Gen.chooseNum` for a 
wide range of floats and doubles does not yield values in the "everyday" range 
as stated in https://github.com/typelevel/scalacheck/issues/113 .

There is an similar class `RandomDataGenerator` that is used in some other 
tests. Added `-0.0` and `-0.0f` as special values to there too.

These changes revealed an inconsistency with the equality check between 
`-0.0` and `0.0`.

### Why are the changes needed?

The `LiteralGenerator` is mostly used in the 
`checkConsistencyBetweenInterpretedAndCodegen` method in 
`MathExpressionsSuite`. This change would have caught the bug fixed in #29495 .

### Does this PR introduce _any_ user-facing change?

No

### How was this patch tested?

Locally reverted #29495 and verified that the existing test cases caught 
the bug.

Closes #29515 from tanelk/SPARK-32688.

Authored-by: Tanel Kiis 
Signed-off-by: Takeshi Yamamuro 
(cherry picked from commit 6051755bfe23a0e4564bf19476ec34cd7fd6008d)
Signed-off-by: Takeshi Yamamuro 
---
 .../org/apache/spark/sql/RandomDataGenerator.scala|  4 ++--
 .../sql/catalyst/expressions/LiteralGenerator.scala   | 19 +++
 2 files changed, 17 insertions(+), 6 deletions(-)

diff --git 
a/sql/catalyst/src/test/scala/org/apache/spark/sql/RandomDataGenerator.scala 
b/sql/catalyst/src/test/scala/org/apache/spark/sql/RandomDataGenerator.scala
index 6a5bdc4..3e2dc3f 100644
--- a/sql/catalyst/src/test/scala/org/apache/spark/sql/RandomDataGenerator.scala
+++ b/sql/catalyst/src/test/scala/org/apache/spark/sql/RandomDataGenerator.scala
@@ -260,10 +260,10 @@ object RandomDataGenerator {
   new MathContext(precision)).bigDecimal)
   case DoubleType => randomNumeric[Double](
 rand, r => longBitsToDouble(r.nextLong()), Seq(Double.MinValue, 
Double.MinPositiveValue,
-  Double.MaxValue, Double.PositiveInfinity, Double.NegativeInfinity, 
Double.NaN, 0.0))
+  Double.MaxValue, Double.PositiveInfinity, Double.NegativeInfinity, 
Double.NaN, 0.0, -0.0))
   case FloatType => randomNumeric[Float](
 rand, r => intBitsToFloat(r.nextInt()), Seq(Float.MinValue, 
Float.MinPositiveValue,
-  Float.MaxValue, Float.PositiveInfinity, Float.NegativeInfinity, 
Float.NaN, 0.0f))
+  Float.MaxValue, Float.PositiveInfinity, Float.NegativeInfinity, 
Float.NaN, 0.0f, -0.0f))
   case ByteType => randomNumeric[Byte](
 rand, _.nextInt().toByte, Seq(Byte.MinValue, Byte.MaxValue, 0.toByte))
   case IntegerType => randomNumeric[Int](
diff --git 
a/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/LiteralGenerator.scala
 
b/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/LiteralGenerator.scala
index d92eb01..c8e3b0e 100644
--- 
a/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/LiteralGenerator.scala
+++ 
b/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/LiteralGenerator.scala
@@ -68,16 +68,27 @@ object LiteralGenerator {
   lazy val longLiteralGen: Gen[Literal] =
 for { l <- Arbitrary.arbLong.arbitrary } yield Literal.create(l, LongType)
 
+  // The floatLiteralGen and doubleLiteralGen will 50% of the time yield 
arbitrary values
+  // and 50% of the time will yield some special values that are more likely 
to reveal
+  // corner cases. This behavior is similar to the integral value generators.
   lazy val floatLiteralGen: Gen[Literal] =
 for {
-  f <- Gen.chooseNum(Float.MinValue / 2, Float.MaxValue / 2,
-Float.NaN, Float.PositiveInfinity, Float.NegativeInfinity)
+  f <- Gen.oneOf(
+Gen.oneOf(
+  Float.NaN, Float.PositiveInfinity, Float.NegativeInfinity, 
Float.MinPositiveValue,
+  Float.MaxValue, -Float.MaxValue, 0.0f, -0.0f, 1.0f, -1.0f),
+Arbitrary.arbFloat.arbitrary
+  )
 } yield Literal.create(f, FloatType)
 
   lazy val doubleLiteralGen: Gen[Literal] =
 for {
-  f <- Gen.chooseNum(Double.MinValue / 2, Double.MaxValue / 2,
-   

[spark] branch master updated (b46c730 -> 6051755)

2020-09-15 Thread yamamuro
This is an automated email from the ASF dual-hosted git repository.

yamamuro pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from b46c730  [SPARK-32704][SQL][TESTS][FOLLOW-UP] Check any physical rule 
instead of a specific rule in the test
 add 6051755  [SPARK-32688][SQL][TEST] Add special values to 
LiteralGenerator for float and double

No new revisions were added by this update.

Summary of changes:
 .../org/apache/spark/sql/RandomDataGenerator.scala|  4 ++--
 .../sql/catalyst/expressions/LiteralGenerator.scala   | 19 +++
 2 files changed, 17 insertions(+), 6 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org



[spark] branch branch-3.0 updated: [SPARK-32688][SQL][TEST] Add special values to LiteralGenerator for float and double

2020-09-15 Thread yamamuro
This is an automated email from the ASF dual-hosted git repository.

yamamuro pushed a commit to branch branch-3.0
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/branch-3.0 by this push:
 new cb6a0d0  [SPARK-32688][SQL][TEST] Add special values to 
LiteralGenerator for float and double
cb6a0d0 is described below

commit cb6a0d08cc020d9a2c19173c9023a9f5e565dd6c
Author: Tanel Kiis 
AuthorDate: Wed Sep 16 12:13:15 2020 +0900

[SPARK-32688][SQL][TEST] Add special values to LiteralGenerator for float 
and double

### What changes were proposed in this pull request?

The `LiteralGenerator` for float and double datatypes was supposed to yield 
special values (NaN, +-inf) among others, but the `Gen.chooseNum` method does 
not yield values that are outside the defined range. The `Gen.chooseNum` for a 
wide range of floats and doubles does not yield values in the "everyday" range 
as stated in https://github.com/typelevel/scalacheck/issues/113 .

There is an similar class `RandomDataGenerator` that is used in some other 
tests. Added `-0.0` and `-0.0f` as special values to there too.

These changes revealed an inconsistency with the equality check between 
`-0.0` and `0.0`.

### Why are the changes needed?

The `LiteralGenerator` is mostly used in the 
`checkConsistencyBetweenInterpretedAndCodegen` method in 
`MathExpressionsSuite`. This change would have caught the bug fixed in #29495 .

### Does this PR introduce _any_ user-facing change?

No

### How was this patch tested?

Locally reverted #29495 and verified that the existing test cases caught 
the bug.

Closes #29515 from tanelk/SPARK-32688.

Authored-by: Tanel Kiis 
Signed-off-by: Takeshi Yamamuro 
(cherry picked from commit 6051755bfe23a0e4564bf19476ec34cd7fd6008d)
Signed-off-by: Takeshi Yamamuro 
---
 .../org/apache/spark/sql/RandomDataGenerator.scala|  4 ++--
 .../sql/catalyst/expressions/LiteralGenerator.scala   | 19 +++
 2 files changed, 17 insertions(+), 6 deletions(-)

diff --git 
a/sql/catalyst/src/test/scala/org/apache/spark/sql/RandomDataGenerator.scala 
b/sql/catalyst/src/test/scala/org/apache/spark/sql/RandomDataGenerator.scala
index 6a5bdc4..3e2dc3f 100644
--- a/sql/catalyst/src/test/scala/org/apache/spark/sql/RandomDataGenerator.scala
+++ b/sql/catalyst/src/test/scala/org/apache/spark/sql/RandomDataGenerator.scala
@@ -260,10 +260,10 @@ object RandomDataGenerator {
   new MathContext(precision)).bigDecimal)
   case DoubleType => randomNumeric[Double](
 rand, r => longBitsToDouble(r.nextLong()), Seq(Double.MinValue, 
Double.MinPositiveValue,
-  Double.MaxValue, Double.PositiveInfinity, Double.NegativeInfinity, 
Double.NaN, 0.0))
+  Double.MaxValue, Double.PositiveInfinity, Double.NegativeInfinity, 
Double.NaN, 0.0, -0.0))
   case FloatType => randomNumeric[Float](
 rand, r => intBitsToFloat(r.nextInt()), Seq(Float.MinValue, 
Float.MinPositiveValue,
-  Float.MaxValue, Float.PositiveInfinity, Float.NegativeInfinity, 
Float.NaN, 0.0f))
+  Float.MaxValue, Float.PositiveInfinity, Float.NegativeInfinity, 
Float.NaN, 0.0f, -0.0f))
   case ByteType => randomNumeric[Byte](
 rand, _.nextInt().toByte, Seq(Byte.MinValue, Byte.MaxValue, 0.toByte))
   case IntegerType => randomNumeric[Int](
diff --git 
a/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/LiteralGenerator.scala
 
b/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/LiteralGenerator.scala
index d92eb01..c8e3b0e 100644
--- 
a/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/LiteralGenerator.scala
+++ 
b/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/LiteralGenerator.scala
@@ -68,16 +68,27 @@ object LiteralGenerator {
   lazy val longLiteralGen: Gen[Literal] =
 for { l <- Arbitrary.arbLong.arbitrary } yield Literal.create(l, LongType)
 
+  // The floatLiteralGen and doubleLiteralGen will 50% of the time yield 
arbitrary values
+  // and 50% of the time will yield some special values that are more likely 
to reveal
+  // corner cases. This behavior is similar to the integral value generators.
   lazy val floatLiteralGen: Gen[Literal] =
 for {
-  f <- Gen.chooseNum(Float.MinValue / 2, Float.MaxValue / 2,
-Float.NaN, Float.PositiveInfinity, Float.NegativeInfinity)
+  f <- Gen.oneOf(
+Gen.oneOf(
+  Float.NaN, Float.PositiveInfinity, Float.NegativeInfinity, 
Float.MinPositiveValue,
+  Float.MaxValue, -Float.MaxValue, 0.0f, -0.0f, 1.0f, -1.0f),
+Arbitrary.arbFloat.arbitrary
+  )
 } yield Literal.create(f, FloatType)
 
   lazy val doubleLiteralGen: Gen[Literal] =
 for {
-  f <- Gen.chooseNum(Double.MinValue / 2, Double.MaxValue / 2,
-   

[spark] branch master updated (b46c730 -> 6051755)

2020-09-15 Thread yamamuro
This is an automated email from the ASF dual-hosted git repository.

yamamuro pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from b46c730  [SPARK-32704][SQL][TESTS][FOLLOW-UP] Check any physical rule 
instead of a specific rule in the test
 add 6051755  [SPARK-32688][SQL][TEST] Add special values to 
LiteralGenerator for float and double

No new revisions were added by this update.

Summary of changes:
 .../org/apache/spark/sql/RandomDataGenerator.scala|  4 ++--
 .../sql/catalyst/expressions/LiteralGenerator.scala   | 19 +++
 2 files changed, 17 insertions(+), 6 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org



[spark] branch master updated (5e82548 -> 7a17158)

2020-09-14 Thread yamamuro
This is an automated email from the ASF dual-hosted git repository.

yamamuro pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 5e82548  [SPARK-32844][SQL] Make `DataFrameReader.table` take the 
specified options for datasource v1
 add 7a17158  [SPARK-32868][SQL] Add more order irrelevant aggregates to 
EliminateSorts

No new revisions were added by this update.

Summary of changes:
 .../apache/spark/sql/catalyst/dsl/package.scala|  6 +
 .../spark/sql/catalyst/optimizer/Optimizer.scala   |  2 +-
 .../catalyst/optimizer/EliminateSortsSuite.scala   | 26 --
 3 files changed, 26 insertions(+), 8 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org



[spark] branch master updated (5e82548 -> 7a17158)

2020-09-14 Thread yamamuro
This is an automated email from the ASF dual-hosted git repository.

yamamuro pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 5e82548  [SPARK-32844][SQL] Make `DataFrameReader.table` take the 
specified options for datasource v1
 add 7a17158  [SPARK-32868][SQL] Add more order irrelevant aggregates to 
EliminateSorts

No new revisions were added by this update.

Summary of changes:
 .../apache/spark/sql/catalyst/dsl/package.scala|  6 +
 .../spark/sql/catalyst/optimizer/Optimizer.scala   |  2 +-
 .../catalyst/optimizer/EliminateSortsSuite.scala   | 26 --
 3 files changed, 26 insertions(+), 8 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org



[spark] branch master updated (5e82548 -> 7a17158)

2020-09-14 Thread yamamuro
This is an automated email from the ASF dual-hosted git repository.

yamamuro pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 5e82548  [SPARK-32844][SQL] Make `DataFrameReader.table` take the 
specified options for datasource v1
 add 7a17158  [SPARK-32868][SQL] Add more order irrelevant aggregates to 
EliminateSorts

No new revisions were added by this update.

Summary of changes:
 .../apache/spark/sql/catalyst/dsl/package.scala|  6 +
 .../spark/sql/catalyst/optimizer/Optimizer.scala   |  2 +-
 .../catalyst/optimizer/EliminateSortsSuite.scala   | 26 --
 3 files changed, 26 insertions(+), 8 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org



[spark] branch master updated (5e82548 -> 7a17158)

2020-09-14 Thread yamamuro
This is an automated email from the ASF dual-hosted git repository.

yamamuro pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 5e82548  [SPARK-32844][SQL] Make `DataFrameReader.table` take the 
specified options for datasource v1
 add 7a17158  [SPARK-32868][SQL] Add more order irrelevant aggregates to 
EliminateSorts

No new revisions were added by this update.

Summary of changes:
 .../apache/spark/sql/catalyst/dsl/package.scala|  6 +
 .../spark/sql/catalyst/optimizer/Optimizer.scala   |  2 +-
 .../catalyst/optimizer/EliminateSortsSuite.scala   | 26 --
 3 files changed, 26 insertions(+), 8 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org



[spark] branch master updated (5e82548 -> 7a17158)

2020-09-14 Thread yamamuro
This is an automated email from the ASF dual-hosted git repository.

yamamuro pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 5e82548  [SPARK-32844][SQL] Make `DataFrameReader.table` take the 
specified options for datasource v1
 add 7a17158  [SPARK-32868][SQL] Add more order irrelevant aggregates to 
EliminateSorts

No new revisions were added by this update.

Summary of changes:
 .../apache/spark/sql/catalyst/dsl/package.scala|  6 +
 .../spark/sql/catalyst/optimizer/Optimizer.scala   |  2 +-
 .../catalyst/optimizer/EliminateSortsSuite.scala   | 26 --
 3 files changed, 26 insertions(+), 8 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org



[spark] branch master updated (b4be6a6 -> 4269c2c)

2020-09-11 Thread yamamuro
This is an automated email from the ASF dual-hosted git repository.

yamamuro pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from b4be6a6  [SPARK-32845][SS][TESTS] Add sinkParameter to check sink 
options robustly in DataStreamReaderWriterSuite
 add 4269c2c  [SPARK-32851][SQL][TEST] Tests should fail if errors happen 
when generating projection code

No new revisions were added by this update.

Summary of changes:
 .../src/test/scala/org/apache/spark/sql/test/SharedSparkSession.scala   | 2 ++
 sql/hive/src/test/scala/org/apache/spark/sql/hive/test/TestHive.scala   | 2 ++
 2 files changed, 4 insertions(+)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org



[spark] branch master updated (b4be6a6 -> 4269c2c)

2020-09-11 Thread yamamuro
This is an automated email from the ASF dual-hosted git repository.

yamamuro pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from b4be6a6  [SPARK-32845][SS][TESTS] Add sinkParameter to check sink 
options robustly in DataStreamReaderWriterSuite
 add 4269c2c  [SPARK-32851][SQL][TEST] Tests should fail if errors happen 
when generating projection code

No new revisions were added by this update.

Summary of changes:
 .../src/test/scala/org/apache/spark/sql/test/SharedSparkSession.scala   | 2 ++
 sql/hive/src/test/scala/org/apache/spark/sql/hive/test/TestHive.scala   | 2 ++
 2 files changed, 4 insertions(+)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org



[spark] branch master updated (b4be6a6 -> 4269c2c)

2020-09-11 Thread yamamuro
This is an automated email from the ASF dual-hosted git repository.

yamamuro pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from b4be6a6  [SPARK-32845][SS][TESTS] Add sinkParameter to check sink 
options robustly in DataStreamReaderWriterSuite
 add 4269c2c  [SPARK-32851][SQL][TEST] Tests should fail if errors happen 
when generating projection code

No new revisions were added by this update.

Summary of changes:
 .../src/test/scala/org/apache/spark/sql/test/SharedSparkSession.scala   | 2 ++
 sql/hive/src/test/scala/org/apache/spark/sql/hive/test/TestHive.scala   | 2 ++
 2 files changed, 4 insertions(+)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org



[spark] branch master updated (b4be6a6 -> 4269c2c)

2020-09-11 Thread yamamuro
This is an automated email from the ASF dual-hosted git repository.

yamamuro pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from b4be6a6  [SPARK-32845][SS][TESTS] Add sinkParameter to check sink 
options robustly in DataStreamReaderWriterSuite
 add 4269c2c  [SPARK-32851][SQL][TEST] Tests should fail if errors happen 
when generating projection code

No new revisions were added by this update.

Summary of changes:
 .../src/test/scala/org/apache/spark/sql/test/SharedSparkSession.scala   | 2 ++
 sql/hive/src/test/scala/org/apache/spark/sql/hive/test/TestHive.scala   | 2 ++
 2 files changed, 4 insertions(+)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org



[spark] branch master updated (b4be6a6 -> 4269c2c)

2020-09-11 Thread yamamuro
This is an automated email from the ASF dual-hosted git repository.

yamamuro pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from b4be6a6  [SPARK-32845][SS][TESTS] Add sinkParameter to check sink 
options robustly in DataStreamReaderWriterSuite
 add 4269c2c  [SPARK-32851][SQL][TEST] Tests should fail if errors happen 
when generating projection code

No new revisions were added by this update.

Summary of changes:
 .../src/test/scala/org/apache/spark/sql/test/SharedSparkSession.scala   | 2 ++
 sql/hive/src/test/scala/org/apache/spark/sql/hive/test/TestHive.scala   | 2 ++
 2 files changed, 4 insertions(+)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org



[spark] branch branch-3.0 updated: [SPARK-32677][SQL][DOCS][MINOR] Improve code comment in CreateFunctionCommand

2020-09-10 Thread yamamuro
This is an automated email from the ASF dual-hosted git repository.

yamamuro pushed a commit to branch branch-3.0
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/branch-3.0 by this push:
 new cf14897  [SPARK-32677][SQL][DOCS][MINOR] Improve code comment in 
CreateFunctionCommand
cf14897 is described below

commit cf14897d355efbf4acb3497ef1b74cd3a9c35d59
Author: Wenchen Fan 
AuthorDate: Fri Sep 11 09:22:56 2020 +0900

[SPARK-32677][SQL][DOCS][MINOR] Improve code comment in 
CreateFunctionCommand

### What changes were proposed in this pull request?

We made a mistake in https://github.com/apache/spark/pull/29502, as there 
is no code comment to explain why we can't load the UDF class when creating 
functions. This PR improves the code comment.

### Why are the changes needed?

To avoid making the same mistake.

### Does this PR introduce _any_ user-facing change?

No

### How was this patch tested?

N/A

Closes #29713 from cloud-fan/comment.

Authored-by: Wenchen Fan 
Signed-off-by: Takeshi Yamamuro 
(cherry picked from commit 328d81a2d1131742bcfba5117896c093db39e721)
Signed-off-by: Takeshi Yamamuro 
---
 .../main/scala/org/apache/spark/sql/execution/command/functions.scala | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git 
a/sql/core/src/main/scala/org/apache/spark/sql/execution/command/functions.scala
 
b/sql/core/src/main/scala/org/apache/spark/sql/execution/command/functions.scala
index 6fdc7f4..d55d696 100644
--- 
a/sql/core/src/main/scala/org/apache/spark/sql/execution/command/functions.scala
+++ 
b/sql/core/src/main/scala/org/apache/spark/sql/execution/command/functions.scala
@@ -88,7 +88,9 @@ case class CreateFunctionCommand(
   } else {
 // For a permanent, we will store the metadata into underlying 
external catalog.
 // This function will be loaded into the FunctionRegistry when a query 
uses it.
-// We do not load it into FunctionRegistry right now.
+// We do not load it into FunctionRegistry right now, to avoid loading 
the resource and
+// UDF class immediately, as the Spark application to create the 
function may not have
+// access to the resource and/or UDF class.
 catalog.createFunction(func, ignoreIfExists)
   }
 }


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org



[spark] branch branch-3.0 updated: [SPARK-32677][SQL][DOCS][MINOR] Improve code comment in CreateFunctionCommand

2020-09-10 Thread yamamuro
This is an automated email from the ASF dual-hosted git repository.

yamamuro pushed a commit to branch branch-3.0
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/branch-3.0 by this push:
 new cf14897  [SPARK-32677][SQL][DOCS][MINOR] Improve code comment in 
CreateFunctionCommand
cf14897 is described below

commit cf14897d355efbf4acb3497ef1b74cd3a9c35d59
Author: Wenchen Fan 
AuthorDate: Fri Sep 11 09:22:56 2020 +0900

[SPARK-32677][SQL][DOCS][MINOR] Improve code comment in 
CreateFunctionCommand

### What changes were proposed in this pull request?

We made a mistake in https://github.com/apache/spark/pull/29502, as there 
is no code comment to explain why we can't load the UDF class when creating 
functions. This PR improves the code comment.

### Why are the changes needed?

To avoid making the same mistake.

### Does this PR introduce _any_ user-facing change?

No

### How was this patch tested?

N/A

Closes #29713 from cloud-fan/comment.

Authored-by: Wenchen Fan 
Signed-off-by: Takeshi Yamamuro 
(cherry picked from commit 328d81a2d1131742bcfba5117896c093db39e721)
Signed-off-by: Takeshi Yamamuro 
---
 .../main/scala/org/apache/spark/sql/execution/command/functions.scala | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git 
a/sql/core/src/main/scala/org/apache/spark/sql/execution/command/functions.scala
 
b/sql/core/src/main/scala/org/apache/spark/sql/execution/command/functions.scala
index 6fdc7f4..d55d696 100644
--- 
a/sql/core/src/main/scala/org/apache/spark/sql/execution/command/functions.scala
+++ 
b/sql/core/src/main/scala/org/apache/spark/sql/execution/command/functions.scala
@@ -88,7 +88,9 @@ case class CreateFunctionCommand(
   } else {
 // For a permanent, we will store the metadata into underlying 
external catalog.
 // This function will be loaded into the FunctionRegistry when a query 
uses it.
-// We do not load it into FunctionRegistry right now.
+// We do not load it into FunctionRegistry right now, to avoid loading 
the resource and
+// UDF class immediately, as the Spark application to create the 
function may not have
+// access to the resource and/or UDF class.
 catalog.createFunction(func, ignoreIfExists)
   }
 }


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org



[spark] branch master updated (5f468cc -> 328d81a)

2020-09-10 Thread yamamuro
This is an automated email from the ASF dual-hosted git repository.

yamamuro pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 5f468cc  [SPARK-32822][SQL] Change the number of partitions to zero 
when a range is empty with WholeStageCodegen disabled or falled back
 add 328d81a  [SPARK-32677][SQL][DOCS][MINOR] Improve code comment in 
CreateFunctionCommand

No new revisions were added by this update.

Summary of changes:
 .../main/scala/org/apache/spark/sql/execution/command/functions.scala | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org



[spark] branch branch-3.0 updated: [SPARK-32677][SQL][DOCS][MINOR] Improve code comment in CreateFunctionCommand

2020-09-10 Thread yamamuro
This is an automated email from the ASF dual-hosted git repository.

yamamuro pushed a commit to branch branch-3.0
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/branch-3.0 by this push:
 new cf14897  [SPARK-32677][SQL][DOCS][MINOR] Improve code comment in 
CreateFunctionCommand
cf14897 is described below

commit cf14897d355efbf4acb3497ef1b74cd3a9c35d59
Author: Wenchen Fan 
AuthorDate: Fri Sep 11 09:22:56 2020 +0900

[SPARK-32677][SQL][DOCS][MINOR] Improve code comment in 
CreateFunctionCommand

### What changes were proposed in this pull request?

We made a mistake in https://github.com/apache/spark/pull/29502, as there 
is no code comment to explain why we can't load the UDF class when creating 
functions. This PR improves the code comment.

### Why are the changes needed?

To avoid making the same mistake.

### Does this PR introduce _any_ user-facing change?

No

### How was this patch tested?

N/A

Closes #29713 from cloud-fan/comment.

Authored-by: Wenchen Fan 
Signed-off-by: Takeshi Yamamuro 
(cherry picked from commit 328d81a2d1131742bcfba5117896c093db39e721)
Signed-off-by: Takeshi Yamamuro 
---
 .../main/scala/org/apache/spark/sql/execution/command/functions.scala | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git 
a/sql/core/src/main/scala/org/apache/spark/sql/execution/command/functions.scala
 
b/sql/core/src/main/scala/org/apache/spark/sql/execution/command/functions.scala
index 6fdc7f4..d55d696 100644
--- 
a/sql/core/src/main/scala/org/apache/spark/sql/execution/command/functions.scala
+++ 
b/sql/core/src/main/scala/org/apache/spark/sql/execution/command/functions.scala
@@ -88,7 +88,9 @@ case class CreateFunctionCommand(
   } else {
 // For a permanent, we will store the metadata into underlying 
external catalog.
 // This function will be loaded into the FunctionRegistry when a query 
uses it.
-// We do not load it into FunctionRegistry right now.
+// We do not load it into FunctionRegistry right now, to avoid loading 
the resource and
+// UDF class immediately, as the Spark application to create the 
function may not have
+// access to the resource and/or UDF class.
 catalog.createFunction(func, ignoreIfExists)
   }
 }


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org



<    1   2   3   4   5   6   7   8   9   10   >