[spark] branch master updated (3dca81e -> 1450b5e)
This is an automated email from the ASF dual-hosted git repository. yamamuro pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from 3dca81e [SPARK-32669][SQL][TEST] Expression unit tests should explore all cases that can lead to null result add 1450b5e [MINOR][DOCS] fix typo for docs,log message and comments No new revisions were added by this update. Summary of changes: .../src/main/java/org/apache/spark/network/util/TransportConf.java | 2 +- core/src/main/java/org/apache/spark/api/plugin/DriverPlugin.java| 2 +- .../scala/org/apache/spark/resource/ResourceDiscoveryScriptPlugin.scala | 2 +- docs/job-scheduling.md | 2 +- docs/sql-ref-syntax-qry-select-groupby.md | 2 +- docs/sql-ref-syntax-qry-select-hints.md | 2 +- docs/sql-ref.md | 2 +- launcher/src/main/java/org/apache/spark/launcher/LauncherServer.java| 2 +- sbin/decommission-worker.sh | 2 +- .../main/java/org/apache/spark/sql/connector/catalog/TableCatalog.java | 2 +- .../main/scala/org/apache/spark/sql/catalyst/QueryPlanningTracker.scala | 2 +- .../spark/sql/execution/datasources/v2/ShowTablePropertiesExec.scala| 2 +- 12 files changed, 12 insertions(+), 12 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (6dd37cb -> 3dca81e)
This is an automated email from the ASF dual-hosted git repository. yamamuro pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from 6dd37cb [SPARK-32682][INFRA] Use workflow_dispatch to enable manual test triggers add 3dca81e [SPARK-32669][SQL][TEST] Expression unit tests should explore all cases that can lead to null result No new revisions were added by this update. Summary of changes: .../catalyst/expressions/ComplexTypeSuite.scala| 31 +- .../expressions/ExpressionEvalHelper.scala | 5 2 files changed, 23 insertions(+), 13 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (6dd37cb -> 3dca81e)
This is an automated email from the ASF dual-hosted git repository. yamamuro pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from 6dd37cb [SPARK-32682][INFRA] Use workflow_dispatch to enable manual test triggers add 3dca81e [SPARK-32669][SQL][TEST] Expression unit tests should explore all cases that can lead to null result No new revisions were added by this update. Summary of changes: .../catalyst/expressions/ComplexTypeSuite.scala| 31 +- .../expressions/ExpressionEvalHelper.scala | 5 2 files changed, 23 insertions(+), 13 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (6dd37cb -> 3dca81e)
This is an automated email from the ASF dual-hosted git repository. yamamuro pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from 6dd37cb [SPARK-32682][INFRA] Use workflow_dispatch to enable manual test triggers add 3dca81e [SPARK-32669][SQL][TEST] Expression unit tests should explore all cases that can lead to null result No new revisions were added by this update. Summary of changes: .../catalyst/expressions/ComplexTypeSuite.scala| 31 +- .../expressions/ExpressionEvalHelper.scala | 5 2 files changed, 23 insertions(+), 13 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (6dd37cb -> 3dca81e)
This is an automated email from the ASF dual-hosted git repository. yamamuro pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from 6dd37cb [SPARK-32682][INFRA] Use workflow_dispatch to enable manual test triggers add 3dca81e [SPARK-32669][SQL][TEST] Expression unit tests should explore all cases that can lead to null result No new revisions were added by this update. Summary of changes: .../catalyst/expressions/ComplexTypeSuite.scala| 31 +- .../expressions/ExpressionEvalHelper.scala | 5 2 files changed, 23 insertions(+), 13 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (6dd37cb -> 3dca81e)
This is an automated email from the ASF dual-hosted git repository. yamamuro pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from 6dd37cb [SPARK-32682][INFRA] Use workflow_dispatch to enable manual test triggers add 3dca81e [SPARK-32669][SQL][TEST] Expression unit tests should explore all cases that can lead to null result No new revisions were added by this update. Summary of changes: .../catalyst/expressions/ComplexTypeSuite.scala| 31 +- .../expressions/ExpressionEvalHelper.scala | 5 2 files changed, 23 insertions(+), 13 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (e277ef1 -> 6dd37cb)
This is an automated email from the ASF dual-hosted git repository. yamamuro pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from e277ef1 [SPARK-32646][SQL] ORC predicate pushdown should work with case-insensitive analysis add 6dd37cb [SPARK-32682][INFRA] Use workflow_dispatch to enable manual test triggers No new revisions were added by this update. Summary of changes: .github/workflows/build_and_test.yml | 9 + dev/run-tests.py | 8 +++- 2 files changed, 16 insertions(+), 1 deletion(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (e277ef1 -> 6dd37cb)
This is an automated email from the ASF dual-hosted git repository. yamamuro pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from e277ef1 [SPARK-32646][SQL] ORC predicate pushdown should work with case-insensitive analysis add 6dd37cb [SPARK-32682][INFRA] Use workflow_dispatch to enable manual test triggers No new revisions were added by this update. Summary of changes: .github/workflows/build_and_test.yml | 9 + dev/run-tests.py | 8 +++- 2 files changed, 16 insertions(+), 1 deletion(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (e277ef1 -> 6dd37cb)
This is an automated email from the ASF dual-hosted git repository. yamamuro pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from e277ef1 [SPARK-32646][SQL] ORC predicate pushdown should work with case-insensitive analysis add 6dd37cb [SPARK-32682][INFRA] Use workflow_dispatch to enable manual test triggers No new revisions were added by this update. Summary of changes: .github/workflows/build_and_test.yml | 9 + dev/run-tests.py | 8 +++- 2 files changed, 16 insertions(+), 1 deletion(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (e277ef1 -> 6dd37cb)
This is an automated email from the ASF dual-hosted git repository. yamamuro pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from e277ef1 [SPARK-32646][SQL] ORC predicate pushdown should work with case-insensitive analysis add 6dd37cb [SPARK-32682][INFRA] Use workflow_dispatch to enable manual test triggers No new revisions were added by this update. Summary of changes: .github/workflows/build_and_test.yml | 9 + dev/run-tests.py | 8 +++- 2 files changed, 16 insertions(+), 1 deletion(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (e277ef1 -> 6dd37cb)
This is an automated email from the ASF dual-hosted git repository. yamamuro pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from e277ef1 [SPARK-32646][SQL] ORC predicate pushdown should work with case-insensitive analysis add 6dd37cb [SPARK-32682][INFRA] Use workflow_dispatch to enable manual test triggers No new revisions were added by this update. Summary of changes: .github/workflows/build_and_test.yml | 9 + dev/run-tests.py | 8 +++- 2 files changed, 16 insertions(+), 1 deletion(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (9a79bbc -> 8f0fef1)
This is an automated email from the ASF dual-hosted git repository. yamamuro pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from 9a79bbc [SPARK-32610][DOCS] Fix the link to metrics.dropwizard.io in monitoring.md to refer the proper version add 8f0fef1 [SPARK-32399][SQL] Full outer shuffled hash join No new revisions were added by this update. Summary of changes: .../apache/spark/unsafe/map/BytesToBytesMap.java | 70 ++ .../unsafe/map/AbstractBytesToBytesMapSuite.java | 27 ++- .../spark/sql/catalyst/optimizer/joins.scala | 27 ++- .../org/apache/spark/sql/internal/SQLConf.scala| 5 +- .../spark/sql/execution/SparkStrategies.scala | 6 +- .../spark/sql/execution/joins/HashJoin.scala | 4 +- .../spark/sql/execution/joins/HashedRelation.scala | 175 ++- .../sql/execution/joins/ShuffledHashJoinExec.scala | 239 - .../spark/sql/execution/joins/ShuffledJoin.scala | 23 +- .../sql/execution/joins/SortMergeJoinExec.scala| 20 -- .../scala/org/apache/spark/sql/JoinSuite.scala | 66 ++ .../sql/execution/joins/HashedRelationSuite.scala | 79 +++ 12 files changed, 693 insertions(+), 48 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (9a79bbc -> 8f0fef1)
This is an automated email from the ASF dual-hosted git repository. yamamuro pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from 9a79bbc [SPARK-32610][DOCS] Fix the link to metrics.dropwizard.io in monitoring.md to refer the proper version add 8f0fef1 [SPARK-32399][SQL] Full outer shuffled hash join No new revisions were added by this update. Summary of changes: .../apache/spark/unsafe/map/BytesToBytesMap.java | 70 ++ .../unsafe/map/AbstractBytesToBytesMapSuite.java | 27 ++- .../spark/sql/catalyst/optimizer/joins.scala | 27 ++- .../org/apache/spark/sql/internal/SQLConf.scala| 5 +- .../spark/sql/execution/SparkStrategies.scala | 6 +- .../spark/sql/execution/joins/HashJoin.scala | 4 +- .../spark/sql/execution/joins/HashedRelation.scala | 175 ++- .../sql/execution/joins/ShuffledHashJoinExec.scala | 239 - .../spark/sql/execution/joins/ShuffledJoin.scala | 23 +- .../sql/execution/joins/SortMergeJoinExec.scala| 20 -- .../scala/org/apache/spark/sql/JoinSuite.scala | 66 ++ .../sql/execution/joins/HashedRelationSuite.scala | 79 +++ 12 files changed, 693 insertions(+), 48 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (9a79bbc -> 8f0fef1)
This is an automated email from the ASF dual-hosted git repository. yamamuro pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from 9a79bbc [SPARK-32610][DOCS] Fix the link to metrics.dropwizard.io in monitoring.md to refer the proper version add 8f0fef1 [SPARK-32399][SQL] Full outer shuffled hash join No new revisions were added by this update. Summary of changes: .../apache/spark/unsafe/map/BytesToBytesMap.java | 70 ++ .../unsafe/map/AbstractBytesToBytesMapSuite.java | 27 ++- .../spark/sql/catalyst/optimizer/joins.scala | 27 ++- .../org/apache/spark/sql/internal/SQLConf.scala| 5 +- .../spark/sql/execution/SparkStrategies.scala | 6 +- .../spark/sql/execution/joins/HashJoin.scala | 4 +- .../spark/sql/execution/joins/HashedRelation.scala | 175 ++- .../sql/execution/joins/ShuffledHashJoinExec.scala | 239 - .../spark/sql/execution/joins/ShuffledJoin.scala | 23 +- .../sql/execution/joins/SortMergeJoinExec.scala| 20 -- .../scala/org/apache/spark/sql/JoinSuite.scala | 66 ++ .../sql/execution/joins/HashedRelationSuite.scala | 79 +++ 12 files changed, 693 insertions(+), 48 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (9a79bbc -> 8f0fef1)
This is an automated email from the ASF dual-hosted git repository. yamamuro pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from 9a79bbc [SPARK-32610][DOCS] Fix the link to metrics.dropwizard.io in monitoring.md to refer the proper version add 8f0fef1 [SPARK-32399][SQL] Full outer shuffled hash join No new revisions were added by this update. Summary of changes: .../apache/spark/unsafe/map/BytesToBytesMap.java | 70 ++ .../unsafe/map/AbstractBytesToBytesMapSuite.java | 27 ++- .../spark/sql/catalyst/optimizer/joins.scala | 27 ++- .../org/apache/spark/sql/internal/SQLConf.scala| 5 +- .../spark/sql/execution/SparkStrategies.scala | 6 +- .../spark/sql/execution/joins/HashJoin.scala | 4 +- .../spark/sql/execution/joins/HashedRelation.scala | 175 ++- .../sql/execution/joins/ShuffledHashJoinExec.scala | 239 - .../spark/sql/execution/joins/ShuffledJoin.scala | 23 +- .../sql/execution/joins/SortMergeJoinExec.scala| 20 -- .../scala/org/apache/spark/sql/JoinSuite.scala | 66 ++ .../sql/execution/joins/HashedRelationSuite.scala | 79 +++ 12 files changed, 693 insertions(+), 48 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (9a79bbc -> 8f0fef1)
This is an automated email from the ASF dual-hosted git repository. yamamuro pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from 9a79bbc [SPARK-32610][DOCS] Fix the link to metrics.dropwizard.io in monitoring.md to refer the proper version add 8f0fef1 [SPARK-32399][SQL] Full outer shuffled hash join No new revisions were added by this update. Summary of changes: .../apache/spark/unsafe/map/BytesToBytesMap.java | 70 ++ .../unsafe/map/AbstractBytesToBytesMapSuite.java | 27 ++- .../spark/sql/catalyst/optimizer/joins.scala | 27 ++- .../org/apache/spark/sql/internal/SQLConf.scala| 5 +- .../spark/sql/execution/SparkStrategies.scala | 6 +- .../spark/sql/execution/joins/HashJoin.scala | 4 +- .../spark/sql/execution/joins/HashedRelation.scala | 175 ++- .../sql/execution/joins/ShuffledHashJoinExec.scala | 239 - .../spark/sql/execution/joins/ShuffledJoin.scala | 23 +- .../sql/execution/joins/SortMergeJoinExec.scala| 20 -- .../scala/org/apache/spark/sql/JoinSuite.scala | 66 ++ .../sql/execution/joins/HashedRelationSuite.scala | 79 +++ 12 files changed, 693 insertions(+), 48 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch branch-3.0 updated (89765f5 -> 81d7747)
This is an automated email from the ASF dual-hosted git repository. yamamuro pushed a change to branch branch-3.0 in repository https://gitbox.apache.org/repos/asf/spark.git. from 89765f5 [SPARK-32018][SQL][FOLLOWUP][3.0] Throw exception on decimal value overflow of sum aggregation add 81d7747 [MINOR][SQL] Fixed approx_count_distinct rsd param description No new revisions were added by this update. Summary of changes: R/pkg/R/functions.R | 2 +- python/pyspark/sql/functions.py | 4 ++-- .../expressions/aggregate/ApproxCountDistinctForIntervals.scala | 3 ++- .../sql/catalyst/expressions/aggregate/HyperLogLogPlusPlus.scala | 4 ++-- .../src/main/scala/org/apache/spark/sql/internal/SQLConf.scala| 4 ++-- sql/core/src/main/scala/org/apache/spark/sql/functions.scala | 4 ++-- 6 files changed, 11 insertions(+), 10 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch branch-3.0 updated (89765f5 -> 81d7747)
This is an automated email from the ASF dual-hosted git repository. yamamuro pushed a change to branch branch-3.0 in repository https://gitbox.apache.org/repos/asf/spark.git. from 89765f5 [SPARK-32018][SQL][FOLLOWUP][3.0] Throw exception on decimal value overflow of sum aggregation add 81d7747 [MINOR][SQL] Fixed approx_count_distinct rsd param description No new revisions were added by this update. Summary of changes: R/pkg/R/functions.R | 2 +- python/pyspark/sql/functions.py | 4 ++-- .../expressions/aggregate/ApproxCountDistinctForIntervals.scala | 3 ++- .../sql/catalyst/expressions/aggregate/HyperLogLogPlusPlus.scala | 4 ++-- .../src/main/scala/org/apache/spark/sql/internal/SQLConf.scala| 4 ++-- sql/core/src/main/scala/org/apache/spark/sql/functions.scala | 4 ++-- 6 files changed, 11 insertions(+), 10 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (14003d4 -> 10edeaf)
This is an automated email from the ASF dual-hosted git repository. yamamuro pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from 14003d4 [SPARK-32590][SQL] Remove fullOutput from RowDataSourceScanExec add 10edeaf [MINOR][SQL] Fixed approx_count_distinct rsd param description No new revisions were added by this update. Summary of changes: R/pkg/R/functions.R | 2 +- python/pyspark/sql/functions.py | 4 ++-- .../expressions/aggregate/ApproxCountDistinctForIntervals.scala | 3 ++- .../sql/catalyst/expressions/aggregate/HyperLogLogPlusPlus.scala | 4 ++-- .../src/main/scala/org/apache/spark/sql/internal/SQLConf.scala| 4 ++-- sql/core/src/main/scala/org/apache/spark/sql/functions.scala | 4 ++-- 6 files changed, 11 insertions(+), 10 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch branch-3.0 updated (89765f5 -> 81d7747)
This is an automated email from the ASF dual-hosted git repository. yamamuro pushed a change to branch branch-3.0 in repository https://gitbox.apache.org/repos/asf/spark.git. from 89765f5 [SPARK-32018][SQL][FOLLOWUP][3.0] Throw exception on decimal value overflow of sum aggregation add 81d7747 [MINOR][SQL] Fixed approx_count_distinct rsd param description No new revisions were added by this update. Summary of changes: R/pkg/R/functions.R | 2 +- python/pyspark/sql/functions.py | 4 ++-- .../expressions/aggregate/ApproxCountDistinctForIntervals.scala | 3 ++- .../sql/catalyst/expressions/aggregate/HyperLogLogPlusPlus.scala | 4 ++-- .../src/main/scala/org/apache/spark/sql/internal/SQLConf.scala| 4 ++-- sql/core/src/main/scala/org/apache/spark/sql/functions.scala | 4 ++-- 6 files changed, 11 insertions(+), 10 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (14003d4 -> 10edeaf)
This is an automated email from the ASF dual-hosted git repository. yamamuro pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from 14003d4 [SPARK-32590][SQL] Remove fullOutput from RowDataSourceScanExec add 10edeaf [MINOR][SQL] Fixed approx_count_distinct rsd param description No new revisions were added by this update. Summary of changes: R/pkg/R/functions.R | 2 +- python/pyspark/sql/functions.py | 4 ++-- .../expressions/aggregate/ApproxCountDistinctForIntervals.scala | 3 ++- .../sql/catalyst/expressions/aggregate/HyperLogLogPlusPlus.scala | 4 ++-- .../src/main/scala/org/apache/spark/sql/internal/SQLConf.scala| 4 ++-- sql/core/src/main/scala/org/apache/spark/sql/functions.scala | 4 ++-- 6 files changed, 11 insertions(+), 10 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch branch-3.0 updated (89765f5 -> 81d7747)
This is an automated email from the ASF dual-hosted git repository. yamamuro pushed a change to branch branch-3.0 in repository https://gitbox.apache.org/repos/asf/spark.git. from 89765f5 [SPARK-32018][SQL][FOLLOWUP][3.0] Throw exception on decimal value overflow of sum aggregation add 81d7747 [MINOR][SQL] Fixed approx_count_distinct rsd param description No new revisions were added by this update. Summary of changes: R/pkg/R/functions.R | 2 +- python/pyspark/sql/functions.py | 4 ++-- .../expressions/aggregate/ApproxCountDistinctForIntervals.scala | 3 ++- .../sql/catalyst/expressions/aggregate/HyperLogLogPlusPlus.scala | 4 ++-- .../src/main/scala/org/apache/spark/sql/internal/SQLConf.scala| 4 ++-- sql/core/src/main/scala/org/apache/spark/sql/functions.scala | 4 ++-- 6 files changed, 11 insertions(+), 10 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (14003d4 -> 10edeaf)
This is an automated email from the ASF dual-hosted git repository. yamamuro pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from 14003d4 [SPARK-32590][SQL] Remove fullOutput from RowDataSourceScanExec add 10edeaf [MINOR][SQL] Fixed approx_count_distinct rsd param description No new revisions were added by this update. Summary of changes: R/pkg/R/functions.R | 2 +- python/pyspark/sql/functions.py | 4 ++-- .../expressions/aggregate/ApproxCountDistinctForIntervals.scala | 3 ++- .../sql/catalyst/expressions/aggregate/HyperLogLogPlusPlus.scala | 4 ++-- .../src/main/scala/org/apache/spark/sql/internal/SQLConf.scala| 4 ++-- sql/core/src/main/scala/org/apache/spark/sql/functions.scala | 4 ++-- 6 files changed, 11 insertions(+), 10 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch branch-3.0 updated (89765f5 -> 81d7747)
This is an automated email from the ASF dual-hosted git repository. yamamuro pushed a change to branch branch-3.0 in repository https://gitbox.apache.org/repos/asf/spark.git. from 89765f5 [SPARK-32018][SQL][FOLLOWUP][3.0] Throw exception on decimal value overflow of sum aggregation add 81d7747 [MINOR][SQL] Fixed approx_count_distinct rsd param description No new revisions were added by this update. Summary of changes: R/pkg/R/functions.R | 2 +- python/pyspark/sql/functions.py | 4 ++-- .../expressions/aggregate/ApproxCountDistinctForIntervals.scala | 3 ++- .../sql/catalyst/expressions/aggregate/HyperLogLogPlusPlus.scala | 4 ++-- .../src/main/scala/org/apache/spark/sql/internal/SQLConf.scala| 4 ++-- sql/core/src/main/scala/org/apache/spark/sql/functions.scala | 4 ++-- 6 files changed, 11 insertions(+), 10 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (14003d4 -> 10edeaf)
This is an automated email from the ASF dual-hosted git repository. yamamuro pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from 14003d4 [SPARK-32590][SQL] Remove fullOutput from RowDataSourceScanExec add 10edeaf [MINOR][SQL] Fixed approx_count_distinct rsd param description No new revisions were added by this update. Summary of changes: R/pkg/R/functions.R | 2 +- python/pyspark/sql/functions.py | 4 ++-- .../expressions/aggregate/ApproxCountDistinctForIntervals.scala | 3 ++- .../sql/catalyst/expressions/aggregate/HyperLogLogPlusPlus.scala | 4 ++-- .../src/main/scala/org/apache/spark/sql/internal/SQLConf.scala| 4 ++-- sql/core/src/main/scala/org/apache/spark/sql/functions.scala | 4 ++-- 6 files changed, 11 insertions(+), 10 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (14003d4 -> 10edeaf)
This is an automated email from the ASF dual-hosted git repository. yamamuro pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from 14003d4 [SPARK-32590][SQL] Remove fullOutput from RowDataSourceScanExec add 10edeaf [MINOR][SQL] Fixed approx_count_distinct rsd param description No new revisions were added by this update. Summary of changes: R/pkg/R/functions.R | 2 +- python/pyspark/sql/functions.py | 4 ++-- .../expressions/aggregate/ApproxCountDistinctForIntervals.scala | 3 ++- .../sql/catalyst/expressions/aggregate/HyperLogLogPlusPlus.scala | 4 ++-- .../src/main/scala/org/apache/spark/sql/internal/SQLConf.scala| 4 ++-- sql/core/src/main/scala/org/apache/spark/sql/functions.scala | 4 ++-- 6 files changed, 11 insertions(+), 10 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark-website] branch asf-site updated: Added Data Pipelines to Powered By page
This is an automated email from the ASF dual-hosted git repository. yamamuro pushed a commit to branch asf-site in repository https://gitbox.apache.org/repos/asf/spark-website.git The following commit(s) were added to refs/heads/asf-site by this push: new 78a971a Added Data Pipelines to Powered By page 78a971a is described below commit 78a971a9a2d13fe66d6a5ac8b18f5846d41fcd34 Author: roland1982 AuthorDate: Fri Aug 14 22:06:03 2020 +0900 Added Data Pipelines to Powered By page Author: roland1982 Author: roland-ondeviceresearch Closes #285 from roland1982/patch-1. --- powered-by.md| 2 ++ site/powered-by.html | 5 + 2 files changed, 7 insertions(+) diff --git a/powered-by.md b/powered-by.md index 098f9ce..150d402 100644 --- a/powered-by.md +++ b/powered-by.md @@ -88,6 +88,8 @@ and external data sources, driving holistic and actionable insights. - We provided a https://www.databricks.com/product;>cloud-optimized platform to run Spark and ML applications on Amazon Web Services and Azure, as well as a comprehensive https://databricks.com/training;>training program. +- https://datapipelines.com;>Data Pipelines + - Build and schedule ETL pipelines step-by-step via a simple no-code UI. - http://dianping.com;>Dianping.com - http://www.drawbrid.ge/;>Drawbridge - http://www.ebay.com/;>eBay Inc. diff --git a/site/powered-by.html b/site/powered-by.html index efe64b2..e19436b 100644 --- a/site/powered-by.html +++ b/site/powered-by.html @@ -321,6 +321,11 @@ to run Spark and ML applications on Amazon Web Services and Azure, as well as a https://databricks.com/training;>training program. + https://datapipelines.com;>Data Pipelines + + Build and schedule ETL pipelines step-by-step via a simple no-code UI. + + http://dianping.com;>Dianping.com http://www.drawbrid.ge/;>Drawbridge http://www.ebay.com/;>eBay Inc. - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch branch-3.0 updated: [SPARK-32576][SQL][TEST][FOLLOWUP] Add tests for all the character array types in PostgresIntegrationSuite
This is an automated email from the ASF dual-hosted git repository. yamamuro pushed a commit to branch branch-3.0 in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/branch-3.0 by this push: new 843ff03 [SPARK-32576][SQL][TEST][FOLLOWUP] Add tests for all the character array types in PostgresIntegrationSuite 843ff03 is described below commit 843ff0367e45034bfc1e174a939f336bcc8d2391 Author: Takeshi Yamamuro AuthorDate: Mon Aug 10 19:05:50 2020 +0900 [SPARK-32576][SQL][TEST][FOLLOWUP] Add tests for all the character array types in PostgresIntegrationSuite ### What changes were proposed in this pull request? This is a follow-up PR of #29192 that adds integration tests for character arrays in `PostgresIntegrationSuite`. ### Why are the changes needed? For better test coverage. ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? Add tests. Closes #29397 from maropu/SPARK-32576-FOLLOWUP. Authored-by: Takeshi Yamamuro Signed-off-by: Takeshi Yamamuro (cherry picked from commit 7990ea14090c13e1fd1e42bc519b54144bd3aa76) Signed-off-by: Takeshi Yamamuro --- .../spark/sql/jdbc/PostgresIntegrationSuite.scala | 19 +++ 1 file changed, 19 insertions(+) diff --git a/external/docker-integration-tests/src/test/scala/org/apache/spark/sql/jdbc/PostgresIntegrationSuite.scala b/external/docker-integration-tests/src/test/scala/org/apache/spark/sql/jdbc/PostgresIntegrationSuite.scala index 1914491..2b676be 100644 --- a/external/docker-integration-tests/src/test/scala/org/apache/spark/sql/jdbc/PostgresIntegrationSuite.scala +++ b/external/docker-integration-tests/src/test/scala/org/apache/spark/sql/jdbc/PostgresIntegrationSuite.scala @@ -84,6 +84,13 @@ class PostgresIntegrationSuite extends DockerJDBCIntegrationSuite { ).executeUpdate() conn.prepareStatement("INSERT INTO char_types VALUES " + "('abcd', 'efgh', 'ijkl', 'mnop', 'q')").executeUpdate() + +conn.prepareStatement("CREATE TABLE char_array_types (" + + "c0 char(4)[], c1 character(4)[], c2 character varying(4)[], c3 varchar(4)[], c4 bpchar[])" +).executeUpdate() +conn.prepareStatement("INSERT INTO char_array_types VALUES " + + """('{"a", "bcd"}', '{"ef", "gh"}', '{"i", "j", "kl"}', '{"mnop"}', '{"q", "r"}')""" +).executeUpdate() } test("Type mapping for various types") { @@ -236,4 +243,16 @@ class PostgresIntegrationSuite extends DockerJDBCIntegrationSuite { assert(row(0).getString(3) === "mnop") assert(row(0).getString(4) === "q") } + + test("SPARK-32576: character array type tests") { +val df = sqlContext.read.jdbc(jdbcUrl, "char_array_types", new Properties) +val row = df.collect() +assert(row.length == 1) +assert(row(0).length === 5) +assert(row(0).getSeq[String](0) === Seq("a ", "bcd ")) +assert(row(0).getSeq[String](1) === Seq("ef ", "gh ")) +assert(row(0).getSeq[String](2) === Seq("i", "j", "kl")) +assert(row(0).getSeq[String](3) === Seq("mnop")) +assert(row(0).getSeq[String](4) === Seq("q", "r")) + } } - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch branch-3.0 updated: [SPARK-32576][SQL][TEST][FOLLOWUP] Add tests for all the character array types in PostgresIntegrationSuite
This is an automated email from the ASF dual-hosted git repository. yamamuro pushed a commit to branch branch-3.0 in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/branch-3.0 by this push: new 843ff03 [SPARK-32576][SQL][TEST][FOLLOWUP] Add tests for all the character array types in PostgresIntegrationSuite 843ff03 is described below commit 843ff0367e45034bfc1e174a939f336bcc8d2391 Author: Takeshi Yamamuro AuthorDate: Mon Aug 10 19:05:50 2020 +0900 [SPARK-32576][SQL][TEST][FOLLOWUP] Add tests for all the character array types in PostgresIntegrationSuite ### What changes were proposed in this pull request? This is a follow-up PR of #29192 that adds integration tests for character arrays in `PostgresIntegrationSuite`. ### Why are the changes needed? For better test coverage. ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? Add tests. Closes #29397 from maropu/SPARK-32576-FOLLOWUP. Authored-by: Takeshi Yamamuro Signed-off-by: Takeshi Yamamuro (cherry picked from commit 7990ea14090c13e1fd1e42bc519b54144bd3aa76) Signed-off-by: Takeshi Yamamuro --- .../spark/sql/jdbc/PostgresIntegrationSuite.scala | 19 +++ 1 file changed, 19 insertions(+) diff --git a/external/docker-integration-tests/src/test/scala/org/apache/spark/sql/jdbc/PostgresIntegrationSuite.scala b/external/docker-integration-tests/src/test/scala/org/apache/spark/sql/jdbc/PostgresIntegrationSuite.scala index 1914491..2b676be 100644 --- a/external/docker-integration-tests/src/test/scala/org/apache/spark/sql/jdbc/PostgresIntegrationSuite.scala +++ b/external/docker-integration-tests/src/test/scala/org/apache/spark/sql/jdbc/PostgresIntegrationSuite.scala @@ -84,6 +84,13 @@ class PostgresIntegrationSuite extends DockerJDBCIntegrationSuite { ).executeUpdate() conn.prepareStatement("INSERT INTO char_types VALUES " + "('abcd', 'efgh', 'ijkl', 'mnop', 'q')").executeUpdate() + +conn.prepareStatement("CREATE TABLE char_array_types (" + + "c0 char(4)[], c1 character(4)[], c2 character varying(4)[], c3 varchar(4)[], c4 bpchar[])" +).executeUpdate() +conn.prepareStatement("INSERT INTO char_array_types VALUES " + + """('{"a", "bcd"}', '{"ef", "gh"}', '{"i", "j", "kl"}', '{"mnop"}', '{"q", "r"}')""" +).executeUpdate() } test("Type mapping for various types") { @@ -236,4 +243,16 @@ class PostgresIntegrationSuite extends DockerJDBCIntegrationSuite { assert(row(0).getString(3) === "mnop") assert(row(0).getString(4) === "q") } + + test("SPARK-32576: character array type tests") { +val df = sqlContext.read.jdbc(jdbcUrl, "char_array_types", new Properties) +val row = df.collect() +assert(row.length == 1) +assert(row(0).length === 5) +assert(row(0).getSeq[String](0) === Seq("a ", "bcd ")) +assert(row(0).getSeq[String](1) === Seq("ef ", "gh ")) +assert(row(0).getSeq[String](2) === Seq("i", "j", "kl")) +assert(row(0).getSeq[String](3) === Seq("mnop")) +assert(row(0).getSeq[String](4) === Seq("q", "r")) + } } - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (fc62d72 -> 7990ea1)
This is an automated email from the ASF dual-hosted git repository. yamamuro pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from fc62d72 [MINOR] add test_createDataFrame_empty_partition in pyspark arrow tests add 7990ea1 [SPARK-32576][SQL][TEST][FOLLOWUP] Add tests for all the character array types in PostgresIntegrationSuite No new revisions were added by this update. Summary of changes: .../spark/sql/jdbc/PostgresIntegrationSuite.scala | 19 +++ 1 file changed, 19 insertions(+) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch branch-3.0 updated: [SPARK-32576][SQL][TEST][FOLLOWUP] Add tests for all the character array types in PostgresIntegrationSuite
This is an automated email from the ASF dual-hosted git repository. yamamuro pushed a commit to branch branch-3.0 in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/branch-3.0 by this push: new 843ff03 [SPARK-32576][SQL][TEST][FOLLOWUP] Add tests for all the character array types in PostgresIntegrationSuite 843ff03 is described below commit 843ff0367e45034bfc1e174a939f336bcc8d2391 Author: Takeshi Yamamuro AuthorDate: Mon Aug 10 19:05:50 2020 +0900 [SPARK-32576][SQL][TEST][FOLLOWUP] Add tests for all the character array types in PostgresIntegrationSuite ### What changes were proposed in this pull request? This is a follow-up PR of #29192 that adds integration tests for character arrays in `PostgresIntegrationSuite`. ### Why are the changes needed? For better test coverage. ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? Add tests. Closes #29397 from maropu/SPARK-32576-FOLLOWUP. Authored-by: Takeshi Yamamuro Signed-off-by: Takeshi Yamamuro (cherry picked from commit 7990ea14090c13e1fd1e42bc519b54144bd3aa76) Signed-off-by: Takeshi Yamamuro --- .../spark/sql/jdbc/PostgresIntegrationSuite.scala | 19 +++ 1 file changed, 19 insertions(+) diff --git a/external/docker-integration-tests/src/test/scala/org/apache/spark/sql/jdbc/PostgresIntegrationSuite.scala b/external/docker-integration-tests/src/test/scala/org/apache/spark/sql/jdbc/PostgresIntegrationSuite.scala index 1914491..2b676be 100644 --- a/external/docker-integration-tests/src/test/scala/org/apache/spark/sql/jdbc/PostgresIntegrationSuite.scala +++ b/external/docker-integration-tests/src/test/scala/org/apache/spark/sql/jdbc/PostgresIntegrationSuite.scala @@ -84,6 +84,13 @@ class PostgresIntegrationSuite extends DockerJDBCIntegrationSuite { ).executeUpdate() conn.prepareStatement("INSERT INTO char_types VALUES " + "('abcd', 'efgh', 'ijkl', 'mnop', 'q')").executeUpdate() + +conn.prepareStatement("CREATE TABLE char_array_types (" + + "c0 char(4)[], c1 character(4)[], c2 character varying(4)[], c3 varchar(4)[], c4 bpchar[])" +).executeUpdate() +conn.prepareStatement("INSERT INTO char_array_types VALUES " + + """('{"a", "bcd"}', '{"ef", "gh"}', '{"i", "j", "kl"}', '{"mnop"}', '{"q", "r"}')""" +).executeUpdate() } test("Type mapping for various types") { @@ -236,4 +243,16 @@ class PostgresIntegrationSuite extends DockerJDBCIntegrationSuite { assert(row(0).getString(3) === "mnop") assert(row(0).getString(4) === "q") } + + test("SPARK-32576: character array type tests") { +val df = sqlContext.read.jdbc(jdbcUrl, "char_array_types", new Properties) +val row = df.collect() +assert(row.length == 1) +assert(row(0).length === 5) +assert(row(0).getSeq[String](0) === Seq("a ", "bcd ")) +assert(row(0).getSeq[String](1) === Seq("ef ", "gh ")) +assert(row(0).getSeq[String](2) === Seq("i", "j", "kl")) +assert(row(0).getSeq[String](3) === Seq("mnop")) +assert(row(0).getSeq[String](4) === Seq("q", "r")) + } } - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (fc62d72 -> 7990ea1)
This is an automated email from the ASF dual-hosted git repository. yamamuro pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from fc62d72 [MINOR] add test_createDataFrame_empty_partition in pyspark arrow tests add 7990ea1 [SPARK-32576][SQL][TEST][FOLLOWUP] Add tests for all the character array types in PostgresIntegrationSuite No new revisions were added by this update. Summary of changes: .../spark/sql/jdbc/PostgresIntegrationSuite.scala | 19 +++ 1 file changed, 19 insertions(+) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch branch-3.0 updated: [SPARK-32576][SQL][TEST][FOLLOWUP] Add tests for all the character array types in PostgresIntegrationSuite
This is an automated email from the ASF dual-hosted git repository. yamamuro pushed a commit to branch branch-3.0 in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/branch-3.0 by this push: new 843ff03 [SPARK-32576][SQL][TEST][FOLLOWUP] Add tests for all the character array types in PostgresIntegrationSuite 843ff03 is described below commit 843ff0367e45034bfc1e174a939f336bcc8d2391 Author: Takeshi Yamamuro AuthorDate: Mon Aug 10 19:05:50 2020 +0900 [SPARK-32576][SQL][TEST][FOLLOWUP] Add tests for all the character array types in PostgresIntegrationSuite ### What changes were proposed in this pull request? This is a follow-up PR of #29192 that adds integration tests for character arrays in `PostgresIntegrationSuite`. ### Why are the changes needed? For better test coverage. ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? Add tests. Closes #29397 from maropu/SPARK-32576-FOLLOWUP. Authored-by: Takeshi Yamamuro Signed-off-by: Takeshi Yamamuro (cherry picked from commit 7990ea14090c13e1fd1e42bc519b54144bd3aa76) Signed-off-by: Takeshi Yamamuro --- .../spark/sql/jdbc/PostgresIntegrationSuite.scala | 19 +++ 1 file changed, 19 insertions(+) diff --git a/external/docker-integration-tests/src/test/scala/org/apache/spark/sql/jdbc/PostgresIntegrationSuite.scala b/external/docker-integration-tests/src/test/scala/org/apache/spark/sql/jdbc/PostgresIntegrationSuite.scala index 1914491..2b676be 100644 --- a/external/docker-integration-tests/src/test/scala/org/apache/spark/sql/jdbc/PostgresIntegrationSuite.scala +++ b/external/docker-integration-tests/src/test/scala/org/apache/spark/sql/jdbc/PostgresIntegrationSuite.scala @@ -84,6 +84,13 @@ class PostgresIntegrationSuite extends DockerJDBCIntegrationSuite { ).executeUpdate() conn.prepareStatement("INSERT INTO char_types VALUES " + "('abcd', 'efgh', 'ijkl', 'mnop', 'q')").executeUpdate() + +conn.prepareStatement("CREATE TABLE char_array_types (" + + "c0 char(4)[], c1 character(4)[], c2 character varying(4)[], c3 varchar(4)[], c4 bpchar[])" +).executeUpdate() +conn.prepareStatement("INSERT INTO char_array_types VALUES " + + """('{"a", "bcd"}', '{"ef", "gh"}', '{"i", "j", "kl"}', '{"mnop"}', '{"q", "r"}')""" +).executeUpdate() } test("Type mapping for various types") { @@ -236,4 +243,16 @@ class PostgresIntegrationSuite extends DockerJDBCIntegrationSuite { assert(row(0).getString(3) === "mnop") assert(row(0).getString(4) === "q") } + + test("SPARK-32576: character array type tests") { +val df = sqlContext.read.jdbc(jdbcUrl, "char_array_types", new Properties) +val row = df.collect() +assert(row.length == 1) +assert(row(0).length === 5) +assert(row(0).getSeq[String](0) === Seq("a ", "bcd ")) +assert(row(0).getSeq[String](1) === Seq("ef ", "gh ")) +assert(row(0).getSeq[String](2) === Seq("i", "j", "kl")) +assert(row(0).getSeq[String](3) === Seq("mnop")) +assert(row(0).getSeq[String](4) === Seq("q", "r")) + } } - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (fc62d72 -> 7990ea1)
This is an automated email from the ASF dual-hosted git repository. yamamuro pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from fc62d72 [MINOR] add test_createDataFrame_empty_partition in pyspark arrow tests add 7990ea1 [SPARK-32576][SQL][TEST][FOLLOWUP] Add tests for all the character array types in PostgresIntegrationSuite No new revisions were added by this update. Summary of changes: .../spark/sql/jdbc/PostgresIntegrationSuite.scala | 19 +++ 1 file changed, 19 insertions(+) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch branch-3.0 updated: [SPARK-32576][SQL][TEST][FOLLOWUP] Add tests for all the character array types in PostgresIntegrationSuite
This is an automated email from the ASF dual-hosted git repository. yamamuro pushed a commit to branch branch-3.0 in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/branch-3.0 by this push: new 843ff03 [SPARK-32576][SQL][TEST][FOLLOWUP] Add tests for all the character array types in PostgresIntegrationSuite 843ff03 is described below commit 843ff0367e45034bfc1e174a939f336bcc8d2391 Author: Takeshi Yamamuro AuthorDate: Mon Aug 10 19:05:50 2020 +0900 [SPARK-32576][SQL][TEST][FOLLOWUP] Add tests for all the character array types in PostgresIntegrationSuite ### What changes were proposed in this pull request? This is a follow-up PR of #29192 that adds integration tests for character arrays in `PostgresIntegrationSuite`. ### Why are the changes needed? For better test coverage. ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? Add tests. Closes #29397 from maropu/SPARK-32576-FOLLOWUP. Authored-by: Takeshi Yamamuro Signed-off-by: Takeshi Yamamuro (cherry picked from commit 7990ea14090c13e1fd1e42bc519b54144bd3aa76) Signed-off-by: Takeshi Yamamuro --- .../spark/sql/jdbc/PostgresIntegrationSuite.scala | 19 +++ 1 file changed, 19 insertions(+) diff --git a/external/docker-integration-tests/src/test/scala/org/apache/spark/sql/jdbc/PostgresIntegrationSuite.scala b/external/docker-integration-tests/src/test/scala/org/apache/spark/sql/jdbc/PostgresIntegrationSuite.scala index 1914491..2b676be 100644 --- a/external/docker-integration-tests/src/test/scala/org/apache/spark/sql/jdbc/PostgresIntegrationSuite.scala +++ b/external/docker-integration-tests/src/test/scala/org/apache/spark/sql/jdbc/PostgresIntegrationSuite.scala @@ -84,6 +84,13 @@ class PostgresIntegrationSuite extends DockerJDBCIntegrationSuite { ).executeUpdate() conn.prepareStatement("INSERT INTO char_types VALUES " + "('abcd', 'efgh', 'ijkl', 'mnop', 'q')").executeUpdate() + +conn.prepareStatement("CREATE TABLE char_array_types (" + + "c0 char(4)[], c1 character(4)[], c2 character varying(4)[], c3 varchar(4)[], c4 bpchar[])" +).executeUpdate() +conn.prepareStatement("INSERT INTO char_array_types VALUES " + + """('{"a", "bcd"}', '{"ef", "gh"}', '{"i", "j", "kl"}', '{"mnop"}', '{"q", "r"}')""" +).executeUpdate() } test("Type mapping for various types") { @@ -236,4 +243,16 @@ class PostgresIntegrationSuite extends DockerJDBCIntegrationSuite { assert(row(0).getString(3) === "mnop") assert(row(0).getString(4) === "q") } + + test("SPARK-32576: character array type tests") { +val df = sqlContext.read.jdbc(jdbcUrl, "char_array_types", new Properties) +val row = df.collect() +assert(row.length == 1) +assert(row(0).length === 5) +assert(row(0).getSeq[String](0) === Seq("a ", "bcd ")) +assert(row(0).getSeq[String](1) === Seq("ef ", "gh ")) +assert(row(0).getSeq[String](2) === Seq("i", "j", "kl")) +assert(row(0).getSeq[String](3) === Seq("mnop")) +assert(row(0).getSeq[String](4) === Seq("q", "r")) + } } - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (fc62d72 -> 7990ea1)
This is an automated email from the ASF dual-hosted git repository. yamamuro pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from fc62d72 [MINOR] add test_createDataFrame_empty_partition in pyspark arrow tests add 7990ea1 [SPARK-32576][SQL][TEST][FOLLOWUP] Add tests for all the character array types in PostgresIntegrationSuite No new revisions were added by this update. Summary of changes: .../spark/sql/jdbc/PostgresIntegrationSuite.scala | 19 +++ 1 file changed, 19 insertions(+) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (fc62d72 -> 7990ea1)
This is an automated email from the ASF dual-hosted git repository. yamamuro pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from fc62d72 [MINOR] add test_createDataFrame_empty_partition in pyspark arrow tests add 7990ea1 [SPARK-32576][SQL][TEST][FOLLOWUP] Add tests for all the character array types in PostgresIntegrationSuite No new revisions were added by this update. Summary of changes: .../spark/sql/jdbc/PostgresIntegrationSuite.scala | 19 +++ 1 file changed, 19 insertions(+) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch branch-3.0 updated: [SPARK-32538][CORE][TEST] Use local time zone for the timestamp logged in unit-tests.log
This is an automated email from the ASF dual-hosted git repository. yamamuro pushed a commit to branch branch-3.0 in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/branch-3.0 by this push: new c7af0be [SPARK-32538][CORE][TEST] Use local time zone for the timestamp logged in unit-tests.log c7af0be is described below commit c7af0be530d8c2a4935acf9a593034fb85d5572f Author: Kousuke Saruta AuthorDate: Fri Aug 7 11:29:18 2020 +0900 [SPARK-32538][CORE][TEST] Use local time zone for the timestamp logged in unit-tests.log ### What changes were proposed in this pull request? This PR lets the logger log timestamp based on local time zone during test. `SparkFunSuite` fixes the default time zone to America/Los_Angeles so the timestamp logged in unit-tests.log is also based on the fixed time zone. ### Why are the changes needed? It's confusable for developers whose time zone is not America/Los_Angeles. ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? Run existing tests and confirmed uint-tests.log. If your local time zone is America/Los_Angeles, you can test by setting the environment variable `TZ` like as follows. ``` $ TZ=Asia/Tokyo build/sbt "testOnly org.apache.spark.executor.ExecutorSuite" $ tail core/target/unit-tests.log ``` Closes #29356 from sarutak/fix-unit-test-log-timezone. Authored-by: Kousuke Saruta Signed-off-by: Takeshi Yamamuro (cherry picked from commit 4e267f3eb9ca0df18647c859b75b61b1af800120) Signed-off-by: Takeshi Yamamuro --- core/src/test/scala/org/apache/spark/SparkFunSuite.scala | 6 ++ 1 file changed, 6 insertions(+) diff --git a/core/src/test/scala/org/apache/spark/SparkFunSuite.scala b/core/src/test/scala/org/apache/spark/SparkFunSuite.scala index ec641f8..d402074 100644 --- a/core/src/test/scala/org/apache/spark/SparkFunSuite.scala +++ b/core/src/test/scala/org/apache/spark/SparkFunSuite.scala @@ -64,6 +64,12 @@ abstract class SparkFunSuite with Logging { // scalastyle:on + // Initialize the logger forcibly to let the logger log timestamp + // based on the local time zone depending on environments. + // The default time zone will be set to America/Los_Angeles later + // so this initialization is necessary here. + log + // Timezone is fixed to America/Los_Angeles for those timezone sensitive tests (timestamp_*) TimeZone.setDefault(TimeZone.getTimeZone("America/Los_Angeles")) // Add Locale setting - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (75c2c53 -> 4e267f3)
This is an automated email from the ASF dual-hosted git repository. yamamuro pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from 75c2c53 [SPARK-32506][TESTS] Flaky test: StreamingLinearRegressionWithTests add 4e267f3 [SPARK-32538][CORE][TEST] Use local time zone for the timestamp logged in unit-tests.log No new revisions were added by this update. Summary of changes: core/src/test/scala/org/apache/spark/SparkFunSuite.scala | 6 ++ 1 file changed, 6 insertions(+) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch branch-3.0 updated: [SPARK-32538][CORE][TEST] Use local time zone for the timestamp logged in unit-tests.log
This is an automated email from the ASF dual-hosted git repository. yamamuro pushed a commit to branch branch-3.0 in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/branch-3.0 by this push: new c7af0be [SPARK-32538][CORE][TEST] Use local time zone for the timestamp logged in unit-tests.log c7af0be is described below commit c7af0be530d8c2a4935acf9a593034fb85d5572f Author: Kousuke Saruta AuthorDate: Fri Aug 7 11:29:18 2020 +0900 [SPARK-32538][CORE][TEST] Use local time zone for the timestamp logged in unit-tests.log ### What changes were proposed in this pull request? This PR lets the logger log timestamp based on local time zone during test. `SparkFunSuite` fixes the default time zone to America/Los_Angeles so the timestamp logged in unit-tests.log is also based on the fixed time zone. ### Why are the changes needed? It's confusable for developers whose time zone is not America/Los_Angeles. ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? Run existing tests and confirmed uint-tests.log. If your local time zone is America/Los_Angeles, you can test by setting the environment variable `TZ` like as follows. ``` $ TZ=Asia/Tokyo build/sbt "testOnly org.apache.spark.executor.ExecutorSuite" $ tail core/target/unit-tests.log ``` Closes #29356 from sarutak/fix-unit-test-log-timezone. Authored-by: Kousuke Saruta Signed-off-by: Takeshi Yamamuro (cherry picked from commit 4e267f3eb9ca0df18647c859b75b61b1af800120) Signed-off-by: Takeshi Yamamuro --- core/src/test/scala/org/apache/spark/SparkFunSuite.scala | 6 ++ 1 file changed, 6 insertions(+) diff --git a/core/src/test/scala/org/apache/spark/SparkFunSuite.scala b/core/src/test/scala/org/apache/spark/SparkFunSuite.scala index ec641f8..d402074 100644 --- a/core/src/test/scala/org/apache/spark/SparkFunSuite.scala +++ b/core/src/test/scala/org/apache/spark/SparkFunSuite.scala @@ -64,6 +64,12 @@ abstract class SparkFunSuite with Logging { // scalastyle:on + // Initialize the logger forcibly to let the logger log timestamp + // based on the local time zone depending on environments. + // The default time zone will be set to America/Los_Angeles later + // so this initialization is necessary here. + log + // Timezone is fixed to America/Los_Angeles for those timezone sensitive tests (timestamp_*) TimeZone.setDefault(TimeZone.getTimeZone("America/Los_Angeles")) // Add Locale setting - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (75c2c53 -> 4e267f3)
This is an automated email from the ASF dual-hosted git repository. yamamuro pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from 75c2c53 [SPARK-32506][TESTS] Flaky test: StreamingLinearRegressionWithTests add 4e267f3 [SPARK-32538][CORE][TEST] Use local time zone for the timestamp logged in unit-tests.log No new revisions were added by this update. Summary of changes: core/src/test/scala/org/apache/spark/SparkFunSuite.scala | 6 ++ 1 file changed, 6 insertions(+) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch branch-3.0 updated: [SPARK-32538][CORE][TEST] Use local time zone for the timestamp logged in unit-tests.log
This is an automated email from the ASF dual-hosted git repository. yamamuro pushed a commit to branch branch-3.0 in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/branch-3.0 by this push: new c7af0be [SPARK-32538][CORE][TEST] Use local time zone for the timestamp logged in unit-tests.log c7af0be is described below commit c7af0be530d8c2a4935acf9a593034fb85d5572f Author: Kousuke Saruta AuthorDate: Fri Aug 7 11:29:18 2020 +0900 [SPARK-32538][CORE][TEST] Use local time zone for the timestamp logged in unit-tests.log ### What changes were proposed in this pull request? This PR lets the logger log timestamp based on local time zone during test. `SparkFunSuite` fixes the default time zone to America/Los_Angeles so the timestamp logged in unit-tests.log is also based on the fixed time zone. ### Why are the changes needed? It's confusable for developers whose time zone is not America/Los_Angeles. ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? Run existing tests and confirmed uint-tests.log. If your local time zone is America/Los_Angeles, you can test by setting the environment variable `TZ` like as follows. ``` $ TZ=Asia/Tokyo build/sbt "testOnly org.apache.spark.executor.ExecutorSuite" $ tail core/target/unit-tests.log ``` Closes #29356 from sarutak/fix-unit-test-log-timezone. Authored-by: Kousuke Saruta Signed-off-by: Takeshi Yamamuro (cherry picked from commit 4e267f3eb9ca0df18647c859b75b61b1af800120) Signed-off-by: Takeshi Yamamuro --- core/src/test/scala/org/apache/spark/SparkFunSuite.scala | 6 ++ 1 file changed, 6 insertions(+) diff --git a/core/src/test/scala/org/apache/spark/SparkFunSuite.scala b/core/src/test/scala/org/apache/spark/SparkFunSuite.scala index ec641f8..d402074 100644 --- a/core/src/test/scala/org/apache/spark/SparkFunSuite.scala +++ b/core/src/test/scala/org/apache/spark/SparkFunSuite.scala @@ -64,6 +64,12 @@ abstract class SparkFunSuite with Logging { // scalastyle:on + // Initialize the logger forcibly to let the logger log timestamp + // based on the local time zone depending on environments. + // The default time zone will be set to America/Los_Angeles later + // so this initialization is necessary here. + log + // Timezone is fixed to America/Los_Angeles for those timezone sensitive tests (timestamp_*) TimeZone.setDefault(TimeZone.getTimeZone("America/Los_Angeles")) // Add Locale setting - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (75c2c53 -> 4e267f3)
This is an automated email from the ASF dual-hosted git repository. yamamuro pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from 75c2c53 [SPARK-32506][TESTS] Flaky test: StreamingLinearRegressionWithTests add 4e267f3 [SPARK-32538][CORE][TEST] Use local time zone for the timestamp logged in unit-tests.log No new revisions were added by this update. Summary of changes: core/src/test/scala/org/apache/spark/SparkFunSuite.scala | 6 ++ 1 file changed, 6 insertions(+) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch branch-3.0 updated: [SPARK-32538][CORE][TEST] Use local time zone for the timestamp logged in unit-tests.log
This is an automated email from the ASF dual-hosted git repository. yamamuro pushed a commit to branch branch-3.0 in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/branch-3.0 by this push: new c7af0be [SPARK-32538][CORE][TEST] Use local time zone for the timestamp logged in unit-tests.log c7af0be is described below commit c7af0be530d8c2a4935acf9a593034fb85d5572f Author: Kousuke Saruta AuthorDate: Fri Aug 7 11:29:18 2020 +0900 [SPARK-32538][CORE][TEST] Use local time zone for the timestamp logged in unit-tests.log ### What changes were proposed in this pull request? This PR lets the logger log timestamp based on local time zone during test. `SparkFunSuite` fixes the default time zone to America/Los_Angeles so the timestamp logged in unit-tests.log is also based on the fixed time zone. ### Why are the changes needed? It's confusable for developers whose time zone is not America/Los_Angeles. ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? Run existing tests and confirmed uint-tests.log. If your local time zone is America/Los_Angeles, you can test by setting the environment variable `TZ` like as follows. ``` $ TZ=Asia/Tokyo build/sbt "testOnly org.apache.spark.executor.ExecutorSuite" $ tail core/target/unit-tests.log ``` Closes #29356 from sarutak/fix-unit-test-log-timezone. Authored-by: Kousuke Saruta Signed-off-by: Takeshi Yamamuro (cherry picked from commit 4e267f3eb9ca0df18647c859b75b61b1af800120) Signed-off-by: Takeshi Yamamuro --- core/src/test/scala/org/apache/spark/SparkFunSuite.scala | 6 ++ 1 file changed, 6 insertions(+) diff --git a/core/src/test/scala/org/apache/spark/SparkFunSuite.scala b/core/src/test/scala/org/apache/spark/SparkFunSuite.scala index ec641f8..d402074 100644 --- a/core/src/test/scala/org/apache/spark/SparkFunSuite.scala +++ b/core/src/test/scala/org/apache/spark/SparkFunSuite.scala @@ -64,6 +64,12 @@ abstract class SparkFunSuite with Logging { // scalastyle:on + // Initialize the logger forcibly to let the logger log timestamp + // based on the local time zone depending on environments. + // The default time zone will be set to America/Los_Angeles later + // so this initialization is necessary here. + log + // Timezone is fixed to America/Los_Angeles for those timezone sensitive tests (timestamp_*) TimeZone.setDefault(TimeZone.getTimeZone("America/Los_Angeles")) // Add Locale setting - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (75c2c53 -> 4e267f3)
This is an automated email from the ASF dual-hosted git repository. yamamuro pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from 75c2c53 [SPARK-32506][TESTS] Flaky test: StreamingLinearRegressionWithTests add 4e267f3 [SPARK-32538][CORE][TEST] Use local time zone for the timestamp logged in unit-tests.log No new revisions were added by this update. Summary of changes: core/src/test/scala/org/apache/spark/SparkFunSuite.scala | 6 ++ 1 file changed, 6 insertions(+) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch branch-3.0 updated: [SPARK-32538][CORE][TEST] Use local time zone for the timestamp logged in unit-tests.log
This is an automated email from the ASF dual-hosted git repository. yamamuro pushed a commit to branch branch-3.0 in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/branch-3.0 by this push: new c7af0be [SPARK-32538][CORE][TEST] Use local time zone for the timestamp logged in unit-tests.log c7af0be is described below commit c7af0be530d8c2a4935acf9a593034fb85d5572f Author: Kousuke Saruta AuthorDate: Fri Aug 7 11:29:18 2020 +0900 [SPARK-32538][CORE][TEST] Use local time zone for the timestamp logged in unit-tests.log ### What changes were proposed in this pull request? This PR lets the logger log timestamp based on local time zone during test. `SparkFunSuite` fixes the default time zone to America/Los_Angeles so the timestamp logged in unit-tests.log is also based on the fixed time zone. ### Why are the changes needed? It's confusable for developers whose time zone is not America/Los_Angeles. ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? Run existing tests and confirmed uint-tests.log. If your local time zone is America/Los_Angeles, you can test by setting the environment variable `TZ` like as follows. ``` $ TZ=Asia/Tokyo build/sbt "testOnly org.apache.spark.executor.ExecutorSuite" $ tail core/target/unit-tests.log ``` Closes #29356 from sarutak/fix-unit-test-log-timezone. Authored-by: Kousuke Saruta Signed-off-by: Takeshi Yamamuro (cherry picked from commit 4e267f3eb9ca0df18647c859b75b61b1af800120) Signed-off-by: Takeshi Yamamuro --- core/src/test/scala/org/apache/spark/SparkFunSuite.scala | 6 ++ 1 file changed, 6 insertions(+) diff --git a/core/src/test/scala/org/apache/spark/SparkFunSuite.scala b/core/src/test/scala/org/apache/spark/SparkFunSuite.scala index ec641f8..d402074 100644 --- a/core/src/test/scala/org/apache/spark/SparkFunSuite.scala +++ b/core/src/test/scala/org/apache/spark/SparkFunSuite.scala @@ -64,6 +64,12 @@ abstract class SparkFunSuite with Logging { // scalastyle:on + // Initialize the logger forcibly to let the logger log timestamp + // based on the local time zone depending on environments. + // The default time zone will be set to America/Los_Angeles later + // so this initialization is necessary here. + log + // Timezone is fixed to America/Los_Angeles for those timezone sensitive tests (timestamp_*) TimeZone.setDefault(TimeZone.getTimeZone("America/Los_Angeles")) // Add Locale setting - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (75c2c53 -> 4e267f3)
This is an automated email from the ASF dual-hosted git repository. yamamuro pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from 75c2c53 [SPARK-32506][TESTS] Flaky test: StreamingLinearRegressionWithTests add 4e267f3 [SPARK-32538][CORE][TEST] Use local time zone for the timestamp logged in unit-tests.log No new revisions were added by this update. Summary of changes: core/src/test/scala/org/apache/spark/SparkFunSuite.scala | 6 ++ 1 file changed, 6 insertions(+) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch branch-2.4 updated: [SPARK-28818][SQL][2.4] Respect source column nullability in the arrays created by `freqItems()`
This is an automated email from the ASF dual-hosted git repository. yamamuro pushed a commit to branch branch-2.4 in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/branch-2.4 by this push: new 91f2a25 [SPARK-28818][SQL][2.4] Respect source column nullability in the arrays created by `freqItems()` 91f2a25 is described below commit 91f2a2548ad0f825fc4b5c67264e11abb76bbd9d Author: Matt Hawes AuthorDate: Mon Aug 3 08:55:28 2020 +0900 [SPARK-28818][SQL][2.4] Respect source column nullability in the arrays created by `freqItems()` ### What changes were proposed in this pull request? This PR replaces the hard-coded non-nullability of the array elements returned by `freqItems()` with a nullability that reflects the original schema. Essentially [the functional change](https://github.com/apache/spark/pull/25575/files#diff-bf59bb9f3dc351f5bf6624e5edd2dcf4R122) to the schema generation is: ``` StructField(name + "_freqItems", ArrayType(dataType, false)) ``` Becomes: ``` StructField(name + "_freqItems", ArrayType(dataType, originalField.nullable)) ``` Respecting the original nullability prevents issues when Spark depends on `ArrayType`'s `containsNull` being accurate. The example that uncovered this is calling `collect()` on the dataframe (see [ticket](https://issues.apache.org/jira/browse/SPARK-28818) for full repro). Though it's likely that there a several places where this could cause a problem. I've also refactored a small amount of the surrounding code to remove some unnecessary steps and group together related operations. Note: This is the backport PR of #25575 and the credit should be MGHawes. ### Why are the changes needed? I think it's pretty clear why this change is needed. It fixes a bug that currently prevents users from calling `df.freqItems.collect()` along with potentially causing other, as yet unknown, issues. ### Does this PR introduce any user-facing change? Nullability of columns when calling freqItems on them is now respected after the change. ### How was this patch tested? I added a test that specifically tests the carry-through of the nullability as well as explicitly calling `collect()` to catch the exact regression that was observed. I also ran the test against the old version of the code and it fails as expected. Closes #29327 from maropu/SPARK-28818-2.4. Lead-authored-by: Matt Hawes Co-authored-by: Takeshi Yamamuro Signed-off-by: Takeshi Yamamuro --- .../spark/sql/execution/stat/FrequentItems.scala | 19 .../org/apache/spark/sql/DataFrameStatSuite.scala | 26 +- 2 files changed, 35 insertions(+), 10 deletions(-) diff --git a/sql/core/src/main/scala/org/apache/spark/sql/execution/stat/FrequentItems.scala b/sql/core/src/main/scala/org/apache/spark/sql/execution/stat/FrequentItems.scala index 86f6307..f21efd4 100644 --- a/sql/core/src/main/scala/org/apache/spark/sql/execution/stat/FrequentItems.scala +++ b/sql/core/src/main/scala/org/apache/spark/sql/execution/stat/FrequentItems.scala @@ -89,11 +89,6 @@ object FrequentItems extends Logging { // number of max items to keep counts for val sizeOfMap = (1 / support).toInt val countMaps = Seq.tabulate(numCols)(i => new FreqItemCounter(sizeOfMap)) -val originalSchema = df.schema -val colInfo: Array[(String, DataType)] = cols.map { name => - val index = originalSchema.fieldIndex(name) - (name, originalSchema.fields(index).dataType) -}.toArray val freqItems = df.select(cols.map(Column(_)) : _*).rdd.treeAggregate(countMaps)( seqOp = (counts, row) => { @@ -117,10 +112,16 @@ object FrequentItems extends Logging { ) val justItems = freqItems.map(m => m.baseMap.keys.toArray) val resultRow = Row(justItems : _*) -// append frequent Items to the column name for easy debugging -val outputCols = colInfo.map { v => - StructField(v._1 + "_freqItems", ArrayType(v._2, false)) -} + +val originalSchema = df.schema +val outputCols = cols.map { name => + val index = originalSchema.fieldIndex(name) + val originalField = originalSchema.fields(index) + + // append frequent Items to the column name for easy debugging + StructField(name + "_freqItems", ArrayType(originalField.dataType, originalField.nullable)) +}.toArray + val schema = StructType(outputCols).toAttributes Dataset.ofRows(df.sparkSession, LocalRelation.fromExternalRows(schema, Seq(resultRow))) } diff --git a/sql/core/src/test/scala/org/apache/spark/sql/DataFrameStatSuite.scala b/sql/core/src/test/scala/org/apache/spark/sql/DataFrameStatSuite.scala index 8eae353..23a1fc4 100644 --- a/sql/core/src/test/scala/org/apache/spark/sql/DataFrameStatSuit
[spark] branch branch-2.4 updated: [SPARK-28818][SQL][2.4] Respect source column nullability in the arrays created by `freqItems()`
This is an automated email from the ASF dual-hosted git repository. yamamuro pushed a commit to branch branch-2.4 in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/branch-2.4 by this push: new 91f2a25 [SPARK-28818][SQL][2.4] Respect source column nullability in the arrays created by `freqItems()` 91f2a25 is described below commit 91f2a2548ad0f825fc4b5c67264e11abb76bbd9d Author: Matt Hawes AuthorDate: Mon Aug 3 08:55:28 2020 +0900 [SPARK-28818][SQL][2.4] Respect source column nullability in the arrays created by `freqItems()` ### What changes were proposed in this pull request? This PR replaces the hard-coded non-nullability of the array elements returned by `freqItems()` with a nullability that reflects the original schema. Essentially [the functional change](https://github.com/apache/spark/pull/25575/files#diff-bf59bb9f3dc351f5bf6624e5edd2dcf4R122) to the schema generation is: ``` StructField(name + "_freqItems", ArrayType(dataType, false)) ``` Becomes: ``` StructField(name + "_freqItems", ArrayType(dataType, originalField.nullable)) ``` Respecting the original nullability prevents issues when Spark depends on `ArrayType`'s `containsNull` being accurate. The example that uncovered this is calling `collect()` on the dataframe (see [ticket](https://issues.apache.org/jira/browse/SPARK-28818) for full repro). Though it's likely that there a several places where this could cause a problem. I've also refactored a small amount of the surrounding code to remove some unnecessary steps and group together related operations. Note: This is the backport PR of #25575 and the credit should be MGHawes. ### Why are the changes needed? I think it's pretty clear why this change is needed. It fixes a bug that currently prevents users from calling `df.freqItems.collect()` along with potentially causing other, as yet unknown, issues. ### Does this PR introduce any user-facing change? Nullability of columns when calling freqItems on them is now respected after the change. ### How was this patch tested? I added a test that specifically tests the carry-through of the nullability as well as explicitly calling `collect()` to catch the exact regression that was observed. I also ran the test against the old version of the code and it fails as expected. Closes #29327 from maropu/SPARK-28818-2.4. Lead-authored-by: Matt Hawes Co-authored-by: Takeshi Yamamuro Signed-off-by: Takeshi Yamamuro --- .../spark/sql/execution/stat/FrequentItems.scala | 19 .../org/apache/spark/sql/DataFrameStatSuite.scala | 26 +- 2 files changed, 35 insertions(+), 10 deletions(-) diff --git a/sql/core/src/main/scala/org/apache/spark/sql/execution/stat/FrequentItems.scala b/sql/core/src/main/scala/org/apache/spark/sql/execution/stat/FrequentItems.scala index 86f6307..f21efd4 100644 --- a/sql/core/src/main/scala/org/apache/spark/sql/execution/stat/FrequentItems.scala +++ b/sql/core/src/main/scala/org/apache/spark/sql/execution/stat/FrequentItems.scala @@ -89,11 +89,6 @@ object FrequentItems extends Logging { // number of max items to keep counts for val sizeOfMap = (1 / support).toInt val countMaps = Seq.tabulate(numCols)(i => new FreqItemCounter(sizeOfMap)) -val originalSchema = df.schema -val colInfo: Array[(String, DataType)] = cols.map { name => - val index = originalSchema.fieldIndex(name) - (name, originalSchema.fields(index).dataType) -}.toArray val freqItems = df.select(cols.map(Column(_)) : _*).rdd.treeAggregate(countMaps)( seqOp = (counts, row) => { @@ -117,10 +112,16 @@ object FrequentItems extends Logging { ) val justItems = freqItems.map(m => m.baseMap.keys.toArray) val resultRow = Row(justItems : _*) -// append frequent Items to the column name for easy debugging -val outputCols = colInfo.map { v => - StructField(v._1 + "_freqItems", ArrayType(v._2, false)) -} + +val originalSchema = df.schema +val outputCols = cols.map { name => + val index = originalSchema.fieldIndex(name) + val originalField = originalSchema.fields(index) + + // append frequent Items to the column name for easy debugging + StructField(name + "_freqItems", ArrayType(originalField.dataType, originalField.nullable)) +}.toArray + val schema = StructType(outputCols).toAttributes Dataset.ofRows(df.sparkSession, LocalRelation.fromExternalRows(schema, Seq(resultRow))) } diff --git a/sql/core/src/test/scala/org/apache/spark/sql/DataFrameStatSuite.scala b/sql/core/src/test/scala/org/apache/spark/sql/DataFrameStatSuite.scala index 8eae353..23a1fc4 100644 --- a/sql/core/src/test/scala/org/apache/spark/sql/DataFrameStatSuit
[spark] branch branch-2.4 updated (4a8f692 -> 91f2a25)
This is an automated email from the ASF dual-hosted git repository. yamamuro pushed a change to branch branch-2.4 in repository https://gitbox.apache.org/repos/asf/spark.git. from 4a8f692 [SPARK-32397][BUILD] Allow specifying of time for build to keep time consistent between modules add 91f2a25 [SPARK-28818][SQL][2.4] Respect source column nullability in the arrays created by `freqItems()` No new revisions were added by this update. Summary of changes: .../spark/sql/execution/stat/FrequentItems.scala | 19 .../org/apache/spark/sql/DataFrameStatSuite.scala | 26 +- 2 files changed, 35 insertions(+), 10 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (8de4333 -> 8323c8e)
This is an automated email from the ASF dual-hosted git repository. yamamuro pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from 8de4333 [SPARK-31753][SQL][DOCS] Add missing keywords in the SQL docs add 8323c8e [SPARK-32059][SQL] Allow nested schema pruning thru window/sort plans No new revisions were added by this update. Summary of changes: .../catalyst/optimizer/NestedColumnAliasing.scala | 16 +++ .../optimizer/NestedColumnAliasingSuite.scala | 140 - .../execution/datasources/SchemaPruningSuite.scala | 63 ++ 3 files changed, 217 insertions(+), 2 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (8de4333 -> 8323c8e)
This is an automated email from the ASF dual-hosted git repository. yamamuro pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from 8de4333 [SPARK-31753][SQL][DOCS] Add missing keywords in the SQL docs add 8323c8e [SPARK-32059][SQL] Allow nested schema pruning thru window/sort plans No new revisions were added by this update. Summary of changes: .../catalyst/optimizer/NestedColumnAliasing.scala | 16 +++ .../optimizer/NestedColumnAliasingSuite.scala | 140 - .../execution/datasources/SchemaPruningSuite.scala | 63 ++ 3 files changed, 217 insertions(+), 2 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (8de4333 -> 8323c8e)
This is an automated email from the ASF dual-hosted git repository. yamamuro pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from 8de4333 [SPARK-31753][SQL][DOCS] Add missing keywords in the SQL docs add 8323c8e [SPARK-32059][SQL] Allow nested schema pruning thru window/sort plans No new revisions were added by this update. Summary of changes: .../catalyst/optimizer/NestedColumnAliasing.scala | 16 +++ .../optimizer/NestedColumnAliasingSuite.scala | 140 - .../execution/datasources/SchemaPruningSuite.scala | 63 ++ 3 files changed, 217 insertions(+), 2 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (8de4333 -> 8323c8e)
This is an automated email from the ASF dual-hosted git repository. yamamuro pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from 8de4333 [SPARK-31753][SQL][DOCS] Add missing keywords in the SQL docs add 8323c8e [SPARK-32059][SQL] Allow nested schema pruning thru window/sort plans No new revisions were added by this update. Summary of changes: .../catalyst/optimizer/NestedColumnAliasing.scala | 16 +++ .../optimizer/NestedColumnAliasingSuite.scala | 140 - .../execution/datasources/SchemaPruningSuite.scala | 63 ++ 3 files changed, 217 insertions(+), 2 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (8de4333 -> 8323c8e)
This is an automated email from the ASF dual-hosted git repository. yamamuro pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from 8de4333 [SPARK-31753][SQL][DOCS] Add missing keywords in the SQL docs add 8323c8e [SPARK-32059][SQL] Allow nested schema pruning thru window/sort plans No new revisions were added by this update. Summary of changes: .../catalyst/optimizer/NestedColumnAliasing.scala | 16 +++ .../optimizer/NestedColumnAliasingSuite.scala | 140 - .../execution/datasources/SchemaPruningSuite.scala | 63 ++ 3 files changed, 217 insertions(+), 2 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch branch-3.0 updated: [SPARK-31753][SQL][DOCS] Add missing keywords in the SQL docs
This is an automated email from the ASF dual-hosted git repository. yamamuro pushed a commit to branch branch-3.0 in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/branch-3.0 by this push: new 6ed93c3 [SPARK-31753][SQL][DOCS] Add missing keywords in the SQL docs 6ed93c3 is described below commit 6ed93c3e86c60323328b44cab45faa9ae3050dab Author: GuoPhilipse AuthorDate: Tue Jul 28 09:41:53 2020 +0900 [SPARK-31753][SQL][DOCS] Add missing keywords in the SQL docs ### What changes were proposed in this pull request? update sql-ref docs, the following key words will be added in this PR. CASE/ELSE WHEN/THEN MAP KEYS TERMINATED BY NULL DEFINED AS LINES TERMINATED BY ESCAPED BY COLLECTION ITEMS TERMINATED BY PIVOT LATERAL VIEW OUTER? ROW FORMAT SERDE ROW FORMAT DELIMITED FIELDS TERMINATED BY IGNORE NULLS FIRST LAST ### Why are the changes needed? let more users know the sql key words usage ### Does this PR introduce _any_ user-facing change? ![image](https://user-images.githubusercontent.com/46367746/88148830-c6dc1f80-cc31-11ea-81ea-13bc9dc34550.png) ![image](https://user-images.githubusercontent.com/46367746/88148968-fb4fdb80-cc31-11ea-8649-e8297cf5813e.png) ![image](https://user-images.githubusercontent.com/46367746/88149000-073b9d80-cc32-11ea-9aa4-f914ecd72663.png) ![image](https://user-images.githubusercontent.com/46367746/88149021-0f93d880-cc32-11ea-86ed-7db8672b5aac.png) ### How was this patch tested? No Closes #29056 from GuoPhilipse/add-missing-keywords. Lead-authored-by: GuoPhilipse Co-authored-by: GuoPhilipse <46367746+guophili...@users.noreply.github.com> Signed-off-by: Takeshi Yamamuro (cherry picked from commit 8de43338be879f0cfeebca328dbbcfd1e5bd70da) Signed-off-by: Takeshi Yamamuro --- docs/_data/menu-sql.yaml | 6 + docs/sql-ref-syntax-ddl-create-table-hiveformat.md | 94 +++- docs/sql-ref-syntax-qry-select-case.md | 109 ++ docs/sql-ref-syntax-qry-select-clusterby.md| 3 + docs/sql-ref-syntax-qry-select-distribute-by.md| 3 + docs/sql-ref-syntax-qry-select-groupby.md | 27 + docs/sql-ref-syntax-qry-select-having.md | 3 + docs/sql-ref-syntax-qry-select-lateral-view.md | 125 + docs/sql-ref-syntax-qry-select-limit.md| 3 + docs/sql-ref-syntax-qry-select-orderby.md | 3 + docs/sql-ref-syntax-qry-select-pivot.md| 101 + docs/sql-ref-syntax-qry-select-sortby.md | 3 + docs/sql-ref-syntax-qry-select-where.md| 3 + docs/sql-ref-syntax-qry-select.md | 56 + docs/sql-ref-syntax-qry.md | 3 + docs/sql-ref-syntax.md | 3 + 16 files changed, 520 insertions(+), 25 deletions(-) diff --git a/docs/_data/menu-sql.yaml b/docs/_data/menu-sql.yaml index eea657e..22fae0c 100644 --- a/docs/_data/menu-sql.yaml +++ b/docs/_data/menu-sql.yaml @@ -187,6 +187,12 @@ url: sql-ref-syntax-qry-select-tvf.html - text: Window Function url: sql-ref-syntax-qry-select-window.html +- text: CASE Clause + url: sql-ref-syntax-qry-select-case.html +- text: LATERAL VIEW Clause + url: sql-ref-syntax-qry-select-lateral-view.html +- text: PIVOT Clause + url: sql-ref-syntax-qry-select-pivot.html - text: EXPLAIN url: sql-ref-syntax-qry-explain.html - text: Auxiliary Statements diff --git a/docs/sql-ref-syntax-ddl-create-table-hiveformat.md b/docs/sql-ref-syntax-ddl-create-table-hiveformat.md index 38f8856..7bf847d 100644 --- a/docs/sql-ref-syntax-ddl-create-table-hiveformat.md +++ b/docs/sql-ref-syntax-ddl-create-table-hiveformat.md @@ -36,6 +36,14 @@ CREATE [ EXTERNAL ] TABLE [ IF NOT EXISTS ] table_identifier [ LOCATION path ] [ TBLPROPERTIES ( key1=val1, key2=val2, ... ) ] [ AS select_statement ] + +row_format: +: SERDE serde_class [ WITH SERDEPROPERTIES (k1=v1, k2=v2, ... ) ] +| DELIMITED [ FIELDS TERMINATED BY fields_termiated_char [ ESCAPED BY escaped_char ] ] +[ COLLECTION ITEMS TERMINATED BY collection_items_termiated_char ] +[ MAP KEYS TERMINATED BY map_key_termiated_char ] +[ LINES TERMINATED BY row_termiated_char ] +[ NULL DEFINED AS null_char ] ``` Note that, the clauses between the columns definition clause and the AS SELECT clause can come in @@ -51,15 +59,55 @@ as any order. For example, you can write COMMENT table_comment after TBLPROPERTI * **EXTERNAL** -Table is defined using the path provided as LOCATION, does not use d
[spark] branch branch-3.0 updated: [SPARK-31753][SQL][DOCS] Add missing keywords in the SQL docs
This is an automated email from the ASF dual-hosted git repository. yamamuro pushed a commit to branch branch-3.0 in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/branch-3.0 by this push: new 6ed93c3 [SPARK-31753][SQL][DOCS] Add missing keywords in the SQL docs 6ed93c3 is described below commit 6ed93c3e86c60323328b44cab45faa9ae3050dab Author: GuoPhilipse AuthorDate: Tue Jul 28 09:41:53 2020 +0900 [SPARK-31753][SQL][DOCS] Add missing keywords in the SQL docs ### What changes were proposed in this pull request? update sql-ref docs, the following key words will be added in this PR. CASE/ELSE WHEN/THEN MAP KEYS TERMINATED BY NULL DEFINED AS LINES TERMINATED BY ESCAPED BY COLLECTION ITEMS TERMINATED BY PIVOT LATERAL VIEW OUTER? ROW FORMAT SERDE ROW FORMAT DELIMITED FIELDS TERMINATED BY IGNORE NULLS FIRST LAST ### Why are the changes needed? let more users know the sql key words usage ### Does this PR introduce _any_ user-facing change? ![image](https://user-images.githubusercontent.com/46367746/88148830-c6dc1f80-cc31-11ea-81ea-13bc9dc34550.png) ![image](https://user-images.githubusercontent.com/46367746/88148968-fb4fdb80-cc31-11ea-8649-e8297cf5813e.png) ![image](https://user-images.githubusercontent.com/46367746/88149000-073b9d80-cc32-11ea-9aa4-f914ecd72663.png) ![image](https://user-images.githubusercontent.com/46367746/88149021-0f93d880-cc32-11ea-86ed-7db8672b5aac.png) ### How was this patch tested? No Closes #29056 from GuoPhilipse/add-missing-keywords. Lead-authored-by: GuoPhilipse Co-authored-by: GuoPhilipse <46367746+guophili...@users.noreply.github.com> Signed-off-by: Takeshi Yamamuro (cherry picked from commit 8de43338be879f0cfeebca328dbbcfd1e5bd70da) Signed-off-by: Takeshi Yamamuro --- docs/_data/menu-sql.yaml | 6 + docs/sql-ref-syntax-ddl-create-table-hiveformat.md | 94 +++- docs/sql-ref-syntax-qry-select-case.md | 109 ++ docs/sql-ref-syntax-qry-select-clusterby.md| 3 + docs/sql-ref-syntax-qry-select-distribute-by.md| 3 + docs/sql-ref-syntax-qry-select-groupby.md | 27 + docs/sql-ref-syntax-qry-select-having.md | 3 + docs/sql-ref-syntax-qry-select-lateral-view.md | 125 + docs/sql-ref-syntax-qry-select-limit.md| 3 + docs/sql-ref-syntax-qry-select-orderby.md | 3 + docs/sql-ref-syntax-qry-select-pivot.md| 101 + docs/sql-ref-syntax-qry-select-sortby.md | 3 + docs/sql-ref-syntax-qry-select-where.md| 3 + docs/sql-ref-syntax-qry-select.md | 56 + docs/sql-ref-syntax-qry.md | 3 + docs/sql-ref-syntax.md | 3 + 16 files changed, 520 insertions(+), 25 deletions(-) diff --git a/docs/_data/menu-sql.yaml b/docs/_data/menu-sql.yaml index eea657e..22fae0c 100644 --- a/docs/_data/menu-sql.yaml +++ b/docs/_data/menu-sql.yaml @@ -187,6 +187,12 @@ url: sql-ref-syntax-qry-select-tvf.html - text: Window Function url: sql-ref-syntax-qry-select-window.html +- text: CASE Clause + url: sql-ref-syntax-qry-select-case.html +- text: LATERAL VIEW Clause + url: sql-ref-syntax-qry-select-lateral-view.html +- text: PIVOT Clause + url: sql-ref-syntax-qry-select-pivot.html - text: EXPLAIN url: sql-ref-syntax-qry-explain.html - text: Auxiliary Statements diff --git a/docs/sql-ref-syntax-ddl-create-table-hiveformat.md b/docs/sql-ref-syntax-ddl-create-table-hiveformat.md index 38f8856..7bf847d 100644 --- a/docs/sql-ref-syntax-ddl-create-table-hiveformat.md +++ b/docs/sql-ref-syntax-ddl-create-table-hiveformat.md @@ -36,6 +36,14 @@ CREATE [ EXTERNAL ] TABLE [ IF NOT EXISTS ] table_identifier [ LOCATION path ] [ TBLPROPERTIES ( key1=val1, key2=val2, ... ) ] [ AS select_statement ] + +row_format: +: SERDE serde_class [ WITH SERDEPROPERTIES (k1=v1, k2=v2, ... ) ] +| DELIMITED [ FIELDS TERMINATED BY fields_termiated_char [ ESCAPED BY escaped_char ] ] +[ COLLECTION ITEMS TERMINATED BY collection_items_termiated_char ] +[ MAP KEYS TERMINATED BY map_key_termiated_char ] +[ LINES TERMINATED BY row_termiated_char ] +[ NULL DEFINED AS null_char ] ``` Note that, the clauses between the columns definition clause and the AS SELECT clause can come in @@ -51,15 +59,55 @@ as any order. For example, you can write COMMENT table_comment after TBLPROPERTI * **EXTERNAL** -Table is defined using the path provided as LOCATION, does not use d
[spark] branch master updated (f7542d3 -> 8de4333)
This is an automated email from the ASF dual-hosted git repository. yamamuro pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from f7542d3 [SPARK-32457][ML] logParam thresholds in DT/GBT/FM/LR/MLP add 8de4333 [SPARK-31753][SQL][DOCS] Add missing keywords in the SQL docs No new revisions were added by this update. Summary of changes: docs/_data/menu-sql.yaml | 6 + docs/sql-ref-syntax-ddl-create-table-hiveformat.md | 94 +++- docs/sql-ref-syntax-qry-select-case.md | 109 ++ docs/sql-ref-syntax-qry-select-clusterby.md| 3 + docs/sql-ref-syntax-qry-select-distribute-by.md| 3 + docs/sql-ref-syntax-qry-select-groupby.md | 27 + docs/sql-ref-syntax-qry-select-having.md | 3 + docs/sql-ref-syntax-qry-select-lateral-view.md | 125 + docs/sql-ref-syntax-qry-select-limit.md| 3 + docs/sql-ref-syntax-qry-select-orderby.md | 3 + docs/sql-ref-syntax-qry-select-pivot.md| 101 + docs/sql-ref-syntax-qry-select-sortby.md | 3 + docs/sql-ref-syntax-qry-select-where.md| 3 + docs/sql-ref-syntax-qry-select.md | 56 + docs/sql-ref-syntax-qry.md | 3 + docs/sql-ref-syntax.md | 3 + 16 files changed, 520 insertions(+), 25 deletions(-) create mode 100644 docs/sql-ref-syntax-qry-select-case.md create mode 100644 docs/sql-ref-syntax-qry-select-lateral-view.md create mode 100644 docs/sql-ref-syntax-qry-select-pivot.md - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch branch-3.0 updated: [SPARK-31753][SQL][DOCS] Add missing keywords in the SQL docs
This is an automated email from the ASF dual-hosted git repository. yamamuro pushed a commit to branch branch-3.0 in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/branch-3.0 by this push: new 6ed93c3 [SPARK-31753][SQL][DOCS] Add missing keywords in the SQL docs 6ed93c3 is described below commit 6ed93c3e86c60323328b44cab45faa9ae3050dab Author: GuoPhilipse AuthorDate: Tue Jul 28 09:41:53 2020 +0900 [SPARK-31753][SQL][DOCS] Add missing keywords in the SQL docs ### What changes were proposed in this pull request? update sql-ref docs, the following key words will be added in this PR. CASE/ELSE WHEN/THEN MAP KEYS TERMINATED BY NULL DEFINED AS LINES TERMINATED BY ESCAPED BY COLLECTION ITEMS TERMINATED BY PIVOT LATERAL VIEW OUTER? ROW FORMAT SERDE ROW FORMAT DELIMITED FIELDS TERMINATED BY IGNORE NULLS FIRST LAST ### Why are the changes needed? let more users know the sql key words usage ### Does this PR introduce _any_ user-facing change? ![image](https://user-images.githubusercontent.com/46367746/88148830-c6dc1f80-cc31-11ea-81ea-13bc9dc34550.png) ![image](https://user-images.githubusercontent.com/46367746/88148968-fb4fdb80-cc31-11ea-8649-e8297cf5813e.png) ![image](https://user-images.githubusercontent.com/46367746/88149000-073b9d80-cc32-11ea-9aa4-f914ecd72663.png) ![image](https://user-images.githubusercontent.com/46367746/88149021-0f93d880-cc32-11ea-86ed-7db8672b5aac.png) ### How was this patch tested? No Closes #29056 from GuoPhilipse/add-missing-keywords. Lead-authored-by: GuoPhilipse Co-authored-by: GuoPhilipse <46367746+guophili...@users.noreply.github.com> Signed-off-by: Takeshi Yamamuro (cherry picked from commit 8de43338be879f0cfeebca328dbbcfd1e5bd70da) Signed-off-by: Takeshi Yamamuro --- docs/_data/menu-sql.yaml | 6 + docs/sql-ref-syntax-ddl-create-table-hiveformat.md | 94 +++- docs/sql-ref-syntax-qry-select-case.md | 109 ++ docs/sql-ref-syntax-qry-select-clusterby.md| 3 + docs/sql-ref-syntax-qry-select-distribute-by.md| 3 + docs/sql-ref-syntax-qry-select-groupby.md | 27 + docs/sql-ref-syntax-qry-select-having.md | 3 + docs/sql-ref-syntax-qry-select-lateral-view.md | 125 + docs/sql-ref-syntax-qry-select-limit.md| 3 + docs/sql-ref-syntax-qry-select-orderby.md | 3 + docs/sql-ref-syntax-qry-select-pivot.md| 101 + docs/sql-ref-syntax-qry-select-sortby.md | 3 + docs/sql-ref-syntax-qry-select-where.md| 3 + docs/sql-ref-syntax-qry-select.md | 56 + docs/sql-ref-syntax-qry.md | 3 + docs/sql-ref-syntax.md | 3 + 16 files changed, 520 insertions(+), 25 deletions(-) diff --git a/docs/_data/menu-sql.yaml b/docs/_data/menu-sql.yaml index eea657e..22fae0c 100644 --- a/docs/_data/menu-sql.yaml +++ b/docs/_data/menu-sql.yaml @@ -187,6 +187,12 @@ url: sql-ref-syntax-qry-select-tvf.html - text: Window Function url: sql-ref-syntax-qry-select-window.html +- text: CASE Clause + url: sql-ref-syntax-qry-select-case.html +- text: LATERAL VIEW Clause + url: sql-ref-syntax-qry-select-lateral-view.html +- text: PIVOT Clause + url: sql-ref-syntax-qry-select-pivot.html - text: EXPLAIN url: sql-ref-syntax-qry-explain.html - text: Auxiliary Statements diff --git a/docs/sql-ref-syntax-ddl-create-table-hiveformat.md b/docs/sql-ref-syntax-ddl-create-table-hiveformat.md index 38f8856..7bf847d 100644 --- a/docs/sql-ref-syntax-ddl-create-table-hiveformat.md +++ b/docs/sql-ref-syntax-ddl-create-table-hiveformat.md @@ -36,6 +36,14 @@ CREATE [ EXTERNAL ] TABLE [ IF NOT EXISTS ] table_identifier [ LOCATION path ] [ TBLPROPERTIES ( key1=val1, key2=val2, ... ) ] [ AS select_statement ] + +row_format: +: SERDE serde_class [ WITH SERDEPROPERTIES (k1=v1, k2=v2, ... ) ] +| DELIMITED [ FIELDS TERMINATED BY fields_termiated_char [ ESCAPED BY escaped_char ] ] +[ COLLECTION ITEMS TERMINATED BY collection_items_termiated_char ] +[ MAP KEYS TERMINATED BY map_key_termiated_char ] +[ LINES TERMINATED BY row_termiated_char ] +[ NULL DEFINED AS null_char ] ``` Note that, the clauses between the columns definition clause and the AS SELECT clause can come in @@ -51,15 +59,55 @@ as any order. For example, you can write COMMENT table_comment after TBLPROPERTI * **EXTERNAL** -Table is defined using the path provided as LOCATION, does not use d
[spark] branch master updated (f7542d3 -> 8de4333)
This is an automated email from the ASF dual-hosted git repository. yamamuro pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from f7542d3 [SPARK-32457][ML] logParam thresholds in DT/GBT/FM/LR/MLP add 8de4333 [SPARK-31753][SQL][DOCS] Add missing keywords in the SQL docs No new revisions were added by this update. Summary of changes: docs/_data/menu-sql.yaml | 6 + docs/sql-ref-syntax-ddl-create-table-hiveformat.md | 94 +++- docs/sql-ref-syntax-qry-select-case.md | 109 ++ docs/sql-ref-syntax-qry-select-clusterby.md| 3 + docs/sql-ref-syntax-qry-select-distribute-by.md| 3 + docs/sql-ref-syntax-qry-select-groupby.md | 27 + docs/sql-ref-syntax-qry-select-having.md | 3 + docs/sql-ref-syntax-qry-select-lateral-view.md | 125 + docs/sql-ref-syntax-qry-select-limit.md| 3 + docs/sql-ref-syntax-qry-select-orderby.md | 3 + docs/sql-ref-syntax-qry-select-pivot.md| 101 + docs/sql-ref-syntax-qry-select-sortby.md | 3 + docs/sql-ref-syntax-qry-select-where.md| 3 + docs/sql-ref-syntax-qry-select.md | 56 + docs/sql-ref-syntax-qry.md | 3 + docs/sql-ref-syntax.md | 3 + 16 files changed, 520 insertions(+), 25 deletions(-) create mode 100644 docs/sql-ref-syntax-qry-select-case.md create mode 100644 docs/sql-ref-syntax-qry-select-lateral-view.md create mode 100644 docs/sql-ref-syntax-qry-select-pivot.md - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch branch-3.0 updated: [SPARK-31753][SQL][DOCS] Add missing keywords in the SQL docs
This is an automated email from the ASF dual-hosted git repository. yamamuro pushed a commit to branch branch-3.0 in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/branch-3.0 by this push: new 6ed93c3 [SPARK-31753][SQL][DOCS] Add missing keywords in the SQL docs 6ed93c3 is described below commit 6ed93c3e86c60323328b44cab45faa9ae3050dab Author: GuoPhilipse AuthorDate: Tue Jul 28 09:41:53 2020 +0900 [SPARK-31753][SQL][DOCS] Add missing keywords in the SQL docs ### What changes were proposed in this pull request? update sql-ref docs, the following key words will be added in this PR. CASE/ELSE WHEN/THEN MAP KEYS TERMINATED BY NULL DEFINED AS LINES TERMINATED BY ESCAPED BY COLLECTION ITEMS TERMINATED BY PIVOT LATERAL VIEW OUTER? ROW FORMAT SERDE ROW FORMAT DELIMITED FIELDS TERMINATED BY IGNORE NULLS FIRST LAST ### Why are the changes needed? let more users know the sql key words usage ### Does this PR introduce _any_ user-facing change? ![image](https://user-images.githubusercontent.com/46367746/88148830-c6dc1f80-cc31-11ea-81ea-13bc9dc34550.png) ![image](https://user-images.githubusercontent.com/46367746/88148968-fb4fdb80-cc31-11ea-8649-e8297cf5813e.png) ![image](https://user-images.githubusercontent.com/46367746/88149000-073b9d80-cc32-11ea-9aa4-f914ecd72663.png) ![image](https://user-images.githubusercontent.com/46367746/88149021-0f93d880-cc32-11ea-86ed-7db8672b5aac.png) ### How was this patch tested? No Closes #29056 from GuoPhilipse/add-missing-keywords. Lead-authored-by: GuoPhilipse Co-authored-by: GuoPhilipse <46367746+guophili...@users.noreply.github.com> Signed-off-by: Takeshi Yamamuro (cherry picked from commit 8de43338be879f0cfeebca328dbbcfd1e5bd70da) Signed-off-by: Takeshi Yamamuro --- docs/_data/menu-sql.yaml | 6 + docs/sql-ref-syntax-ddl-create-table-hiveformat.md | 94 +++- docs/sql-ref-syntax-qry-select-case.md | 109 ++ docs/sql-ref-syntax-qry-select-clusterby.md| 3 + docs/sql-ref-syntax-qry-select-distribute-by.md| 3 + docs/sql-ref-syntax-qry-select-groupby.md | 27 + docs/sql-ref-syntax-qry-select-having.md | 3 + docs/sql-ref-syntax-qry-select-lateral-view.md | 125 + docs/sql-ref-syntax-qry-select-limit.md| 3 + docs/sql-ref-syntax-qry-select-orderby.md | 3 + docs/sql-ref-syntax-qry-select-pivot.md| 101 + docs/sql-ref-syntax-qry-select-sortby.md | 3 + docs/sql-ref-syntax-qry-select-where.md| 3 + docs/sql-ref-syntax-qry-select.md | 56 + docs/sql-ref-syntax-qry.md | 3 + docs/sql-ref-syntax.md | 3 + 16 files changed, 520 insertions(+), 25 deletions(-) diff --git a/docs/_data/menu-sql.yaml b/docs/_data/menu-sql.yaml index eea657e..22fae0c 100644 --- a/docs/_data/menu-sql.yaml +++ b/docs/_data/menu-sql.yaml @@ -187,6 +187,12 @@ url: sql-ref-syntax-qry-select-tvf.html - text: Window Function url: sql-ref-syntax-qry-select-window.html +- text: CASE Clause + url: sql-ref-syntax-qry-select-case.html +- text: LATERAL VIEW Clause + url: sql-ref-syntax-qry-select-lateral-view.html +- text: PIVOT Clause + url: sql-ref-syntax-qry-select-pivot.html - text: EXPLAIN url: sql-ref-syntax-qry-explain.html - text: Auxiliary Statements diff --git a/docs/sql-ref-syntax-ddl-create-table-hiveformat.md b/docs/sql-ref-syntax-ddl-create-table-hiveformat.md index 38f8856..7bf847d 100644 --- a/docs/sql-ref-syntax-ddl-create-table-hiveformat.md +++ b/docs/sql-ref-syntax-ddl-create-table-hiveformat.md @@ -36,6 +36,14 @@ CREATE [ EXTERNAL ] TABLE [ IF NOT EXISTS ] table_identifier [ LOCATION path ] [ TBLPROPERTIES ( key1=val1, key2=val2, ... ) ] [ AS select_statement ] + +row_format: +: SERDE serde_class [ WITH SERDEPROPERTIES (k1=v1, k2=v2, ... ) ] +| DELIMITED [ FIELDS TERMINATED BY fields_termiated_char [ ESCAPED BY escaped_char ] ] +[ COLLECTION ITEMS TERMINATED BY collection_items_termiated_char ] +[ MAP KEYS TERMINATED BY map_key_termiated_char ] +[ LINES TERMINATED BY row_termiated_char ] +[ NULL DEFINED AS null_char ] ``` Note that, the clauses between the columns definition clause and the AS SELECT clause can come in @@ -51,15 +59,55 @@ as any order. For example, you can write COMMENT table_comment after TBLPROPERTI * **EXTERNAL** -Table is defined using the path provided as LOCATION, does not use d
[spark] branch master updated (f7542d3 -> 8de4333)
This is an automated email from the ASF dual-hosted git repository. yamamuro pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from f7542d3 [SPARK-32457][ML] logParam thresholds in DT/GBT/FM/LR/MLP add 8de4333 [SPARK-31753][SQL][DOCS] Add missing keywords in the SQL docs No new revisions were added by this update. Summary of changes: docs/_data/menu-sql.yaml | 6 + docs/sql-ref-syntax-ddl-create-table-hiveformat.md | 94 +++- docs/sql-ref-syntax-qry-select-case.md | 109 ++ docs/sql-ref-syntax-qry-select-clusterby.md| 3 + docs/sql-ref-syntax-qry-select-distribute-by.md| 3 + docs/sql-ref-syntax-qry-select-groupby.md | 27 + docs/sql-ref-syntax-qry-select-having.md | 3 + docs/sql-ref-syntax-qry-select-lateral-view.md | 125 + docs/sql-ref-syntax-qry-select-limit.md| 3 + docs/sql-ref-syntax-qry-select-orderby.md | 3 + docs/sql-ref-syntax-qry-select-pivot.md| 101 + docs/sql-ref-syntax-qry-select-sortby.md | 3 + docs/sql-ref-syntax-qry-select-where.md| 3 + docs/sql-ref-syntax-qry-select.md | 56 + docs/sql-ref-syntax-qry.md | 3 + docs/sql-ref-syntax.md | 3 + 16 files changed, 520 insertions(+), 25 deletions(-) create mode 100644 docs/sql-ref-syntax-qry-select-case.md create mode 100644 docs/sql-ref-syntax-qry-select-lateral-view.md create mode 100644 docs/sql-ref-syntax-qry-select-pivot.md - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch branch-3.0 updated: [SPARK-31753][SQL][DOCS] Add missing keywords in the SQL docs
This is an automated email from the ASF dual-hosted git repository. yamamuro pushed a commit to branch branch-3.0 in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/branch-3.0 by this push: new 6ed93c3 [SPARK-31753][SQL][DOCS] Add missing keywords in the SQL docs 6ed93c3 is described below commit 6ed93c3e86c60323328b44cab45faa9ae3050dab Author: GuoPhilipse AuthorDate: Tue Jul 28 09:41:53 2020 +0900 [SPARK-31753][SQL][DOCS] Add missing keywords in the SQL docs ### What changes were proposed in this pull request? update sql-ref docs, the following key words will be added in this PR. CASE/ELSE WHEN/THEN MAP KEYS TERMINATED BY NULL DEFINED AS LINES TERMINATED BY ESCAPED BY COLLECTION ITEMS TERMINATED BY PIVOT LATERAL VIEW OUTER? ROW FORMAT SERDE ROW FORMAT DELIMITED FIELDS TERMINATED BY IGNORE NULLS FIRST LAST ### Why are the changes needed? let more users know the sql key words usage ### Does this PR introduce _any_ user-facing change? ![image](https://user-images.githubusercontent.com/46367746/88148830-c6dc1f80-cc31-11ea-81ea-13bc9dc34550.png) ![image](https://user-images.githubusercontent.com/46367746/88148968-fb4fdb80-cc31-11ea-8649-e8297cf5813e.png) ![image](https://user-images.githubusercontent.com/46367746/88149000-073b9d80-cc32-11ea-9aa4-f914ecd72663.png) ![image](https://user-images.githubusercontent.com/46367746/88149021-0f93d880-cc32-11ea-86ed-7db8672b5aac.png) ### How was this patch tested? No Closes #29056 from GuoPhilipse/add-missing-keywords. Lead-authored-by: GuoPhilipse Co-authored-by: GuoPhilipse <46367746+guophili...@users.noreply.github.com> Signed-off-by: Takeshi Yamamuro (cherry picked from commit 8de43338be879f0cfeebca328dbbcfd1e5bd70da) Signed-off-by: Takeshi Yamamuro --- docs/_data/menu-sql.yaml | 6 + docs/sql-ref-syntax-ddl-create-table-hiveformat.md | 94 +++- docs/sql-ref-syntax-qry-select-case.md | 109 ++ docs/sql-ref-syntax-qry-select-clusterby.md| 3 + docs/sql-ref-syntax-qry-select-distribute-by.md| 3 + docs/sql-ref-syntax-qry-select-groupby.md | 27 + docs/sql-ref-syntax-qry-select-having.md | 3 + docs/sql-ref-syntax-qry-select-lateral-view.md | 125 + docs/sql-ref-syntax-qry-select-limit.md| 3 + docs/sql-ref-syntax-qry-select-orderby.md | 3 + docs/sql-ref-syntax-qry-select-pivot.md| 101 + docs/sql-ref-syntax-qry-select-sortby.md | 3 + docs/sql-ref-syntax-qry-select-where.md| 3 + docs/sql-ref-syntax-qry-select.md | 56 + docs/sql-ref-syntax-qry.md | 3 + docs/sql-ref-syntax.md | 3 + 16 files changed, 520 insertions(+), 25 deletions(-) diff --git a/docs/_data/menu-sql.yaml b/docs/_data/menu-sql.yaml index eea657e..22fae0c 100644 --- a/docs/_data/menu-sql.yaml +++ b/docs/_data/menu-sql.yaml @@ -187,6 +187,12 @@ url: sql-ref-syntax-qry-select-tvf.html - text: Window Function url: sql-ref-syntax-qry-select-window.html +- text: CASE Clause + url: sql-ref-syntax-qry-select-case.html +- text: LATERAL VIEW Clause + url: sql-ref-syntax-qry-select-lateral-view.html +- text: PIVOT Clause + url: sql-ref-syntax-qry-select-pivot.html - text: EXPLAIN url: sql-ref-syntax-qry-explain.html - text: Auxiliary Statements diff --git a/docs/sql-ref-syntax-ddl-create-table-hiveformat.md b/docs/sql-ref-syntax-ddl-create-table-hiveformat.md index 38f8856..7bf847d 100644 --- a/docs/sql-ref-syntax-ddl-create-table-hiveformat.md +++ b/docs/sql-ref-syntax-ddl-create-table-hiveformat.md @@ -36,6 +36,14 @@ CREATE [ EXTERNAL ] TABLE [ IF NOT EXISTS ] table_identifier [ LOCATION path ] [ TBLPROPERTIES ( key1=val1, key2=val2, ... ) ] [ AS select_statement ] + +row_format: +: SERDE serde_class [ WITH SERDEPROPERTIES (k1=v1, k2=v2, ... ) ] +| DELIMITED [ FIELDS TERMINATED BY fields_termiated_char [ ESCAPED BY escaped_char ] ] +[ COLLECTION ITEMS TERMINATED BY collection_items_termiated_char ] +[ MAP KEYS TERMINATED BY map_key_termiated_char ] +[ LINES TERMINATED BY row_termiated_char ] +[ NULL DEFINED AS null_char ] ``` Note that, the clauses between the columns definition clause and the AS SELECT clause can come in @@ -51,15 +59,55 @@ as any order. For example, you can write COMMENT table_comment after TBLPROPERTI * **EXTERNAL** -Table is defined using the path provided as LOCATION, does not use d
[spark] branch master updated (f7542d3 -> 8de4333)
This is an automated email from the ASF dual-hosted git repository. yamamuro pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from f7542d3 [SPARK-32457][ML] logParam thresholds in DT/GBT/FM/LR/MLP add 8de4333 [SPARK-31753][SQL][DOCS] Add missing keywords in the SQL docs No new revisions were added by this update. Summary of changes: docs/_data/menu-sql.yaml | 6 + docs/sql-ref-syntax-ddl-create-table-hiveformat.md | 94 +++- docs/sql-ref-syntax-qry-select-case.md | 109 ++ docs/sql-ref-syntax-qry-select-clusterby.md| 3 + docs/sql-ref-syntax-qry-select-distribute-by.md| 3 + docs/sql-ref-syntax-qry-select-groupby.md | 27 + docs/sql-ref-syntax-qry-select-having.md | 3 + docs/sql-ref-syntax-qry-select-lateral-view.md | 125 + docs/sql-ref-syntax-qry-select-limit.md| 3 + docs/sql-ref-syntax-qry-select-orderby.md | 3 + docs/sql-ref-syntax-qry-select-pivot.md| 101 + docs/sql-ref-syntax-qry-select-sortby.md | 3 + docs/sql-ref-syntax-qry-select-where.md| 3 + docs/sql-ref-syntax-qry-select.md | 56 + docs/sql-ref-syntax-qry.md | 3 + docs/sql-ref-syntax.md | 3 + 16 files changed, 520 insertions(+), 25 deletions(-) create mode 100644 docs/sql-ref-syntax-qry-select-case.md create mode 100644 docs/sql-ref-syntax-qry-select-lateral-view.md create mode 100644 docs/sql-ref-syntax-qry-select-pivot.md - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (f7542d3 -> 8de4333)
This is an automated email from the ASF dual-hosted git repository. yamamuro pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from f7542d3 [SPARK-32457][ML] logParam thresholds in DT/GBT/FM/LR/MLP add 8de4333 [SPARK-31753][SQL][DOCS] Add missing keywords in the SQL docs No new revisions were added by this update. Summary of changes: docs/_data/menu-sql.yaml | 6 + docs/sql-ref-syntax-ddl-create-table-hiveformat.md | 94 +++- docs/sql-ref-syntax-qry-select-case.md | 109 ++ docs/sql-ref-syntax-qry-select-clusterby.md| 3 + docs/sql-ref-syntax-qry-select-distribute-by.md| 3 + docs/sql-ref-syntax-qry-select-groupby.md | 27 + docs/sql-ref-syntax-qry-select-having.md | 3 + docs/sql-ref-syntax-qry-select-lateral-view.md | 125 + docs/sql-ref-syntax-qry-select-limit.md| 3 + docs/sql-ref-syntax-qry-select-orderby.md | 3 + docs/sql-ref-syntax-qry-select-pivot.md| 101 + docs/sql-ref-syntax-qry-select-sortby.md | 3 + docs/sql-ref-syntax-qry-select-where.md| 3 + docs/sql-ref-syntax-qry-select.md | 56 + docs/sql-ref-syntax-qry.md | 3 + docs/sql-ref-syntax.md | 3 + 16 files changed, 520 insertions(+), 25 deletions(-) create mode 100644 docs/sql-ref-syntax-qry-select-case.md create mode 100644 docs/sql-ref-syntax-qry-select-lateral-view.md create mode 100644 docs/sql-ref-syntax-qry-select-pivot.md - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (184074d -> b151194)
This is an automated email from the ASF dual-hosted git repository. yamamuro pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from 184074d [SPARK-31999][SQL] Add REFRESH FUNCTION command add b151194 [SPARK-32392][SQL] Reduce duplicate error log for executing sql statement operation in thrift server No new revisions were added by this update. Summary of changes: .../spark/sql/hive/thriftserver/SparkExecuteStatementOperation.scala | 4 +--- sql/hive-thriftserver/src/test/resources/log4j.properties | 2 ++ 2 files changed, 3 insertions(+), 3 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (184074d -> b151194)
This is an automated email from the ASF dual-hosted git repository. yamamuro pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from 184074d [SPARK-31999][SQL] Add REFRESH FUNCTION command add b151194 [SPARK-32392][SQL] Reduce duplicate error log for executing sql statement operation in thrift server No new revisions were added by this update. Summary of changes: .../spark/sql/hive/thriftserver/SparkExecuteStatementOperation.scala | 4 +--- sql/hive-thriftserver/src/test/resources/log4j.properties | 2 ++ 2 files changed, 3 insertions(+), 3 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (184074d -> b151194)
This is an automated email from the ASF dual-hosted git repository. yamamuro pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from 184074d [SPARK-31999][SQL] Add REFRESH FUNCTION command add b151194 [SPARK-32392][SQL] Reduce duplicate error log for executing sql statement operation in thrift server No new revisions were added by this update. Summary of changes: .../spark/sql/hive/thriftserver/SparkExecuteStatementOperation.scala | 4 +--- sql/hive-thriftserver/src/test/resources/log4j.properties | 2 ++ 2 files changed, 3 insertions(+), 3 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (184074d -> b151194)
This is an automated email from the ASF dual-hosted git repository. yamamuro pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from 184074d [SPARK-31999][SQL] Add REFRESH FUNCTION command add b151194 [SPARK-32392][SQL] Reduce duplicate error log for executing sql statement operation in thrift server No new revisions were added by this update. Summary of changes: .../spark/sql/hive/thriftserver/SparkExecuteStatementOperation.scala | 4 +--- sql/hive-thriftserver/src/test/resources/log4j.properties | 2 ++ 2 files changed, 3 insertions(+), 3 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (184074d -> b151194)
This is an automated email from the ASF dual-hosted git repository. yamamuro pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from 184074d [SPARK-31999][SQL] Add REFRESH FUNCTION command add b151194 [SPARK-32392][SQL] Reduce duplicate error log for executing sql statement operation in thrift server No new revisions were added by this update. Summary of changes: .../spark/sql/hive/thriftserver/SparkExecuteStatementOperation.scala | 4 +--- sql/hive-thriftserver/src/test/resources/log4j.properties | 2 ++ 2 files changed, 3 insertions(+), 3 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (0432379 -> 39181ff)
This is an automated email from the ASF dual-hosted git repository. yamamuro pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from 0432379 [SPARK-24266][K8S] Restart the watcher when we receive a version changed from k8s add 39181ff [SPARK-32286][SQL] Coalesce bucketed table for shuffled hash join if applicable No new revisions were added by this update. Summary of changes: .../org/apache/spark/sql/internal/SQLConf.scala| 22 ++- .../spark/sql/execution/QueryExecution.scala | 4 +- .../bucketing/CoalesceBucketsInJoin.scala | 177 + .../bucketing/CoalesceBucketsInSortMergeJoin.scala | 132 --- .../scala/org/apache/spark/sql/ExplainSuite.scala | 2 +- ...uite.scala => CoalesceBucketsInJoinSuite.scala} | 136 .../spark/sql/sources/BucketedReadSuite.scala | 14 +- 7 files changed, 309 insertions(+), 178 deletions(-) create mode 100644 sql/core/src/main/scala/org/apache/spark/sql/execution/bucketing/CoalesceBucketsInJoin.scala delete mode 100644 sql/core/src/main/scala/org/apache/spark/sql/execution/bucketing/CoalesceBucketsInSortMergeJoin.scala rename sql/core/src/test/scala/org/apache/spark/sql/execution/bucketing/{CoalesceBucketsInSortMergeJoinSuite.scala => CoalesceBucketsInJoinSuite.scala} (55%) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (0432379 -> 39181ff)
This is an automated email from the ASF dual-hosted git repository. yamamuro pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from 0432379 [SPARK-24266][K8S] Restart the watcher when we receive a version changed from k8s add 39181ff [SPARK-32286][SQL] Coalesce bucketed table for shuffled hash join if applicable No new revisions were added by this update. Summary of changes: .../org/apache/spark/sql/internal/SQLConf.scala| 22 ++- .../spark/sql/execution/QueryExecution.scala | 4 +- .../bucketing/CoalesceBucketsInJoin.scala | 177 + .../bucketing/CoalesceBucketsInSortMergeJoin.scala | 132 --- .../scala/org/apache/spark/sql/ExplainSuite.scala | 2 +- ...uite.scala => CoalesceBucketsInJoinSuite.scala} | 136 .../spark/sql/sources/BucketedReadSuite.scala | 14 +- 7 files changed, 309 insertions(+), 178 deletions(-) create mode 100644 sql/core/src/main/scala/org/apache/spark/sql/execution/bucketing/CoalesceBucketsInJoin.scala delete mode 100644 sql/core/src/main/scala/org/apache/spark/sql/execution/bucketing/CoalesceBucketsInSortMergeJoin.scala rename sql/core/src/test/scala/org/apache/spark/sql/execution/bucketing/{CoalesceBucketsInSortMergeJoinSuite.scala => CoalesceBucketsInJoinSuite.scala} (55%) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (0432379 -> 39181ff)
This is an automated email from the ASF dual-hosted git repository. yamamuro pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from 0432379 [SPARK-24266][K8S] Restart the watcher when we receive a version changed from k8s add 39181ff [SPARK-32286][SQL] Coalesce bucketed table for shuffled hash join if applicable No new revisions were added by this update. Summary of changes: .../org/apache/spark/sql/internal/SQLConf.scala| 22 ++- .../spark/sql/execution/QueryExecution.scala | 4 +- .../bucketing/CoalesceBucketsInJoin.scala | 177 + .../bucketing/CoalesceBucketsInSortMergeJoin.scala | 132 --- .../scala/org/apache/spark/sql/ExplainSuite.scala | 2 +- ...uite.scala => CoalesceBucketsInJoinSuite.scala} | 136 .../spark/sql/sources/BucketedReadSuite.scala | 14 +- 7 files changed, 309 insertions(+), 178 deletions(-) create mode 100644 sql/core/src/main/scala/org/apache/spark/sql/execution/bucketing/CoalesceBucketsInJoin.scala delete mode 100644 sql/core/src/main/scala/org/apache/spark/sql/execution/bucketing/CoalesceBucketsInSortMergeJoin.scala rename sql/core/src/test/scala/org/apache/spark/sql/execution/bucketing/{CoalesceBucketsInSortMergeJoinSuite.scala => CoalesceBucketsInJoinSuite.scala} (55%) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (0432379 -> 39181ff)
This is an automated email from the ASF dual-hosted git repository. yamamuro pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from 0432379 [SPARK-24266][K8S] Restart the watcher when we receive a version changed from k8s add 39181ff [SPARK-32286][SQL] Coalesce bucketed table for shuffled hash join if applicable No new revisions were added by this update. Summary of changes: .../org/apache/spark/sql/internal/SQLConf.scala| 22 ++- .../spark/sql/execution/QueryExecution.scala | 4 +- .../bucketing/CoalesceBucketsInJoin.scala | 177 + .../bucketing/CoalesceBucketsInSortMergeJoin.scala | 132 --- .../scala/org/apache/spark/sql/ExplainSuite.scala | 2 +- ...uite.scala => CoalesceBucketsInJoinSuite.scala} | 136 .../spark/sql/sources/BucketedReadSuite.scala | 14 +- 7 files changed, 309 insertions(+), 178 deletions(-) create mode 100644 sql/core/src/main/scala/org/apache/spark/sql/execution/bucketing/CoalesceBucketsInJoin.scala delete mode 100644 sql/core/src/main/scala/org/apache/spark/sql/execution/bucketing/CoalesceBucketsInSortMergeJoin.scala rename sql/core/src/test/scala/org/apache/spark/sql/execution/bucketing/{CoalesceBucketsInSortMergeJoinSuite.scala => CoalesceBucketsInJoinSuite.scala} (55%) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (0432379 -> 39181ff)
This is an automated email from the ASF dual-hosted git repository. yamamuro pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from 0432379 [SPARK-24266][K8S] Restart the watcher when we receive a version changed from k8s add 39181ff [SPARK-32286][SQL] Coalesce bucketed table for shuffled hash join if applicable No new revisions were added by this update. Summary of changes: .../org/apache/spark/sql/internal/SQLConf.scala| 22 ++- .../spark/sql/execution/QueryExecution.scala | 4 +- .../bucketing/CoalesceBucketsInJoin.scala | 177 + .../bucketing/CoalesceBucketsInSortMergeJoin.scala | 132 --- .../scala/org/apache/spark/sql/ExplainSuite.scala | 2 +- ...uite.scala => CoalesceBucketsInJoinSuite.scala} | 136 .../spark/sql/sources/BucketedReadSuite.scala | 14 +- 7 files changed, 309 insertions(+), 178 deletions(-) create mode 100644 sql/core/src/main/scala/org/apache/spark/sql/execution/bucketing/CoalesceBucketsInJoin.scala delete mode 100644 sql/core/src/main/scala/org/apache/spark/sql/execution/bucketing/CoalesceBucketsInSortMergeJoin.scala rename sql/core/src/test/scala/org/apache/spark/sql/execution/bucketing/{CoalesceBucketsInSortMergeJoinSuite.scala => CoalesceBucketsInJoinSuite.scala} (55%) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (98504e9 -> 004aea8)
This is an automated email from the ASF dual-hosted git repository. yamamuro pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from 98504e9 [SPARK-29358][SQL] Make unionByName optionally fill missing columns with nulls add 004aea8 [SPARK-32154][SQL] Use ExpressionEncoder for the return type of ScalaUDF to convert to catalyst type No new revisions were added by this update. Summary of changes: .../spark/sql/catalyst/analysis/Analyzer.scala | 7 +- .../sql/catalyst/encoders/ExpressionEncoder.scala | 8 +- .../spark/sql/catalyst/expressions/ScalaUDF.scala | 26 - .../org/apache/spark/sql/UDFRegistration.scala | 120 - .../sql/expressions/UserDefinedFunction.scala | 2 + .../scala/org/apache/spark/sql/functions.scala | 60 ++- .../test/scala/org/apache/spark/sql/UDFSuite.scala | 106 +++--- 7 files changed, 235 insertions(+), 94 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (98504e9 -> 004aea8)
This is an automated email from the ASF dual-hosted git repository. yamamuro pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from 98504e9 [SPARK-29358][SQL] Make unionByName optionally fill missing columns with nulls add 004aea8 [SPARK-32154][SQL] Use ExpressionEncoder for the return type of ScalaUDF to convert to catalyst type No new revisions were added by this update. Summary of changes: .../spark/sql/catalyst/analysis/Analyzer.scala | 7 +- .../sql/catalyst/encoders/ExpressionEncoder.scala | 8 +- .../spark/sql/catalyst/expressions/ScalaUDF.scala | 26 - .../org/apache/spark/sql/UDFRegistration.scala | 120 - .../sql/expressions/UserDefinedFunction.scala | 2 + .../scala/org/apache/spark/sql/functions.scala | 60 ++- .../test/scala/org/apache/spark/sql/UDFSuite.scala | 106 +++--- 7 files changed, 235 insertions(+), 94 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (98504e9 -> 004aea8)
This is an automated email from the ASF dual-hosted git repository. yamamuro pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from 98504e9 [SPARK-29358][SQL] Make unionByName optionally fill missing columns with nulls add 004aea8 [SPARK-32154][SQL] Use ExpressionEncoder for the return type of ScalaUDF to convert to catalyst type No new revisions were added by this update. Summary of changes: .../spark/sql/catalyst/analysis/Analyzer.scala | 7 +- .../sql/catalyst/encoders/ExpressionEncoder.scala | 8 +- .../spark/sql/catalyst/expressions/ScalaUDF.scala | 26 - .../org/apache/spark/sql/UDFRegistration.scala | 120 - .../sql/expressions/UserDefinedFunction.scala | 2 + .../scala/org/apache/spark/sql/functions.scala | 60 ++- .../test/scala/org/apache/spark/sql/UDFSuite.scala | 106 +++--- 7 files changed, 235 insertions(+), 94 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (98504e9 -> 004aea8)
This is an automated email from the ASF dual-hosted git repository. yamamuro pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from 98504e9 [SPARK-29358][SQL] Make unionByName optionally fill missing columns with nulls add 004aea8 [SPARK-32154][SQL] Use ExpressionEncoder for the return type of ScalaUDF to convert to catalyst type No new revisions were added by this update. Summary of changes: .../spark/sql/catalyst/analysis/Analyzer.scala | 7 +- .../sql/catalyst/encoders/ExpressionEncoder.scala | 8 +- .../spark/sql/catalyst/expressions/ScalaUDF.scala | 26 - .../org/apache/spark/sql/UDFRegistration.scala | 120 - .../sql/expressions/UserDefinedFunction.scala | 2 + .../scala/org/apache/spark/sql/functions.scala | 60 ++- .../test/scala/org/apache/spark/sql/UDFSuite.scala | 106 +++--- 7 files changed, 235 insertions(+), 94 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch branch-3.0 updated: [SPARK-32193][SQL][DOCS] Update regexp usage in SQL docs
This is an automated email from the ASF dual-hosted git repository. yamamuro pushed a commit to branch branch-3.0 in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/branch-3.0 by this push: new 29e098b [SPARK-32193][SQL][DOCS] Update regexp usage in SQL docs 29e098b is described below commit 29e098bc08477f00b5f83dfc5e73181668bb5f20 Author: GuoPhilipse <46367746+guophili...@users.noreply.github.com> AuthorDate: Thu Jul 9 16:14:33 2020 +0900 [SPARK-32193][SQL][DOCS] Update regexp usage in SQL docs ### What changes were proposed in this pull request? update REGEXP usage and examples in sql-ref-syntx-qry-select-like.cmd ### Why are the changes needed? make the usage of REGEXP known to more users ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? No tests Closes #29009 from GuoPhilipse/update-migrate-guide. Lead-authored-by: GuoPhilipse <46367746+guophili...@users.noreply.github.com> Co-authored-by: GuoPhilipse Signed-off-by: Takeshi Yamamuro (cherry picked from commit 09cc6c51eaa489733551e0507d129b06d683207c) Signed-off-by: Takeshi Yamamuro --- docs/sql-ref-syntax-qry-select-like.md | 12 ++-- 1 file changed, 10 insertions(+), 2 deletions(-) diff --git a/docs/sql-ref-syntax-qry-select-like.md b/docs/sql-ref-syntax-qry-select-like.md index feb5eb7..6211faa8 100644 --- a/docs/sql-ref-syntax-qry-select-like.md +++ b/docs/sql-ref-syntax-qry-select-like.md @@ -26,7 +26,7 @@ A LIKE predicate is used to search for a specific pattern. ### Syntax ```sql -[ NOT ] { LIKE search_pattern [ ESCAPE esc_char ] | RLIKE regex_pattern } +[ NOT ] { LIKE search_pattern [ ESCAPE esc_char ] | [ RLIKE | REGEXP ] regex_pattern } ``` ### Parameters @@ -44,7 +44,7 @@ A LIKE predicate is used to search for a specific pattern. * **regex_pattern** -Specifies a regular expression search pattern to be searched by the `RLIKE` clause. +Specifies a regular expression search pattern to be searched by the `RLIKE` or `REGEXP` clause. ### Examples @@ -90,6 +90,14 @@ SELECT * FROM person WHERE name RLIKE 'M+'; |200|Mary|null| +---+++ +SELECT * FROM person WHERE name REGEXP 'M+'; ++---+++ +| id|name| age| ++---+++ +|300|Mike| 80| +|200|Mary|null| ++---+++ + SELECT * FROM person WHERE name LIKE '%\_%'; +---+--+---+ | id| name|age| - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch branch-3.0 updated: [SPARK-32193][SQL][DOCS] Update regexp usage in SQL docs
This is an automated email from the ASF dual-hosted git repository. yamamuro pushed a commit to branch branch-3.0 in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/branch-3.0 by this push: new 29e098b [SPARK-32193][SQL][DOCS] Update regexp usage in SQL docs 29e098b is described below commit 29e098bc08477f00b5f83dfc5e73181668bb5f20 Author: GuoPhilipse <46367746+guophili...@users.noreply.github.com> AuthorDate: Thu Jul 9 16:14:33 2020 +0900 [SPARK-32193][SQL][DOCS] Update regexp usage in SQL docs ### What changes were proposed in this pull request? update REGEXP usage and examples in sql-ref-syntx-qry-select-like.cmd ### Why are the changes needed? make the usage of REGEXP known to more users ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? No tests Closes #29009 from GuoPhilipse/update-migrate-guide. Lead-authored-by: GuoPhilipse <46367746+guophili...@users.noreply.github.com> Co-authored-by: GuoPhilipse Signed-off-by: Takeshi Yamamuro (cherry picked from commit 09cc6c51eaa489733551e0507d129b06d683207c) Signed-off-by: Takeshi Yamamuro --- docs/sql-ref-syntax-qry-select-like.md | 12 ++-- 1 file changed, 10 insertions(+), 2 deletions(-) diff --git a/docs/sql-ref-syntax-qry-select-like.md b/docs/sql-ref-syntax-qry-select-like.md index feb5eb7..6211faa8 100644 --- a/docs/sql-ref-syntax-qry-select-like.md +++ b/docs/sql-ref-syntax-qry-select-like.md @@ -26,7 +26,7 @@ A LIKE predicate is used to search for a specific pattern. ### Syntax ```sql -[ NOT ] { LIKE search_pattern [ ESCAPE esc_char ] | RLIKE regex_pattern } +[ NOT ] { LIKE search_pattern [ ESCAPE esc_char ] | [ RLIKE | REGEXP ] regex_pattern } ``` ### Parameters @@ -44,7 +44,7 @@ A LIKE predicate is used to search for a specific pattern. * **regex_pattern** -Specifies a regular expression search pattern to be searched by the `RLIKE` clause. +Specifies a regular expression search pattern to be searched by the `RLIKE` or `REGEXP` clause. ### Examples @@ -90,6 +90,14 @@ SELECT * FROM person WHERE name RLIKE 'M+'; |200|Mary|null| +---+++ +SELECT * FROM person WHERE name REGEXP 'M+'; ++---+++ +| id|name| age| ++---+++ +|300|Mike| 80| +|200|Mary|null| ++---+++ + SELECT * FROM person WHERE name LIKE '%\_%'; +---+--+---+ | id| name|age| - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (8c5bee59 -> 09cc6c5)
This is an automated email from the ASF dual-hosted git repository. yamamuro pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from 8c5bee59 [SPARK-28067][SPARK-32018] Fix decimal overflow issues add 09cc6c5 [SPARK-32193][SQL][DOCS] Update regexp usage in SQL docs No new revisions were added by this update. Summary of changes: docs/sql-ref-syntax-qry-select-like.md | 12 ++-- 1 file changed, 10 insertions(+), 2 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch branch-3.0 updated: [SPARK-32193][SQL][DOCS] Update regexp usage in SQL docs
This is an automated email from the ASF dual-hosted git repository. yamamuro pushed a commit to branch branch-3.0 in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/branch-3.0 by this push: new 29e098b [SPARK-32193][SQL][DOCS] Update regexp usage in SQL docs 29e098b is described below commit 29e098bc08477f00b5f83dfc5e73181668bb5f20 Author: GuoPhilipse <46367746+guophili...@users.noreply.github.com> AuthorDate: Thu Jul 9 16:14:33 2020 +0900 [SPARK-32193][SQL][DOCS] Update regexp usage in SQL docs ### What changes were proposed in this pull request? update REGEXP usage and examples in sql-ref-syntx-qry-select-like.cmd ### Why are the changes needed? make the usage of REGEXP known to more users ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? No tests Closes #29009 from GuoPhilipse/update-migrate-guide. Lead-authored-by: GuoPhilipse <46367746+guophili...@users.noreply.github.com> Co-authored-by: GuoPhilipse Signed-off-by: Takeshi Yamamuro (cherry picked from commit 09cc6c51eaa489733551e0507d129b06d683207c) Signed-off-by: Takeshi Yamamuro --- docs/sql-ref-syntax-qry-select-like.md | 12 ++-- 1 file changed, 10 insertions(+), 2 deletions(-) diff --git a/docs/sql-ref-syntax-qry-select-like.md b/docs/sql-ref-syntax-qry-select-like.md index feb5eb7..6211faa8 100644 --- a/docs/sql-ref-syntax-qry-select-like.md +++ b/docs/sql-ref-syntax-qry-select-like.md @@ -26,7 +26,7 @@ A LIKE predicate is used to search for a specific pattern. ### Syntax ```sql -[ NOT ] { LIKE search_pattern [ ESCAPE esc_char ] | RLIKE regex_pattern } +[ NOT ] { LIKE search_pattern [ ESCAPE esc_char ] | [ RLIKE | REGEXP ] regex_pattern } ``` ### Parameters @@ -44,7 +44,7 @@ A LIKE predicate is used to search for a specific pattern. * **regex_pattern** -Specifies a regular expression search pattern to be searched by the `RLIKE` clause. +Specifies a regular expression search pattern to be searched by the `RLIKE` or `REGEXP` clause. ### Examples @@ -90,6 +90,14 @@ SELECT * FROM person WHERE name RLIKE 'M+'; |200|Mary|null| +---+++ +SELECT * FROM person WHERE name REGEXP 'M+'; ++---+++ +| id|name| age| ++---+++ +|300|Mike| 80| +|200|Mary|null| ++---+++ + SELECT * FROM person WHERE name LIKE '%\_%'; +---+--+---+ | id| name|age| - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (8c5bee59 -> 09cc6c5)
This is an automated email from the ASF dual-hosted git repository. yamamuro pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from 8c5bee59 [SPARK-28067][SPARK-32018] Fix decimal overflow issues add 09cc6c5 [SPARK-32193][SQL][DOCS] Update regexp usage in SQL docs No new revisions were added by this update. Summary of changes: docs/sql-ref-syntax-qry-select-like.md | 12 ++-- 1 file changed, 10 insertions(+), 2 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (8c5bee59 -> 09cc6c5)
This is an automated email from the ASF dual-hosted git repository. yamamuro pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from 8c5bee59 [SPARK-28067][SPARK-32018] Fix decimal overflow issues add 09cc6c5 [SPARK-32193][SQL][DOCS] Update regexp usage in SQL docs No new revisions were added by this update. Summary of changes: docs/sql-ref-syntax-qry-select-like.md | 12 ++-- 1 file changed, 10 insertions(+), 2 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch branch-3.0 updated: [SPARK-30703][SQL][FOLLOWUP] Update SqlBase.g4 invalid comment
This is an automated email from the ASF dual-hosted git repository. yamamuro pushed a commit to branch branch-3.0 in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/branch-3.0 by this push: new 37dc51a [SPARK-30703][SQL][FOLLOWUP] Update SqlBase.g4 invalid comment 37dc51a is described below commit 37dc51a5dddc2958c18831e9c0809c3b495aa719 Author: ulysses AuthorDate: Wed Jul 8 11:30:47 2020 +0900 [SPARK-30703][SQL][FOLLOWUP] Update SqlBase.g4 invalid comment ### What changes were proposed in this pull request? Modify the comment of `SqlBase.g4`. ### Why are the changes needed? `docs/sql-keywords.md` has already moved to `docs/sql-ref-ansi-compliance.md#sql-keywords`. ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? No need. Closes #29033 from ulysses-you/SPARK-30703-FOLLOWUP. Authored-by: ulysses Signed-off-by: Takeshi Yamamuro (cherry picked from commit 65286aec4b3c4e93d8beac6dd1b097ce97d53fd8) Signed-off-by: Takeshi Yamamuro --- .../src/main/antlr4/org/apache/spark/sql/catalyst/parser/SqlBase.g4| 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/sql/catalyst/src/main/antlr4/org/apache/spark/sql/catalyst/parser/SqlBase.g4 b/sql/catalyst/src/main/antlr4/org/apache/spark/sql/catalyst/parser/SqlBase.g4 index 5821a74..75dae8f 100644 --- a/sql/catalyst/src/main/antlr4/org/apache/spark/sql/catalyst/parser/SqlBase.g4 +++ b/sql/catalyst/src/main/antlr4/org/apache/spark/sql/catalyst/parser/SqlBase.g4 @@ -1441,8 +1441,7 @@ nonReserved ; // NOTE: If you add a new token in the list below, you should update the list of keywords -// in `docs/sql-keywords.md`. If the token is a non-reserved keyword, -// please update `ansiNonReserved` and `nonReserved` as well. +// and reserved tag in `docs/sql-ref-ansi-compliance.md#sql-keywords`. // // Start of the keywords list - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch branch-3.0 updated: [SPARK-30703][SQL][FOLLOWUP] Update SqlBase.g4 invalid comment
This is an automated email from the ASF dual-hosted git repository. yamamuro pushed a commit to branch branch-3.0 in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/branch-3.0 by this push: new 37dc51a [SPARK-30703][SQL][FOLLOWUP] Update SqlBase.g4 invalid comment 37dc51a is described below commit 37dc51a5dddc2958c18831e9c0809c3b495aa719 Author: ulysses AuthorDate: Wed Jul 8 11:30:47 2020 +0900 [SPARK-30703][SQL][FOLLOWUP] Update SqlBase.g4 invalid comment ### What changes were proposed in this pull request? Modify the comment of `SqlBase.g4`. ### Why are the changes needed? `docs/sql-keywords.md` has already moved to `docs/sql-ref-ansi-compliance.md#sql-keywords`. ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? No need. Closes #29033 from ulysses-you/SPARK-30703-FOLLOWUP. Authored-by: ulysses Signed-off-by: Takeshi Yamamuro (cherry picked from commit 65286aec4b3c4e93d8beac6dd1b097ce97d53fd8) Signed-off-by: Takeshi Yamamuro --- .../src/main/antlr4/org/apache/spark/sql/catalyst/parser/SqlBase.g4| 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/sql/catalyst/src/main/antlr4/org/apache/spark/sql/catalyst/parser/SqlBase.g4 b/sql/catalyst/src/main/antlr4/org/apache/spark/sql/catalyst/parser/SqlBase.g4 index 5821a74..75dae8f 100644 --- a/sql/catalyst/src/main/antlr4/org/apache/spark/sql/catalyst/parser/SqlBase.g4 +++ b/sql/catalyst/src/main/antlr4/org/apache/spark/sql/catalyst/parser/SqlBase.g4 @@ -1441,8 +1441,7 @@ nonReserved ; // NOTE: If you add a new token in the list below, you should update the list of keywords -// in `docs/sql-keywords.md`. If the token is a non-reserved keyword, -// please update `ansiNonReserved` and `nonReserved` as well. +// and reserved tag in `docs/sql-ref-ansi-compliance.md#sql-keywords`. // // Start of the keywords list - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (b5297c4 -> 65286ae)
This is an automated email from the ASF dual-hosted git repository. yamamuro pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from b5297c4 [SPARK-20680][SQL] Spark-sql do not support for creating table with void column datatype add 65286ae [SPARK-30703][SQL][FOLLOWUP] Update SqlBase.g4 invalid comment No new revisions were added by this update. Summary of changes: .../src/main/antlr4/org/apache/spark/sql/catalyst/parser/SqlBase.g4| 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch branch-3.0 updated: [SPARK-30703][SQL][FOLLOWUP] Update SqlBase.g4 invalid comment
This is an automated email from the ASF dual-hosted git repository. yamamuro pushed a commit to branch branch-3.0 in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/branch-3.0 by this push: new 37dc51a [SPARK-30703][SQL][FOLLOWUP] Update SqlBase.g4 invalid comment 37dc51a is described below commit 37dc51a5dddc2958c18831e9c0809c3b495aa719 Author: ulysses AuthorDate: Wed Jul 8 11:30:47 2020 +0900 [SPARK-30703][SQL][FOLLOWUP] Update SqlBase.g4 invalid comment ### What changes were proposed in this pull request? Modify the comment of `SqlBase.g4`. ### Why are the changes needed? `docs/sql-keywords.md` has already moved to `docs/sql-ref-ansi-compliance.md#sql-keywords`. ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? No need. Closes #29033 from ulysses-you/SPARK-30703-FOLLOWUP. Authored-by: ulysses Signed-off-by: Takeshi Yamamuro (cherry picked from commit 65286aec4b3c4e93d8beac6dd1b097ce97d53fd8) Signed-off-by: Takeshi Yamamuro --- .../src/main/antlr4/org/apache/spark/sql/catalyst/parser/SqlBase.g4| 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/sql/catalyst/src/main/antlr4/org/apache/spark/sql/catalyst/parser/SqlBase.g4 b/sql/catalyst/src/main/antlr4/org/apache/spark/sql/catalyst/parser/SqlBase.g4 index 5821a74..75dae8f 100644 --- a/sql/catalyst/src/main/antlr4/org/apache/spark/sql/catalyst/parser/SqlBase.g4 +++ b/sql/catalyst/src/main/antlr4/org/apache/spark/sql/catalyst/parser/SqlBase.g4 @@ -1441,8 +1441,7 @@ nonReserved ; // NOTE: If you add a new token in the list below, you should update the list of keywords -// in `docs/sql-keywords.md`. If the token is a non-reserved keyword, -// please update `ansiNonReserved` and `nonReserved` as well. +// and reserved tag in `docs/sql-ref-ansi-compliance.md#sql-keywords`. // // Start of the keywords list - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (b5297c4 -> 65286ae)
This is an automated email from the ASF dual-hosted git repository. yamamuro pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from b5297c4 [SPARK-20680][SQL] Spark-sql do not support for creating table with void column datatype add 65286ae [SPARK-30703][SQL][FOLLOWUP] Update SqlBase.g4 invalid comment No new revisions were added by this update. Summary of changes: .../src/main/antlr4/org/apache/spark/sql/catalyst/parser/SqlBase.g4| 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch branch-3.0 updated: [SPARK-30703][SQL][FOLLOWUP] Update SqlBase.g4 invalid comment
This is an automated email from the ASF dual-hosted git repository. yamamuro pushed a commit to branch branch-3.0 in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/branch-3.0 by this push: new 37dc51a [SPARK-30703][SQL][FOLLOWUP] Update SqlBase.g4 invalid comment 37dc51a is described below commit 37dc51a5dddc2958c18831e9c0809c3b495aa719 Author: ulysses AuthorDate: Wed Jul 8 11:30:47 2020 +0900 [SPARK-30703][SQL][FOLLOWUP] Update SqlBase.g4 invalid comment ### What changes were proposed in this pull request? Modify the comment of `SqlBase.g4`. ### Why are the changes needed? `docs/sql-keywords.md` has already moved to `docs/sql-ref-ansi-compliance.md#sql-keywords`. ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? No need. Closes #29033 from ulysses-you/SPARK-30703-FOLLOWUP. Authored-by: ulysses Signed-off-by: Takeshi Yamamuro (cherry picked from commit 65286aec4b3c4e93d8beac6dd1b097ce97d53fd8) Signed-off-by: Takeshi Yamamuro --- .../src/main/antlr4/org/apache/spark/sql/catalyst/parser/SqlBase.g4| 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/sql/catalyst/src/main/antlr4/org/apache/spark/sql/catalyst/parser/SqlBase.g4 b/sql/catalyst/src/main/antlr4/org/apache/spark/sql/catalyst/parser/SqlBase.g4 index 5821a74..75dae8f 100644 --- a/sql/catalyst/src/main/antlr4/org/apache/spark/sql/catalyst/parser/SqlBase.g4 +++ b/sql/catalyst/src/main/antlr4/org/apache/spark/sql/catalyst/parser/SqlBase.g4 @@ -1441,8 +1441,7 @@ nonReserved ; // NOTE: If you add a new token in the list below, you should update the list of keywords -// in `docs/sql-keywords.md`. If the token is a non-reserved keyword, -// please update `ansiNonReserved` and `nonReserved` as well. +// and reserved tag in `docs/sql-ref-ansi-compliance.md#sql-keywords`. // // Start of the keywords list - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (b5297c4 -> 65286ae)
This is an automated email from the ASF dual-hosted git repository. yamamuro pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from b5297c4 [SPARK-20680][SQL] Spark-sql do not support for creating table with void column datatype add 65286ae [SPARK-30703][SQL][FOLLOWUP] Update SqlBase.g4 invalid comment No new revisions were added by this update. Summary of changes: .../src/main/antlr4/org/apache/spark/sql/catalyst/parser/SqlBase.g4| 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch branch-3.0 updated: [SPARK-30703][SQL][FOLLOWUP] Update SqlBase.g4 invalid comment
This is an automated email from the ASF dual-hosted git repository. yamamuro pushed a commit to branch branch-3.0 in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/branch-3.0 by this push: new 37dc51a [SPARK-30703][SQL][FOLLOWUP] Update SqlBase.g4 invalid comment 37dc51a is described below commit 37dc51a5dddc2958c18831e9c0809c3b495aa719 Author: ulysses AuthorDate: Wed Jul 8 11:30:47 2020 +0900 [SPARK-30703][SQL][FOLLOWUP] Update SqlBase.g4 invalid comment ### What changes were proposed in this pull request? Modify the comment of `SqlBase.g4`. ### Why are the changes needed? `docs/sql-keywords.md` has already moved to `docs/sql-ref-ansi-compliance.md#sql-keywords`. ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? No need. Closes #29033 from ulysses-you/SPARK-30703-FOLLOWUP. Authored-by: ulysses Signed-off-by: Takeshi Yamamuro (cherry picked from commit 65286aec4b3c4e93d8beac6dd1b097ce97d53fd8) Signed-off-by: Takeshi Yamamuro --- .../src/main/antlr4/org/apache/spark/sql/catalyst/parser/SqlBase.g4| 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/sql/catalyst/src/main/antlr4/org/apache/spark/sql/catalyst/parser/SqlBase.g4 b/sql/catalyst/src/main/antlr4/org/apache/spark/sql/catalyst/parser/SqlBase.g4 index 5821a74..75dae8f 100644 --- a/sql/catalyst/src/main/antlr4/org/apache/spark/sql/catalyst/parser/SqlBase.g4 +++ b/sql/catalyst/src/main/antlr4/org/apache/spark/sql/catalyst/parser/SqlBase.g4 @@ -1441,8 +1441,7 @@ nonReserved ; // NOTE: If you add a new token in the list below, you should update the list of keywords -// in `docs/sql-keywords.md`. If the token is a non-reserved keyword, -// please update `ansiNonReserved` and `nonReserved` as well. +// and reserved tag in `docs/sql-ref-ansi-compliance.md#sql-keywords`. // // Start of the keywords list - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (b5297c4 -> 65286ae)
This is an automated email from the ASF dual-hosted git repository. yamamuro pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from b5297c4 [SPARK-20680][SQL] Spark-sql do not support for creating table with void column datatype add 65286ae [SPARK-30703][SQL][FOLLOWUP] Update SqlBase.g4 invalid comment No new revisions were added by this update. Summary of changes: .../src/main/antlr4/org/apache/spark/sql/catalyst/parser/SqlBase.g4| 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated: [SPARK-30703][SQL][FOLLOWUP] Update SqlBase.g4 invalid comment
This is an automated email from the ASF dual-hosted git repository. yamamuro pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/master by this push: new 65286ae [SPARK-30703][SQL][FOLLOWUP] Update SqlBase.g4 invalid comment 65286ae is described below commit 65286aec4b3c4e93d8beac6dd1b097ce97d53fd8 Author: ulysses AuthorDate: Wed Jul 8 11:30:47 2020 +0900 [SPARK-30703][SQL][FOLLOWUP] Update SqlBase.g4 invalid comment ### What changes were proposed in this pull request? Modify the comment of `SqlBase.g4`. ### Why are the changes needed? `docs/sql-keywords.md` has already moved to `docs/sql-ref-ansi-compliance.md#sql-keywords`. ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? No need. Closes #29033 from ulysses-you/SPARK-30703-FOLLOWUP. Authored-by: ulysses Signed-off-by: Takeshi Yamamuro --- .../src/main/antlr4/org/apache/spark/sql/catalyst/parser/SqlBase.g4| 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/sql/catalyst/src/main/antlr4/org/apache/spark/sql/catalyst/parser/SqlBase.g4 b/sql/catalyst/src/main/antlr4/org/apache/spark/sql/catalyst/parser/SqlBase.g4 index 691fde8..b383e03 100644 --- a/sql/catalyst/src/main/antlr4/org/apache/spark/sql/catalyst/parser/SqlBase.g4 +++ b/sql/catalyst/src/main/antlr4/org/apache/spark/sql/catalyst/parser/SqlBase.g4 @@ -1461,8 +1461,7 @@ nonReserved ; // NOTE: If you add a new token in the list below, you should update the list of keywords -// in `docs/sql-keywords.md`. If the token is a non-reserved keyword, -// please update `ansiNonReserved` and `nonReserved` as well. +// and reserved tag in `docs/sql-ref-ansi-compliance.md#sql-keywords`. // // Start of the keywords list - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (a9247c3 -> 7b86838)
This is an automated email from the ASF dual-hosted git repository. yamamuro pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from a9247c3 [SPARK-32033][SS][DSTEAMS] Use new poll API in Kafka connector executor side to avoid infinite wait add 7b86838 [SPARK-31350][SQL] Coalesce bucketed tables for sort merge join if applicable No new revisions were added by this update. Summary of changes: .../org/apache/spark/sql/internal/SQLConf.scala| 20 +++ .../spark/sql/execution/DataSourceScanExec.scala | 29 ++- .../spark/sql/execution/QueryExecution.scala | 2 + .../bucketing/CoalesceBucketsInSortMergeJoin.scala | 132 ++ .../execution/datasources/FileSourceStrategy.scala | 1 + .../org/apache/spark/sql/DataFrameJoinSuite.scala | 2 +- .../scala/org/apache/spark/sql/ExplainSuite.scala | 17 ++ .../scala/org/apache/spark/sql/SubquerySuite.scala | 2 +- .../CoalesceBucketsInSortMergeJoinSuite.scala | 194 + .../spark/sql/sources/BucketedReadSuite.scala | 137 ++- 10 files changed, 523 insertions(+), 13 deletions(-) create mode 100644 sql/core/src/main/scala/org/apache/spark/sql/execution/bucketing/CoalesceBucketsInSortMergeJoin.scala create mode 100644 sql/core/src/test/scala/org/apache/spark/sql/execution/bucketing/CoalesceBucketsInSortMergeJoinSuite.scala - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (a9247c3 -> 7b86838)
This is an automated email from the ASF dual-hosted git repository. yamamuro pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from a9247c3 [SPARK-32033][SS][DSTEAMS] Use new poll API in Kafka connector executor side to avoid infinite wait add 7b86838 [SPARK-31350][SQL] Coalesce bucketed tables for sort merge join if applicable No new revisions were added by this update. Summary of changes: .../org/apache/spark/sql/internal/SQLConf.scala| 20 +++ .../spark/sql/execution/DataSourceScanExec.scala | 29 ++- .../spark/sql/execution/QueryExecution.scala | 2 + .../bucketing/CoalesceBucketsInSortMergeJoin.scala | 132 ++ .../execution/datasources/FileSourceStrategy.scala | 1 + .../org/apache/spark/sql/DataFrameJoinSuite.scala | 2 +- .../scala/org/apache/spark/sql/ExplainSuite.scala | 17 ++ .../scala/org/apache/spark/sql/SubquerySuite.scala | 2 +- .../CoalesceBucketsInSortMergeJoinSuite.scala | 194 + .../spark/sql/sources/BucketedReadSuite.scala | 137 ++- 10 files changed, 523 insertions(+), 13 deletions(-) create mode 100644 sql/core/src/main/scala/org/apache/spark/sql/execution/bucketing/CoalesceBucketsInSortMergeJoin.scala create mode 100644 sql/core/src/test/scala/org/apache/spark/sql/execution/bucketing/CoalesceBucketsInSortMergeJoinSuite.scala - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (a9247c3 -> 7b86838)
This is an automated email from the ASF dual-hosted git repository. yamamuro pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from a9247c3 [SPARK-32033][SS][DSTEAMS] Use new poll API in Kafka connector executor side to avoid infinite wait add 7b86838 [SPARK-31350][SQL] Coalesce bucketed tables for sort merge join if applicable No new revisions were added by this update. Summary of changes: .../org/apache/spark/sql/internal/SQLConf.scala| 20 +++ .../spark/sql/execution/DataSourceScanExec.scala | 29 ++- .../spark/sql/execution/QueryExecution.scala | 2 + .../bucketing/CoalesceBucketsInSortMergeJoin.scala | 132 ++ .../execution/datasources/FileSourceStrategy.scala | 1 + .../org/apache/spark/sql/DataFrameJoinSuite.scala | 2 +- .../scala/org/apache/spark/sql/ExplainSuite.scala | 17 ++ .../scala/org/apache/spark/sql/SubquerySuite.scala | 2 +- .../CoalesceBucketsInSortMergeJoinSuite.scala | 194 + .../spark/sql/sources/BucketedReadSuite.scala | 137 ++- 10 files changed, 523 insertions(+), 13 deletions(-) create mode 100644 sql/core/src/main/scala/org/apache/spark/sql/execution/bucketing/CoalesceBucketsInSortMergeJoin.scala create mode 100644 sql/core/src/test/scala/org/apache/spark/sql/execution/bucketing/CoalesceBucketsInSortMergeJoinSuite.scala - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] 02/02: [SPARK-26905][SQL] Follow the SQL:2016 reserved keywords
This is an automated email from the ASF dual-hosted git repository. yamamuro pushed a commit to branch branch-3.0 in repository https://gitbox.apache.org/repos/asf/spark.git commit ed69190ce0762f3b741b8d175ef8d02da45f3183 Author: Takeshi Yamamuro AuthorDate: Tue Jun 16 00:27:45 2020 +0900 [SPARK-26905][SQL] Follow the SQL:2016 reserved keywords ### What changes were proposed in this pull request? This PR intends to move keywords `ANTI`, `SEMI`, and `MINUS` from reserved to non-reserved. ### Why are the changes needed? To comply with the ANSI/SQL standard. ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? Added tests. Closes #28807 from maropu/SPARK-26905-2. Authored-by: Takeshi Yamamuro Signed-off-by: Takeshi Yamamuro --- docs/sql-ref-ansi-compliance.md| 6 +- .../apache/spark/sql/catalyst/parser/SqlBase.g4| 3 + .../resources/ansi-sql-2016-reserved-keywords.txt | 401 + .../parser/TableIdentifierParserSuite.scala| 24 +- 4 files changed, 429 insertions(+), 5 deletions(-) diff --git a/docs/sql-ref-ansi-compliance.md b/docs/sql-ref-ansi-compliance.md index eab194c..e5ca7e9d 100644 --- a/docs/sql-ref-ansi-compliance.md +++ b/docs/sql-ref-ansi-compliance.md @@ -135,7 +135,7 @@ Below is a list of all the keywords in Spark SQL. |ALTER|non-reserved|non-reserved|reserved| |ANALYZE|non-reserved|non-reserved|non-reserved| |AND|reserved|non-reserved|reserved| -|ANTI|reserved|strict-non-reserved|non-reserved| +|ANTI|non-reserved|strict-non-reserved|non-reserved| |ANY|reserved|non-reserved|reserved| |ARCHIVE|non-reserved|non-reserved|non-reserved| |ARRAY|non-reserved|non-reserved|reserved| @@ -264,7 +264,7 @@ Below is a list of all the keywords in Spark SQL. |MAP|non-reserved|non-reserved|non-reserved| |MATCHED|non-reserved|non-reserved|non-reserved| |MERGE|non-reserved|non-reserved|non-reserved| -|MINUS|reserved|strict-non-reserved|non-reserved| +|MINUS|not-reserved|strict-non-reserved|non-reserved| |MINUTE|reserved|non-reserved|reserved| |MONTH|reserved|non-reserved|reserved| |MSCK|non-reserved|non-reserved|non-reserved| @@ -325,7 +325,7 @@ Below is a list of all the keywords in Spark SQL. |SCHEMA|non-reserved|non-reserved|non-reserved| |SECOND|reserved|non-reserved|reserved| |SELECT|reserved|non-reserved|reserved| -|SEMI|reserved|strict-non-reserved|non-reserved| +|SEMI|non-reserved|strict-non-reserved|non-reserved| |SEPARATED|non-reserved|non-reserved|non-reserved| |SERDE|non-reserved|non-reserved|non-reserved| |SERDEPROPERTIES|non-reserved|non-reserved|non-reserved| diff --git a/sql/catalyst/src/main/antlr4/org/apache/spark/sql/catalyst/parser/SqlBase.g4 b/sql/catalyst/src/main/antlr4/org/apache/spark/sql/catalyst/parser/SqlBase.g4 index 14a6687..5821a74 100644 --- a/sql/catalyst/src/main/antlr4/org/apache/spark/sql/catalyst/parser/SqlBase.g4 +++ b/sql/catalyst/src/main/antlr4/org/apache/spark/sql/catalyst/parser/SqlBase.g4 @@ -994,6 +994,7 @@ ansiNonReserved | AFTER | ALTER | ANALYZE +| ANTI | ARCHIVE | ARRAY | ASC @@ -1126,10 +1127,12 @@ ansiNonReserved | ROW | ROWS | SCHEMA +| SEMI | SEPARATED | SERDE | SERDEPROPERTIES | SET +| SETMINUS | SETS | SHOW | SKEWED diff --git a/sql/catalyst/src/test/resources/ansi-sql-2016-reserved-keywords.txt b/sql/catalyst/src/test/resources/ansi-sql-2016-reserved-keywords.txt new file mode 100644 index 000..921491a --- /dev/null +++ b/sql/catalyst/src/test/resources/ansi-sql-2016-reserved-keywords.txt @@ -0,0 +1,401 @@ +-- This file comes from: https://github.com/postgres/postgres/tree/master/doc/src/sgml/keywords +ABS +ACOS +ALL +ALLOCATE +ALTER +AND +ANY +ARE +ARRAY +ARRAY_AGG +ARRAY_MAX_CARDINALITY +AS +ASENSITIVE +ASIN +ASYMMETRIC +AT +ATAN +ATOMIC +AUTHORIZATION +AVG +BEGIN +BEGIN_FRAME +BEGIN_PARTITION +BETWEEN +BIGINT +BINARY +BLOB +BOOLEAN +BOTH +BY +CALL +CALLED +CARDINALITY +CASCADED +CASE +CAST +CEIL +CEILING +CHAR +CHAR_LENGTH +CHARACTER +CHARACTER_LENGTH +CHECK +CLASSIFIER +CLOB +CLOSE +COALESCE +COLLATE +COLLECT +COLUMN +COMMIT +CONDITION +CONNECT +CONSTRAINT +CONTAINS +CONVERT +COPY +CORR +CORRESPONDING +COS +COSH +COUNT +COVAR_POP +COVAR_SAMP +CREATE +CROSS +CUBE +CUME_DIST +CURRENT +CURRENT_CATALOG +CURRENT_DATE +CURRENT_DEFAULT_TRANSFORM_GROUP +CURRENT_PATH +CURRENT_ROLE +CURRENT_ROW +CURRENT_SCHEMA +CURRENT_TIME +CURRENT_TIMESTAMP +CURRENT_TRANSFORM_GROUP_FOR_TYPE +CURRENT_USER +CURSOR +CYCLE +DATE +DAY +DEALLOCATE +DEC +DECIMAL +DECFLOAT +DECLARE +DEFAULT +DEFINE +DELETE +DENSE_RANK +DEREF +DESCRIBE +DETERMINISTIC +DISCONNECT +DISTINCT +DOUBLE +DROP +DYNAMIC +EACH +ELEMENT +ELSE +EMPTY +END +END_FRAME +END_PARTITION +END-EXEC +EQUALS +ESCAPE +EVERY +EXCEPT +EXEC +EXECUTE +EXISTS +EXP +EXTERNAL +EXTRACT +FALSE +FETCH +FILTER +FIRST_VALUE
[spark] 01/02: [SPARK-31950][SQL][TESTS] Extract SQL keywords from the SqlBase.g4 file
This is an automated email from the ASF dual-hosted git repository. yamamuro pushed a commit to branch branch-3.0 in repository https://gitbox.apache.org/repos/asf/spark.git commit b70c68ae458d929cbf28a084cecf8252b4a3849f Author: Takeshi Yamamuro AuthorDate: Sat Jun 13 07:12:27 2020 +0900 [SPARK-31950][SQL][TESTS] Extract SQL keywords from the SqlBase.g4 file ### What changes were proposed in this pull request? This PR intends to extract SQL reserved/non-reserved keywords from the ANTLR grammar file (`SqlBase.g4`) directly. This approach is based on the cloud-fan suggestion: https://github.com/apache/spark/pull/28779#issuecomment-642033217 ### Why are the changes needed? It is hard to maintain a full set of the keywords in `TableIdentifierParserSuite`, so it would be nice if we could extract them from the `SqlBase.g4` file directly. ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? Existing tests. Closes #28802 from maropu/SPARK-31950-2. Authored-by: Takeshi Yamamuro Signed-off-by: Takeshi Yamamuro --- .../apache/spark/sql/catalyst/parser/SqlBase.g4| 4 + .../parser/TableIdentifierParserSuite.scala| 432 + 2 files changed, 110 insertions(+), 326 deletions(-) diff --git a/sql/catalyst/src/main/antlr4/org/apache/spark/sql/catalyst/parser/SqlBase.g4 b/sql/catalyst/src/main/antlr4/org/apache/spark/sql/catalyst/parser/SqlBase.g4 index 208a503..14a6687 100644 --- a/sql/catalyst/src/main/antlr4/org/apache/spark/sql/catalyst/parser/SqlBase.g4 +++ b/sql/catalyst/src/main/antlr4/org/apache/spark/sql/catalyst/parser/SqlBase.g4 @@ -989,6 +989,7 @@ alterColumnAction // You can find the full keywords list by searching "Start of the keywords list" in this file. // The non-reserved keywords are listed below. Keywords not in this list are reserved keywords. ansiNonReserved +//--ANSI-NON-RESERVED-START : ADD | AFTER | ALTER @@ -1165,6 +1166,7 @@ ansiNonReserved | VIEW | VIEWS | WINDOW +//--ANSI-NON-RESERVED-END ; // When `SQL_standard_keyword_behavior=false`, there are 2 kinds of keywords in Spark SQL. @@ -1442,6 +1444,7 @@ nonReserved // // Start of the keywords list // +//--SPARK-KEYWORD-LIST-START ADD: 'ADD'; AFTER: 'AFTER'; ALL: 'ALL'; @@ -1694,6 +1697,7 @@ WHERE: 'WHERE'; WINDOW: 'WINDOW'; WITH: 'WITH'; YEAR: 'YEAR'; +//--SPARK-KEYWORD-LIST-END // // End of the keywords list // diff --git a/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/parser/TableIdentifierParserSuite.scala b/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/parser/TableIdentifierParserSuite.scala index bd617bf..04969e3 100644 --- a/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/parser/TableIdentifierParserSuite.scala +++ b/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/parser/TableIdentifierParserSuite.scala @@ -16,9 +16,14 @@ */ package org.apache.spark.sql.catalyst.parser +import java.util.Locale + +import scala.collection.mutable + import org.apache.spark.SparkFunSuite import org.apache.spark.sql.catalyst.TableIdentifier import org.apache.spark.sql.catalyst.plans.SQLHelper +import org.apache.spark.sql.catalyst.util.fileToString import org.apache.spark.sql.internal.SQLConf class TableIdentifierParserSuite extends SparkFunSuite with SQLHelper { @@ -285,334 +290,109 @@ class TableIdentifierParserSuite extends SparkFunSuite with SQLHelper { "where", "with") - // All the keywords in `docs/sql-keywords.md` are listed below: - val allCandidateKeywords = Set( -"add", -"after", -"all", -"alter", -"analyze", -"and", -"anti", -"any", -"archive", -"array", -"as", -"asc", -"at", -"authorization", -"between", -"both", -"bucket", -"buckets", -"by", -"cache", -"cascade", -"case", -"cast", -"change", -"check", -"clear", -"cluster", -"clustered", -"codegen", -"collate", -"collection", -"column", -"columns", -"comment", -"commit", -"compact", -"compactions", -"compute", -"concatenate", -"constraint", -"cost", -"create", -"cross", -"cube", -"current", -"current_date"