(spark) branch master updated: [SPARK-48303][CORE] Reorganize `LogKeys`
This is an automated email from the ASF dual-hosted git repository. gengliang pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/master by this push: new 5643cfb71d34 [SPARK-48303][CORE] Reorganize `LogKeys` 5643cfb71d34 is described below commit 5643cfb71d343133a185aa257f137074f41abfb3 Author: panbingkun AuthorDate: Thu May 16 23:20:23 2024 -0700 [SPARK-48303][CORE] Reorganize `LogKeys` ### What changes were proposed in this pull request? The pr aims to `reorganize` `LogKeys`, includes: - remove some unused `LogLeys` ACTUAL_BROADCAST_OUTPUT_STATUS_SIZE DEFAULT_COMPACTION_INTERVAL DRIVER_LIBRARY_PATH_KEY EXISTING_JARS EXPECTED_ANSWER FILTERS HAS_R_PACKAGE JAR_ENTRY LOG_KEY_FILE NUM_ADDED_MASTERS NUM_ADDED_WORKERS NUM_PARTITION_VALUES OUTPUT_LINE OUTPUT_LINE_NUMBER PARTITIONS_SIZE RULE_BATCH_NAME SERIALIZE_OUTPUT_LENGTH SHELL_COMMAND STREAM_SOURCE - merge `PARAMETER` into `PARAM` (because some are `full` spelled, and some are `abbreviations`, which are not unified) ESTIMATOR_PARAMETER_MAP -> ESTIMATOR_PARAM_MAP FUNCTION_PARAMETER -> FUNCTION_PARAM METHOD_PARAMETER_TYPES -> METHOD_PARAM_TYPES - merge `NUMBER` into `NUM` (abbreviations) MIN_VERSION_NUMBER -> MIN_VERSION_NUM RULE_NUMBER_OF_RUNS -> NUM_RULE_OF_RUNS VERSION_NUMBER -> VERSION_NUM - merge `TOTAL` into `NUM` TOTAL_RECORDS_READ -> NUM_RECORDS_READ TRAIN_WORD_COUNT -> NUM_TRAIN_WORD - `NUM` as prefix CHECKSUM_FILE_NUM -> NUM_CHECKSUM_FILE DATA_FILE_NUM -> NUM_DATA_FILE INDEX_FILE_NUM -> NUM_INDEX_FILE - COUNR -> NUM EXECUTOR_DESIRED_COUNT -> NUM_EXECUTOR_DESIRED EXECUTOR_LAUNCH_COUNT -> NUM_EXECUTOR_LAUNCH EXECUTOR_TARGET_COUNT -> NUM_EXECUTOR_TARGET KAFKA_PULLS_COUNT -> NUM_KAFKA_PULLS KAFKA_RECORDS_PULLED_COUNT -> NUM_KAFKA_RECORDS_PULLED MIN_FREQUENT_PATTERN_COUNT -> MIN_NUM_FREQUENT_PATTERN POD_COUNT -> NUM_POD POD_SHARED_SLOT_COUNT -> NUM_POD_SHARED_SLOT POD_TARGET_COUNT -> NUM_POD_TARGET RETRY_COUNT -> NUM_RETRY - fix some `typo` MALFORMATTED_STIRNG -> MALFORMATTED_STRING - other MAX_LOG_NUM_POLICY -> MAX_NUM_LOG_POLICY WEIGHTED_NUM -> NUM_WEIGHTED_EXAMPLES Changes in other code are additional changes caused by the above adjustments. ### Why are the changes needed? Let's make `LogKeys` easier to understand and more consistent. ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? Pass GA. ### Was this patch authored or co-authored using generative AI tooling? No. Closes #46612 from panbingkun/reorganize_logkey. Authored-by: panbingkun Signed-off-by: Gengliang Wang --- .../network/shuffle/RetryingBlockTransferor.java | 6 +- .../scala/org/apache/spark/internal/LogKey.scala | 68 -- .../sql/connect/client/GrpcRetryHandler.scala | 8 +-- .../sql/kafka010/KafkaOffsetReaderAdmin.scala | 4 +- .../sql/kafka010/KafkaOffsetReaderConsumer.scala | 4 +- .../sql/kafka010/consumer/KafkaDataConsumer.scala | 6 +- .../streaming/kinesis/KinesisBackedBlockRDD.scala | 4 +- .../org/apache/spark/api/r/RBackendHandler.scala | 4 +- .../spark/deploy/history/FsHistoryProvider.scala | 2 +- .../org/apache/spark/deploy/master/Master.scala| 2 +- .../apache/spark/ml/tree/impl/RandomForest.scala | 4 +- .../apache/spark/ml/tuning/CrossValidator.scala| 4 +- .../spark/ml/tuning/TrainValidationSplit.scala | 4 +- .../org/apache/spark/mllib/feature/Word2Vec.scala | 4 +- .../org/apache/spark/mllib/fpm/PrefixSpan.scala| 4 +- .../apache/spark/mllib/linalg/VectorsSuite.scala | 4 +- .../cluster/k8s/ExecutorPodsAllocator.scala| 6 +- ...ernetesLocalDiskShuffleExecutorComponents.scala | 6 +- .../apache/spark/deploy/yarn/YarnAllocator.scala | 6 +- .../catalyst/expressions/V2ExpressionUtils.scala | 4 +- .../spark/sql/catalyst/rules/RuleExecutor.scala| 6 +- .../sql/execution/streaming/state/RocksDB.scala| 18 +++--- .../streaming/state/RocksDBFileManager.scala | 22 +++ .../state/RocksDBStateStoreProvider.scala | 6 +- .../apache/hive/service/server/HiveServer2.java| 2 +- .../spark/sql/hive/client/HiveClientImpl.scala | 2 +- .../org/apache/spark/streaming/Checkpoint.scala| 4 +- .../streaming/util/FileBasedWriteAheadLog.scala| 4 +- 28 files changed, 101 insertions(+), 117 deletions(-) diff --git a/common/network-shuffle/src/main/java/org/apache/spark/network/shuffle/RetryingBlockTransferor.java b/common/network-shuffle/src/main/java/org/apache/spark/network/shuffle/Retryin
(spark) branch master updated (74a1a76e811a -> e07f1af03edf)
This is an automated email from the ASF dual-hosted git repository. gurwls223 pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git from 74a1a76e811a [MINOR][PYTHON][TESTS] Call `test_apply_schema_to_dict_and_rows` in `test_apply_schema_to_row` add e07f1af03edf [SPARK-48317][PYTHON][CONNECT][TESTS] Enable `test_udtf_with_analyze_using_archive` and `test_udtf_with_analyze_using_file` No new revisions were added by this update. Summary of changes: python/pyspark/sql/tests/connect/test_parity_udtf.py | 7 ++- python/pyspark/sql/tests/test_udtf.py| 14 -- 2 files changed, 10 insertions(+), 11 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
(spark) branch master updated (889820c1ff39 -> 74a1a76e811a)
This is an automated email from the ASF dual-hosted git repository. gurwls223 pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git from 889820c1ff39 [SPARK-41625][PYTHON][CONNECT][TESTS][FOLLOW-UP] Enable `DataFrameObservationParityTests.test_observe_str` add 74a1a76e811a [MINOR][PYTHON][TESTS] Call `test_apply_schema_to_dict_and_rows` in `test_apply_schema_to_row` No new revisions were added by this update. Summary of changes: python/pyspark/sql/tests/connect/test_parity_types.py | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
(spark) branch master updated: [SPARK-41625][PYTHON][CONNECT][TESTS][FOLLOW-UP] Enable `DataFrameObservationParityTests.test_observe_str`
This is an automated email from the ASF dual-hosted git repository. ruifengz pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/master by this push: new 889820c1ff39 [SPARK-41625][PYTHON][CONNECT][TESTS][FOLLOW-UP] Enable `DataFrameObservationParityTests.test_observe_str` 889820c1ff39 is described below commit 889820c1ff392983c52b55d80bd8d80be22785ab Author: Hyukjin Kwon AuthorDate: Fri May 17 11:57:34 2024 +0800 [SPARK-41625][PYTHON][CONNECT][TESTS][FOLLOW-UP] Enable `DataFrameObservationParityTests.test_observe_str` ### What changes were proposed in this pull request? This PR proposes to enable `DataFrameObservationParityTests.test_observe_str`. ### Why are the changes needed? To make sure on the test coverage ### Does this PR introduce _any_ user-facing change? No, test-only. ### How was this patch tested? CI in this PR. ### Was this patch authored or co-authored using generative AI tooling? No. Closes #46630 from HyukjinKwon/SPARK-41625-followup. Authored-by: Hyukjin Kwon Signed-off-by: Ruifeng Zheng --- python/pyspark/sql/tests/connect/test_parity_observation.py | 5 + 1 file changed, 1 insertion(+), 4 deletions(-) diff --git a/python/pyspark/sql/tests/connect/test_parity_observation.py b/python/pyspark/sql/tests/connect/test_parity_observation.py index a7b0009357b6..e16053d5a082 100644 --- a/python/pyspark/sql/tests/connect/test_parity_observation.py +++ b/python/pyspark/sql/tests/connect/test_parity_observation.py @@ -25,10 +25,7 @@ class DataFrameObservationParityTests( DataFrameObservationTestsMixin, ReusedConnectTestCase, ): -# TODO(SPARK-41625): Support Structured Streaming -@unittest.skip("Fails in Spark Connect, should enable.") -def test_observe_str(self): -super().test_observe_str() +pass if __name__ == "__main__": - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
(spark) branch master updated: [SPARK-48316][PS][CONNECT][TESTS] Fix comments for SparkFrameMethodsParityTests.test_coalesce and test_repartition
This is an automated email from the ASF dual-hosted git repository. gurwls223 pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/master by this push: new 714fc8cd872d [SPARK-48316][PS][CONNECT][TESTS] Fix comments for SparkFrameMethodsParityTests.test_coalesce and test_repartition 714fc8cd872d is described below commit 714fc8cd872d6f583a6066e9ddb4a51caa51caf3 Author: Hyukjin Kwon AuthorDate: Fri May 17 12:09:49 2024 +0900 [SPARK-48316][PS][CONNECT][TESTS] Fix comments for SparkFrameMethodsParityTests.test_coalesce and test_repartition ### What changes were proposed in this pull request? This PR proposes to enable `SparkFrameMethodsParityTests.test_coalesce` and `SparkFrameMethodsParityTests.test_repartition` in Spark Connect by avoiding RDD usage in the test. ### Why are the changes needed? To make sure on the test coverage ### Does this PR introduce _any_ user-facing change? No, test-only. ### How was this patch tested? CI in this PR. ### Was this patch authored or co-authored using generative AI tooling? No. Closes #46629 from HyukjinKwon/SPARK-48316. Authored-by: Hyukjin Kwon Signed-off-by: Hyukjin Kwon --- python/pyspark/pandas/tests/connect/test_parity_frame_spark.py | 8 ++-- 1 file changed, 6 insertions(+), 2 deletions(-) diff --git a/python/pyspark/pandas/tests/connect/test_parity_frame_spark.py b/python/pyspark/pandas/tests/connect/test_parity_frame_spark.py index 24626a9164e8..c3672647b71b 100644 --- a/python/pyspark/pandas/tests/connect/test_parity_frame_spark.py +++ b/python/pyspark/pandas/tests/connect/test_parity_frame_spark.py @@ -28,7 +28,9 @@ class SparkFrameMethodsParityTests( def test_checkpoint(self): super().test_checkpoint() -@unittest.skip("Test depends on RDD which is not supported from Spark Connect.") +@unittest.skip( +"Test depends on RDD, and cannot use SQL expression due to Catalyst optimization" +) def test_coalesce(self): super().test_coalesce() @@ -36,7 +38,9 @@ class SparkFrameMethodsParityTests( def test_local_checkpoint(self): super().test_local_checkpoint() -@unittest.skip("Test depends on RDD which is not supported from Spark Connect.") +@unittest.skip( +"Test depends on RDD, and cannot use SQL expression due to Catalyst optimization" +) def test_repartition(self): super().test_repartition() - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
(spark) branch master updated (b0e535217bf8 -> 403619a3974c)
This is an automated email from the ASF dual-hosted git repository. yao pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git from b0e535217bf8 [SPARK-48301][SQL][FOLLOWUP] Update the error message add 403619a3974c [SPARK-48306][SQL] Improve UDT in error message No new revisions were added by this update. Summary of changes: .../src/test/scala/org/apache/spark/sql/avro/AvroSuite.scala | 2 +- .../scala/org/apache/spark/sql/errors/DataTypeErrorsBase.scala | 3 ++- .../scala/org/apache/spark/sql/FileBasedDataSourceSuite.scala | 10 +- .../main/scala/org/apache/spark/sql/hive/HiveInspectors.scala | 5 +++-- .../sql/hive/execution/HiveScriptTransformationSuite.scala | 9 - .../org/apache/spark/sql/hive/orc/HiveOrcSourceSuite.scala | 4 ++-- 6 files changed, 17 insertions(+), 16 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
(spark) branch master updated: [SPARK-48301][SQL][FOLLOWUP] Update the error message
This is an automated email from the ASF dual-hosted git repository. ruifengz pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/master by this push: new b0e535217bf8 [SPARK-48301][SQL][FOLLOWUP] Update the error message b0e535217bf8 is described below commit b0e535217bf891f2320f2419d213e1c700e15b41 Author: Ruifeng Zheng AuthorDate: Fri May 17 09:56:06 2024 +0800 [SPARK-48301][SQL][FOLLOWUP] Update the error message ### What changes were proposed in this pull request? Update the error message ### Why are the changes needed? we don't support `CREATE PROCEDURE` in spark, to address https://github.com/apache/spark/pull/46608#discussion_r1604205064 ### Does this PR introduce _any_ user-facing change? no ### How was this patch tested? ci ### Was this patch authored or co-authored using generative AI tooling? no Closes #46628 from zhengruifeng/nit_error. Authored-by: Ruifeng Zheng Signed-off-by: Ruifeng Zheng --- common/utils/src/main/resources/error/error-conditions.json | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/common/utils/src/main/resources/error/error-conditions.json b/common/utils/src/main/resources/error/error-conditions.json index 5d750ade7867..69889435b02e 100644 --- a/common/utils/src/main/resources/error/error-conditions.json +++ b/common/utils/src/main/resources/error/error-conditions.json @@ -2677,7 +2677,7 @@ }, "CREATE_ROUTINE_WITH_IF_NOT_EXISTS_AND_REPLACE" : { "message" : [ - "CREATE PROCEDURE or CREATE FUNCTION with both IF NOT EXISTS and REPLACE is not allowed." + "Cannot create a routine with both IF NOT EXISTS and REPLACE specified." ] }, "CREATE_TEMP_FUNC_WITH_DATABASE" : { - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
(spark) branch master updated: [SPARK-48310][PYTHON][CONNECT] Cached properties must return copies
This is an automated email from the ASF dual-hosted git repository. gurwls223 pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/master by this push: new 05e1706e5aa6 [SPARK-48310][PYTHON][CONNECT] Cached properties must return copies 05e1706e5aa6 is described below commit 05e1706e5aa66a592e61b03263683a2dbbc64afe Author: Martin Grund AuthorDate: Fri May 17 10:28:36 2024 +0900 [SPARK-48310][PYTHON][CONNECT] Cached properties must return copies ### What changes were proposed in this pull request? When a consumer modifies the result values of a cached property it will modify the value of the cached property. Before: ```python df_columns = df.columns for col in ['id', 'name']: df_columns.remove(col) assert len(df_columns) == df.columns ``` But this is wrong and this patch fixes it to ```python df_columns = df.columns for col in ['id', 'name']: df_columns.remove(col) assert len(df_columns) != df.columns ``` ### Why are the changes needed? Correctness of the API ### Does this PR introduce _any_ user-facing change? No, this makes the code consistent with Spark classic. ### How was this patch tested? UT ### Was this patch authored or co-authored using generative AI tooling? No Closes #46621 from grundprinzip/grundprinzip/SPARK-48310. Authored-by: Martin Grund Signed-off-by: Hyukjin Kwon --- python/pyspark/sql/connect/dataframe.py| 3 ++- .../sql/tests/connect/test_parity_dataframe.py | 24 ++ 2 files changed, 26 insertions(+), 1 deletion(-) diff --git a/python/pyspark/sql/connect/dataframe.py b/python/pyspark/sql/connect/dataframe.py index ccaaa15f3190..05300909cdce 100644 --- a/python/pyspark/sql/connect/dataframe.py +++ b/python/pyspark/sql/connect/dataframe.py @@ -43,6 +43,7 @@ from typing import ( Type, ) +import copy import sys import random import pyarrow as pa @@ -1787,7 +1788,7 @@ class DataFrame(ParentDataFrame): if self._cached_schema is None: query = self._plan.to_proto(self._session.client) self._cached_schema = self._session.client.schema(query) -return self._cached_schema +return copy.deepcopy(self._cached_schema) def isLocal(self) -> bool: query = self._plan.to_proto(self._session.client) diff --git a/python/pyspark/sql/tests/connect/test_parity_dataframe.py b/python/pyspark/sql/tests/connect/test_parity_dataframe.py index 343f485553a9..c9888a6a8f1a 100644 --- a/python/pyspark/sql/tests/connect/test_parity_dataframe.py +++ b/python/pyspark/sql/tests/connect/test_parity_dataframe.py @@ -19,6 +19,7 @@ import unittest from pyspark.sql.tests.test_dataframe import DataFrameTestsMixin from pyspark.testing.connectutils import ReusedConnectTestCase +from pyspark.sql.types import StructType, StructField, IntegerType, StringType class DataFrameParityTests(DataFrameTestsMixin, ReusedConnectTestCase): @@ -26,6 +27,29 @@ class DataFrameParityTests(DataFrameTestsMixin, ReusedConnectTestCase): df = self.spark.createDataFrame(data=[{"foo": "bar"}, {"foo": "baz"}]) super().check_help_command(df) +def test_cached_property_is_copied(self): +schema = StructType( +[ +StructField("id", IntegerType(), True), +StructField("name", StringType(), True), +StructField("age", IntegerType(), True), +StructField("city", StringType(), True), +] +) +# Create some dummy data +data = [ +(1, "Alice", 30, "New York"), +(2, "Bob", 25, "San Francisco"), +(3, "Cathy", 29, "Los Angeles"), +(4, "David", 35, "Chicago"), +] +df = self.spark.createDataFrame(data, schema) +df_columns = df.columns +assert len(df.columns) == 4 +for col in ["id", "name"]: +df_columns.remove(col) +assert len(df.columns) == 4 + @unittest.skip("Spark Connect does not support RDD but the tests depend on them.") def test_toDF_with_schema_string(self): super().test_toDF_with_schema_string() - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
(spark) branch master updated (e9d4152a319a -> 153053fe6c3d)
This is an automated email from the ASF dual-hosted git repository. gurwls223 pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git from e9d4152a319a [SPARK-48031][SQL][FOLLOW-UP] Use ANSI-enabled cast in view lookup test add 153053fe6c3d [SPARK-48268][CORE] Add a configuration for SparkContext.setCheckpointDir No new revisions were added by this update. Summary of changes: core/src/main/scala/org/apache/spark/SparkContext.scala | 2 ++ .../scala/org/apache/spark/internal/config/package.scala | 9 + .../test/scala/org/apache/spark/CheckpointSuite.scala| 16 docs/configuration.md| 9 + 4 files changed, 36 insertions(+) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
(spark) branch master updated: [SPARK-48031][SQL][FOLLOW-UP] Use ANSI-enabled cast in view lookup test
This is an automated email from the ASF dual-hosted git repository. gurwls223 pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/master by this push: new e9d4152a319a [SPARK-48031][SQL][FOLLOW-UP] Use ANSI-enabled cast in view lookup test e9d4152a319a is described below commit e9d4152a319af4ad138ad1a6eb87bdf0b051ec9e Author: Hyukjin Kwon AuthorDate: Fri May 17 08:35:37 2024 +0900 [SPARK-48031][SQL][FOLLOW-UP] Use ANSI-enabled cast in view lookup test ### What changes were proposed in this pull request? This PR is a followup of https://github.com/apache/spark/pull/46267 that uses ANSI-enabled cast in the tests. It intentionally uses ANSI-enabled cast in `castColToType` when you look up a view. ### Why are the changes needed? In order to fix the scheduled CI build without ANSI: - https://github.com/apache/spark/actions/runs/9072308206/job/24960016975 - https://github.com/apache/spark/actions/runs/9072308206/job/24960019187 ``` [info] - look up view relation *** FAILED *** (72 milliseconds) [info] == FAIL: Plans do not match === [info]'SubqueryAlias spark_catalog.db3.view1 'SubqueryAlias spark_catalog.db3.view1 [info]+- View (`spark_catalog`.`db3`.`view1`, ['col1, 'col2, 'a, 'b]) +- View (`spark_catalog`.`db3`.`view1`, ['col1, 'col2, 'a, 'b]) [info] +- 'Project [cast(getviewcolumnbynameandordinal(`spark_catalog`.`db3`.`view1`, col1, 0, 1) as int) AS col1#0, cast(getviewcolumnbynameandordinal(`spark_catalog`.`db3`.`view1`, col2, 0, 1) as string) AS col2#0, cast(getviewcolumnbynameandordinal(`spark_catalog`.`db3`.`view1`, a, 0, 1) as int) AS a#0, cast(getviewcolumnbynameandordinal(`spark_catalog`.`db3`.`view1`, b, 0, 1) as string) AS b#0] +- 'Project [cast(getviewcolumnbynameandordinal(`spark_catalog`.`db3`.`view1 [...] [info] +- 'Project [*] +- 'Project [*] [info] +- 'UnresolvedRelation [tbl1], [], false ``` ``` [info] - look up view created before Spark 3.0 *** FAILED *** (452 milliseconds) [info] == FAIL: Plans do not match === [info]'SubqueryAlias spark_catalog.db3.view2 'SubqueryAlias spark_catalog.db3.view2 [info]+- View (`db3`.`view2`, ['col1, 'col2, 'a, 'b]) +- View (`db3`.`view2`, ['col1, 'col2, 'a, 'b]) [info] +- 'Project [cast(getviewcolumnbynameandordinal(`db3`.`view2`, col1, 0, 1) as int) AS col1#0, cast(getviewcolumnbynameandordinal(`db3`.`view2`, col2, 0, 1) as string) AS col2#0, cast(getviewcolumnbynameandordinal(`db3`.`view2`, a, 0, 1) as int) AS a#0, cast(getviewcolumnbynameandordinal(`db3`.`view2`, b, 0, 1) as string) AS b#0] +- 'Project [cast(getviewcolumnbynameandordinal(`db3`.`view2`, col1, 0, 1) as int) AS col1#0, cast(getviewcolumnbynameandordinal(`db3`.`view [...] [info] +- 'Project [*] +- 'Project [
(spark) branch dependabot/bundler/docs/rexml-3.2.8 deleted (was 96e70ab579c3)
This is an automated email from the ASF dual-hosted git repository. github-bot pushed a change to branch dependabot/bundler/docs/rexml-3.2.8 in repository https://gitbox.apache.org/repos/asf/spark.git was 96e70ab579c3 Bump rexml from 3.2.6 to 3.2.8 in /docs The revisions that were on this branch are still contained in other references; therefore, this change does not discard any commits from the repository. - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
(spark) branch master updated: [SPARK-48294][SQL] Handle lowercase in nestedTypeMissingElementTypeError
This is an automated email from the ASF dual-hosted git repository. gengliang pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/master by this push: new 59f88c372522 [SPARK-48294][SQL] Handle lowercase in nestedTypeMissingElementTypeError 59f88c372522 is described below commit 59f88c3725222b84b2d0b51ba40a769d99866b56 Author: Michael Zhang AuthorDate: Thu May 16 14:58:25 2024 -0700 [SPARK-48294][SQL] Handle lowercase in nestedTypeMissingElementTypeError ### What changes were proposed in this pull request? Handle lowercase values inside of nestTypeMissingElementTypeError to prevent match errors. ### Why are the changes needed? The previous match error was not user-friendly. Now it gives an actionable `INCOMPLETE_TYPE_DEFINITION` error. ### Does this PR introduce _any_ user-facing change? N/A ### How was this patch tested? Newly added tests pass. ### Was this patch authored or co-authored using generative AI tooling? No. Closes #46623 from michaelzhan-db/SPARK-48294. Authored-by: Michael Zhang Signed-off-by: Gengliang Wang --- .../apache/spark/sql/errors/QueryParsingErrors.scala | 2 +- .../spark/sql/errors/QueryParsingErrorsSuite.scala| 19 +++ 2 files changed, 20 insertions(+), 1 deletion(-) diff --git a/sql/api/src/main/scala/org/apache/spark/sql/errors/QueryParsingErrors.scala b/sql/api/src/main/scala/org/apache/spark/sql/errors/QueryParsingErrors.scala index 5eafd4d915a4..816fa546a138 100644 --- a/sql/api/src/main/scala/org/apache/spark/sql/errors/QueryParsingErrors.scala +++ b/sql/api/src/main/scala/org/apache/spark/sql/errors/QueryParsingErrors.scala @@ -289,7 +289,7 @@ private[sql] object QueryParsingErrors extends DataTypeErrorsBase { def nestedTypeMissingElementTypeError( dataType: String, ctx: PrimitiveDataTypeContext): Throwable = { -dataType match { +dataType.toUpperCase(Locale.ROOT) match { case "ARRAY" => new ParseException( errorClass = "INCOMPLETE_TYPE_DEFINITION.ARRAY", diff --git a/sql/core/src/test/scala/org/apache/spark/sql/errors/QueryParsingErrorsSuite.scala b/sql/core/src/test/scala/org/apache/spark/sql/errors/QueryParsingErrorsSuite.scala index 29ab6e994e42..b7fb65091ef7 100644 --- a/sql/core/src/test/scala/org/apache/spark/sql/errors/QueryParsingErrorsSuite.scala +++ b/sql/core/src/test/scala/org/apache/spark/sql/errors/QueryParsingErrorsSuite.scala @@ -647,6 +647,13 @@ class QueryParsingErrorsSuite extends QueryTest with SharedSparkSession with SQL sqlState = "42K01", parameters = Map("elementType" -> ""), context = ExpectedContext(fragment = "ARRAY", start = 30, stop = 34)) +// Create column of array type without specifying element type in lowercase +checkError( + exception = parseException("CREATE TABLE tbl_120691 (col1 array)"), + errorClass = "INCOMPLETE_TYPE_DEFINITION.ARRAY", + sqlState = "42K01", + parameters = Map("elementType" -> ""), + context = ExpectedContext(fragment = "array", start = 30, stop = 34)) } test("INCOMPLETE_TYPE_DEFINITION: struct type definition is incomplete") { @@ -674,6 +681,12 @@ class QueryParsingErrorsSuite extends QueryTest with SharedSparkSession with SQL errorClass = "PARSE_SYNTAX_ERROR", sqlState = "42601", parameters = Map("error" -> "'<'", "hint" -> ": missing ')'")) +// Create column of struct type without specifying field type in lowercase +checkError( + exception = parseException("CREATE TABLE tbl_120691 (col1 struct)"), + errorClass = "INCOMPLETE_TYPE_DEFINITION.STRUCT", + sqlState = "42K01", + context = ExpectedContext(fragment = "struct", start = 30, stop = 35)) } test("INCOMPLETE_TYPE_DEFINITION: map type definition is incomplete") { @@ -695,6 +708,12 @@ class QueryParsingErrorsSuite extends QueryTest with SharedSparkSession with SQL errorClass = "PARSE_SYNTAX_ERROR", sqlState = "42601", parameters = Map("error" -> "'<'", "hint" -> ": missing ')'")) +// Create column of map type without specifying key/value types in lowercase +checkError( + exception = parseException("SELECT CAST(map('1',2) AS map)"), + errorClass = "INCOMPLETE_TYPE_DEFINITION.MAP", + sqlState = "42K01", + context = ExpectedContext(fragment = "map", start = 26, stop = 28)) } test("INVALID_ESC: Escape string must contain only one character") { - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
(spark) branch dependabot/bundler/docs/rexml-3.2.8 created (now 96e70ab579c3)
This is an automated email from the ASF dual-hosted git repository. github-bot pushed a change to branch dependabot/bundler/docs/rexml-3.2.8 in repository https://gitbox.apache.org/repos/asf/spark.git at 96e70ab579c3 Bump rexml from 3.2.6 to 3.2.8 in /docs No new revisions were added by this update. - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
(spark) branch master updated: [SPARK-48291][CORE][FOLLOWUP] Rename Java *LoggerSuite* as *SparkLoggerSuite*
This is an automated email from the ASF dual-hosted git repository. gengliang pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/master by this push: new 283b2ff42221 [SPARK-48291][CORE][FOLLOWUP] Rename Java *LoggerSuite* as *SparkLoggerSuite* 283b2ff42221 is described below commit 283b2ff422218b025e7b0170e4b7ed31a1294a80 Author: panbingkun AuthorDate: Thu May 16 11:55:20 2024 -0700 [SPARK-48291][CORE][FOLLOWUP] Rename Java *LoggerSuite* as *SparkLoggerSuite* ### What changes were proposed in this pull request? The pr is follow up https://github.com/apache/spark/pull/46600 to . Similarly, to maintain consistency, should be renamed to ### Why are the changes needed? After `org.apache.spark.internal.Logger` is renamed to `org.apache.spark.internal.SparkLogger` and `org.apache.spark.internal.LoggerFactory` is renamed to `org.apache.spark.internal.SparkLoggerFactory.`, the related UT's names should also be `renamed`, so that developers can easily locate the related UT. ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? Pass GA. ### Was this patch authored or co-authored using generative AI tooling? No. Closes #46615 from panbingkun/SPARK-48291_follow_up. Authored-by: panbingkun Signed-off-by: Gengliang Wang --- .../util/{PatternLoggerSuite.java => PatternSparkLoggerSuite.java} | 7 --- .../spark/util/{LoggerSuiteBase.java => SparkLoggerSuiteBase.java} | 2 +- ...{StructuredLoggerSuite.java => StructuredSparkLoggerSuite.java} | 6 +++--- common/utils/src/test/resources/log4j2.properties | 4 ++-- 4 files changed, 10 insertions(+), 9 deletions(-) diff --git a/common/utils/src/test/java/org/apache/spark/util/PatternLoggerSuite.java b/common/utils/src/test/java/org/apache/spark/util/PatternSparkLoggerSuite.java similarity index 91% rename from common/utils/src/test/java/org/apache/spark/util/PatternLoggerSuite.java rename to common/utils/src/test/java/org/apache/spark/util/PatternSparkLoggerSuite.java index 33de91697efa..2d370bad4cc8 100644 --- a/common/utils/src/test/java/org/apache/spark/util/PatternLoggerSuite.java +++ b/common/utils/src/test/java/org/apache/spark/util/PatternSparkLoggerSuite.java @@ -22,9 +22,10 @@ import org.apache.logging.log4j.Level; import org.apache.spark.internal.SparkLogger; import org.apache.spark.internal.SparkLoggerFactory; -public class PatternLoggerSuite extends LoggerSuiteBase { +public class PatternSparkLoggerSuite extends SparkLoggerSuiteBase { - private static final SparkLogger LOGGER = SparkLoggerFactory.getLogger(PatternLoggerSuite.class); + private static final SparkLogger LOGGER = +SparkLoggerFactory.getLogger(PatternSparkLoggerSuite.class); private String toRegexPattern(Level level, String msg) { return msg @@ -39,7 +40,7 @@ public class PatternLoggerSuite extends LoggerSuiteBase { @Override String className() { -return PatternLoggerSuite.class.getSimpleName(); +return PatternSparkLoggerSuite.class.getSimpleName(); } @Override diff --git a/common/utils/src/test/java/org/apache/spark/util/LoggerSuiteBase.java b/common/utils/src/test/java/org/apache/spark/util/SparkLoggerSuiteBase.java similarity index 99% rename from common/utils/src/test/java/org/apache/spark/util/LoggerSuiteBase.java rename to common/utils/src/test/java/org/apache/spark/util/SparkLoggerSuiteBase.java index ecc0a75070c7..46bfe3415080 100644 --- a/common/utils/src/test/java/org/apache/spark/util/LoggerSuiteBase.java +++ b/common/utils/src/test/java/org/apache/spark/util/SparkLoggerSuiteBase.java @@ -30,7 +30,7 @@ import org.apache.spark.internal.SparkLogger; import org.apache.spark.internal.LogKeys; import org.apache.spark.internal.MDC; -public abstract class LoggerSuiteBase { +public abstract class SparkLoggerSuiteBase { abstract SparkLogger logger(); abstract String className(); diff --git a/common/utils/src/test/java/org/apache/spark/util/StructuredLoggerSuite.java b/common/utils/src/test/java/org/apache/spark/util/StructuredSparkLoggerSuite.java similarity index 95% rename from common/utils/src/test/java/org/apache/spark/util/StructuredLoggerSuite.java rename to common/utils/src/test/java/org/apache/spark/util/StructuredSparkLoggerSuite.java index 110e7cc7794e..416f0b6172c0 100644 --- a/common/utils/src/test/java/org/apache/spark/util/StructuredLoggerSuite.java +++ b/common/utils/src/test/java/org/apache/spark/util/StructuredSparkLoggerSuite.java @@ -24,10 +24,10 @@ import org.apache.logging.log4j.Level; import org.apache.spark.internal.SparkLogger; import org.apache.spark.internal.SparkLoggerFactory; -public class StructuredLoggerSuite extends LoggerSuiteBase { +public class StructuredSparkLoggerSuite extends SparkLoggerSuiteBase { private static final
(spark) branch master updated: [SPARK-48308][CORE] Unify getting data schema without partition columns in FileSourceStrategy
This is an automated email from the ASF dual-hosted git repository. wenchen pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/master by this push: new 57948c865e06 [SPARK-48308][CORE] Unify getting data schema without partition columns in FileSourceStrategy 57948c865e06 is described below commit 57948c865e064469a75c92f8b58c632b9b40fdd3 Author: Johan Lasperas AuthorDate: Thu May 16 22:38:02 2024 +0800 [SPARK-48308][CORE] Unify getting data schema without partition columns in FileSourceStrategy ### What changes were proposed in this pull request? Compute the schema of the data without partition columns only once in FileSourceStrategy. ### Why are the changes needed? In FileSourceStrategy, the schema of the data excluding partition columns is computed 2 times in a slightly different way, using an AttributeSet (`partitionSet`) and using the attributes directly (`partitionColumns`) These don't have the exact same semantics, AttributeSet will only use expression ids for comparison while comparing with the actual attributes will use the name, type, nullability and metadata. We want to use the former here. ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? Existing tests ### Was this patch authored or co-authored using generative AI tooling? No Closes #46619 from johanl-db/reuse-schema-without-partition-columns. Authored-by: Johan Lasperas Signed-off-by: Wenchen Fan --- .../apache/spark/sql/execution/datasources/FileSourceStrategy.scala| 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/FileSourceStrategy.scala b/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/FileSourceStrategy.scala index 8333c276cdd8..d31cb111924b 100644 --- a/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/FileSourceStrategy.scala +++ b/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/FileSourceStrategy.scala @@ -216,9 +216,8 @@ object FileSourceStrategy extends Strategy with PredicateHelper with Logging { val requiredExpressions: Seq[NamedExpression] = filterAttributes.toSeq ++ projects val requiredAttributes = AttributeSet(requiredExpressions) - val readDataColumns = dataColumns + val readDataColumns = dataColumnsWithoutPartitionCols .filter(requiredAttributes.contains) -.filterNot(partitionColumns.contains) // Metadata attributes are part of a column of type struct up to this point. Here we extract // this column from the schema and specify a matcher for that. - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
(spark) branch master updated: [SPARK-48301][SQL] Rename `CREATE_FUNC_WITH_IF_NOT_EXISTS_AND_REPLACE` to `CREATE_ROUTINE_WITH_IF_NOT_EXISTS_AND_REPLACE`
This is an automated email from the ASF dual-hosted git repository. ruifengz pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/master by this push: new 3d3d18f14ba2 [SPARK-48301][SQL] Rename `CREATE_FUNC_WITH_IF_NOT_EXISTS_AND_REPLACE` to `CREATE_ROUTINE_WITH_IF_NOT_EXISTS_AND_REPLACE` 3d3d18f14ba2 is described below commit 3d3d18f14ba29074ca3ff8b661449ad45d84369e Author: Ruifeng Zheng AuthorDate: Thu May 16 20:58:15 2024 +0800 [SPARK-48301][SQL] Rename `CREATE_FUNC_WITH_IF_NOT_EXISTS_AND_REPLACE` to `CREATE_ROUTINE_WITH_IF_NOT_EXISTS_AND_REPLACE` ### What changes were proposed in this pull request? Rename `CREATE_FUNC_WITH_IF_NOT_EXISTS_AND_REPLACE` to `CREATE_ROUTINE_WITH_IF_NOT_EXISTS_AND_REPLACE` ### Why are the changes needed? `IF NOT EXISTS` + `REPLACE` is standard restriction, not just for functions. Rename it to make it reusable. ### Does this PR introduce _any_ user-facing change? no ### How was this patch tested? updated tests ### Was this patch authored or co-authored using generative AI tooling? no Closes #46608 from zhengruifeng/sql_rename_if_not_exists_replace. Lead-authored-by: Ruifeng Zheng Co-authored-by: Ruifeng Zheng Signed-off-by: Ruifeng Zheng --- common/utils/src/main/resources/error/error-conditions.json | 4 ++-- .../main/scala/org/apache/spark/sql/errors/QueryParsingErrors.scala | 2 +- .../scala/org/apache/spark/sql/errors/QueryParsingErrorsSuite.scala | 4 ++-- 3 files changed, 5 insertions(+), 5 deletions(-) diff --git a/common/utils/src/main/resources/error/error-conditions.json b/common/utils/src/main/resources/error/error-conditions.json index 75067a1920f7..5d750ade7867 100644 --- a/common/utils/src/main/resources/error/error-conditions.json +++ b/common/utils/src/main/resources/error/error-conditions.json @@ -2675,9 +2675,9 @@ "ANALYZE TABLE(S) ... COMPUTE STATISTICS ... must be either NOSCAN or empty." ] }, - "CREATE_FUNC_WITH_IF_NOT_EXISTS_AND_REPLACE" : { + "CREATE_ROUTINE_WITH_IF_NOT_EXISTS_AND_REPLACE" : { "message" : [ - "CREATE FUNCTION with both IF NOT EXISTS and REPLACE is not allowed." + "CREATE PROCEDURE or CREATE FUNCTION with both IF NOT EXISTS and REPLACE is not allowed." ] }, "CREATE_TEMP_FUNC_WITH_DATABASE" : { diff --git a/sql/api/src/main/scala/org/apache/spark/sql/errors/QueryParsingErrors.scala b/sql/api/src/main/scala/org/apache/spark/sql/errors/QueryParsingErrors.scala index d07aa6741a14..5eafd4d915a4 100644 --- a/sql/api/src/main/scala/org/apache/spark/sql/errors/QueryParsingErrors.scala +++ b/sql/api/src/main/scala/org/apache/spark/sql/errors/QueryParsingErrors.scala @@ -576,7 +576,7 @@ private[sql] object QueryParsingErrors extends DataTypeErrorsBase { def createFuncWithBothIfNotExistsAndReplaceError(ctx: CreateFunctionContext): Throwable = { new ParseException( - errorClass = "INVALID_SQL_SYNTAX.CREATE_FUNC_WITH_IF_NOT_EXISTS_AND_REPLACE", + errorClass = "INVALID_SQL_SYNTAX.CREATE_ROUTINE_WITH_IF_NOT_EXISTS_AND_REPLACE", ctx) } diff --git a/sql/core/src/test/scala/org/apache/spark/sql/errors/QueryParsingErrorsSuite.scala b/sql/core/src/test/scala/org/apache/spark/sql/errors/QueryParsingErrorsSuite.scala index 5babce0ddb8d..29ab6e994e42 100644 --- a/sql/core/src/test/scala/org/apache/spark/sql/errors/QueryParsingErrorsSuite.scala +++ b/sql/core/src/test/scala/org/apache/spark/sql/errors/QueryParsingErrorsSuite.scala @@ -288,7 +288,7 @@ class QueryParsingErrorsSuite extends QueryTest with SharedSparkSession with SQL stop = 27)) } - test("INVALID_SQL_SYNTAX.CREATE_FUNC_WITH_IF_NOT_EXISTS_AND_REPLACE: " + + test("INVALID_SQL_SYNTAX.CREATE_ROUTINE_WITH_IF_NOT_EXISTS_AND_REPLACE: " + "Create function with both if not exists and replace") { val sqlText = """CREATE OR REPLACE FUNCTION IF NOT EXISTS func1 as @@ -297,7 +297,7 @@ class QueryParsingErrorsSuite extends QueryTest with SharedSparkSession with SQL checkError( exception = parseException(sqlText), - errorClass = "INVALID_SQL_SYNTAX.CREATE_FUNC_WITH_IF_NOT_EXISTS_AND_REPLACE", + errorClass = "INVALID_SQL_SYNTAX.CREATE_ROUTINE_WITH_IF_NOT_EXISTS_AND_REPLACE", sqlState = "42000", context = ExpectedContext( fragment = sqlText, - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
(spark) branch master updated (fa83d0f8fce7 -> 4be0828e6e6a)
This is an automated email from the ASF dual-hosted git repository. wenchen pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git from fa83d0f8fce7 [SPARK-48296][SQL] Codegen Support for `to_xml` add 4be0828e6e6a [SPARK-48288] Add source data type for connector cast expression No new revisions were added by this update. Summary of changes: .../apache/spark/sql/connector/expressions/Cast.java | 18 +- .../sql/connector/util/V2ExpressionSQLBuilder.java | 6 +++--- .../spark/sql/catalyst/util/V2ExpressionBuilder.scala | 2 +- .../scala/org/apache/spark/sql/jdbc/JdbcDialects.scala | 4 ++-- 4 files changed, 23 insertions(+), 7 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
(spark) branch master updated (3bd845ea930a -> fa83d0f8fce7)
This is an automated email from the ASF dual-hosted git repository. yao pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git from 3bd845ea930a [SPARK-48297][SQL] Fix a regression TRANSFORM clause with char/varchar add fa83d0f8fce7 [SPARK-48296][SQL] Codegen Support for `to_xml` No new revisions were added by this update. Summary of changes: .../sql/catalyst/expressions/xmlExpressions.scala | 11 ++- .../org/apache/spark/sql/XmlFunctionsSuite.scala | 19 ++- 2 files changed, 24 insertions(+), 6 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
(spark) branch branch-3.5 updated: [SPARK-48297][SQL] Fix a regression TRANSFORM clause with char/varchar
This is an automated email from the ASF dual-hosted git repository. yao pushed a commit to branch branch-3.5 in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/branch-3.5 by this push: new c1dd4a5df693 [SPARK-48297][SQL] Fix a regression TRANSFORM clause with char/varchar c1dd4a5df693 is described below commit c1dd4a5df69340884f3f0f0c28ce916bf9e30159 Author: Kent Yao AuthorDate: Thu May 16 17:29:47 2024 +0800 [SPARK-48297][SQL] Fix a regression TRANSFORM clause with char/varchar ### What changes were proposed in this pull request? TRANSFORM with char/varchar has been accidentally invalidated since 3.1 with a scala.MatchError, this PR fixes it ### Why are the changes needed? bugfix ### Does this PR introduce _any_ user-facing change? no ### How was this patch tested? new tests ### Was this patch authored or co-authored using generative AI tooling? no Closes #46603 from yaooqinn/SPARK-48297. Authored-by: Kent Yao Signed-off-by: Kent Yao (cherry picked from commit 3bd845ea930a4709b7a2f0447b5f8af64c697239) Signed-off-by: Kent Yao --- .../org/apache/spark/sql/catalyst/parser/AstBuilder.scala | 4 +++- .../resources/sql-tests/analyzer-results/transform.sql.out| 11 +++ sql/core/src/test/resources/sql-tests/inputs/transform.sql| 6 +- .../src/test/resources/sql-tests/results/transform.sql.out| 10 ++ 4 files changed, 29 insertions(+), 2 deletions(-) diff --git a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/parser/AstBuilder.scala b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/parser/AstBuilder.scala index 5d68aed9245a..f38d41af445e 100644 --- a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/parser/AstBuilder.scala +++ b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/parser/AstBuilder.scala @@ -787,7 +787,9 @@ class AstBuilder extends DataTypeAstBuilder with SQLConfHelper with Logging { // Create the attributes. val (attributes, schemaLess) = if (transformClause.colTypeList != null) { // Typed return columns. - (DataTypeUtils.toAttributes(createSchema(transformClause.colTypeList)), false) + val schema = createSchema(transformClause.colTypeList) + val replacedSchema = CharVarcharUtils.replaceCharVarcharWithStringInSchema(schema) + (DataTypeUtils.toAttributes(replacedSchema), false) } else if (transformClause.identifierSeq != null) { // Untyped return columns. val attrs = visitIdentifierSeq(transformClause.identifierSeq).map { name => diff --git a/sql/core/src/test/resources/sql-tests/analyzer-results/transform.sql.out b/sql/core/src/test/resources/sql-tests/analyzer-results/transform.sql.out index ceca433a1c91..aa595c551f79 100644 --- a/sql/core/src/test/resources/sql-tests/analyzer-results/transform.sql.out +++ b/sql/core/src/test/resources/sql-tests/analyzer-results/transform.sql.out @@ -1035,3 +1035,14 @@ ScriptTransformation cat, [a#x, b#x], ScriptInputOutputSchema(List(),List(),None +- Project [a#x, b#x] +- SubqueryAlias complex_trans +- LocalRelation [a#x, b#x] + + +-- !query +SELECT TRANSFORM (a, b) + USING 'cat' AS (a CHAR(10), b VARCHAR(10)) +FROM VALUES('apache', 'spark') t(a, b) +-- !query analysis +ScriptTransformation cat, [a#x, b#x], ScriptInputOutputSchema(List(),List(),None,None,List(),List(),None,None,false) ++- Project [a#x, b#x] + +- SubqueryAlias t + +- LocalRelation [a#x, b#x] diff --git a/sql/core/src/test/resources/sql-tests/inputs/transform.sql b/sql/core/src/test/resources/sql-tests/inputs/transform.sql index 922a1d817778..8570496d439e 100644 --- a/sql/core/src/test/resources/sql-tests/inputs/transform.sql +++ b/sql/core/src/test/resources/sql-tests/inputs/transform.sql @@ -415,4 +415,8 @@ FROM ( ORDER BY a ) map_output SELECT TRANSFORM(a, b) - USING 'cat' AS (a, b); \ No newline at end of file + USING 'cat' AS (a, b); + +SELECT TRANSFORM (a, b) + USING 'cat' AS (a CHAR(10), b VARCHAR(10)) +FROM VALUES('apache', 'spark') t(a, b); diff --git a/sql/core/src/test/resources/sql-tests/results/transform.sql.out b/sql/core/src/test/resources/sql-tests/results/transform.sql.out index ab726b93c07c..7975392fd014 100644 --- a/sql/core/src/test/resources/sql-tests/results/transform.sql.out +++ b/sql/core/src/test/resources/sql-tests/results/transform.sql.out @@ -837,3 +837,13 @@ struct 3 3 3 3 3 3 + + +-- !query +SELECT TRANSFORM (a, b) + USING 'cat' AS (a CHAR(10), b VARCHAR(10)) +FROM VALUES('apache', 'spark') t(a, b) +-- !query schema +struct +-- !query output +apache spark - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail:
(spark) branch master updated (b53d78e94f6e -> 3bd845ea930a)
This is an automated email from the ASF dual-hosted git repository. yao pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git from b53d78e94f6e [SPARK-48036][DOCS][FOLLOWUP] Update sql-ref-ansi-compliance.md add 3bd845ea930a [SPARK-48297][SQL] Fix a regression TRANSFORM clause with char/varchar No new revisions were added by this update. Summary of changes: .../org/apache/spark/sql/catalyst/parser/AstBuilder.scala | 4 +++- .../resources/sql-tests/analyzer-results/transform.sql.out| 11 +++ sql/core/src/test/resources/sql-tests/inputs/transform.sql| 6 +- .../src/test/resources/sql-tests/results/transform.sql.out| 10 ++ 4 files changed, 29 insertions(+), 2 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
(spark) branch master updated (0ba8ddc9ce5b -> b53d78e94f6e)
This is an automated email from the ASF dual-hosted git repository. yao pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git from 0ba8ddc9ce5b [SPARK-48293][SS] Add test for when ForeachBatchUserFuncException wraps interrupted exception due to query stop add b53d78e94f6e [SPARK-48036][DOCS][FOLLOWUP] Update sql-ref-ansi-compliance.md No new revisions were added by this update. Summary of changes: docs/sql-ref-ansi-compliance.md | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org