[GitHub] [spark] HyukjinKwon commented on pull request #31527: [MINOR] Add a note about pip installation test in RC for release vote template
HyukjinKwon commented on pull request #31527: URL: https://github.com/apache/spark/pull/31527#issuecomment-775147783 Thanks @srowen. Merged to master. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] cloud-fan commented on a change in pull request #31489: [SPARK-34377][SQL] Add new parquet datasource options to control datetime rebasing in read
cloud-fan commented on a change in pull request #31489: URL: https://github.com/apache/spark/pull/31489#discussion_r572042517 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2/parquet/ParquetPartitionReaderFactory.scala ## @@ -52,14 +52,16 @@ import org.apache.spark.util.SerializableConfiguration * @param readDataSchema Required schema of Parquet files. * @param partitionSchema Schema of partitions. * @param filters Filters to be pushed down in the batch scan. + * @param parquetOptions The options of Parquet datasource that are set for the read. */ case class ParquetPartitionReaderFactory( sqlConf: SQLConf, broadcastedConf: Broadcast[SerializableConfiguration], dataSchema: StructType, readDataSchema: StructType, partitionSchema: StructType, -filters: Array[Filter]) extends FilePartitionReaderFactory with Logging { +filters: Array[Filter], +parquetOptions: ParquetOptions) extends FilePartitionReaderFactory with Logging { Review comment: ah nvm, we read the confs and put it in `val`. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] cloud-fan commented on a change in pull request #31489: [SPARK-34377][SQL] Add new parquet datasource options to control datetime rebasing in read
cloud-fan commented on a change in pull request #31489: URL: https://github.com/apache/spark/pull/31489#discussion_r572042040 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2/parquet/ParquetPartitionReaderFactory.scala ## @@ -52,14 +52,16 @@ import org.apache.spark.util.SerializableConfiguration * @param readDataSchema Required schema of Parquet files. * @param partitionSchema Schema of partitions. * @param filters Filters to be pushed down in the batch scan. + * @param parquetOptions The options of Parquet datasource that are set for the read. */ case class ParquetPartitionReaderFactory( sqlConf: SQLConf, broadcastedConf: Broadcast[SerializableConfiguration], dataSchema: StructType, readDataSchema: StructType, partitionSchema: StructType, -filters: Array[Filter]) extends FilePartitionReaderFactory with Logging { +filters: Array[Filter], +parquetOptions: ParquetOptions) extends FilePartitionReaderFactory with Logging { Review comment: Does it really work? The 2 fields of `ParquetOptions` are `transient`, and become null after (de)serialization. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] MaxGekk commented on a change in pull request #31524: [SPARK-34401][SQL][DOCS] Update docs about altering cached tables/views
MaxGekk commented on a change in pull request #31524: URL: https://github.com/apache/spark/pull/31524#discussion_r572041366 ## File path: docs/sql-ref-syntax-ddl-alter-table.md ## @@ -27,6 +27,9 @@ license: | `ALTER TABLE RENAME TO` statement changes the table name of an existing table in the database. +If the table is cached, the commands clear cached data of the table and all its dependents that refer to it. The cache will be lazily filled when the next time the table or the dependents are accessed. +The partition rename command keeps table dependents as cached but the table rename does not. In this way dependents should be re-cached explicitly after table renaming. Review comment: Look at the implementation https://github.com/apache/spark/blob/2c243c93d9b107ed19fa21002d54deccd17ce50a/sql/core/src/main/scala/org/apache/spark/sql/execution/command/tables.scala#L201-L210 It uncaches tables (and its dependents), and caches ONLY the table back. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #31245: [SPARK-34157][SQL] Unify output of SHOW TABLES and pass output attributes properly
AmplabJenkins removed a comment on pull request #31245: URL: https://github.com/apache/spark/pull/31245#issuecomment-775143616 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/135015/ This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #31524: [SPARK-34401][SQL][DOCS] Update docs about altering cached tables/views
AmplabJenkins removed a comment on pull request #31524: URL: https://github.com/apache/spark/pull/31524#issuecomment-775143615 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/135026/ This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #31073: [SPARK-33995][SQL] Expose make_interval as a Scala function
AmplabJenkins removed a comment on pull request #31073: URL: https://github.com/apache/spark/pull/31073#issuecomment-774492810 Can one of the admins verify this patch? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #31520: [SPARK-34355][CORE][SQL][FOLLOWUP] Log commit time in all File Writer
AmplabJenkins removed a comment on pull request #31520: URL: https://github.com/apache/spark/pull/31520#issuecomment-775143617 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #31073: [SPARK-33995][SQL] Expose make_interval as a Scala function
SparkQA commented on pull request #31073: URL: https://github.com/apache/spark/pull/31073#issuecomment-775144967 **[Test build #135029 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/135029/testReport)** for PR 31073 at commit [`6c0f6bb`](https://github.com/apache/spark/commit/6c0f6bbd7036851f27b0540fc687508342477c0c). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] HyukjinKwon commented on pull request #31527: [MINOR] Add a note about pip installation test in RC for release vote template
HyukjinKwon commented on pull request #31527: URL: https://github.com/apache/spark/pull/31527#issuecomment-775144875 cc @srowen and @holdenk FYI This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] HyukjinKwon opened a new pull request #31527: [MINOR] Add a note about pip installation test in RC for release vote template
HyukjinKwon opened a new pull request #31527: URL: https://github.com/apache/spark/pull/31527 ### What changes were proposed in this pull request? This PR proposes to add a note about pip installation test in RC for release vote template. ### Why are the changes needed? To promote PySpark users to test PyPi distribution and pip installation. ### Does this PR introduce _any_ user-facing change? No. It will be used for release vote. ### How was this patch tested? N/A This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #31520: [SPARK-34355][CORE][SQL][FOLLOWUP] Log commit time in all File Writer
AmplabJenkins commented on pull request #31520: URL: https://github.com/apache/spark/pull/31520#issuecomment-775143621 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #31245: [SPARK-34157][SQL] Unify output of SHOW TABLES and pass output attributes properly
AmplabJenkins commented on pull request #31245: URL: https://github.com/apache/spark/pull/31245#issuecomment-775143616 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/135015/ This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] cloud-fan commented on a change in pull request #31466: [SPARK-34352][SQL] Improve SQLQueryTestSuite so as could run on windows system
cloud-fan commented on a change in pull request #31466: URL: https://github.com/apache/spark/pull/31466#discussion_r572038364 ## File path: sql/core/src/test/scala/org/apache/spark/sql/SQLQueryTestSuite.scala ## @@ -154,10 +154,14 @@ class SQLQueryTestSuite extends QueryTest with SharedSparkSession with SQLHelper // Fewer shuffle partitions to speed up testing. .set(SQLConf.SHUFFLE_PARTITIONS, 4) + // SPARK-32106 Since we add SQL test 'transform.sql' will use `cat` command, Review comment: `SQL test 'transform.sql' will use` -> `SQL test 'transform.sql' which uses` This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #31524: [SPARK-34401][SQL][DOCS] Update docs about altering cached tables/views
AmplabJenkins commented on pull request #31524: URL: https://github.com/apache/spark/pull/31524#issuecomment-775143615 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/135026/ This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] MaxGekk commented on a change in pull request #31524: [SPARK-34401][SQL][DOCS] Update docs about altering cached tables/views
MaxGekk commented on a change in pull request #31524: URL: https://github.com/apache/spark/pull/31524#discussion_r572038254 ## File path: docs/sql-ref-syntax-ddl-alter-table.md ## @@ -187,6 +194,8 @@ ALTER TABLE table_identifier [ partition_spec ] SET SERDE serde_class_name `ALTER TABLE SET` command can also be used for changing the file location and file format for existing tables. +If the table is cached, the `ALTER TABLE .. SET LOCATION` command clears cached data of the table and all its dependents that refer to it. The cache will be lazily filled when the next time the table or the dependents are accessed. Review comment: - It is not applicable to all commands - I modified not only `ALTER TABLE` This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] MaxGekk commented on a change in pull request #31524: [SPARK-34401][SQL][DOCS] Update docs about altering cached tables/views
MaxGekk commented on a change in pull request #31524: URL: https://github.com/apache/spark/pull/31524#discussion_r572037646 ## File path: docs/sql-ref-syntax-ddl-alter-table.md ## @@ -27,6 +27,9 @@ license: | `ALTER TABLE RENAME TO` statement changes the table name of an existing table in the database. +If the table is cached, the commands clear cached data of the table and all its dependents that refer to it. The cache will be lazily filled when the next time the table or the dependents are accessed. +The partition rename command keeps table dependents as cached but the table rename does not. In this way dependents should be re-cached explicitly after table renaming. Review comment: because the table rename command does not do that. It just cache table itself. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] cloud-fan commented on a change in pull request #31524: [SPARK-34401][SQL][DOCS] Update docs about altering cached tables/views
cloud-fan commented on a change in pull request #31524: URL: https://github.com/apache/spark/pull/31524#discussion_r572036282 ## File path: docs/sql-ref-syntax-ddl-alter-table.md ## @@ -187,6 +194,8 @@ ALTER TABLE table_identifier [ partition_spec ] SET SERDE serde_class_name `ALTER TABLE SET` command can also be used for changing the file location and file format for existing tables. +If the table is cached, the `ALTER TABLE .. SET LOCATION` command clears cached data of the table and all its dependents that refer to it. The cache will be lazily filled when the next time the table or the dependents are accessed. Review comment: We should probably put this in the beginning only once: ``` ### Description `ALTER TABLE` statement changes the schema or properties of a table. If the table is cached ... ``` This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] cloud-fan commented on a change in pull request #31524: [SPARK-34401][SQL][DOCS] Update docs about altering cached tables/views
cloud-fan commented on a change in pull request #31524: URL: https://github.com/apache/spark/pull/31524#discussion_r572035577 ## File path: docs/sql-ref-syntax-ddl-alter-table.md ## @@ -27,6 +27,9 @@ license: | `ALTER TABLE RENAME TO` statement changes the table name of an existing table in the database. +If the table is cached, the commands clear cached data of the table and all its dependents that refer to it. The cache will be lazily filled when the next time the table or the dependents are accessed. +The partition rename command keeps table dependents as cached but the table rename does not. In this way dependents should be re-cached explicitly after table renaming. Review comment: I don't get it. Won't the cache be filled lazily? Why we need to re-cache explicitly? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #31378: [SPARK-34240][SQL] Unify output of `SHOW TBLPROPERTIES` clause's output attribute's schema and ExprID
SparkQA commented on pull request #31378: URL: https://github.com/apache/spark/pull/31378#issuecomment-775139197 Kubernetes integration test starting URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/39610/ This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] cloud-fan commented on a change in pull request #30869: [SPARK-33865][SQL] When HiveDDL, we need check avro schema too
cloud-fan commented on a change in pull request #30869: URL: https://github.com/apache/spark/pull/30869#discussion_r572030438 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/FileFormat.scala ## @@ -162,6 +162,8 @@ trait FileFormat { * By default all data types are supported. */ def supportDataType(dataType: DataType): Boolean = true + + def checkFieldNames(names: Seq[String]): Unit = {} Review comment: Can we add some document for this API? We can probably require the exception type as `AnalysisException` This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #31524: [SPARK-34401][SQL][DOCS] Update docs about altering cached tables/views
SparkQA commented on pull request #31524: URL: https://github.com/apache/spark/pull/31524#issuecomment-775136139 Kubernetes integration test starting URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/39609/ This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] cloud-fan commented on a change in pull request #30869: [SPARK-33865][SQL] When HiveDDL, we need check avro schema too
cloud-fan commented on a change in pull request #30869: URL: https://github.com/apache/spark/pull/30869#discussion_r57203 ## File path: sql/hive/src/test/scala/org/apache/spark/sql/hive/execution/HiveDDLSuite.scala ## @@ -2876,6 +2877,33 @@ class HiveDDLSuite } } + test("SPARK-33865: Hive DDL with avro should check col name") { +withTable("t1") { + withView("v") { +spark.range(1).createTempView("v") + +val e1 = intercept[SchemaParseException] { Review comment: Shall we unify the exception type? It's weird if different formats use different exception types. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA removed a comment on pull request #31520: [SPARK-34355][CORE][SQL][FOLLOWUP] Log commit time in all File Writer
SparkQA removed a comment on pull request #31520: URL: https://github.com/apache/spark/pull/31520#issuecomment-775044219 **[Test build #135021 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/135021/testReport)** for PR 31520 at commit [`555f5b4`](https://github.com/apache/spark/commit/555f5b48bae9a36c11d8038a437fba9a18499206). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #31520: [SPARK-34355][CORE][SQL][FOLLOWUP] Log commit time in all File Writer
SparkQA commented on pull request #31520: URL: https://github.com/apache/spark/pull/31520#issuecomment-775130059 **[Test build #135021 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/135021/testReport)** for PR 31520 at commit [`555f5b4`](https://github.com/apache/spark/commit/555f5b48bae9a36c11d8038a437fba9a18499206). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] MaxGekk commented on a change in pull request #31245: [SPARK-34157][SQL] Unify output of SHOW TABLES and pass output attributes properly
MaxGekk commented on a change in pull request #31245: URL: https://github.com/apache/spark/pull/31245#discussion_r572023668 ## File path: sql/core/src/test/scala/org/apache/spark/sql/execution/command/v1/ShowTablesSuite.scala ## @@ -119,6 +101,32 @@ trait ShowTablesSuiteBase extends command.ShowTablesSuiteBase { assert(errMsg.contains("Database from v1 session catalog is not specified")) } } + + test("SPARK-34157 Unify output of SHOW TABLES and pass output attributes properly") { Review comment: The consistent format is `SPARK-34157:` w/o any gaps ;-) This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] HyukjinKwon commented on a change in pull request #31245: [SPARK-34157][SQL] Unify output of SHOW TABLES and pass output attributes properly
HyukjinKwon commented on a change in pull request #31245: URL: https://github.com/apache/spark/pull/31245#discussion_r572020314 ## File path: sql/core/src/test/scala/org/apache/spark/sql/execution/command/v1/ShowTablesSuite.scala ## @@ -119,6 +101,32 @@ trait ShowTablesSuiteBase extends command.ShowTablesSuiteBase { assert(errMsg.contains("Database from v1 session catalog is not specified")) } } + + test("SPARK-34157 Unify output of SHOW TABLES and pass output attributes properly") { +withNamespace(s"$catalog.ns") { + sql(s"CREATE NAMESPACE $catalog.ns") + sql(s"USE $catalog.ns") + withTable("tbl") { +sql("CREATE TABLE tbl(col1 int, col2 string) USING parquet") +checkAnswer(sql("show tables"), Row("ns", "tbl", false)) +assert(sql("show tables").schema.fieldNames.deep == + Seq("namespace", "tableName", "isTemporary")) +assert(sql("show table extended like 'tbl'").collect()(0).length == 4) +assert(sql("show table extended like 'tbl'").schema.fieldNames.deep == Review comment: Seems like this broke Scala 2.13 build. I made a followup here ## File path: sql/core/src/test/scala/org/apache/spark/sql/execution/command/v1/ShowTablesSuite.scala ## @@ -119,6 +101,32 @@ trait ShowTablesSuiteBase extends command.ShowTablesSuiteBase { assert(errMsg.contains("Database from v1 session catalog is not specified")) } } + + test("SPARK-34157 Unify output of SHOW TABLES and pass output attributes properly") { +withNamespace(s"$catalog.ns") { + sql(s"CREATE NAMESPACE $catalog.ns") + sql(s"USE $catalog.ns") + withTable("tbl") { +sql("CREATE TABLE tbl(col1 int, col2 string) USING parquet") +checkAnswer(sql("show tables"), Row("ns", "tbl", false)) +assert(sql("show tables").schema.fieldNames.deep == + Seq("namespace", "tableName", "isTemporary")) +assert(sql("show table extended like 'tbl'").collect()(0).length == 4) +assert(sql("show table extended like 'tbl'").schema.fieldNames.deep == Review comment: Seems like this broke Scala 2.13 build. I made a followup here https://github.com/apache/spark/pull/31526 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] HyukjinKwon commented on a change in pull request #31245: [SPARK-34157][SQL] Unify output of SHOW TABLES and pass output attributes properly
HyukjinKwon commented on a change in pull request #31245: URL: https://github.com/apache/spark/pull/31245#discussion_r572020055 ## File path: sql/core/src/test/scala/org/apache/spark/sql/execution/command/v1/ShowTablesSuite.scala ## @@ -119,6 +101,32 @@ trait ShowTablesSuiteBase extends command.ShowTablesSuiteBase { assert(errMsg.contains("Database from v1 session catalog is not specified")) } } + + test("SPARK-34157 Unify output of SHOW TABLES and pass output attributes properly") { Review comment: Let's keep the format consistent next time: `SPARK-34157 :` This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA removed a comment on pull request #31524: [SPARK-34401][SQL][DOCS] Update docs about altering cached tables/views
SparkQA removed a comment on pull request #31524: URL: https://github.com/apache/spark/pull/31524#issuecomment-775116106 **[Test build #135026 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/135026/testReport)** for PR 31524 at commit [`e514ada`](https://github.com/apache/spark/commit/e514adab4b9c76cc9deff1e3b640b65ea26a366d). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #31524: [SPARK-34401][SQL][DOCS] Update docs about altering cached tables/views
SparkQA commented on pull request #31524: URL: https://github.com/apache/spark/pull/31524#issuecomment-775123809 **[Test build #135026 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/135026/testReport)** for PR 31524 at commit [`e514ada`](https://github.com/apache/spark/commit/e514adab4b9c76cc9deff1e3b640b65ea26a366d). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA removed a comment on pull request #31245: [SPARK-34157][SQL] Unify output of SHOW TABLES and pass output attributes properly
SparkQA removed a comment on pull request #31245: URL: https://github.com/apache/spark/pull/31245#issuecomment-774968499 **[Test build #135015 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/135015/testReport)** for PR 31245 at commit [`815d36b`](https://github.com/apache/spark/commit/815d36b35c294d110cf0770d2f0850d9ae1da151). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #31245: [SPARK-34157][SQL] Unify output of SHOW TABLES and pass output attributes properly
SparkQA commented on pull request #31245: URL: https://github.com/apache/spark/pull/31245#issuecomment-775122966 **[Test build #135015 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/135015/testReport)** for PR 31245 at commit [`815d36b`](https://github.com/apache/spark/commit/815d36b35c294d110cf0770d2f0850d9ae1da151). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] HyukjinKwon closed pull request #31525: [MINOR][INFRA][DOC][3.1] Change the facetFilters of Docsearch to 3.1.1
HyukjinKwon closed pull request #31525: URL: https://github.com/apache/spark/pull/31525 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] HyukjinKwon commented on pull request #31525: [MINOR][INFRA][DOC][3.1] Change the facetFilters of Docsearch to 3.1.1
HyukjinKwon commented on pull request #31525: URL: https://github.com/apache/spark/pull/31525#issuecomment-775121171 Merged to branch-3.1. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] cloud-fan commented on pull request #31073: [SPARK-33995][SQL] Expose make_interval as a Scala function
cloud-fan commented on pull request #31073: URL: https://github.com/apache/spark/pull/31073#issuecomment-775120028 retest this please This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] cloud-fan commented on a change in pull request #31378: [SPARK-34240][SQL] Unify output of `SHOW TBLPROPERTIES` clause's output attribute's schema and ExprID
cloud-fan commented on a change in pull request #31378: URL: https://github.com/apache/spark/pull/31378#discussion_r572015001 ## File path: sql/core/src/test/resources/sql-tests/inputs/show-tblproperties.sql ## @@ -6,6 +6,13 @@ SHOW TBLPROPERTIES tbl; SHOW TBLPROPERTIES tbl("p1"); SHOW TBLPROPERTIES tbl("p3"); +set spark.sql.legacy.keepCommandOutputSchema=true; Review comment: yea, we can also remove changes in this file. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] HyukjinKwon opened a new pull request #31526: [SPARK-34157][BUILD][FOLLOW-UP] Fix Scala 2.13 compliation error via using deep
HyukjinKwon opened a new pull request #31526: URL: https://github.com/apache/spark/pull/31526 ### What changes were proposed in this pull request? This PR is a followup of https://github.com/apache/spark/pull/31245: ``` [error] /home/runner/work/spark/spark/sql/core/src/test/scala/org/apache/spark/sql/execution/command/v1/ShowTablesSuite.scala:112:53: value deep is not a member of Array[String] [error] assert(sql("show tables").schema.fieldNames.deep == [error] ^ [error] /home/runner/work/spark/spark/sql/core/src/test/scala/org/apache/spark/sql/execution/command/v1/ShowTablesSuite.scala:115:72: value deep is not a member of Array[String] [error] assert(sql("show table extended like 'tbl'").schema.fieldNames.deep == [error] ^ [error] /home/runner/work/spark/spark/sql/core/src/test/scala/org/apache/spark/sql/execution/command/v1/ShowTablesSuite.scala:121:55: value deep is not a member of Array[String] [error] assert(sql("show tables").schema.fieldNames.deep == [error] ^ [error] /home/runner/work/spark/spark/sql/core/src/test/scala/org/apache/spark/sql/execution/command/v1/ShowTablesSuite.scala:124:74: value deep is not a member of Array[String] [error] assert(sql("show table extended like 'tbl'").schema.fieldNames.deep == [error] ^ ``` It broke Scala 2.13 build. This PR works around by using `===` that can compare `Array`s safely. ### Why are the changes needed? To fix the build. ### Does this PR introduce _any_ user-facing change? No, dev-only. ### How was this patch tested? CI in this PR should test it out. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] cloud-fan commented on a change in pull request #31485: [SPARK-34137][SQL] Update suquery's stats when build LogicalPlan's stats
cloud-fan commented on a change in pull request #31485: URL: https://github.com/apache/spark/pull/31485#discussion_r572014419 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/LogicalPlanVisitor.scala ## @@ -47,6 +49,16 @@ trait LogicalPlanVisitor[T] { def default(p: LogicalPlan): T + def visitSubqueryExpression(p: LogicalPlan): LogicalPlan = { +p.transformExpressionsDown { + case subqueryExpression: SubqueryExpression => +// trigger subquery's child plan stats propagation Review comment: Yea let's move it to `stringWithStats` This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] Ngone51 commented on pull request #31517: [WIP][SPARK-34309][CORE][SQL] Use Caffeine instead of Guava Cache
Ngone51 commented on pull request #31517: URL: https://github.com/apache/spark/pull/31517#issuecomment-775117864 Just out of curiosity, how much performance improvement we can get by using Caffeine? Do we have a rough number? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] Ngone51 commented on a change in pull request #31521: [SPARK-34400][CORE] Check spark.job.interruptOnCancel value is illegal before submit job
Ngone51 commented on a change in pull request #31521: URL: https://github.com/apache/spark/pull/31521#discussion_r572010794 ## File path: core/src/main/scala/org/apache/spark/SparkContext.scala ## @@ -719,10 +719,23 @@ class SparkContext(config: SparkConf) extends Logging { if (value == null) { localProperties.get.remove(key) } else { + checkLocalProperty(key, value) Review comment: Do we expect users would set `spark.job.interruptOnCance` as I notice it's private? I think it may be better to prevent users from setting this kind of private property instead of doing the valid check. And we could recommend users to use `setJobGroup` instead. cc @tgravescs @mridulm any thoughts about this? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #31378: [SPARK-34240][SQL] Unify output of `SHOW TBLPROPERTIES` clause's output attribute's schema and ExprID
SparkQA commented on pull request #31378: URL: https://github.com/apache/spark/pull/31378#issuecomment-775116045 **[Test build #135027 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/135027/testReport)** for PR 31378 at commit [`728fca4`](https://github.com/apache/spark/commit/728fca4979bc7251168e8a68d85576fd97cc48fe). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #31524: [SPARK-34401][SQL][DOCS] Update docs about altering cached tables/views
SparkQA commented on pull request #31524: URL: https://github.com/apache/spark/pull/31524#issuecomment-775116106 **[Test build #135026 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/135026/testReport)** for PR 31524 at commit [`e514ada`](https://github.com/apache/spark/commit/e514adab4b9c76cc9deff1e3b640b65ea26a366d). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] Ngone51 commented on a change in pull request #31521: [SPARK-34400][CORE] Check spark.job.interruptOnCancel value is illegal before submit job
Ngone51 commented on a change in pull request #31521: URL: https://github.com/apache/spark/pull/31521#discussion_r572010794 ## File path: core/src/main/scala/org/apache/spark/SparkContext.scala ## @@ -719,10 +719,23 @@ class SparkContext(config: SparkConf) extends Logging { if (value == null) { localProperties.get.remove(key) } else { + checkLocalProperty(key, value) Review comment: Do we expect users would set `spark.job.interruptOnCance` as I notice it's private? I think it may be better to prevent users from setting this kind of private property instead of doing check valid check. And we could recommend users to use `setJobGroup` instead. cc @tgravescs @mridulm any thoughts about this? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] Ngone51 commented on a change in pull request #31521: [SPARK-34400][CORE] Check spark.job.interruptOnCancel value is illegal before submit job
Ngone51 commented on a change in pull request #31521: URL: https://github.com/apache/spark/pull/31521#discussion_r572010794 ## File path: core/src/main/scala/org/apache/spark/SparkContext.scala ## @@ -719,10 +719,23 @@ class SparkContext(config: SparkConf) extends Logging { if (value == null) { localProperties.get.remove(key) } else { + checkLocalProperty(key, value) Review comment: Do we expect users would set `spark.job.interruptOnCance` as I notice it's private? I think it may be better to prevent users from setting this kind of private property instead of doing check valid check. And we could recommend users to use `setJobGroup` instead. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA removed a comment on pull request #31520: [SPARK-34355][CORE][SQL][FOLLOWUP] Log commit time in all File Writer
SparkQA removed a comment on pull request #31520: URL: https://github.com/apache/spark/pull/31520#issuecomment-774960569 **[Test build #135013 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/135013/testReport)** for PR 31520 at commit [`171cefd`](https://github.com/apache/spark/commit/171cefd14f6936820a522bf5b49163428484b0d0). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] gengliangwang opened a new pull request #31525: [3.1][INFRA][DOC] Change the facetFilters of Docsearch to 3.1.1
gengliangwang opened a new pull request #31525: URL: https://github.com/apache/spark/pull/31525 ### What changes were proposed in this pull request? As https://github.com/algolia/docsearch-configs/pull/3391 is merged, This PR changes the facetFilters of Docsearch as 3.1.1. ### Why are the changes needed? So that the search result of the published Spark site will points to https://spark.apache.org/docs/3.1.1 instead of https://spark.apache.org/docs/latest/. This is useful for searching the docs of 3.1.1 after there are new Spark releases. ### Does this PR introduce _any_ user-facing change? Yes, the search result of 3.1.1 Spark doc site is based on https://spark.apache.org/docs/3.1.1 instead of https://spark.apache.org/docs/latest/ ### How was this patch tested? Just configuration changes. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #31520: [SPARK-34355][CORE][SQL][FOLLOWUP] Log commit time in all File Writer
SparkQA commented on pull request #31520: URL: https://github.com/apache/spark/pull/31520#issuecomment-775113086 **[Test build #135013 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/135013/testReport)** for PR 31520 at commit [`171cefd`](https://github.com/apache/spark/commit/171cefd14f6936820a522bf5b49163428484b0d0). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #31508: [SPARK-34393][SQL] Unify output of SHOW VIEWS and pass output attributes properly
AmplabJenkins removed a comment on pull request #31508: URL: https://github.com/apache/spark/pull/31508#issuecomment-775112580 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/135012/ This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #31508: [SPARK-34393][SQL] Unify output of SHOW VIEWS and pass output attributes properly
AmplabJenkins commented on pull request #31508: URL: https://github.com/apache/spark/pull/31508#issuecomment-775112580 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/135012/ This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA removed a comment on pull request #31508: [SPARK-34393][SQL] Unify output of SHOW VIEWS and pass output attributes properly
SparkQA removed a comment on pull request #31508: URL: https://github.com/apache/spark/pull/31508#issuecomment-774939013 **[Test build #135012 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/135012/testReport)** for PR 31508 at commit [`d964a05`](https://github.com/apache/spark/commit/d964a059a4882cecddc3dbe2d4343cbf6298ff44). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] cloud-fan commented on a change in pull request #31494: [SPARK-34380][SQL] Support ifExists for ALTER TABLE ... UNSET TBLPROPERTIES for v2 command
cloud-fan commented on a change in pull request #31494: URL: https://github.com/apache/spark/pull/31494#discussion_r572007541 ## File path: sql/catalyst/src/main/java/org/apache/spark/sql/connector/catalog/TableChange.java ## @@ -64,7 +64,20 @@ static TableChange setProperty(String property, String value) { * @return a TableChange for the addition */ static TableChange removeProperty(String property) { Review comment: Another way is to do the work in Spark analyzer: if the key doesn't exist and `ifExists = true`, remove this table change. Then we don't need to change API. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #31508: [SPARK-34393][SQL] Unify output of SHOW VIEWS and pass output attributes properly
SparkQA commented on pull request #31508: URL: https://github.com/apache/spark/pull/31508#issuecomment-775111325 **[Test build #135012 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/135012/testReport)** for PR 31508 at commit [`d964a05`](https://github.com/apache/spark/commit/d964a059a4882cecddc3dbe2d4343cbf6298ff44). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] cloud-fan commented on a change in pull request #31494: [SPARK-34380][SQL] Support ifExists for ALTER TABLE ... UNSET TBLPROPERTIES for v2 command
cloud-fan commented on a change in pull request #31494: URL: https://github.com/apache/spark/pull/31494#discussion_r572005786 ## File path: sql/catalyst/src/main/java/org/apache/spark/sql/connector/catalog/TableChange.java ## @@ -64,7 +64,20 @@ static TableChange setProperty(String property, String value) { * @return a TableChange for the addition */ static TableChange removeProperty(String property) { Review comment: Yea we need to keep backward compatibility This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] cloud-fan commented on a change in pull request #31258: [SPARK-34168] [SQL] Support DPP in AQE when the join is Broadcast hash join at the beginning
cloud-fan commented on a change in pull request #31258: URL: https://github.com/apache/spark/pull/31258#discussion_r572004814 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/adaptive/QueryStageExec.scala ## @@ -165,7 +175,8 @@ case class ShuffleQueryStageExec( override def newReuseInstance(newStageId: Int, newOutput: Seq[Attribute]): QueryStageExec = { val reuse = ShuffleQueryStageExec( newStageId, - ReusedExchangeExec(newOutput, shuffle)) + ReusedExchangeExec(newOutput, shuffle), + shuffle.canonicalized) Review comment: nit: this should be `_canonicalized` ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/adaptive/QueryStageExec.scala ## @@ -229,7 +245,8 @@ case class BroadcastQueryStageExec( override def newReuseInstance(newStageId: Int, newOutput: Seq[Attribute]): QueryStageExec = { val reuse = BroadcastQueryStageExec( newStageId, - ReusedExchangeExec(newOutput, broadcast)) + ReusedExchangeExec(newOutput, broadcast), + broadcast.canonicalized) Review comment: ditto This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #31520: [SPARK-34355][CORE][SQL][FOLLOWUP] Log commit time in all File Writer
AmplabJenkins removed a comment on pull request #31520: URL: https://github.com/apache/spark/pull/31520#issuecomment-775104877 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/135018/ This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #31470: [SPARK-34354][SQL] Fix failure when apply CostBasedJoinReorder on self-join
AmplabJenkins removed a comment on pull request #31470: URL: https://github.com/apache/spark/pull/31470#issuecomment-775104873 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/39605/ This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #31519: [SPARK-34394][SQL] Unify output of SHOW FUNCTIONS and pass output attributes properly
AmplabJenkins removed a comment on pull request #31519: URL: https://github.com/apache/spark/pull/31519#issuecomment-775104874 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/135010/ This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #31470: [SPARK-34354][SQL] Fix failure when apply CostBasedJoinReorder on self-join
AmplabJenkins commented on pull request #31470: URL: https://github.com/apache/spark/pull/31470#issuecomment-775104873 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/39605/ This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #31520: [SPARK-34355][CORE][SQL][FOLLOWUP] Log commit time in all File Writer
AmplabJenkins commented on pull request #31520: URL: https://github.com/apache/spark/pull/31520#issuecomment-775104877 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/135018/ This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #31519: [SPARK-34394][SQL] Unify output of SHOW FUNCTIONS and pass output attributes properly
AmplabJenkins commented on pull request #31519: URL: https://github.com/apache/spark/pull/31519#issuecomment-775104874 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/135010/ This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #31466: [SPARK-34352][SQL] Improve SQLQueryTestSuite so as could run on windows system
SparkQA commented on pull request #31466: URL: https://github.com/apache/spark/pull/31466#issuecomment-775104234 Kubernetes integration test starting URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/39608/ This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AngersZhuuuu commented on a change in pull request #31485: [SPARK-34137][SQL] Update suquery's stats when build LogicalPlan's stats
AngersZh commented on a change in pull request #31485: URL: https://github.com/apache/spark/pull/31485#discussion_r57261 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/LogicalPlanVisitor.scala ## @@ -47,6 +49,16 @@ trait LogicalPlanVisitor[T] { def default(p: LogicalPlan): T + def visitSubqueryExpression(p: LogicalPlan): LogicalPlan = { +p.transformExpressionsDown { Review comment: > If we don't care about iteration order, `transformExpressions` should be better. Done This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] MaxGekk commented on pull request #31524: [SPARK-34401][SQL][DOCS] Update docs about altering cached tables/views
MaxGekk commented on pull request #31524: URL: https://github.com/apache/spark/pull/31524#issuecomment-775103312 @sunchao @cloud-fan @HyukjinKwon @dongjoon-hyun Could you review this PR, please. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] MaxGekk opened a new pull request #31524: [SPARK-34401][SQL][DOCS] Update docs about altering cached tables/views
MaxGekk opened a new pull request #31524: URL: https://github.com/apache/spark/pull/31524 ### What changes were proposed in this pull request? Update public docs of SQL commands about altering cached tables/views. For instance: https://user-images.githubusercontent.com/1580697/107217940-fd3b8980-6a1f-11eb-98b9-9b2e3fe7f4ef.png";> ### Why are the changes needed? To inform users about commands behavior in altering cached tables or views. ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? By running the command below and manually checking the docs: ``` $ SKIP_API=1 SKIP_SCALADOC=1 SKIP_PYTHONDOC=1 SKIP_RDOC=1 jekyll serve --watch ``` This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AngersZhuuuu commented on a change in pull request #31378: [SPARK-34240][SQL] Unify output of `SHOW TBLPROPERTIES` clause's output attribute's schema and ExprID
AngersZh commented on a change in pull request #31378: URL: https://github.com/apache/spark/pull/31378#discussion_r571998057 ## File path: sql/core/src/test/resources/sql-tests/inputs/show-tblproperties.sql ## @@ -6,6 +6,13 @@ SHOW TBLPROPERTIES tbl; SHOW TBLPROPERTIES tbl("p1"); SHOW TBLPROPERTIES tbl("p3"); +set spark.sql.legacy.keepCommandOutputSchema=true; Review comment: > Can we follow other PRs and test the config in the scala test? Updated, should we merge `test("SPARK-34240 Unify output of SHOW TBLPROPERTIES and pass output attributes properly") ` and `test("show tblproperties for hive table") ` ## File path: sql/core/src/main/scala/org/apache/spark/sql/catalyst/analysis/ResolveSessionCatalog.scala ## @@ -481,8 +481,14 @@ class ResolveSessionCatalog(val catalogManager: CatalogManager) throw QueryCompilationErrors.externalCatalogNotSupportShowViewsError(resolved) } -case ShowTableProperties(ResolvedV1TableOrViewIdentifier(ident), propertyKey) => - ShowTablePropertiesCommand(ident.asTableIdentifier, propertyKey) +case s @ ShowTableProperties(ResolvedV1TableOrViewIdentifier(ident), propertyKey, output) => + val newOutput = if (conf.getConf(SQLConf.LEGACY_KEEP_COMMAND_OUTPUT_SCHEMA)) { +assert(output.length == 2) +output.tail Review comment: > we should only do it if key is specified. Fixed This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] HyukjinKwon commented on pull request #31523: [SPARK-33354][DOC] Remove an unnecessary quote in doc
HyukjinKwon commented on pull request #31523: URL: https://github.com/apache/spark/pull/31523#issuecomment-775100166 Merged to master and branch-3.1. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] HyukjinKwon closed pull request #31523: [SPARK-33354][DOC] Remove an unnecessary quote in doc
HyukjinKwon closed pull request #31523: URL: https://github.com/apache/spark/pull/31523 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] cloud-fan commented on a change in pull request #31500: [SPARK-34388][SQL] Propagate the registered UDF name to ScalaUDF, ScalaUDAF and ScalaAggregator
cloud-fan commented on a change in pull request #31500: URL: https://github.com/apache/spark/pull/31500#discussion_r571994908 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/Expression.scala ## @@ -1088,4 +1088,6 @@ trait ComplexTypeMergingExpression extends Expression { * Common base trait for user-defined functions, including UDF/UDAF/UDTF of different languages * and Hive function wrappers. */ -trait UserDefinedExpression +trait UserDefinedExpression { + def name: String Review comment: It's an internal trait, seems OK to require it. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AngersZhuuuu commented on a change in pull request #31485: [SPARK-34137][SQL] Update suquery's stats when build LogicalPlan's stats
AngersZh commented on a change in pull request #31485: URL: https://github.com/apache/spark/pull/31485#discussion_r571993370 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/LogicalPlanVisitor.scala ## @@ -47,6 +49,16 @@ trait LogicalPlanVisitor[T] { def default(p: LogicalPlan): T + def visitSubqueryExpression(p: LogicalPlan): LogicalPlan = { +p.transformExpressionsDown { + case subqueryExpression: SubqueryExpression => +// trigger subquery's child plan stats propagation Review comment: If it looks too weird, we can move this to `stringWithStats` This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] gengliangwang opened a new pull request #31523: [SPARK-33354][DOC] Remove an unnecessary quote in doc
gengliangwang opened a new pull request #31523: URL: https://github.com/apache/spark/pull/31523 ### What changes were proposed in this pull request? Remove an unnecessary quote in the documentation. Super trivial. ### Why are the changes needed? Fix a mistake. ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? Just doc This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA removed a comment on pull request #31520: [SPARK-34355][CORE][SQL][FOLLOWUP] Log FileBatchWrite's commit time too
SparkQA removed a comment on pull request #31520: URL: https://github.com/apache/spark/pull/31520#issuecomment-775007297 **[Test build #135018 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/135018/testReport)** for PR 31520 at commit [`91d92f4`](https://github.com/apache/spark/commit/91d92f4609b7348938b69175e830452cafcb1134). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AngersZhuuuu commented on a change in pull request #31485: [SPARK-34137][SQL] Update suquery's stats when build LogicalPlan's stats
AngersZh commented on a change in pull request #31485: URL: https://github.com/apache/spark/pull/31485#discussion_r571992802 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/LogicalPlanVisitor.scala ## @@ -47,6 +49,16 @@ trait LogicalPlanVisitor[T] { def default(p: LogicalPlan): T + def visitSubqueryExpression(p: LogicalPlan): LogicalPlan = { +p.transformExpressionsDown { + case subqueryExpression: SubqueryExpression => +// trigger subquery's child plan stats propagation Review comment: > This is weird. Doesn't EXPLAIN trigger the plan stats propagation? Yea, EXPLAIN trigger plan stats before build string: ``` private def stringWithStats(maxFields: Int, append: String => Unit): Unit = { val maxFields = SQLConf.get.maxToStringFields // trigger to compute stats for logical plans try { optimizedPlan.stats } catch { case e: AnalysisException => append(e.toString + "\n") } // only show optimized logical plan and physical plan append("== Optimized Logical Plan ==\n") QueryPlan.append(optimizedPlan, append, verbose = true, addSuffix = true, maxFields) append("\n== Physical Plan ==\n") QueryPlan.append(executedPlan, append, verbose = true, addSuffix = false, maxFields) append("\n") } ``` Have tested that in subquery, all behavior about statistic is right since when it use statistcs,it will call `plan.stats`. We do it here just trigger it earlier. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #31520: [SPARK-34355][CORE][SQL][FOLLOWUP] Log FileBatchWrite's commit time too
SparkQA commented on pull request #31520: URL: https://github.com/apache/spark/pull/31520#issuecomment-775096290 **[Test build #135018 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/135018/testReport)** for PR 31520 at commit [`91d92f4`](https://github.com/apache/spark/commit/91d92f4609b7348938b69175e830452cafcb1134). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] cloud-fan commented on pull request #31442: [SPARK-34333][SQL] Fix PostgresDialect to handle money types properly
cloud-fan commented on pull request #31442: URL: https://github.com/apache/spark/pull/31442#issuecomment-775094701 LGTM except one comment about migration guide. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] cloud-fan commented on a change in pull request #31442: [SPARK-34333][SQL] Fix PostgresDialect to handle money types properly
cloud-fan commented on a change in pull request #31442: URL: https://github.com/apache/spark/pull/31442#discussion_r571990570 ## File path: docs/sql-migration-guide.md ## @@ -24,6 +24,8 @@ license: | ## Upgrading from Spark SQL 3.1 to 3.2 + - In Spark 3.2, money type in PostgreSQL table is converted to `StringType` and money[] type is not supported due to the JDBC driver for PostgreSQL can't handle those types properly. Review comment: In the migration guide, we should also mention the previous behavior. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] cloud-fan commented on pull request #31475: [SPARK-34360][SQL] Support table truncation by v2 Table Catalogs
cloud-fan commented on pull request #31475: URL: https://github.com/apache/spark/pull/31475#issuecomment-775093012 LGTM, also cc @rdblue @imback82 @dongjoon-hyun This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] cloud-fan commented on a change in pull request #31485: [SPARK-34137][SQL] Update suquery's stats when build LogicalPlan's stats
cloud-fan commented on a change in pull request #31485: URL: https://github.com/apache/spark/pull/31485#discussion_r571988475 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/LogicalPlanVisitor.scala ## @@ -47,6 +49,16 @@ trait LogicalPlanVisitor[T] { def default(p: LogicalPlan): T + def visitSubqueryExpression(p: LogicalPlan): LogicalPlan = { +p.transformExpressionsDown { + case subqueryExpression: SubqueryExpression => +// trigger subquery's child plan stats propagation Review comment: This is weird. Doesn't EXPLAIN trigger the plan stats propagation? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] cloud-fan commented on a change in pull request #31485: [SPARK-34137][SQL] Update suquery's stats when build LogicalPlan's stats
cloud-fan commented on a change in pull request #31485: URL: https://github.com/apache/spark/pull/31485#discussion_r571987869 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/LogicalPlanVisitor.scala ## @@ -47,6 +49,16 @@ trait LogicalPlanVisitor[T] { def default(p: LogicalPlan): T + def visitSubqueryExpression(p: LogicalPlan): LogicalPlan = { +p.transformExpressionsDown { Review comment: If we don't care about iteration order, `transformExpressions` should be better. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA removed a comment on pull request #31519: [SPARK-34394][SQL] Unify output of SHOW FUNCTIONS and pass output attributes properly
SparkQA removed a comment on pull request #31519: URL: https://github.com/apache/spark/pull/31519#issuecomment-774937492 **[Test build #135010 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/135010/testReport)** for PR 31519 at commit [`a980eb4`](https://github.com/apache/spark/commit/a980eb417e7ecfd0569129c6809450d762e0bdb5). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] cloud-fan commented on a change in pull request #31337: [SPARK-34234][SQL] Remove TreeNodeException that didn't work
cloud-fan commented on a change in pull request #31337: URL: https://github.com/apache/spark/pull/31337#discussion_r571986698 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/command/commands.scala ## @@ -157,7 +156,7 @@ case class ExplainCommand( override def run(sparkSession: SparkSession): Seq[Row] = try { val outputString = sparkSession.sessionState.executePlan(logicalPlan).explainString(mode) Seq(Row(outputString)) - } catch { case cause: TreeNodeException[_] => + } catch { case cause: AnalysisException => Review comment: we should probably catch `NonFatal` now ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/command/commands.scala ## @@ -179,7 +178,7 @@ case class StreamingExplainCommand( queryExecution.simpleString } Seq(Row(outputString)) - } catch { case cause: TreeNodeException[_] => + } catch { case cause: AnalysisException => Review comment: ditto This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #31519: [SPARK-34394][SQL] Unify output of SHOW FUNCTIONS and pass output attributes properly
SparkQA commented on pull request #31519: URL: https://github.com/apache/spark/pull/31519#issuecomment-775090378 **[Test build #135010 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/135010/testReport)** for PR 31519 at commit [`a980eb4`](https://github.com/apache/spark/commit/a980eb417e7ecfd0569129c6809450d762e0bdb5). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] cloud-fan commented on a change in pull request #31337: [SPARK-34234][SQL] Remove TreeNodeException that didn't work
cloud-fan commented on a change in pull request #31337: URL: https://github.com/apache/spark/pull/31337#discussion_r570991936 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/trees/TreeNode.scala ## @@ -504,8 +504,7 @@ abstract class TreeNode[BaseType <: TreeNode[BaseType]] extends Product { } } catch { case e: java.lang.IllegalArgumentException => -throw new TreeNodeException( - this, +throw new AnalysisException( s""" |Failed to copy node. Review comment: We should probably still include the tree string in the error message, like before. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] cloud-fan commented on a change in pull request #31337: [SPARK-34234][SQL] Remove TreeNodeException that didn't work
cloud-fan commented on a change in pull request #31337: URL: https://github.com/apache/spark/pull/31337#discussion_r571985735 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/trees/TreeNode.scala ## @@ -504,8 +504,7 @@ abstract class TreeNode[BaseType <: TreeNode[BaseType]] extends Product { } } catch { case e: java.lang.IllegalArgumentException => -throw new TreeNodeException( - this, +throw new AnalysisException( Review comment: This is not analysis only. `IllegalStateException` should be fine as it indicates a bug. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] cloud-fan commented on a change in pull request #31378: [SPARK-34240][SQL] Unify output of `SHOW TBLPROPERTIES` clause's output attribute's schema and ExprID
cloud-fan commented on a change in pull request #31378: URL: https://github.com/apache/spark/pull/31378#discussion_r571983429 ## File path: sql/core/src/test/resources/sql-tests/inputs/show-tblproperties.sql ## @@ -6,6 +6,13 @@ SHOW TBLPROPERTIES tbl; SHOW TBLPROPERTIES tbl("p1"); SHOW TBLPROPERTIES tbl("p3"); +set spark.sql.legacy.keepCommandOutputSchema=true; Review comment: Can we follow others and test the config in the scala test? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] cloud-fan commented on a change in pull request #31378: [SPARK-34240][SQL] Unify output of `SHOW TBLPROPERTIES` clause's output attribute's schema and ExprID
cloud-fan commented on a change in pull request #31378: URL: https://github.com/apache/spark/pull/31378#discussion_r571983429 ## File path: sql/core/src/test/resources/sql-tests/inputs/show-tblproperties.sql ## @@ -6,6 +6,13 @@ SHOW TBLPROPERTIES tbl; SHOW TBLPROPERTIES tbl("p1"); SHOW TBLPROPERTIES tbl("p3"); +set spark.sql.legacy.keepCommandOutputSchema=true; Review comment: Can we follow other PRs and test the config in the scala test? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] cloud-fan commented on a change in pull request #31378: [SPARK-34240][SQL] Unify output of `SHOW TBLPROPERTIES` clause's output attribute's schema and ExprID
cloud-fan commented on a change in pull request #31378: URL: https://github.com/apache/spark/pull/31378#discussion_r571982639 ## File path: sql/core/src/main/scala/org/apache/spark/sql/catalyst/analysis/ResolveSessionCatalog.scala ## @@ -481,8 +481,14 @@ class ResolveSessionCatalog(val catalogManager: CatalogManager) throw QueryCompilationErrors.externalCatalogNotSupportShowViewsError(resolved) } -case ShowTableProperties(ResolvedV1TableOrViewIdentifier(ident), propertyKey) => - ShowTablePropertiesCommand(ident.asTableIdentifier, propertyKey) +case s @ ShowTableProperties(ResolvedV1TableOrViewIdentifier(ident), propertyKey, output) => + val newOutput = if (conf.getConf(SQLConf.LEGACY_KEEP_COMMAND_OUTPUT_SCHEMA)) { +assert(output.length == 2) +output.tail Review comment: we should only do it if key is specified. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] cloud-fan closed pull request #31518: [SPARK-34239][SQL][FOLLOW_UP] SHOW COLUMNS Keep consistence with other `SHOW` command
cloud-fan closed pull request #31518: URL: https://github.com/apache/spark/pull/31518 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] cloud-fan commented on pull request #31518: [SPARK-34239][SQL][FOLLOW_UP] SHOW COLUMNS Keep consistence with other `SHOW` command
cloud-fan commented on pull request #31518: URL: https://github.com/apache/spark/pull/31518#issuecomment-775084376 thanks, merging to master! This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #31466: [SPARK-34352][SQL] Improve SQLQueryTestSuite so as could run on windows system
SparkQA commented on pull request #31466: URL: https://github.com/apache/spark/pull/31466#issuecomment-775080723 **[Test build #135025 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/135025/testReport)** for PR 31466 at commit [`01caf85`](https://github.com/apache/spark/commit/01caf85014a7733402a0abcb091fa739eeafa32d). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #31470: [SPARK-34354][SQL] Fix failure when apply CostBasedJoinReorder on self-join
SparkQA commented on pull request #31470: URL: https://github.com/apache/spark/pull/31470#issuecomment-775080149 Kubernetes integration test status success URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/39605/ This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AngersZhuuuu commented on pull request #31516: [SPARK-34238][SQL][FOLLOW_UP] SHOW PARTITIONS Keep consistence with other `SHOW` command
AngersZh commented on pull request #31516: URL: https://github.com/apache/spark/pull/31516#issuecomment-775076483 FYI @cloud-fan This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AngersZhuuuu commented on pull request #31518: [SPARK-34239][SQL][FOLLOW_UP] SHOW COLUMNS Keep consistence with other `SHOW` command
AngersZh commented on pull request #31518: URL: https://github.com/apache/spark/pull/31518#issuecomment-775076252 FYI @cloud-fan This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA removed a comment on pull request #31518: [SPARK-34239][SQL][FOLLOW_UP] SHOW COLUMNS Keep consistence with other `SHOW` command
SparkQA removed a comment on pull request #31518: URL: https://github.com/apache/spark/pull/31518#issuecomment-774906642 **[Test build #135005 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/135005/testReport)** for PR 31518 at commit [`12e569b`](https://github.com/apache/spark/commit/12e569be80f3bb03daac2dfa15b507572cafbaaa). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #31520: [SPARK-34355][CORE][SQL][FOLLOWUP] Log FileBatchWrite's commit time too
AmplabJenkins removed a comment on pull request #31520: URL: https://github.com/apache/spark/pull/31520#issuecomment-775071544 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #31466: [SPARK-34352][SQL] Improve SQLQueryTestSuite so as could run on windows system
AmplabJenkins removed a comment on pull request #31466: URL: https://github.com/apache/spark/pull/31466#issuecomment-775071593 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA removed a comment on pull request #31521: [SPARK-34400][CORE] Check spark.job.interruptOnCancel value is illegal before submit job
SparkQA removed a comment on pull request #31521: URL: https://github.com/apache/spark/pull/31521#issuecomment-774999620 **[Test build #135017 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/135017/testReport)** for PR 31521 at commit [`17669d9`](https://github.com/apache/spark/commit/17669d900568635be2151870101f0d6b33885230). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #31521: [SPARK-34400][CORE] Check spark.job.interruptOnCancel value is illegal before submit job
AmplabJenkins removed a comment on pull request #31521: URL: https://github.com/apache/spark/pull/31521#issuecomment-775071598 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #31518: [SPARK-34239][SQL][FOLLOW_UP] SHOW COLUMNS Keep consistence with other `SHOW` command
AmplabJenkins removed a comment on pull request #31518: URL: https://github.com/apache/spark/pull/31518#issuecomment-775072743 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/135005/ This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #31516: [SPARK-34238][SQL][FOLLOW_UP] SHOW PARTITIONS Keep consistence with other `SHOW` command
AmplabJenkins removed a comment on pull request #31516: URL: https://github.com/apache/spark/pull/31516#issuecomment-775071591 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/135007/ This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #31522: [SPARK-34399][SQL] Add commit duration to SQL tab's graph node.
AmplabJenkins removed a comment on pull request #31522: URL: https://github.com/apache/spark/pull/31522#issuecomment-775071600 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/39599/ This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org