[GitHub] [spark] cloud-fan commented on a change in pull request #26071: [SPARK-29412][SQL] refine the document of v2 session catalog config
cloud-fan commented on a change in pull request #26071: [SPARK-29412][SQL] refine the document of v2 session catalog config URL: https://github.com/apache/spark/pull/26071#discussion_r334413116 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala ## @@ -1976,11 +1977,14 @@ object SQLConf { .stringConf .createOptional - val V2_SESSION_CATALOG = buildConf("spark.sql.catalog.session") - .doc("A catalog implementation that will be used in place of the Spark built-in session " + -"catalog for v2 operations. The implementation may extend `CatalogExtension` to be " + -"passed the Spark built-in session catalog, so that it may delegate calls to the " + -"built-in session catalog.") + val V2_SESSION_CATALOG_IMPLEMENTATION = +buildConf(s"spark.sql.catalog.$SESSION_CATALOG_NAME") + .doc("A catalog implementation that will be used as the v2 interface to Spark's built-in " + +s"v1 catalog: $SESSION_CATALOG_NAME. This catalog shares its identifier namespace with " + +s"the $SESSION_CATALOG_NAME and must be consistent with it; for example, if a table can " + +s"be loaded by the $SESSION_CATALOG_NAME, this catalog must also return the table " + +s"metadata. To delegate operations to the $SESSION_CATALOG_NAME, implementations can " + +"extend 'CatalogExtension'.") Review comment: It must be a v2 interface, but we can't fully control how it is implemented. We expect it to delegate to the SESSION_CATALOG_NAME, but there is no way to guarantee it. If the implementation doesn't delegate, then the behavior is undefined. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] cloud-fan commented on a change in pull request #26071: [SPARK-29412][SQL] refine the document of v2 session catalog config
cloud-fan commented on a change in pull request #26071: [SPARK-29412][SQL] refine the document of v2 session catalog config URL: https://github.com/apache/spark/pull/26071#discussion_r334410058 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala ## @@ -1976,11 +1977,14 @@ object SQLConf { .stringConf .createOptional - val V2_SESSION_CATALOG = buildConf("spark.sql.catalog.session") - .doc("A catalog implementation that will be used in place of the Spark built-in session " + -"catalog for v2 operations. The implementation may extend `CatalogExtension` to be " + -"passed the Spark built-in session catalog, so that it may delegate calls to the " + -"built-in session catalog.") + val V2_SESSION_CATALOG_IMPLEMENTATION = +buildConf(s"spark.sql.catalog.$SESSION_CATALOG_NAME") Review comment: yes, and it will be hard to change after 3.0 is released. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] cloud-fan commented on a change in pull request #26071: [SPARK-29412][SQL] refine the document of v2 session catalog config
cloud-fan commented on a change in pull request #26071: [SPARK-29412][SQL] refine the document of v2 session catalog config URL: https://github.com/apache/spark/pull/26071#discussion_r334409752 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/connector/catalog/CatalogManager.scala ## @@ -146,5 +146,5 @@ class CatalogManager( } private[sql] object CatalogManager { - val SESSION_CATALOG_NAME: String = "session" + val SESSION_CATALOG_NAME: String = "spark_catalog" Review comment: The config name/document includes the catalog name, so I put them together. The new name was discussed in a DS v2 community meeting. I'm proposing it in this PR to get more visibility. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] cloud-fan commented on a change in pull request #26071: [SPARK-29412][SQL] refine the document of v2 session catalog config
cloud-fan commented on a change in pull request #26071: [SPARK-29412][SQL] refine the document of v2 session catalog config URL: https://github.com/apache/spark/pull/26071#discussion_r02564 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala ## @@ -1976,11 +1977,19 @@ object SQLConf { .stringConf .createOptional - val V2_SESSION_CATALOG = buildConf("spark.sql.catalog.session") - .doc("A catalog implementation that will be used in place of the Spark built-in session " + -"catalog for v2 operations. The implementation may extend `CatalogExtension` to be " + -"passed the Spark built-in session catalog, so that it may delegate calls to the " + -"built-in session catalog.") + val V2_SESSION_CATALOG_IMPLEMENTATION = +buildConf(s"spark.sql.catalog.${CatalogManager.SESSION_CATALOG_NAME}") + .doc("A catalog implementation that will be used in place of the Spark Catalog for v2 " + +"operations (e.g. create table using a v2 source, alter a v2 table). The Spark Catalog " + +"is the current catalog by default, and supports all kinds of catalog operations like " + +"CREATE TABLE USING v1/v2 source, VIEW/FUNCTION related operations, etc. This config is " + +"used to extend the Spark Catalog and inject custom logic to v2 operations, while other" + +"operations still go through the Spark Catalog. The catalog implementation specified " + +"by this config should extend `CatalogExtension` to be passed the Spark Catalog, " + +"so that it can delegate calls to Spark Catalog. Otherwise, the implementation " + +"should figure out a way to access the Spark Catalog or its underlying meta-store " + +"by itself. It's important to make the implementation share the underlying meta-store " + +"of the Spark Catalog and act as an extension, instead of a separated catalog.") Review comment: thanks for the suggestion! This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] cloud-fan commented on a change in pull request #26071: [SPARK-29412][SQL] refine the document of v2 session catalog config
cloud-fan commented on a change in pull request #26071: [SPARK-29412][SQL] refine the document of v2 session catalog config URL: https://github.com/apache/spark/pull/26071#discussion_r333051695 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala ## @@ -1976,11 +1977,19 @@ object SQLConf { .stringConf .createOptional - val V2_SESSION_CATALOG = buildConf("spark.sql.catalog.session") - .doc("A catalog implementation that will be used in place of the Spark built-in session " + -"catalog for v2 operations. The implementation may extend `CatalogExtension` to be " + -"passed the Spark built-in session catalog, so that it may delegate calls to the " + -"built-in session catalog.") + val V2_SESSION_CATALOG_IMPLEMENTATION = +buildConf(s"spark.sql.catalog.${CatalogManager.SESSION_CATALOG_NAME}") + .doc("A catalog implementation that will be used in place of the Spark Catalog for v2 " + +"operations (e.g. create table using a v2 source, alter a v2 table). The Spark Catalog " + +"is the current catalog by default, and supports all kinds of catalog operations like " + +"CREATE TABLE USING v1/v2 source, VIEW/FUNCTION related operations, etc. This config is " + +"used to extend the Spark Catalog and inject custom logic to v2 operations, while other" + +"operations still go through the Spark Catalog. The catalog implementation specified " + +"by this config should extend `CatalogExtension` to be passed the Spark Catalog, " + +"so that it can delegate calls to Spark Catalog. Otherwise, the implementation " + +"should figure out a way to access the Spark Catalog or its underlying meta-store " + +"by itself. It's important to make the implementation share the underlying meta-store " + +"of the Spark Catalog and act as an extension, instead of a separated catalog.") Review comment: This is a complicated config and I'm trying my best to explain it. cc some SQL guys to get fresh eyes and see if they can understand it. @dongjoon-hyun @viirya @maropu @HyukjinKwon This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org