[
https://issues.apache.org/jira/browse/SPARK-30214?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Kent Yao updated SPARK-30214:
-----------------------------
Description:
Currently, we have a v2 adapter for v1 catalog (V2SessionCatalog), all the
table/namespace commands can be implemented via v2 APIs.
Usually, a command needs to know which catalog it needs to operate, but
different commands have different requirements about what to resolve. A few
examples:
- DROP NAMESPACE: only need to know the name of the namespace.
- DESC NAMESPACE: need to lookup the namespace and get metadata, but is done
during execution
- DROP TABLE: need to do lookup and make sure it's a table not (temp) view.
- DESC TABLE: need to lookup the table and get metadata.
For namespaces, the analyzer only needs to find the catalog and the namespace
name. The command can do lookup during execution if needed.
For tables, mostly commands need the analyzer to do lookup.
Note that, table and namespace have a difference: DESC NAMESPACE testcat works
and describes the root namespace under testcat, while DESC TABLE testcat fails
if there is no table testcat under the current catalog. It's because namespaces
can be named [], but tables can't. The commands should explicitly specify it
needs to operate on namespace or table.
In this Pull Request, we introduce a new framework to resolve v2 commands:
parser creates logical plans or commands with
UnresolvedNamespace/UnresolvedTable/UnresolvedView/UnresolvedRelation. (CREATE
TABLE still keeps Seq[String], as it doesn't need to look up relations)
analyzer converts
- UnresolvedNamespace to ResolvesNamespace (contains catalog and namespace
identifier)
- UnresolvedTable to ResolvedTable (contains catalog, identifier and Table)
- UnresolvedView to ResolvedView (will be added later when we migrate view
commands)
- UnresolvedRelation to relation.
an extra analyzer rule to match commands with V1Table and converts them to
corresponding v1 commands. This will be added later when we migrate existing
commands
planner matches commands and converts them to the corresponding physical nodes.
We also introduce brand new v2 commands - the comment syntaxes to illustrate
how to work with the newly added framework.
{code:java}
COMMENT ON (DATABASE|SCHEMA|NAMESPACE) ... IS ...
COMMENT ON TABLE ... IS ...
{code}
Details about the comment syntaxes:
As the new design of catalog v2, some properties become reserved, e.g.
location, comment. We are going to disable setting reserved properties by
dbproperties or tblproperites directly to avoid confliction with their related
subClause or specific commands.
They are the best practices from PostgreSQL and presto.
https://www.postgresql.org/docs/12/sql-comment.html
https://prestosql.io/docs/current/sql/comment.html
was:
Currently, we have a v2 adapter for v1 catalog (V2SessionCatalog), all the
table/namespace commands can be implemented via v2 APIs.
Usually, a command needs to know which catalog it needs to operate, but
different commands have different requirements about what to resolve. A few
examples:
- DROP NAMESPACE: only need to know the name of the namespace.
- DESC NAMESPACE: need to lookup the namespace and get metadata, but is done
during execution
- DROP TABLE: need to do lookup and make sure it's a table not (temp) view.
- DESC TABLE: need to lookup the table and get metadata.
For namespaces, the analyzer only needs to find the catalog and the namespace
name. The command can do lookup during execution if needed.
For tables, mostly commands need the analyzer to do lookup.
Note that, table and namespace have a difference: DESC NAMESPACE testcat works
and describes the root namespace under testcat, while DESC TABLE testcat fails
if there is no table testcat under the current catalog. It's because namespaces
can be named [], but tables can't. The commands should explicitly specify it
needs to operate on namespace or table.
In this Pull Request, we introduce a new framework to resolve v2 commands:
parser creates logical plans or commands with
UnresolvedNamespace/UnresolvedTable/UnresolvedView/UnresolvedRelation. (CREATE
TABLE still keeps Seq[String], as it doesn't need to look up relations)
analyzer converts
- 2.1 UnresolvedNamespace to ResolvesNamespace (contains catalog and namespace
identifier)
- 2.2 UnresolvedTable to ResolvedTable (contains catalog, identifier and Table)
- 2.3 UnresolvedView to ResolvedView (will be added later when we migrate view
commands)
- 2.4 UnresolvedRelation to relation.
an extra analyzer rule to match commands with V1Table and converts them to
corresponding v1 commands. This will be added later when we migrate existing
commands
planner matches commands and converts them to the corresponding physical nodes.
We also introduce brand new v2 commands - the comment syntaxes to illustrate
how to work with the newly added framework.
{code:java}
COMMENT ON (DATABASE|SCHEMA|NAMESPACE) ... IS ...
COMMENT ON TABLE ... IS ...
{code}
Details about the comment syntaxes:
As the new design of catalog v2, some properties become reserved, e.g.
location, comment. We are going to disable setting reserved properties by
dbproperties or tblproperites directly to avoid confliction with their related
subClause or specific commands.
They are the best practices from PostgreSQL and presto.
https://www.postgresql.org/docs/12/sql-comment.html
https://prestosql.io/docs/current/sql/comment.html
> A new framework to resolve v2 commands with a case of COMMENT ON syntax
> implementation
> --------------------------------------------------------------------------------------
>
> Key: SPARK-30214
> URL: https://issues.apache.org/jira/browse/SPARK-30214
> Project: Spark
> Issue Type: Improvement
> Components: SQL
> Affects Versions: 3.0.0
> Reporter: Kent Yao
> Priority: Major
>
> Currently, we have a v2 adapter for v1 catalog (V2SessionCatalog), all the
> table/namespace commands can be implemented via v2 APIs.
> Usually, a command needs to know which catalog it needs to operate, but
> different commands have different requirements about what to resolve. A few
> examples:
> - DROP NAMESPACE: only need to know the name of the namespace.
> - DESC NAMESPACE: need to lookup the namespace and get metadata, but is done
> during execution
> - DROP TABLE: need to do lookup and make sure it's a table not (temp) view.
> - DESC TABLE: need to lookup the table and get metadata.
> For namespaces, the analyzer only needs to find the catalog and the namespace
> name. The command can do lookup during execution if needed.
> For tables, mostly commands need the analyzer to do lookup.
> Note that, table and namespace have a difference: DESC NAMESPACE testcat
> works and describes the root namespace under testcat, while DESC TABLE
> testcat fails if there is no table testcat under the current catalog. It's
> because namespaces can be named [], but tables can't. The commands should
> explicitly specify it needs to operate on namespace or table.
> In this Pull Request, we introduce a new framework to resolve v2 commands:
> parser creates logical plans or commands with
> UnresolvedNamespace/UnresolvedTable/UnresolvedView/UnresolvedRelation.
> (CREATE TABLE still keeps Seq[String], as it doesn't need to look up
> relations)
> analyzer converts
> - UnresolvedNamespace to ResolvesNamespace (contains catalog and namespace
> identifier)
> - UnresolvedTable to ResolvedTable (contains catalog, identifier and Table)
> - UnresolvedView to ResolvedView (will be added later when we migrate view
> commands)
> - UnresolvedRelation to relation.
> an extra analyzer rule to match commands with V1Table and converts them to
> corresponding v1 commands. This will be added later when we migrate existing
> commands
> planner matches commands and converts them to the corresponding physical
> nodes.
> We also introduce brand new v2 commands - the comment syntaxes to illustrate
> how to work with the newly added framework.
> {code:java}
> COMMENT ON (DATABASE|SCHEMA|NAMESPACE) ... IS ...
> COMMENT ON TABLE ... IS ...
> {code}
> Details about the comment syntaxes:
> As the new design of catalog v2, some properties become reserved, e.g.
> location, comment. We are going to disable setting reserved properties by
> dbproperties or tblproperites directly to avoid confliction with their
> related subClause or specific commands.
> They are the best practices from PostgreSQL and presto.
> https://www.postgresql.org/docs/12/sql-comment.html
> https://prestosql.io/docs/current/sql/comment.html
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]