cloud-fan opened a new pull request, #55625:
URL: https://github.com/apache/spark/pull/55625
### What changes were proposed in this pull request?
Standardize the `# Detailed Table Information` / `# Detailed View
Information` block in `DESCRIBE TABLE EXTENDED` output for v2 tables
and views to emit three structured rows derived from the resolved
identifier, replacing:
- For tables (`DescribeTableExec`): the single `Name` row that came
from `Table.name()` is replaced by `Catalog`, `Namespace`, `Table`.
- For views (`DescribeV2ViewExec`): the `Catalog` + `Identifier` pair
(where `Identifier` was a single string concatenating namespace and
name with `.`) is replaced by `Catalog`, `Namespace`, `View`.
The catalog name and resolved `Identifier` are threaded from
`ResolvedTable` / `ResolvedPersistentView` through the v2 execs.
`DescribeTablePartitionExec` is updated to pass the catalog name to
the inner `DescribeTableExec` it constructs for the schema/partition
header.
The `Namespace` row uses `Identifier.namespace().quoted` —
dot-separated, with back-tick quoting only on segments that need it
— matching the existing Spark convention used elsewhere for
multi-segment namespaces. This keeps the row round-trip-safe for
namespaces with dots in segments while staying readable for the
common single-level case.
### Why are the changes needed?
In a multi-catalog deployment, the catalog name is a first-class part
of a v2 table or view identifier. The previous output buried it
inside connector-controlled strings:
- `Table.name()` for tables is connector-defined; some connectors
return `catalog.namespace.name`, others just `namespace.name`,
others use a custom format. The result is that `DESCRIBE TABLE`
output looks different across catalogs even for the same logical
table shape.
- `Identifier` for v2 views collapsed namespace and name into a
single dotted string, so consumers had to parse the dot back out
and could not unambiguously round-trip multi-level namespaces with
dots in segments.
Splitting the components into `Catalog`, `Namespace`, and `Table` /
`View` rows:
- gives `DESCRIBE TABLE EXTENDED` a uniform shape across v2 connectors,
- makes the catalog name explicit and surfaceable when multiple v2
catalogs are configured,
- handles multi-level namespaces naturally via
`Identifier.namespace().quoted`,
- aligns the table and view sections so consumers can read the same
three rows from either, switching only on the section header
(`# Detailed Table Information` vs `# Detailed View Information`),
- is parseable programmatically without splitting strings.
### Does this PR introduce any user-facing change?
Yes, slight output change in `DESCRIBE TABLE EXTENDED` for v2 tables
and v2 views.
For v2 tables:
- Before: a single `Name` row (e.g. `Name | testcat.ns.t | `)
- After: three rows (`Catalog | testcat | `, `Namespace | ns | `,
`Table | t | `).
For v2 views:
- Before: `Catalog | testcat | ` followed by `Identifier | ns.v | `
- After: `Catalog | testcat | `, `Namespace | ns | `, `View | v | `.
v1 paths (session-catalog tables and views via HMS) are unchanged.
Tools that read DESCRIBE output should switch from concatenating
`Name` / `Identifier` to reading the three structured rows.
### How was this patch tested?
- Updated the affected golden assertion in `DescribeTableSuite`
(`DESCRIBE TABLE EXTENDED of a partitioned table`) to match the new
row layout.
- Added a focused test
`DESCRIBE TABLE EXTENDED emits structured Catalog/Namespace/Table
rows` in v2 `DescribeTableSuite` that pins the three structured
rows on a freshly created v2 table.
- Removed the now-redundant `DESCRIBE TABLE EXTENDED on a non-view
MetadataTable shows the real identifier` test in
`DataSourceV2MetadataTableSuite` (the structured-row layout is
what's pinned by the new test in `v2.DescribeTableSuite`; the
identifier-passthrough behavior is no longer tied to
`MetadataTable.name()`).
Ran:
build/sbt 'sql/testOnly \
org.apache.spark.sql.execution.command.v2.DescribeTableSuite \
org.apache.spark.sql.connector.DataSourceV2MetadataTableSuite \
org.apache.spark.sql.connector.DataSourceV2MetadataViewSuite'
### Was this patch authored or co-authored using generative AI tooling?
Generated-by: Claude Opus 4.7
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]