Table rows in DESCRIBE TABLE EXTENDED for v2 tables and views [spark]

via GitHub Thu, 30 Apr 2026 07:08:02 -0700


cloud-fan opened a new pull request, #55625:
URL: https://github.com/apache/spark/pull/55625


   ### What changes were proposed in this pull request?
   
   Standardize the `# Detailed Table Information` / `# Detailed View
   Information` block in `DESCRIBE TABLE EXTENDED` output for v2 tables
   and views to emit three structured rows derived from the resolved
   identifier, replacing:
   
   - For tables (`DescribeTableExec`): the single `Name` row that came
     from `Table.name()` is replaced by `Catalog`, `Namespace`, `Table`.
   - For views (`DescribeV2ViewExec`): the `Catalog` + `Identifier` pair
     (where `Identifier` was a single string concatenating namespace and
     name with `.`) is replaced by `Catalog`, `Namespace`, `View`.
   
   The catalog name and resolved `Identifier` are threaded from
   `ResolvedTable` / `ResolvedPersistentView` through the v2 execs.
   `DescribeTablePartitionExec` is updated to pass the catalog name to
   the inner `DescribeTableExec` it constructs for the schema/partition
   header.
   
   The `Namespace` row uses `Identifier.namespace().quoted` —
   dot-separated, with back-tick quoting only on segments that need it
   — matching the existing Spark convention used elsewhere for
   multi-segment namespaces. This keeps the row round-trip-safe for
   namespaces with dots in segments while staying readable for the
   common single-level case.
   
   ### Why are the changes needed?
   
   In a multi-catalog deployment, the catalog name is a first-class part
   of a v2 table or view identifier. The previous output buried it
   inside connector-controlled strings:
   
   - `Table.name()` for tables is connector-defined; some connectors
     return `catalog.namespace.name`, others just `namespace.name`,
     others use a custom format. The result is that `DESCRIBE TABLE`
     output looks different across catalogs even for the same logical
     table shape.
   - `Identifier` for v2 views collapsed namespace and name into a
     single dotted string, so consumers had to parse the dot back out
     and could not unambiguously round-trip multi-level namespaces with
     dots in segments.
   
   Splitting the components into `Catalog`, `Namespace`, and `Table` /
   `View` rows:
   - gives `DESCRIBE TABLE EXTENDED` a uniform shape across v2 connectors,
   - makes the catalog name explicit and surfaceable when multiple v2
     catalogs are configured,
   - handles multi-level namespaces naturally via
     `Identifier.namespace().quoted`,
   - aligns the table and view sections so consumers can read the same
     three rows from either, switching only on the section header
     (`# Detailed Table Information` vs `# Detailed View Information`),
   - is parseable programmatically without splitting strings.
   
   ### Does this PR introduce any user-facing change?
   
   Yes, slight output change in `DESCRIBE TABLE EXTENDED` for v2 tables
   and v2 views.
   
   For v2 tables:
   - Before: a single `Name` row (e.g. `Name | testcat.ns.t | `)
   - After: three rows (`Catalog | testcat | `, `Namespace | ns | `,
     `Table | t | `).
   
   For v2 views:
   - Before: `Catalog | testcat | ` followed by `Identifier | ns.v | `
   - After: `Catalog | testcat | `, `Namespace | ns | `, `View | v | `.
   
   v1 paths (session-catalog tables and views via HMS) are unchanged.
   Tools that read DESCRIBE output should switch from concatenating
   `Name` / `Identifier` to reading the three structured rows.
   
   ### How was this patch tested?
   
   - Updated the affected golden assertion in `DescribeTableSuite`
     (`DESCRIBE TABLE EXTENDED of a partitioned table`) to match the new
     row layout.
   - Added a focused test
     `DESCRIBE TABLE EXTENDED emits structured Catalog/Namespace/Table
     rows` in v2 `DescribeTableSuite` that pins the three structured
     rows on a freshly created v2 table.
   - Removed the now-redundant `DESCRIBE TABLE EXTENDED on a non-view
     MetadataTable shows the real identifier` test in
     `DataSourceV2MetadataTableSuite` (the structured-row layout is
     what's pinned by the new test in `v2.DescribeTableSuite`; the
     identifier-passthrough behavior is no longer tied to
     `MetadataTable.name()`).
   
   Ran:
   
     build/sbt 'sql/testOnly \
       org.apache.spark.sql.execution.command.v2.DescribeTableSuite \
       org.apache.spark.sql.connector.DataSourceV2MetadataTableSuite \
       org.apache.spark.sql.connector.DataSourceV2MetadataViewSuite'
   
   ### Was this patch authored or co-authored using generative AI tooling?
   
   Generated-by: Claude Opus 4.7


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[PR] [SPARK-XXXXX][SQL] Use structured Catalog/Namespace/Table rows in DESCRIBE TABLE EXTENDED for v2 tables and views [spark]

Reply via email to