Amanda Liu created SPARK-50541:
----------------------------------
Summary: Describe Table As JSON
Key: SPARK-50541
URL: https://issues.apache.org/jira/browse/SPARK-50541
Project: Spark
Issue Type: Task
Components: SQL
Affects Versions: 4.0.0
Reporter: Amanda Liu
Support DESCRIBE TABLE ... [AS JSON] option to display table metadata in JSON
format.
*Context:*
The Spark SQL command DESCRIBE TABLE displays table metadata in a DataFrame
format geared toward human consumption. This format causes parsing challenges,
e.g. if fields contain special characters or the format changes as new features
are added. [DBT|https://www.getdbt.com/] is an example customer that motivates
this proposal, as providing a structured JSON format can help prevent breakages
in pipelines that depend on parsing table metadata.
The new AS JSON option would return the table metadata as a JSON string that
supports parsing via machine, while being extensible with a minimized risk of
breaking changes. It is not meant to be human-readable.
*SQL Ref Spec:*
{ DESC | DESCRIBE } [ TABLE ] [ EXTENDED | FORMATTED ] table_name \{ [
PARTITION clause ] | [ column_name ] } [ AS JSON ]
*JSON Schema:*
```
{
"table_name": "<table_name>",
"catalog_name": [...],
"database_name": [...],
"qualified_name": "<qualified_name>"
"type": "<table_type>",
"provider": "<provider>",
"columns": [
{
"id": 1,
"name": "<name>",
"type": <type_json>,
"comment": "<comment>",
"default": "<default_val>"
}
],
"partition_values": {
"<col_name>": "<val>"
},
"location": "<path>",
"view_definition": "<view_defn>",
"owner": "<owner>",
"comment": "<comment>",
"table_properties": {
"property1": "<property1>",
"property2": "<property2>"
},
"storage_properties": {
"property1": "<property1>",
"property2": "<property2>"
},
"serde_library": "<serde_library>",
"inputformat": "<input_format>",
"outputformt": "<output_format>",
"bucket_columns": [<col_name>],
"sort_columns": [<col_name>],
"created_time": "<timestamp>",
"last_access": "<timestamp>",
"partition_provider": "<partition_provider>"
}
```
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]