Stamatis Zampetakis created HIVE-29357:
------------------------------------------
Summary: Change CBOPlan in EXPLAIN FORMATTED from plain string to
JSON object
Key: HIVE-29357
URL: https://issues.apache.org/jira/browse/HIVE-29357
Project: Hive
Issue Type: Improvement
Components: SQL
Affects Versions: 4.2.0
Reporter: Stamatis Zampetakis
Assignee: Stamatis Zampetakis
Currently the value of CBOPlan attribute in the result of EXPLAIN FORMATTED is
a plain string.
{code:sql}
CREATE TABLE person (id INTEGER, country STRING);
EXPLAIN FORMATTED CBO SELECT country FROM person;
{code}
{code:json}
{"CBOPlan":"{\n \"rels\": [\n {\n \"id\": \"0\",\n \"relOp\":
\"org.apache.hadoop.hive.ql.optimizer.calcite.reloperators.HiveTableScan\",\n
\"table\": [\n \"default\",\n \"person\"\n ],\n
\"table:alias\": \"person\",\n \"inputs\": [],\n \"rowCount\": 1.0,\n
\"avgRowSize\": 233.0,\n \"rowType\": {\n \"fields\": [\n
{\n \"type\": \"INTEGER\",\n \"nullable\": true,\n
\"name\": \"id\"\n },\n {\n \"type\":
\"VARCHAR\",\n \"nullable\": true,\n \"precision\":
2147483647,\n \"name\": \"country\"\n },\n {\n
\"type\": \"BIGINT\",\n \"nullable\": true,\n
\"name\": \"BLOCK__OFFSET__INSIDE__FILE\"\n },\n {\n
\"type\": \"VARCHAR\",\n \"nullable\": true,\n
\"precision\": 2147483647,\n \"name\": \"INPUT__FILE__NAME\"\n
},\n {\n \"fields\": [\n {\n
\"type\": \"BIGINT\",\n \"nullable\": true,\n
\"name\": \"writeid\"\n },\n {\n
\"type\": \"INTEGER\",\n \"nullable\": true,\n
\"name\": \"bucketid\"\n },\n {\n
\"type\": \"BIGINT\",\n \"nullable\": true,\n
\"name\": \"rowid\"\n }\n ],\n \"nullable\":
true,\n \"name\": \"ROW__ID\"\n },\n {\n
\"type\": \"BOOLEAN\",\n \"nullable\": true,\n \"name\":
\"ROW__IS__DELETED\"\n }\n ],\n \"nullable\": false\n
},\n \"colStats\": [\n {\n \"name\": \"country\",\n
\"ndv\": 1\n },\n {\n \"name\": \"id\",\n
\"ndv\": 1,\n \"minValue\": -2147483648,\n \"maxValue\":
2147483647\n }\n ]\n },\n {\n \"id\": \"1\",\n
\"relOp\":
\"org.apache.hadoop.hive.ql.optimizer.calcite.reloperators.HiveProject\",\n
\"fields\": [\n \"country\"\n ],\n \"exprs\": [\n {\n
\"input\": 1,\n \"name\": \"$1\"\n }\n ],\n
\"rowCount\": 1.0\n }\n ]\n}"}
{code}
Observe that value of CBOPlan is in fact a JSON object so wrapping it in a
string has various drawbacks:
* Bigger size with lots of unnecessary whitespace and escaped characters
* Poor readability since it cannot be formatted by a JSON processors
* Deserialization overhead since consumers need to read the value of "CBOPlan"
and retransform it to a JSON object in order to process it further
The goal is to return the value of CBOPlan as a JSON object:
{code:json}
{"CBOPlan":{"rels":[{"id":"0","relOp":"org.apache.hadoop.hive.ql.optimizer.calcite.reloperators.HiveTableScan","table":["default","person"],"table:alias":"person","inputs":[],"rowCount":1,"avgRowSize":233,"rowType":{"fields":[{"type":"INTEGER","nullable":true,"name":"id"},{"type":"VARCHAR","nullable":true,"precision":2147483647,"name":"country"},{"type":"BIGINT","nullable":true,"name":"BLOCK__OFFSET__INSIDE__FILE"},{"type":"VARCHAR","nullable":true,"precision":2147483647,"name":"INPUT__FILE__NAME"},{"fields":[{"type":"BIGINT","nullable":true,"name":"writeid"},{"type":"INTEGER","nullable":true,"name":"bucketid"},{"type":"BIGINT","nullable":true,"name":"rowid"}],"nullable":true,"name":"ROW__ID"},{"type":"BOOLEAN","nullable":true,"name":"ROW__IS__DELETED"}],"nullable":false},"colStats":[{"name":"country","ndv":1},{"name":"id","ndv":1,"minValue":-2147483648,"maxValue":2147483647}]},{"id":"1","relOp":"org.apache.hadoop.hive.ql.optimizer.calcite.reloperators.HiveProject","fields":["country"],"exprs":[{"input":1,"name":"$1"}],"rowCount":1}]}}
{code}
--
This message was sent by Atlassian Jira
(v8.20.10#820010)