Hari Sekhon created DRILL-3526:
----------------------------------
Summary: Drill proper DESCRIBE support for JSON
Key: DRILL-3526
URL: https://issues.apache.org/jira/browse/DRILL-3526
Project: Apache Drill
Issue Type: Bug
Components: Metadata, Storage - JSON
Affects Versions: 1.1.0
Reporter: Hari Sekhon
Assignee: Steven Phillips
Request to add full DESCRIBE support for JSON files.
Currently the describe command results in a blank table being printed instead
of the schema, which is unhelpful, so I do a select * limit 1 instead.
While trying to describe lots of JSON data could be inefficient, I propose the
following solution:
Read JSON records until a threshold of a few thousand JSON file records or few
tens of thousands of fields have been read without discovering any new fields,
and then assume that is the schema.
Extend the DESCRIBE command to have a user-configurable number of records /
fields to read (or rather number of records / fields to read without which any
new fields have been discovered) to present a merged schema for the data
source, as well as an ALL keywords to scan all JSON files and records to create
true global schema.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)