[
https://issues.apache.org/jira/browse/DRILL-3524?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Hari Sekhon updated DRILL-3524:
-------------------------------
Component/s: Storage - MongoDB
> Drill proper DESCRIBE support for MongoDB
> -----------------------------------------
>
> Key: DRILL-3524
> URL: https://issues.apache.org/jira/browse/DRILL-3524
> Project: Apache Drill
> Issue Type: Bug
> Components: Metadata, Storage - MongoDB
> Affects Versions: 1.1.0
> Reporter: Hari Sekhon
> Assignee: Steven Phillips
>
> Request to add full DESCRIBE support for MongoDB collections.
> I understand this may be difficult / sub-optimal due to the flexible schema
> nature of Mongo docs but if you can tabulate results when reading directly
> from MongoDB for which you have read the field names, then it's also possible
> to extract all field names to present for the describe command, albeit an
> inefficient scan to do so.
> Currently describe returns a pseudo / inaccurate / unhelpful metadata:
> {code}+--------------+------------+--------------+
> | COLUMN_NAME | DATA_TYPE | IS_NULLABLE |
> +--------------+------------+--------------+
> | * | ANY | YES |
> +--------------+------------+--------------+{code}
> Perhaps you could extend DESCRIBE to scan the first few dozen docs by default
> to create a merged schema as well as adding an optional argument to the
> describe command to allow for scanning a user-specified number of docs from
> which to describe the schema, or an ALL argument keyword to describe to scan
> all docs in a collection to get the complete global schema for the collection?
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)