[
https://issues.apache.org/jira/browse/DRILL-3524?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14995645#comment-14995645
]
B Anil Kumar commented on DRILL-3524:
-------------------------------------
As Mongo collections doesn't have any schema. It doesn't make sense to
implement this feature.
> Drill proper DESCRIBE support for MongoDB
> -----------------------------------------
>
> Key: DRILL-3524
> URL: https://issues.apache.org/jira/browse/DRILL-3524
> Project: Apache Drill
> Issue Type: Bug
> Components: Metadata, Storage - MongoDB
> Affects Versions: 1.1.0
> Reporter: Hari Sekhon
> Fix For: Future
>
>
> Request to add full DESCRIBE support for MongoDB collections.
> I understand this may be difficult / sub-optimal due to the flexible schema
> nature of Mongo docs but if you can tabulate results when reading directly
> from MongoDB for which you have read the field names, then it's also possible
> to extract all field names to present for the describe command, albeit an
> inefficient scan to do so.
> Currently describe returns a pseudo / inaccurate / unhelpful metadata:
> {code}+--------------+------------+--------------+
> | COLUMN_NAME | DATA_TYPE | IS_NULLABLE |
> +--------------+------------+--------------+
> | * | ANY | YES |
> +--------------+------------+--------------+{code}
> Perhaps you could extend DESCRIBE to scan the first few dozen docs by default
> to create a merged schema as well as adding an optional argument to the
> describe command to allow for scanning a user-specified number of docs from
> which to describe the schema, or an ALL argument keyword to describe to scan
> all docs in a collection to get the complete global schema for the collection?
> In case of schema evolution it might be an interesting option to additionally
> read the newest and oldest records, maybe the first and last records by ID
> etc.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)