[ https://issues.apache.org/jira/browse/DRILL-4710?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Pritesh Maker updated DRILL-4710: --------------------------------- Fix Version/s: Future > Document Drill's JSON processing rules > -------------------------------------- > > Key: DRILL-4710 > URL: https://issues.apache.org/jira/browse/DRILL-4710 > Project: Apache Drill > Issue Type: Improvement > Components: Documentation > Reporter: Paul Rogers > Priority: Minor > Fix For: Future > > > One of Drill's key benefits is the ability to query JSON-formatted data. Much > great work has been done. But, unless someone happens to be a Drill > developer, the details of exactly how Drill handles various JSON formats can > be hard to find. > We should document how Drill handles various JSON scenarios. > * SELECT * (schema inferred) > * SELECT a, b, c (schema implied by query) > And various JSON structures: > * Top-level structure (list of maps. Can we handle an array of maps? A list > of scalars?) > * Changes of the top-level map structure across rows. > ** New field appears later in the file. (Was {a: 1, b: "s"}, now is {a: 1, b: > "s", c: 10} > ** Fields disappear later in the file > ** Fields change type > ** Start of file has many nulls for a field, later in file has non-null > values. > * How Drill handles array fields > ** Array field is null: { a: [10, 20]}, { a: null } > ** Array contains nulls: { a: [10, null, 20] } > ** Array contains single scalar type (number or string) > ** Array contains multiple scalar types (number and string) > ** Aray contains structured types (array, map) > * How Drill handles nested maps > ** Explicit select: a, b.c, b.d: {a: 1, b: { c: "s", d: 10 }} > ** Implicit select: * > ** How data is delivered to Drill client > ** How data is delivered to JDBC/ODBC clients > * Size issues > ** Very large records (what is max size?) > ** Very large strings > ** Vary large arrays > Naming > * Support for case-sensitive names: { a: 1, A: "foo" } > The above is legal JSON, but causes problems with the case-insensitive naming > rules of Drill > Along with any other detailed information not covered by the above list. -- This message was sent by Atlassian JIRA (v7.6.3#76005)