They would end up as two different columns. You would have to merge them into a single column using a SQL statement
Select COALESCE(taste , type) as taste_or_type from ... With that said I avoid using JSON these days with Apache Drill (or any SQL engine). Drill appears to sample like 30,0000? Json records to figure out what the schema is. If you have "new columns" after the first 30,000 json records they either don't show up will bomb your query. Today I just load up JSON files into a python program using json.loads() and then convert it into a table format (pandas or list of dictionaries) before writing it out to a tabular parquet file using pyarrow. https://arrow.apache.org/docs/python/parquet.html The only nested "json" format that is bullet proof is nested parquet which includes a full schema definition in the header of the file so you don't have to rely on hit or miss data sampling.. -----Original Message----- From: Salil Sanghvi <[email protected]> Sent: Thursday, February 28, 2019 12:35 AM To: [email protected] Subject: How to write dynamic queries in Apache Drill to query Nested JSON ... External Email: Use caution with links and attachments Hi, We have a requirement to query HBASE table which has single column with nested json: 1. What will be the query to convert the JSON data into table format. 2. Also the query should be generic enough to pick any new columns that may vary between different json blocks. 3. In below example the 1st JSON block has attribute "taste" and second JSON block has attribute "type", but the query should be generic enough to return both type and taste 4. Query should be able to extract any new attribute introduced in future. { "id": "0001", "taste": "donut", "name": "Cake", "ppu": 0.55, "batters": { "batter": [ { "id": "1001", "type": "Regular" },}, { "id": "1004", "type": "Devil's Food" } ] }, "topping": [ { "id": "5001", "type": "None" }, { "id": "5002", "type": "Glazed" } ] } { "id": "0002", "type": "donut", "name": "Cake", "ppu": 0.55, "batters": { "batter": [ { "id": "1001", "type": "Regular" }, { "id": "1004", "type": "Devil's Food" } ] }, "topping": [ { "id": "5001", "type": "None" }, { "id": "5004", "type": "Maple" } ] } This message may contain information that is confidential or privileged. If you are not the intended recipient, please advise the sender immediately and delete this message. See http://www.blackrock.com/corporate/compliance/email-disclaimers for further information. Please refer to http://www.blackrock.com/corporate/compliance/privacy-policy for more information about BlackRock’s Privacy Policy. For a list of BlackRock's office addresses worldwide, see http://www.blackrock.com/corporate/about-us/contacts-locations. © 2019 BlackRock, Inc. All rights reserved.
