I've got some data with the following structure:
{
"SpotPriceHistory": [
{
"ProductDescription": "Linux/UNIX",
"AvailabilityZone": "us-east-1a",
"Timestamp": "2015-01-01T13:26:16.000Z",
"InstanceType": "cc2.8xlarge",
"SpotPrice": "0.256900"
},
{
"ProductDescription": "Linux/UNIX",
"AvailabilityZone": "us-east-1a",
"Timestamp": "2015-01-01T13:01:38.000Z",
"InstanceType": "cc2.8xlarge",
"SpotPrice": "0.256800"
},
...
]
}
In exploring it with Drill, I'm able to get as far as:
> select * from (select kvgen(flatten(SpotPriceHistory)) as t from
dfs.`/tmp/spot_price_history.json`);
Which yields rows that each contain a list of key-value pairs, which seems
like what I want:
+------------+
| t |
+------------+
|
[{"key":"ProductDescription","value":"Linux/UNIX"},{"key":"AvailabilityZone","value":"us-east-1a"},{"key":"Timestamp","value":"2014-12-20T11:48:46.000Z"},{"key":"InstanceType","value":"cc2.8xlarge"},{"key":"SpotPrice","value":"0.256900"}]
|
|
[{"key":"ProductDescription","value":"Linux/UNIX"},{"key":"AvailabilityZone","value":"us-east-1a"},{"key":"Timestamp","value":"2014-12-20T11:40:49.000Z"},{"key":"InstanceType","value":"cc2.8xlarge"},{"key":"SpotPrice","value":"0.257000"}]
|
|
[{"key":"ProductDescription","value":"Linux/UNIX"},{"key":"AvailabilityZone","value":"us-east-1a"},{"key":"Timestamp","value":"2014-12-20T11:36:46.000Z"},{"key":"InstanceType","value":"cc2.8xlarge"},{"key":"SpotPrice","value":"0.257100"}]
|
But I have not figured out how to pull those keys up into column names so
that I can do a query like:
> select SpotPrice,Timestamp from t
I'll appreciate any tips.
Thanks,
--vince