Is this what you're looking for? select a.`t`.`ProductDescription` as `ProductDescription`, a.`t`.`AvailabilityZone` as `AvailabilityZone`, a.`t`.`Timestamp` as `Timestamp`, a.`t`.`InstanceType` as `InstanceType`, a.`t`.`SpotPrice` as `SpotPrice` from (select flatten(SpotPriceHistory) as t from dfs.`/data/tmp_json/1.json`) a; +--------------------+------------------+------------+--------------+------------+ | ProductDescription | AvailabilityZone | Timestamp | InstanceType | SpotPrice | +--------------------+------------------+------------+--------------+------------+ | Linux/UNIX | us-east-1a | 2015-01-01T13:26:16.000Z | cc2.8xlarge | 0.256900 | | Linux/UNIX | us-east-1a | 2015-01-01T13:01:38.000Z | cc2.8xlarge | 0.256800 | +--------------------+------------------+------------+--------------+------------+
On Thu, Jan 1, 2015 at 11:28 AM, Vince Gonzalez <[email protected]> wrote: > Mufy, your suggestion didn't work, but this got me further. Still wondering > if this is the best way. > > 0: jdbc:drill:zk=local> create table spot_price_history as (select > flatten(SpotPriceHistory) as t from dfs.`/tmp/spot_price_history.json`); > +------------+---------------------------+ > | Fragment | Number of records written | > +------------+---------------------------+ > | 0_0 | 638 | > +------------+---------------------------+ > 1 row selected (0.353 seconds) > 0: jdbc:drill:zk=local> select * from spot_price_history limit 5; > +------------+ > | t | > +------------+ > | > > {"ProductDescription":"Linux/UNIX","AvailabilityZone":"us-east-1a","Timestamp":"2015-01-01T13:26:16.000Z","InstanceType":"cc2.8xlarge","SpotPrice":"0.256900"} > | > | > > {"ProductDescription":"Linux/UNIX","AvailabilityZone":"us-east-1a","Timestamp":"2015-01-01T13:01:38.000Z","InstanceType":"cc2.8xlarge","SpotPrice":"0.256800"} > | > | > > {"ProductDescription":"Linux/UNIX","AvailabilityZone":"us-east-1a","Timestamp":"2015-01-01T12:21:21.000Z","InstanceType":"cc2.8xlarge","SpotPrice":"0.256900"} > | > | > > {"ProductDescription":"Linux/UNIX","AvailabilityZone":"us-east-1a","Timestamp":"2015-01-01T12:13:21.000Z","InstanceType":"cc2.8xlarge","SpotPrice":"0.257000"} > | > | > > {"ProductDescription":"Linux/UNIX","AvailabilityZone":"us-east-1a","Timestamp":"2015-01-01T11:57:05.000Z","InstanceType":"cc2.8xlarge","SpotPrice":"0.256900"} > | > +------------+ > 5 rows selected (0.052 seconds) > 0: jdbc:drill:zk=local> select t['SpotPrice'] from spot_price_history limit > 5; > +------------+ > | EXPR$0 | > +------------+ > | 0.256900 | > | 0.256800 | > | 0.256900 | > | 0.257000 | > | 0.256900 | > +------------+ > 5 rows selected (0.072 seconds) > > --- > Vince Gonzalez > Systems Engineer > 212.694.3879 > > mapr.com > > On Thu, Jan 1, 2015 at 10:06 AM, mufy <[email protected]> wrote: > > > Have you tried something like - > > > > select SpotPriceHistory[0].Timestamp from ... > > > > > > --- > > Mufeed Usman > > My LinkedIn <http://www.linkedin.com/pub/mufeed-usman/28/254/400> | My > > Social Cause <http://www.vision2016.org.in/> | My Blogs : LiveJournal > > <http://mufeed.livejournal.com> > > > > > > > > > > On Thu, Jan 1, 2015 at 7:25 PM, Vince Gonzalez <[email protected]> > wrote: > > > > > I've got some data with the following structure: > > > > > > { > > > "SpotPriceHistory": [ > > > { > > > "ProductDescription": "Linux/UNIX", > > > "AvailabilityZone": "us-east-1a", > > > "Timestamp": "2015-01-01T13:26:16.000Z", > > > "InstanceType": "cc2.8xlarge", > > > "SpotPrice": "0.256900" > > > }, > > > { > > > "ProductDescription": "Linux/UNIX", > > > "AvailabilityZone": "us-east-1a", > > > "Timestamp": "2015-01-01T13:01:38.000Z", > > > "InstanceType": "cc2.8xlarge", > > > "SpotPrice": "0.256800" > > > }, > > > ... > > > ] > > > } > > > > > > In exploring it with Drill, I'm able to get as far as: > > > > > > > select * from (select kvgen(flatten(SpotPriceHistory)) as t from > > > dfs.`/tmp/spot_price_history.json`); > > > > > > Which yields rows that each contain a list of key-value pairs, which > > seems > > > like what I want: > > > > > > +------------+ > > > | t | > > > +------------+ > > > | > > > > > > > > > [{"key":"ProductDescription","value":"Linux/UNIX"},{"key":"AvailabilityZone","value":"us-east-1a"},{"key":"Timestamp","value":"2014-12-20T11:48:46.000Z"},{"key":"InstanceType","value":"cc2.8xlarge"},{"key":"SpotPrice","value":"0.256900"}] > > > | > > > | > > > > > > > > > [{"key":"ProductDescription","value":"Linux/UNIX"},{"key":"AvailabilityZone","value":"us-east-1a"},{"key":"Timestamp","value":"2014-12-20T11:40:49.000Z"},{"key":"InstanceType","value":"cc2.8xlarge"},{"key":"SpotPrice","value":"0.257000"}] > > > | > > > | > > > > > > > > > [{"key":"ProductDescription","value":"Linux/UNIX"},{"key":"AvailabilityZone","value":"us-east-1a"},{"key":"Timestamp","value":"2014-12-20T11:36:46.000Z"},{"key":"InstanceType","value":"cc2.8xlarge"},{"key":"SpotPrice","value":"0.257100"}] > > > | > > > > > > But I have not figured out how to pull those keys up into column names > so > > > that I can do a query like: > > > > > > > select SpotPrice,Timestamp from t > > > > > > I'll appreciate any tips. > > > > > > Thanks, > > > > > > --vince > > > > > >
