Tomer - I wrote a hacky Python script to get it all out and came up with 263 values..
Chris - This is an interesting problem. If there were some semantics in the json plugin to handle this, it would be helpful. The challenge as I was thinking through it is mongo isn't consistent in how it outputs the data, in the three I included in the thread its _id, created_on, and the dynamic myvalues, however, looking through the data, there are other records where created_on is the last one, so even parsing it as an array, and using the array value to reference it doesn't help. This is a challenge! On Mon, Sep 21, 2015 at 7:52 PM, Tomer Shiran <[email protected]> wrote: > How many different myvalueX are there in your dataset? (In the example > below you have 3.) is it a small, known set, or could it be anything? > > > > > On Sep 21, 2015, at 12:53 PM, John Omernik <[email protected]> wrote: > > > > Sorry about that, premature send. Here are some records. As you can see > > the myvalue1-3 is in the top level of the record, ideally I'd run thte > > kvgen on the myvalue records, but I have no way to address those. I > tried > > kvgen() on * for and that failed. Not sure how to address this in json, > > yes, I know it's poorly formatted, but it's what I have been given. > > > > > > > > > > > > { "_id" : "127.0.0.1", "created_on" : "2014-02-18 14:52:23", "myvalue1" > : { > > "source" : "somestuff", "context" : "Context here", "last_seen" : > > "2014-02-11 00:00:00", "refreshed" : "2014-03-12 18:14:23" } } > > { "_id" : "127.0.0.2", "created_on" : "2014-02-18 14:52:08", "myvalue2" > : { > > "source" : "otherstuff", "context" : "Special context", "last_seen" : > > "2014-02-26 18:14:05", "refreshed" : "2014-02-26 18:14:05" } } > > { "_id" : "127.0.0.3", "created_on" : "2014-04-25 00:08:17", "myvalue3" > : { > > "source" : "oops", "context" : "Other Context, "last_seen" : "2014-04-25 > > 05:32:08", "refreshed" : "2014-04-25 05:32:08" } } > > > >> On Mon, Sep 21, 2015 at 2:52 PM, John Omernik <[email protected]> wrote: > >> > >> The challenge I have is the data was poorly formatted, here's some > records > >> > >> > >> > >> > >> > >>> On Mon, Sep 21, 2015 at 2:14 PM, Jim Scott <[email protected]> > wrote: > >>> > >>> I do believe KVGEN will meet your needs: > >>> https://drill.apache.org/docs/kvgen/ > >>> > >>>> On Mon, Sep 21, 2015 at 2:11 PM, John Omernik <[email protected]> > wrote: > >>>> > >>>> I have some poorly developed json where the developer used data for > key > >>>> names > >>>> > >>>> {"created":"2015-12-01", "ZYS":"BLAH"} > >>>> {"created":"2015-12-01", "ZYX":"BLAH"} > >>>> {"created":"2015-12-01", "ABC":"BLAH"} > >>>> {"created":"2015-12-01", "ADS":"BLAH"} > >>>> > >>>> I'd like to somehow map the key name to a value and give it a generic > >>> name > >>>> > >>>> select `created`, somemagic() as value1 from table > >>>> > >>>> Not sure how this would work, or if it's possible, or how I'd even > >>>> reference that, but thought I would ask. > >>>> > >>>> John > >>> > >>> > >>> > >>> -- > >>> *Jim Scott* > >>> Director, Enterprise Strategy & Architecture > >>> +1 (347) 746-9281 > >>> @kingmesal <https://twitter.com/kingmesal> > >>> > >>> <http://www.mapr.com/> > >>> [image: MapR Technologies] <http://www.mapr.com> > >>> > >>> Now Available - Free Hadoop On-Demand Training > >>> < > >>> > http://www.mapr.com/training?utm_source=Email&utm_medium=Signature&utm_campaign=Free%20available > >> > >> >
