Sounds like you need to use lateral view with explode <https://cwiki.apache.org/confluence/display/Hive/LanguageManual+LateralView>, which is supported in Spark SQL's HiveContext.
On Sat, Aug 9, 2014 at 6:43 PM, Sathish Kumaran Vairavelu < vsathishkuma...@gmail.com> wrote: > I have a simple JSON dataset as below. How do I query all parts.lock for > the id=1. > > JSON: { "id": 1, "name": "A green door", "price": 12.50, "tags": ["home", > "green"], "parts" : [ { "lock" : "One lock", "key" : "single key" }, { > "lock" : "2 lock", "key" : "2 key" } ] } > > Query: select id,name,price,parts.lockfrom product where id=1 > > The point is if I use parts[0].lock it will return 1 row as below: > > {u'price': 12.5, u'id': 1, u'.lock': {u'lock': u'One lock', u'key': > u'single key'}, u'name': u'A green door'} > > But I want to return all the locks the in the parts structure. It will > return multiple rows but thats the one I am looking for. This kind of a > relational join which I want to accomplish. > > Please help me with this >