Re: Spark SQL JSON dataset query nested datastructures

Michael Armbrust Sun, 10 Aug 2014 14:52:26 -0700

Sounds like you need to use lateral view with explode
<https://cwiki.apache.org/confluence/display/Hive/LanguageManual+LateralView>,
which is supported in Spark SQL's HiveContext.




On Sat, Aug 9, 2014 at 6:43 PM, Sathish Kumaran Vairavelu <
vsathishkuma...@gmail.com> wrote:

> I have a simple JSON dataset as below. How do I query all parts.lock for
> the id=1.
>
> JSON: { "id": 1, "name": "A green door", "price": 12.50, "tags": ["home",
> "green"], "parts" : [ { "lock" : "One lock", "key" : "single key" }, {
> "lock" : "2 lock", "key" : "2 key" } ] }
>
> Query: select id,name,price,parts.lockfrom product where id=1
>
> The point is if I use parts[0].lock it will return 1 row as below:
>
> {u'price': 12.5, u'id': 1, u'.lock': {u'lock': u'One lock', u'key':
> u'single key'}, u'name': u'A green door'}
>
> But I want to return all the locks the in the parts structure. It will
> return multiple rows but thats the one I am looking for. This kind of a
> relational join which I want to accomplish.
>
> Please help me with this
>

Re: Spark SQL JSON dataset query nested datastructures

Reply via email to