This reminds me of
https://github.com/databricks/spark-xml/issues/141#issuecomment-234835577

Maybe using explode() would be helpful.

Thanks!

2016-10-19 14:05 GMT+09:00 Divya Gehlot <divya.htco...@gmail.com>:

> http://stackoverflow.com/questions/33864389/how-can-i-
> create-a-spark-dataframe-from-a-nested-array-of-struct-element
>
> Hope this helps
>
>
> Thanks,
> Divya
>
> On 19 October 2016 at 11:35, lk_spark <lk_sp...@163.com> wrote:
>
>> hi,all:
>> I want to read a json file and search it by sql .
>> the data struct should be :
>>
>> bid: string (nullable = true)
>> code: string (nullable = true)
>>
>> and the json file data should be like :
>>      {bid":"MzI4MTI5MzcyNw==","code":"罗甸网警"}
>>      {"bid":"MzI3MzQ5Nzc2Nw==","code":"西早君"}
>> but in fact my json file data is :
>>     {"bizs":[ {bid":"MzI4MTI5MzcyNw==","code
>> ":"罗甸网警"},{"bid":"MzI3MzQ5Nzc2Nw==","code":"西早君"}]}
>>     {"bizs":[ {bid":"MzI4MTI5Mzcy00==","code
>> ":"罗甸网警"},{"bid":"MzI3MzQ5Nzc201==","code":"西早君"}]}
>> I load it by spark ,data schema shows like this :
>>
>> root
>>  |-- bizs: array (nullable = true)
>>  |    |-- element: struct (containsNull = true)
>>  |    |    |-- bid: string (nullable = true)
>>  |    |    |-- code: string (nullable = true)
>>
>>
>> I can select columns by : df.select("bizs.id","bizs.name")
>> but the colume values is in array type:
>> +--------------------+--------------------+
>> |                  id|                code|
>> +--------------------+--------------------+
>> |[4938200, 4938201...|[罗甸网警, 室内设计师杨焰红, ...|
>> |[4938300, 4938301...|[SDCS十全九美, 旅梦长大, ...|
>> |[4938400, 4938401...|[日重重工液压行走回转, 氧老家,...|
>> |[4938500, 4938501...|[PABXSLZ, 陈少燕, 笑蜜...|
>> |[4938600, 4938601...|[税海微云, 西域美农云家店, 福...|
>> +--------------------+--------------------+
>>
>> what I want is I can read colum in normal row type. how I can do it ?
>> 2016-10-19
>> ------------------------------
>> lk_spark
>>
>
>

Reply via email to