This reminds me of https://github.com/databricks/spark-xml/issues/141#issuecomment-234835577
Maybe using explode() would be helpful. Thanks! 2016-10-19 14:05 GMT+09:00 Divya Gehlot <divya.htco...@gmail.com>: > http://stackoverflow.com/questions/33864389/how-can-i- > create-a-spark-dataframe-from-a-nested-array-of-struct-element > > Hope this helps > > > Thanks, > Divya > > On 19 October 2016 at 11:35, lk_spark <lk_sp...@163.com> wrote: > >> hi,all: >> I want to read a json file and search it by sql . >> the data struct should be : >> >> bid: string (nullable = true) >> code: string (nullable = true) >> >> and the json file data should be like : >> {bid":"MzI4MTI5MzcyNw==","code":"罗甸网警"} >> {"bid":"MzI3MzQ5Nzc2Nw==","code":"西早君"} >> but in fact my json file data is : >> {"bizs":[ {bid":"MzI4MTI5MzcyNw==","code >> ":"罗甸网警"},{"bid":"MzI3MzQ5Nzc2Nw==","code":"西早君"}]} >> {"bizs":[ {bid":"MzI4MTI5Mzcy00==","code >> ":"罗甸网警"},{"bid":"MzI3MzQ5Nzc201==","code":"西早君"}]} >> I load it by spark ,data schema shows like this : >> >> root >> |-- bizs: array (nullable = true) >> | |-- element: struct (containsNull = true) >> | | |-- bid: string (nullable = true) >> | | |-- code: string (nullable = true) >> >> >> I can select columns by : df.select("bizs.id","bizs.name") >> but the colume values is in array type: >> +--------------------+--------------------+ >> | id| code| >> +--------------------+--------------------+ >> |[4938200, 4938201...|[罗甸网警, 室内设计师杨焰红, ...| >> |[4938300, 4938301...|[SDCS十全九美, 旅梦长大, ...| >> |[4938400, 4938401...|[日重重工液压行走回转, 氧老家,...| >> |[4938500, 4938501...|[PABXSLZ, 陈少燕, 笑蜜...| >> |[4938600, 4938601...|[税海微云, 西域美农云家店, 福...| >> +--------------------+--------------------+ >> >> what I want is I can read colum in normal row type. how I can do it ? >> 2016-10-19 >> ------------------------------ >> lk_spark >> > >