I try to work with nested parquet data. To read and write the parquet file is actually working now but when I try to query a nested field with SqlContext I get an exception:
RuntimeException: "Can't access nested field in type ArrayType(StructType(List(StructField(..." I generate the parquet file by parsing the data into the following caseclass structure: case class areas(area : String, dates : Seq[Int]) case class dataset(userid : Long, source : Int, days : Seq[Int] , areas : Seq[areas] ) automatic generated schema: root |-- userid: long (nullable = false) |-- source: integer (nullable = false) |-- days: array (nullable = true) | |-- element: integer (containsNull = false) |-- areas: array (nullable = true) | |-- element: struct (containsNull = true) | | |-- area: string (nullable = true) | | |-- dates: array (nullable = true) | | | |-- element: integer (containsNull = false) After writeing the Parquetfile I load the data again and I create a SQLContext and try to execute a sql-command like follows: parquetFile.registerTempTable("testtable") val result = sqlContext.sql("SELECT areas.area FROM testtable where userid > 500000") result.map(t => t(0)).collect().foreach(println) // throw the exception If I execute this command: val result = sqlContext.sql("SELECT areas[0].area FROM testtable where userid > 500000") I get only the values at the first position in the array but I need every value and that doesn't work. I sow the function t.getAs[...] but everything what I tried didn't worked. I hope somebody can help me how I can access a nested field that I read all values of the nested array or isn't it supported? I use spark-sql_2.10(v1.2.0), spark-core_2.10(v1.2.0) and parquet 1.6.0rc4. -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Can-t-access-nested-types-with-sql-tp21336.html Sent from the Apache Spark User List mailing list archive at Nabble.com. --------------------------------------------------------------------- To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org