Matthes, Ah, gotcha! Repeated items in Parquet seem to correspond to the ArrayType in Spark-SQL. I only use Spark, but it does looks like that should be supported in Spark-SQL 1.1.0. I’m not sure though if you can apply predicates on repeated items from Spark-SQL.
Regards, Frank Austin Nothaft fnoth...@berkeley.edu fnoth...@eecs.berkeley.edu 202-340-0466 On Sep 26, 2014, at 8:48 AM, matthes <mdiekst...@sensenetworks.com> wrote: > Hi Frank, > > thanks al lot for your response, this is a very helpful! > > Actually I'm try to figure out does the current spark version supports > Repetition levels > (https://blog.twitter.com/2013/dremel-made-simple-with-parquet) but now it > looks good to me. > It is very hard to find some good things about that. Now I found this as > well: > https://git-wip-us.apache.org/repos/asf?p=spark.git;a=blob;f=sql/core/src/main/scala/org/apache/spark/sql/parquet/ParquetTestData.scala;h=1dc58633a2a68cd910c1bab01c3d5ee1eb4f8709;hb=f479cf37 > > I wasn't sure of that because nested data can be many different things! > If it works with SQL, to find the firstRepeatedid or secoundRepeatedid would > be awesome. But if it only works with kind of map/reduce job than it also > good. The most important thing is to filter the first or secound repeated > value as fast as possible and in combination as well. > I start now to play with this things to get the best search results! > > Me schema looks like this: > > val nestedSchema = > """message nestedRowSchema > { > int32 firstRepeatedid; > repeated group level1 > { > int64 secoundRepeatedid; > repeated group level2 > { > int64 value1; > int32 value2; > } > } > } > """ > > Best, > Matthes > > > > -- > View this message in context: > http://apache-spark-user-list.1001560.n3.nabble.com/Is-it-possible-to-use-Parquet-with-Dremel-encoding-tp15186p15239.html > Sent from the Apache Spark User List mailing list archive at Nabble.com. > > --------------------------------------------------------------------- > To unsubscribe, e-mail: user-unsubscr...@spark.apache.org > For additional commands, e-mail: user-h...@spark.apache.org > --------------------------------------------------------------------- To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org