Matthes,

Ah, gotcha! Repeated items in Parquet seem to correspond to the ArrayType in 
Spark-SQL. I only use Spark, but it does looks like that should be supported in 
Spark-SQL 1.1.0. I’m not sure though if you can apply predicates on repeated 
items from Spark-SQL.

Regards,

Frank Austin Nothaft
fnoth...@berkeley.edu
fnoth...@eecs.berkeley.edu
202-340-0466

On Sep 26, 2014, at 8:48 AM, matthes <mdiekst...@sensenetworks.com> wrote:

> Hi Frank,
> 
> thanks al lot for your response, this is a very helpful!
> 
> Actually I'm try to figure out does the current spark version supports
> Repetition levels
> (https://blog.twitter.com/2013/dremel-made-simple-with-parquet) but now it
> looks good to me.
> It is very hard to find some good things about that. Now I found this as
> well: 
> https://git-wip-us.apache.org/repos/asf?p=spark.git;a=blob;f=sql/core/src/main/scala/org/apache/spark/sql/parquet/ParquetTestData.scala;h=1dc58633a2a68cd910c1bab01c3d5ee1eb4f8709;hb=f479cf37
> 
> I wasn't sure of that because nested data can be many different things!
> If it works with SQL, to find the firstRepeatedid or secoundRepeatedid would
> be awesome. But if it only works with kind of map/reduce job than it also
> good. The most important thing is to filter the first or secound  repeated
> value as fast as possible and in combination as well.
> I start now to play with this things to get the best search results!
> 
> Me schema looks like this:
> 
> val nestedSchema =
>    """message nestedRowSchema 
> {
>                 int32 firstRepeatedid;
>                 repeated group level1
>                 {
>                       int64 secoundRepeatedid;
>                       repeated group level2 
>                     {
>                       int64   value1;
>                       int32   value2;
>                     }
>                 }
>       }
>    """
> 
> Best,
> Matthes
> 
> 
> 
> --
> View this message in context: 
> http://apache-spark-user-list.1001560.n3.nabble.com/Is-it-possible-to-use-Parquet-with-Dremel-encoding-tp15186p15239.html
> Sent from the Apache Spark User List mailing list archive at Nabble.com.
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
> For additional commands, e-mail: user-h...@spark.apache.org
> 


---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org

Reply via email to