Dmitry,

case classes are needed only for Sprak sqlContext.createDataFrame method
because type parameter should be scala.Product. Ignite doesn't imply such
limitations to user types.

I have no ideas how to use dynamic structures.

Alexey Goncharuk,

can we use BinaryObject for this case? May be you have some ideas?

On Fri, Mar 4, 2016 at 6:47 AM, DmitryB <[email protected]> wrote:

> Hi Andrey,
>
> Thanks a lots for your help.
> Unfortunately, i can not use case classes, because a schema information is
> only available at runtime;
> to make it more clear let me add more details. suppose that i have a very
> big data set (~500 Tb) which is stored in AWS s3 in a parquet format; Using
> spark, i can process (filter + join) it and reduce size down to ~200 -500
> Gb; resulted dataset i would like to save in ignite cache using IgniteRdd
> and create indexes for a particular set of fields which will be used later
> for running queries (filter, join, aggregations); My assumption is that
> having this result dataset in ignite + indexes would help to improve the
> performance comparing to using spark DataFrame (persisted);
> Unfortunately, the resulted dataset schema can vary with great number of
> variations; So, it seems impossible to describe all of them with case
> classes;
> This is why an approach to store spark.sql.row + describe query fields and
> indexes using QueryEntity would be preferable;
> Thanks to your explanation, i see that this approach doesn't works;
> Another solutions that is spinning in my head is to generate case classes
> dynamically (at runtime) based on spark data frame schema, then map
> sql.rows
> to RDD[generated_case_class], describe ignite query and index fields using
> QueryEntity, create IgniteContext for generated case class; Im not sure
> that
> this approach is even possible, so i would like to ask for your opinion
> before i go deeper;
> Will be very grateful for advice
>
> Best regards,
> Dmitry
>
>
>
>
>
>
>
>
>
> --
> View this message in context:
> http://apache-ignite-users.70518.x6.nabble.com/index-and-query-org-apache-ignite-spark-IgniteRDD-String-org-apache-spark-sql-Row-tp3343p3363.html
> Sent from the Apache Ignite Users mailing list archive at Nabble.com.
>



-- 
Andrey Gura
GridGain Systems, Inc.
www.gridgain.com

Reply via email to