Dmitry, case classes are needed only for Sprak sqlContext.createDataFrame method because type parameter should be scala.Product. Ignite doesn't imply such limitations to user types.
I have no ideas how to use dynamic structures. Alexey Goncharuk, can we use BinaryObject for this case? May be you have some ideas? On Fri, Mar 4, 2016 at 6:47 AM, DmitryB <[email protected]> wrote: > Hi Andrey, > > Thanks a lots for your help. > Unfortunately, i can not use case classes, because a schema information is > only available at runtime; > to make it more clear let me add more details. suppose that i have a very > big data set (~500 Tb) which is stored in AWS s3 in a parquet format; Using > spark, i can process (filter + join) it and reduce size down to ~200 -500 > Gb; resulted dataset i would like to save in ignite cache using IgniteRdd > and create indexes for a particular set of fields which will be used later > for running queries (filter, join, aggregations); My assumption is that > having this result dataset in ignite + indexes would help to improve the > performance comparing to using spark DataFrame (persisted); > Unfortunately, the resulted dataset schema can vary with great number of > variations; So, it seems impossible to describe all of them with case > classes; > This is why an approach to store spark.sql.row + describe query fields and > indexes using QueryEntity would be preferable; > Thanks to your explanation, i see that this approach doesn't works; > Another solutions that is spinning in my head is to generate case classes > dynamically (at runtime) based on spark data frame schema, then map > sql.rows > to RDD[generated_case_class], describe ignite query and index fields using > QueryEntity, create IgniteContext for generated case class; Im not sure > that > this approach is even possible, so i would like to ask for your opinion > before i go deeper; > Will be very grateful for advice > > Best regards, > Dmitry > > > > > > > > > > -- > View this message in context: > http://apache-ignite-users.70518.x6.nabble.com/index-and-query-org-apache-ignite-spark-IgniteRDD-String-org-apache-spark-sql-Row-tp3343p3363.html > Sent from the Apache Ignite Users mailing list archive at Nabble.com. > -- Andrey Gura GridGain Systems, Inc. www.gridgain.com
