Re: Spark data frames integration merged

2018-01-05 Thread Nikolay Izhikov
Hello, guys. Currently `getPreferredLocations` implemented in `IgniteRDD -> IgniteAbstractRDD`. But DataFrame implementation uses `IgniteSQLDataFrameRDD -> IgniteSqlRDD -> IgniteAbstractRDD` Where `->` is extension. So, for now, getPreferredLocation doesn't implemented for a IgniteDataFrame.

Re: Spark data frames integration merged

2018-01-03 Thread Valentin Kulichenko
Revin, I doubt IgniteRDD#getPrefferredLocations has any affect on data frames, but this is an interesting point. Nikolay, as a developer of this functionality, can you please comment on this? -Val On Wed, Jan 3, 2018 at 1:22 PM, Revin Chalil wrote: > Thanks Val for the

Re: Spark data frames integration merged

2018-01-03 Thread Revin Chalil
Thanks Val for the info on indexes with DF. Do you know if adding index / affinitykeys on the cache help with the join, when the IgniteRDD is joined with a spark DF? The below from docs say that “IgniteRDD also provides affinity information to Spark via getPrefferredLocations method so that

Re: Spark data frames integration merged

2018-01-03 Thread vkulichenko
Indexes would not be used during joins, at least in current implementation. Current integration is implemented as a regular Spark data source which provides each relation separately. Spark then performs join by itself, so Ignite indexes do not help. The easiest way to get binaries would be to use

Re: Spark data frames integration merged

2018-01-03 Thread Revin Chalil
Thank you and this is great news. We currently use the Ignite cache as a Reference dataset RDD in Spark, convert it into a spark DataFrame and then join this DF with the incoming-data DF. I hope we can change this 3 step process to a single step with the Spark DF integration. If so, would

Re: Spark data frames integration merged

2017-12-29 Thread Nikolay Izhikov
Thank you, guys. Val, thanks for all reviews, advices and patience. Anton, thanks for ignite wisdom you share with me. Looking forward for next issues :) P.S Happy New Year for all Ignite community! В Пт, 29/12/2017 в 13:22 -0800, Valentin Kulichenko пишет: > Igniters, > > Great news! We

Re: Spark data frames integration merged

2017-12-29 Thread Denis Magda
Great news, Thanks Nikolay and Val! Nikolay, could you document the feature before the release [1]? I’ve granted you required permission. More on the doc process can be found here [2]. [1] https://issues.apache.org/jira/browse/IGNITE-7345

Spark data frames integration merged

2017-12-29 Thread Valentin Kulichenko
Igniters, Great news! We completed and merged first part of integration with Spark data frames [1]. It contains implementation of Spark data source which allows to use DataFrame API to query Ignite data, as well as join it with other data frames originated from different sources. Next planned