Revin, Excellent, please keep me in the loop and let me know once you achieve the next milestone being ready for the production. This type of use cases help to spread a word about Ignite which is really-really helpful!
— Denis > On Jan 5, 2018, at 12:27 AM, Revin Chalil <rcha...@expedia.com> wrote: > > Thanks Denis. I watched your recent 2 webinars and they were very helpful. > > I can definitely create a page explaining how (currently three) ignite > shared-rdd caches are shared across multiple spark streaming apps for data > enrichment here at expedia, once the solution is stabilized. We are not in > production yet. I have enabled native persistence and had some hiccups during > our testing but is looking better today. > > We are currently working to optimize the join between incremental data and > shared-rdd dataframe in spark as there are several spark Apps and the total > memory is limited. This part does not have much to do with Ignite but mostly > spark optimization, I believe. We do load the entire ignite-cache (~50GB > each) into spark executors and the cache is trimmed based on the business > rules, daily. > > We will keep in touch and thanks again for all the great work and help > everyone. > > Revin > > From: Denis Magda <dma...@apache.org> > Date: Thursday, January 4, 2018 at 12:34 PM > To: Revin Chalil <rcha...@expedia.com> > Cc: "dev@ignite.apache.org" <dev@ignite.apache.org> > Subject: Re: Spark data frames integration merged > > Revin, > > As as side note, do you have a public article published or any other relevant > material that explains how Ignite is used at Expedia? > > You would help the community out a lot if such information is referenced from > this page: > https://ignite.apache.org/provenusecases.html > <https://ignite.apache.org/provenusecases.html> > > — > Denis > > On Jan 3, 2018, at 11:24 AM, Revin Chalil <rcha...@expedia.com > <mailto:rcha...@expedia.com>> wrote: > > Thank you and this is great news. > > We currently use the Ignite cache as a Reference dataset RDD in Spark, > convert it into a spark DataFrame and then join this DF with the > incoming-data DF. I hope we can change this 3 step process to a single step > with the Spark DF integration. If so, would index / affinitykeys on the join > columns help with performance? We currently do not have them defined on the > Reference dataset. Are there examples available joining ignite DF with Spark > DF? Also, what is the best way to get the latest executables with the > IGNITE-3084 included? Thanks again. > > > On 12/29/17, 10:34 PM, "Nikolay Izhikov" <nizhikov....@gmail.com > <mailto:nizhikov....@gmail.com>> wrote: > > Thank you, guys. > > Val, thanks for all reviews, advices and patience. > > Anton, thanks for ignite wisdom you share with me. > > Looking forward for next issues :) > > P.S Happy New Year for all Ignite community! > > В Пт, 29/12/2017 в 13:22 -0800, Valentin Kulichenko пишет: > > Igniters, > > Great news! We completed and merged first part of integration with > Spark data frames [1]. It contains implementation of Spark data > source which allows to use DataFrame API to query Ignite data, as > well as join it with other data frames originated from different > sources. > > Next planned steps are the following: > - Implement custom execution strategy to avoid transferring data from > Ignite to Spark when possible [2]. This should give serious > performance improvement in cases when only Ignite tables participate > in a query. > - Implement ability to save a data frame into Ignite via > DataFrameWrite API [3]. > > [1] https://issues.apache.org/jira/browse/IGNITE-3084 > <https://issues.apache.org/jira/browse/IGNITE-3084> > [2] https://issues.apache.org/jira/browse/IGNITE-7077 > <https://issues.apache.org/jira/browse/IGNITE-7077> > [3] https://issues.apache.org/jira/browse/IGNITE-7337 > <https://issues.apache.org/jira/browse/IGNITE-7337> > > Nikolay Izhikov, thanks for the contribution and for all the hard > work! > > -Val > > >