Re: Optimization of SQL queries from Spark Data Frame to Ignite

2017-11-30 Thread Valentin Kulichenko
Great! Let me know if you need any assistance and/or intermediate review. -Val On Thu, Nov 30, 2017 at 12:05 AM, Николай Ижиков wrote: > Valentin, > > > Can you please create a separate ticket for the strategy implementation > then? > > Done. > >

Re: Optimization of SQL queries from Spark Data Frame to Ignite

2017-11-30 Thread Николай Ижиков
Valentin, > Can you please create a separate ticket for the strategy implementation then? Done. https://issues.apache.org/jira/browse/IGNITE-7077 > Any idea on how long will it take? I think it will take 2-4 weeks to implement such a strategy. I try my best to make a ready to review PR

Re: Optimization of SQL queries from Spark Data Frame to Ignite

2017-11-29 Thread Valentin Kulichenko
Nikolay, Can you please create a separate ticket for the strategy implementation then? Any idea on how long will it take? As for querying a partition, both SqlQuery and SqlFieldQuery allow to specify set of partitions to work with (see setPartitions method). I think that should be enough. -Val

Re: Optimization of SQL queries from Spark Data Frame to Ignite

2017-11-29 Thread Vladimir Ozerov
Hi Nikolay, No, it is not possible to get this info from public API, neither we planned to expose it. See IGNITE-4509 and commit *fbf0e353* to get better understanding on how this was implemented. Vladimir. On Wed, Nov 29, 2017 at 2:01 PM, Николай Ижиков wrote: >

Re: Optimization of SQL queries from Spark Data Frame to Ignite

2017-11-29 Thread Николай Ижиков
Valentin, > process the AST generated by Spark and convert it to Ignite SQL... > Does it make sense to you? Yes. I think it is a great approach. Let's implement such feature as the second step of Data Frame integration. 2017-11-29 3:23 GMT+03:00 Valentin Kulichenko

Re: Optimization of SQL queries from Spark Data Frame to Ignite

2017-11-29 Thread Николай Ижиков
Hello, Vladimir. > partition pruning is already implemented in Ignite, so there is no need to do this on your own. Spark work with partitioned data set. It is required to provide data partition information to Spark from custom Data Source(Ignite). Can I get information about pruned partitions

Re: Optimization of SQL queries from Spark Data Frame to Ignite

2017-11-28 Thread Vladimir Ozerov
Nikolay, Regarding p3. - partition pruning is already implemented in Ignite, so there is no need to do this on your own. On Wed, Nov 29, 2017 at 3:23 AM, Valentin Kulichenko < valentin.kuliche...@gmail.com> wrote: > Nikolay, > > Custom strategy allows to fully process the AST generated by Spark

Optimization of SQL queries from Spark Data Frame to Ignite

2017-11-28 Thread Николай Ижиков
Hello, guys. I have implemented basic support of Spark Data Frame API [1], [2] for Ignite. Spark provides API for a custom strategy to optimize queries from spark to underlying data source(Ignite). The goal of optimization(obvious, just to be on the same page): Minimize data transfer between