Re: Calcite based SQL query engine. Local queries

2019-11-12 Thread Ivan Pavlukhin
Dmitriy, Would be great if you can describe your use-case in more details, might be sharing a code it the best option here. Denis, Yep, the idea of mixing up Compute, SQL, KV APIs in a super weapon sounds as a killer feature. But I have a great deal of doubt that it is not over-complex to use

Re: Calcite based SQL query engine. Local queries

2019-11-08 Thread Denis Magda
Take the amount of cashback calculation or payments authorization as examples of compute tasks with local SQL. In the first case, all transactions are collocated per account and a bank needs to calculate the cashback monthly by broadcasting the task that executes special logic across all accounts

Re: Calcite based SQL query engine. Local queries

2019-11-08 Thread Dmitriy Pavlov
Yes, I understand that it is straightforward and, may be, naive approach. Which is why I'm asking how to do map-reduce on cache C data in Ignite with proper partition pinning. About Predefined/Implemented aggregate - I'm not sure I agree that we can predict everything. It is real perk of Ignite

Re: Calcite based SQL query engine. Local queries

2019-11-08 Thread Ivan Pavlukhin
Dmitriy, First, what kind of cumulative metric can it be? A lot of cumulative metrics can be compared using SQL. MIN, MAX, AVG are simple ones. For more complex ones I can think about user-define aggregate functions (UDAF). We do not have them in Ignite so far, but can introduce them. Second,

Re: Calcite based SQL query engine. Local queries

2019-11-08 Thread Dmitriy Pavlov
Hi Ivan, Igniters, imagine you need to scan all entities in the cluster. Ideally, you don't want to de-serialize all of entries, so you can use withKeepBinary(). e.g. you need a couple of fields and get some cumulative metric on this data. You can send compute to all cluster nodes and run there

Re: Calcite based SQL query engine. Local queries

2019-11-07 Thread Ivan Pavlukhin
Denis, To make things really clearer we need to provide some concrete example of Compute + LocalSQL and reason about it to figure out whether "smart" SQL engine can deliver the same (or better) results or not. пт, 8 нояб. 2019 г. в 01:48, Denis Magda : > > Folks, > > See our compute tasks as an

Re: Calcite based SQL query engine. Local queries

2019-11-07 Thread Denis Magda
Folks, See our compute tasks as an advanced version of stored procedures that let the users code the logic of various complexity with Java, .NET or C++ (and not with PL/SQL). The logic can use a combination of APIs (key-value, SQL, etc.) to access data both locally and remotely while being

Re: Calcite based SQL query engine. Local queries

2019-11-07 Thread Ivan Pavlukhin
Stephen, In my understanding we need to do a better job to realize use-cases of Compute + LocalSQL ourselves. Ideally smart optimizer should do the best job of query deployment. чт, 7 нояб. 2019 г. в 13:04, Stephen Darlington : > > I made a (bad) assumption that this would also affect queries

Re: Calcite based SQL query engine. Local queries

2019-11-07 Thread Stephen Darlington
I made a (bad) assumption that this would also affect queries against partitions. If “setLocal()” goes away but “setPartitions()” remains I’m happy. What I would say is that the “broadcast / local” method is one I see fairly often. Do we need to do a better job educating people of the “correct”

Re: Calcite based SQL query engine. Local queries

2019-11-07 Thread Andrey Mashenkov
+1 to Alexey's concerns. Local SQL query mode is error prone, as a query executes over non-predicted set of partitions. Using local mode with deep SQL execution model understanding will lead to inconsistent result. Just imagine if we add a note to documentation that "in case of local SQL user

Re: Calcite based SQL query engine. Local queries

2019-11-07 Thread Alexey Goncharuk
Denis, Stephen, Running a local query in a broadcast closure won't work on changing topology. We specifically added an affinityCall method to the compute API in order to pin a partition to prevent its moving and eviction throughout the task execution. Therefore, the query inside an affinityCall

Re: Calcite based SQL query engine. Local queries

2019-11-04 Thread Stephen Darlington
A common use case is where you want to work on many rows of data across the grid. You’d broadcast a closure, running the same code on every node with just the local data. SQL doesn’t work in isolation — it’s often used as a filter for future computations. Regards, Stephen > On 1 Nov 2019, at

Re: Calcite based SQL query engine. Local queries

2019-11-01 Thread Ivan Pavlukhin
Denis, I am mostly concerned about gathering use cases. It would be great to critically assess such cases to identify why it cannot be solved by using distributed SQL. Also it sounds similar to some kind of "hints", but very limited and with all hints drawbacks (impossibility to use full strength

Re: Calcite based SQL query engine. Local queries

2019-11-01 Thread Denis Magda
Ivan, I was involved in a couple of such use cases personally, so, that's not my imagination ;) Even more, as far as I remember, the primary reason why we improved our affinityRuns ensuring no partition is purged from a node until a task is completed is because many users were running local SQL

Re: Calcite based SQL query engine. Local queries

2019-11-01 Thread Ivan Pavlukhin
Denis, Would be nice to see real use-cases of affinity call + local SQL combination. Generally, new engine will be able to infer collocation resulting in the same collocated execution automatically. пт, 1 нояб. 2019 г. в 19:11, Denis Magda : > > Hi Igor, > > Local queries feature is broadly used

Re: Calcite based SQL query engine. Local queries

2019-11-01 Thread Denis Magda
Hi Igor, Local queries feature is broadly used together with affinity-based compute tasks: https://apacheignite.readme.io/docs/collocate-compute-and-data#section-affinity-call-and-run-methods The use case is as follows. The user knows that all required data needed for computation is collocated,

Re: Calcite based SQL query engine. Local queries

2019-11-01 Thread Roman Kondakov
Hi Igor! IMO we need to maintain the backward compatibility between old and new query engines as much as possible. And therefore we shouldn't change the behavior of local queries. So, for local queries Calcite's planner shouldn't consider the distribution trait at all. -- Kind Regards