A common use case is where you want to work on many rows of data across the grid. You’d broadcast a closure, running the same code on every node with just the local data. SQL doesn’t work in isolation — it’s often used as a filter for future computations.
Regards, Stephen > On 1 Nov 2019, at 17:53, Ivan Pavlukhin <vololo...@gmail.com> wrote: > > Denis, > > I am mostly concerned about gathering use cases. It would be great to > critically assess such cases to identify why it cannot be solved by > using distributed SQL. Also it sounds similar to some kind of "hints", > but very limited and with all hints drawbacks (impossibility to use > full strength of CBO). We can provide better "hints" support with new > engine as well. > > пт, 1 нояб. 2019 г. в 20:14, Denis Magda <dma...@apache.org>: >> >> Ivan, >> >> I was involved in a couple of such use cases personally, so, that's not my >> imagination ;) Even more, as far as I remember, the primary reason why we >> improved our affinityRuns ensuring no partition is purged from a node until >> a task is completed is because many users were running local SQL from >> compute tasks and needed a guarantee that SQL will always return a correct >> result set. >> >> - >> Denis >> >> >> On Fri, Nov 1, 2019 at 10:01 AM Ivan Pavlukhin <vololo...@gmail.com> wrote: >> >>> Denis, >>> >>> Would be nice to see real use-cases of affinity call + local SQL >>> combination. Generally, new engine will be able to infer collocation >>> resulting in the same collocated execution automatically. >>> >>> пт, 1 нояб. 2019 г. в 19:11, Denis Magda <dma...@apache.org>: >>>> >>>> Hi Igor, >>>> >>>> Local queries feature is broadly used together with affinity-based >>> compute >>>> tasks: >>>> >>> https://apacheignite.readme.io/docs/collocate-compute-and-data#section-affinity-call-and-run-methods >>>> >>>> The use case is as follows. The user knows that all required data needed >>>> for computation is collocated, and SQL is used as an advanced API for >>> data >>>> retrieval from the computation code. The affinity task ensures that >>>> partitions won't be discarded from the node(s) if the topology changes >>>> during the task execution and, thus, it's safe to run SQL locally >>> skipping >>>> distributed phases. >>>> >>>> The combination of affinity compute tasks with local SQL is a real and >>>> valuable use case, and this is what we need to support with Calcite. Do >>> you >>>> see any challenges? >>>> >>>> - >>>> Denis >>>> >>>> >>>> On Fri, Nov 1, 2019 at 8:46 AM Roman Kondakov <kondako...@mail.ru.invalid >>>> >>>> wrote: >>>> >>>>> Hi Igor! >>>>> >>>>> IMO we need to maintain the backward compatibility between old and new >>>>> query engines as much as possible. And therefore we shouldn't change >>> the >>>>> behavior of local queries. >>>>> >>>>> So, for local queries Calcite's planner shouldn't consider the >>>>> distribution trait at all. >>>>> >>>>> >>>>> -- >>>>> Kind Regards >>>>> Roman Kondakov >>>>> >>>>> On 01.11.2019 17:07, Seliverstov Igor wrote: >>>>>> Hi Igniters, >>>>>> >>>>>> Working on new generation of Ignite SQL I faced a question: «Do we >>> need >>>>> local queries at all and, if so, what semantic they should have?». >>>>>> >>>>>> Current planing flow consists of next steps: >>>>>> >>>>>> 1) Parsing SQL to AST >>>>>> 2) Validating AST (against Schema) >>>>>> 3) Optimizing (Building execution graph) >>>>>> 4) Splitting (into query fragments which executes on target nodes) >>>>>> 5) Mapping (query fragments to nodes/partitions) >>>>>> >>>>>> At last step we check that all Fragment sources (a table or result) >>> have >>>>> the same distribution (in other words all sources have to be >>> co-located) >>>>>> >>>>>> Planner and Splitter guarantee that all caches in a Fragment are >>>>> co-located, an Exchange is produced otherwise. But if we force local >>>>> execution we cannot produce Exchanges, that means we may face two >>>>> non-co-located caches inside a single query fragment (result of local >>> query >>>>> planning is a single query fragment). So, we cannot pass the check. >>>>>> >>>>>> Should we throw an exception or omit the check for local query >>> planning >>>>> or prohibit local queries at all? >>>>>> >>>>>> Your thoughts? >>>>>> >>>>>> Regards, >>>>>> Igor >>>>> >>> >>> >>> >>> -- >>> Best regards, >>> Ivan Pavlukhin >>> > > > > -- > Best regards, > Ivan Pavlukhin