ASF GitHub Bot commented on IGNITE-6019:

GitHub user alexpaschenko opened a pull request:




You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/gridgain/apache-ignite ignite-6019

Alternatively you can review and apply these changes as the patch at:


To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #2430
commit 12a479fe7d1973bdba5dcdb488d3ff109667d107
Author: Alexander Paschenko <alexander.a.pasche...@gmail.com>
Date:   2017-08-10T12:48:38Z

    IGNITE-6019 Merge indexes iterator

commit 51a62134345731c5464c854c62e8be59fb79a9fd
Author: Alexander Paschenko <alexander.a.pasche...@gmail.com>
Date:   2017-08-10T14:06:08Z


commit 652713cba2498cfa85ad89bf9f0aaf96784cae76
Author: Alexander Paschenko <alexander.a.pasche...@gmail.com>
Date:   2017-08-10T14:19:08Z


commit 76c9e2ad6d2e7d0ab4e73b85fa6c2cda9fb913b3
Author: Alexander Paschenko <alexander.a.pasche...@gmail.com>
Date:   2017-08-10T17:31:59Z

    Added a test.


> SQL: client node should not hold the whole data set in-memory when possible
> ---------------------------------------------------------------------------
>                 Key: IGNITE-6019
>                 URL: https://issues.apache.org/jira/browse/IGNITE-6019
>             Project: Ignite
>          Issue Type: Improvement
>          Components: sql
>    Affects Versions: 2.1
>            Reporter: Vladimir Ozerov
>            Assignee: Alexander Paschenko
>            Priority: Critical
>              Labels: performance
>             Fix For: 2.2
> Our SQL engine requests request data from server nodes in pieces called 
> "page". This allows us to control memory consumption on client side. However, 
> currently our client code is designed in a way that all pages are requested 
> from all servers before a single cursor row is returned to the user. It 
> defeats the whole idea of "cursor" and "page", and could easily crash client 
> node with OOME. 
> We need to fix that and request further pages in a kind of sliding window, 
> keeping no more than "N" pages in memory simultaneously. Note that sometimes 
> it is not possible, e.g. in case of {{DISTINCT}} or non-collocated {{GROUP 
> BY}}. In this case we would have to build the whole result set first anyway. 
> So let's focus on a scenario when the whole result set is not needed.
> As currently everything is requested synchronously page-by-page, in the first 
> version it would be enough to distribute synchronous page requests between 
> cursor reads, without any prefetch. 
> Implementation details:
> 1) Optimization should be applied only to {{skipMergeTbl=true}} cases, when 
> complete result set of map queries is not needed.
> 2) Starting point is {{GridReduceQueryExecutor#query}}, see 
> {{skipMergeTbl=true}} branch - this is where we get all pages eagerly.
> 3) Get no more than one page from the server at a time. We request the page, 
> iterate over it, then request another page.

This message was sent by Atlassian JIRA

Reply via email to