[jira] [Updated] (IGNITE-6019) SQL: client node should not hold the whole data set in-memory when possible

Vladimir Ozerov (JIRA) Thu, 10 Aug 2017 01:25:50 -0700

     [ 
https://issues.apache.org/jira/browse/IGNITE-6019?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Vladimir Ozerov updated IGNITE-6019:
------------------------------------
    Issue Type: Improvement  (was: Bug)

> SQL: client node should not hold the whole data set in-memory when possible
> ---------------------------------------------------------------------------
>
>                 Key: IGNITE-6019
>                 URL: https://issues.apache.org/jira/browse/IGNITE-6019
>             Project: Ignite
>          Issue Type: Improvement
>          Components: sql
>    Affects Versions: 2.1
>            Reporter: Vladimir Ozerov
>            Assignee: Alexander Paschenko
>            Priority: Critical
>              Labels: performance
>             Fix For: 2.2
>
>
> Our SQL engine requests request data from server nodes in pieces called 
> "page". This allows us to control memory consumption on client side. However, 
> currently our client code is designed in a way that all pages are requested 
> from all servers before a single cursor row is returned to the user. It 
> defeats the whole idea of "cursor" and "page", and could easily crash client 
> node with OOME. 
> We need to fix that and request further pages in a kind of sliding window, 
> keeping no more than "N" pages in memory simultaneously. Note that sometimes 
> it is not possible, e.g. in case of {{DISTINCT}} or non-collocated {{GROUP 
> BY}}. In this case we would have to build the whole result set first anyway. 
> So let's focus on a scenario when the whole result set is not needed.
> As currently everything is requested synchronously page-by-page, in the first 
> version it would be enough to distribute synchronous page requests between 
> cursor reads, without any prefetch. 
> Implementation details:
> 1) Optimization should be applied only to {{skipMergeTbl=true}} cases, when 
> complete result set of map queries is not needed.
> 2) Starting point is {{GridReduceQueryExecutor#query}}, see 
> {{skipMergeTbl=true}} branch - this is where we get all pages eagerly.
> 3) Get no more than one page from the server at a time. We request the page, 
> iterate over it, then request another page.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Updated] (IGNITE-6019) SQL: client node should not hold the whole data set in-memory when possible

Reply via email to