Vladimir Ozerov created IGNITE-6019:
---------------------------------------
Summary: SQL: client node should not hold the whole data set
in-memory when possible
Key: IGNITE-6019
URL: https://issues.apache.org/jira/browse/IGNITE-6019
Project: Ignite
Issue Type: Bug
Components: sql
Affects Versions: 2.1
Reporter: Vladimir Ozerov
Assignee: Alexander Paschenko
Fix For: 2.2
Our SQL engine requests request data from server nodes in pieces called "page".
This allows us to control memory consumption on client side. However, currently
our client code is designed in a way that all pages are requested from all
servers before a single cursor row is returned to the user. It defeats the
whole idea of "cursor" and "page", and could easily crash client node with
OOME.
We need to fix that and request further pages in a kind of sliding window,
keeping no more than "N" pages in memory simultaneously. Note that sometimes it
is not possible, e.g. in case of {{DISTINCT}} or non-collocated {{GROUP BY}}.
In this case we would have to build the whole result set first anyway. So let's
focus on a scenario when the whole result set is not needed.
As currently everything is requested synchronously page-by-page, in the first
version it would be enough to distribute synchronous page requests between
cursor reads, without any prefetch.
Implementation details:
1) Optimization should be applied only to {{skipMergeTbl=true}} cases, when
complete result set of map queries is not needed.
2) Starting point is {{GridReduceQueryExecutor#query}}, see
{{skipMergeTbl=true}} branch - this is where we get all pages eagerly.
3) Get no more than one page from the server at a time. We request the page,
iterate over it, then request another page.
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)