Sahil Takiar created IMPALA-8818:
------------------------------------

             Summary: Replace deque queue with spillable queue in 
BufferedPlanRootSink
                 Key: IMPALA-8818
                 URL: https://issues.apache.org/jira/browse/IMPALA-8818
             Project: IMPALA
          Issue Type: Sub-task
          Components: Backend
            Reporter: Sahil Takiar
            Assignee: Sahil Takiar


Add a {{SpillableRowBatchQueue}} to replace the {{DequeRowBatchQueue}} in 
{{BufferedPlanRootSink}}. The {{SpillableRowBatchQueue}} will wrap a 
{{BufferedTupleStream}} and take in a {{TBackendResourceProfile}} created by 
{{PlanRootSink#computeResourceProfile}}.

*BufferedTupleStream Usage*:

The wrapped {{BufferedTupleStream}} should be created in 'attach_on_read' mode 
so that pages are attached to the output {{RowBatch}} in 
{{BufferedTupleStream::GetNext}}. The BTS should start off as pinned (e.g. all 
pages are pinned). If a call to {{BufferedTupleStream::AddRow}} returns false 
(it returns false if "the unused reservation was not sufficient to add a new 
page to the stream large enough to fit 'row' and the stream could not increase 
the reservation to get enough unused reservation"), it should unpin the stream 
({{BufferedTupleStream::UnpinStream}}) and then add the row (if the row still 
could not be added, then an error must have occurred, perhaps an IO error, in 
which case return the error and fail the query).

*Constraining Resources*:

When result spooling is disabled, a user can run a {{select * from 
[massive-fact-table]}} and scroll through the results without affecting the 
health of the Impala cluster (assuming they close they query promptly). Impala 
will stream the results one batch at a time to the user.

With result spooling, a naive implementation might try and buffer the enter 
fact table, and end up spilling all the contents to disk, which can potentially 
take up a large amount of space. So there needs to be restrictions on the 
memory and disk space used by the {{BufferedTupleStream}} in order to ensure a 
scan of a massive table does not consume all the memory or disk space of the 
Impala coordinator.

This problem can be solved by placing a max size on the amount of unpinned 
memory (perhaps through a new config option 
{{MAX_PINNED_RESULT_SPOOLING_MEMORY}} (maybe set to a few GBs by default). The 
max amount of pinned memory should already be constrained by the reservation 
(see next paragraph). NUM_ROWS_PRODUCED_LIMIT already limits the number of rows 
returned by a query, and so it should limit the number of rows buffered by the 
BTS as well (although it is set to 0 by default). SCRATCH_LIMIT already limits 
the amount of disk space used for spilling (although it is set to -1 by 
default).

The {{PlanRootSink}} should attempt to accurately estimate how much memory it 
needs to buffer all results in memory. This requires setting an accurate value 
of {{ResourceProfile#memEstimateBytes_}} in 
{{PlanRootSink#computeResourceProfile}}. If statistics are available, the 
estimate can be based on the number of estimated rows returned multiplied by 
the size of the rows returned. The min reservation should account for a read 
and write page for the {{BufferedTupleStream}}.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

Reply via email to