[
https://issues.apache.org/jira/browse/IMPALA-6679?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Tim Armstrong resolved IMPALA-6679.
-----------------------------------
Resolution: Fixed
Fix Version/s: Impala 3.1.0
Impala 2.13.0
> Don't claim ideal reservation in scanner until actually processing scan ranges
> ------------------------------------------------------------------------------
>
> Key: IMPALA-6679
> URL: https://issues.apache.org/jira/browse/IMPALA-6679
> Project: IMPALA
> Issue Type: Sub-task
> Components: Backend
> Reporter: Tim Armstrong
> Assignee: Tim Armstrong
> Priority: Major
> Fix For: Impala 2.13.0, Impala 3.1.0
>
>
> One cause of the memory regression in IMPALA-4835 was that scans,
> particularly Parquet, claimed memory very aggressively in
> HdfsScanNode::Open() based on the planner's estimated ideal memory. There
> were two problems here:
> # In many cases the ideal memory increase was not at all needed, because the
> total amount of data scanned in the Parquet file was actually less than the
> minimum reservation. We don't know this for sure until the read the Parquet
> footer and check the column sizes.
> # HdfsScanNode::Open() may happen a long time before the first
> HdfsScanNode::GetNext(), particularly if the scan node is on the left side of
> a broadcast join. The scan node may grab a lot of reservation before other
> plan nodes have started running, eventually resulting in starving a lot of
> nodes.
> It would make sense to wait until we are actually processing a scan range to
> increase a scanner thread's reservation from the minimum. There are various
> more complex things we could do here, but I suspect the simplest possible
> thing would help a lot. E.g. we could extend this by releasing reservation
> after processing each scan range.
> I believe the second effect may be fairly significant when running many
> queries with high concurrency. In that case the async join build will tend to
> be disabled, which means that HdfsScanNode::GetNext() on the left child of a
> broadcast join will only be called after the join build has completed, so
> overlap between that scans and the nodes on the right side of the join would
> be minimal (there's not enough synchronisation to guarantee no overlap).
> See
> https://docs.google.com/document/d/1kR0zfevNNUJom3sH1XmposacVZ-QALan7NSwnR5CkSA/edit#
> for some more analysis.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)