[ https://issues.apache.org/jira/browse/IMPALA-3265?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Tim Armstrong resolved IMPALA-3265. ----------------------------------- Resolution: Fixed Fix Version/s: Impala 2.10.0 IMPALA-4674: Part 2: port backend exec to BufferPool Always create global BufferPool at startup using 80% of memory and limit reservations to 80% of query memory (same as BufferedBlockMgr). The query's initial reservation is computed in the planner, claimed centrally (managed by the InitialReservations class) and distributed to query operators from there. min_spillable_buffer_size and default_spillable_buffer_size query options control the buffer size that the planner selects for spilling operators. Port ExecNodes to use BufferPool: * Each ExecNode has to claim its reservation during Open() * Port Sorter to use BufferPool. * Switch from BufferedTupleStream to BufferedTupleStreamV2 * Port HashTable to use BufferPool via a Suballocator. This also makes PAGG memory consumption more efficient (avoid wasting buffers) and improve the spilling algorithm: * Allow preaggs to execute with 0 reservation - if streams and hash tables cannot be allocated, it will pass through rows. * Halve the buffer requirement for spilling aggs - avoid allocating buffers for aggregated and unaggregated streams simultaneously. * Rebuild spilled partitions instead of repartitioning (IMPALA-2708) TODO in follow-up patches: * Rename BufferedTupleStreamV2 to BufferedTupleStream * Implement max_row_size query option. Testing: * Updated tests to reflect new memory requirements Change-Id: I7fc7fe1c04e9dfb1a0c749fb56a5e0f2bf9c6c3e Reviewed-on: http://gerrit.cloudera.org:8080/5801 Reviewed-by: Tim Armstrong <tarmstr...@cloudera.com> Tested-by: Impala Public Jenkins > Create a metrics to track spilling per operator > ----------------------------------------------- > > Key: IMPALA-3265 > URL: https://issues.apache.org/jira/browse/IMPALA-3265 > Project: IMPALA > Issue Type: Task > Components: Backend > Affects Versions: Impala 2.2.4 > Reporter: Alan Choi > Assignee: Mostafa Mokhtar > Priority: Minor > Fix For: Impala 2.10.0 > > > There's no per operator metrics that tracks spilling. Need the following > metrics (per operator): > 1. Bytes spilled > 2. Time taken to read/write the spilled data -- This message was sent by Atlassian JIRA (v6.4.14#64029)