[
https://issues.apache.org/jira/browse/IMPALA-4539?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Fu Lili updated IMPALA-4539:
----------------------------
Comment: was deleted
(was: @Tim Armstrong
Thank you very much,and sorry for my poor english。)
> Parquet scanner memory bug: I/O buffer is attached to output batch while
> scratch batch rows still reference it
> --------------------------------------------------------------------------------------------------------------
>
> Key: IMPALA-4539
> URL: https://issues.apache.org/jira/browse/IMPALA-4539
> Project: IMPALA
> Issue Type: Bug
> Components: Backend
> Affects Versions: Impala 2.6.0, Impala 2.7.0, Impala 2.8.0
> Reporter: Fu Lili
> Assignee: Tim Armstrong
> Priority: Blocker
> Labels: crash, resource-management
> Fix For: Impala 2.8.0
>
>
> in the HdfsScanner,RowBatch own some io_buffers when Scanner has complete
> some io read:
> {code:java}
> // We need to pass the row batch to the scan node if there is too much
> memory attached,
> // which can happen if the query is very selective. We need to release
> memory even
> // if no rows passed predicates.
> if (batch_->AtCapacity() || context_->num_completed_io_buffers() > 0) {
> context_->ReleaseCompletedResources(batch_, /* done */ false);
> }
> {code}
> when the row batch is reset, the io_buffers will be free or return to the
> mem_pool。
> {code:java}
> if (!FLAGS_disable_mem_pools && free_buffers_[idx].size() <
> FLAGS_max_free_io_buffers) {
> free_buffers_[idx].push_back(buffer);
> if (ImpaladMetrics::IO_MGR_NUM_UNUSED_BUFFERS != NULL) {
> ImpaladMetrics::IO_MGR_NUM_UNUSED_BUFFERS->Increment(1L);
> }
> } else {
> process_mem_tracker_->Release(buffer_size);
> num_allocated_buffers_.Add(-1);
> delete[] buffer;
> if (ImpaladMetrics::IO_MGR_NUM_BUFFERS != NULL) {
> ImpaladMetrics::IO_MGR_NUM_BUFFERS->Increment(-1L);
> }
> if (ImpaladMetrics::IO_MGR_TOTAL_BYTES != NULL) {
> ImpaladMetrics::IO_MGR_TOTAL_BYTES->Increment(-buffer_size);
> }
> }
> {code}
> here is the bug:the io_buffers owned by the row batch A may by used by the
> row batch B in next ScanNode::GetNext at the same time,but when the row batch
> B is need to be read,the io_buffers may has been released because row batch A
> has been reset。for example:
> {code:java}
> 1. in scanner, get row batch A and owned the io_buffer O1。
> 2. row batch A has been consumed,and the io_buffer O1 is released。
> 3. in scanner, get row batch B,but some tuple in row batch B is pointed to
> io_buffer O1,for example,some string tuples。Especially when row batch A is
> AtCapacity(),the io_buffer is very likely not only used by row batch A。
> 4. when row batch B need to be consumed,some tuples will produce error
> data,because io_buffer O1 has been released。
> {code}
> this bug is easy to reproduce when use starts option:
> "disable_mem_pools=true",because in this situation, the io_buffers will be
> free really instead of being returned to the mem_pool。
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]