[
https://issues.apache.org/jira/browse/DRILL-2957?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Jacques Nadeau resolved DRILL-2957.
-----------------------------------
Resolution: Not A Problem
> Netty Memory Manager doesn't move empty chunks between lists
> ------------------------------------------------------------
>
> Key: DRILL-2957
> URL: https://issues.apache.org/jira/browse/DRILL-2957
> Project: Apache Drill
> Issue Type: Bug
> Components: Execution - Flow
> Reporter: Jacques Nadeau
> Assignee: Jacques Nadeau
> Priority: Critical
> Fix For: 1.0.0
>
>
> I'm seeing a pattern in the memory allocator and I need you to take a look at
> it. Here are the basic concepts:
> 1) We use an extension of PooledByteBufAllocator [1] called
> PooledByteBufAllocatorL.
> 2) We use many Direct Arenas (generally one per core)
> 3) Each arena has chunk lists for different occupancies (chunks that are
> empty, chunks 25% full, chunks 50% full, etc) [2]
> 4) Each of these chunk lists maintains a list of chunks. The chunks move
> from list to list as they get more or less full.
> 5) When no memory is being used, chunks move back to the empty list.
> 6) If there are excessive empty chunks, they are released back to the OS. (I
> don't remember the exact trigger here and I'm only seeing this sometimes
> right now.)
> We're running on Netty 4.0.27.
> What I'm seeing is that we don't seem to be moving the chunks back to the
> empty list as they are vacated. You can see an example output from my memory
> logging [3] that is enabled by [4]. I haven't replicated this at small scale
> but at large scale I see it consistently (30 node cluster, large group by
> query [5]).
> I want to understand this behavior better, determine if it is a bug or not
> and determine whether or not this hurts memory for subsequent queries.
> One other note, Netty will cache small amounts of memory that is allocated
> and released on the same thread for that thread. I don't believe this is a
> large amount of memory but be aware of it. It should be possible to control
> this using these settings [6].
> [1]
> https://github.com/netty/netty/blob/master/buffer/src/main/java/io/netty/buffer/PooledByteBufAllocator.java
> [2]
> https://github.com/netty/netty/blob/master/buffer/src/main/java/io/netty/buffer/PoolArena.java#L67
> [3] Memory log output at idle after large query (one example arena out of 32
> on perf cluster, see logs on those nodes for more info):
> ::snip::
> Chunk(s) at 0~25%:
> none
> Chunk(s) at 0~50%:
> Chunk(62194b16: 0%, 0/16777216)
> Chunk(35983868: 1%, 8192/16777216)
> Chunk(5bbfb16a: 1%, 163840/16777216)
> Chunk(1c6d277e: 1%, 8192/16777216)
> Chunk(2897b6bf: 2%, 204800/16777216)
> Chunk(287d5c71: 0%, 0/16777216)
> Chunk(s) at 25~75%:
> Chunk(61bad0ee: 0%, 0/16777216)
> Chunk(s) at 50~100%:
> Chunk(2d79a032: 0%, 0/16777216)
> Chunk(42415f4e: 0%, 0/16777216)
> Chunk(33a3bade: 0%, 0/16777216)
> Chunk(1ce7ca63: 0%, 0/16777216)
> Chunk(531e1888: 0%, 0/16777216)
> Chunk(54786a09: 0%, 0/16777216)
> Chunk(5cdcb359: 0%, 0/16777216)
> Chunk(3e40137b: 0%, 0/16777216)
> Chunk(534f0fb3: 0%, 0/16777216)
> Chunk(6301ee8a: 0%, 0/16777216)
> Chunk(6a90c3aa: 0%, 0/16777216)
> Chunk(s) at 75~100%:
> none
> Chunk(s) at 100%:
> none
> ::snip::
> [4] Enable the memory logger by enabling trace level debugging for the
> "drill.allocator" logger like this:
> <logger name="drill.allocator" additivity="false">
> <level value="trace" />
> <appender-ref ref="FILE" />
> <appender-ref ref="SOCKET" />
> </logger>
> [5] On perf cluster
> # sqllineTPCDS
> ALTER SESSION SET `exec.errors.verbose` = true;
> ALTER SESSION SET `planner.enable_multiphase_agg` = false;
> ALTER SESSION SET `store.parquet.block-size` = 134217728;
> ALTER SESSION SET `planner.enable_mux_exchange` = false;
> ALTER SESSION SET `exec.min_hash_table_size` = 67108864;
> ALTER SESSION SET `planner.enable_hashagg` = true;
> ALTER SESSION SET `planner.memory.max_query_memory_per_node` = 29205777612;
> ALTER SESSION SET `planner.width.max_per_node` = 23;
> create table dfs.tmp.agg33 as
> select ss_sold_date_sk , ss_sold_time_sk , ss_item_sk , ss_customer_sk ,
> ss_cdemo_sk, count(*) from `store_sales_dri30000`
> group by ss_sold_date_sk , ss_sold_time_sk , ss_item_sk , ss_customer_sk ,
> ss_cdemo_sk;
> [6]
> https://github.com/netty/netty/blob/master/buffer/src/main/java/io/netty/buffer/PooledByteBufAllocator.java#L98
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)