[ https://issues.apache.org/jira/browse/DRILL-2957?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Hanifi Gunes updated DRILL-2957: -------------------------------- Assignee: Jacques Nadeau (was: Hanifi Gunes) > Netty Memory Manager doesn't move empty chunks between lists > ------------------------------------------------------------ > > Key: DRILL-2957 > URL: https://issues.apache.org/jira/browse/DRILL-2957 > Project: Apache Drill > Issue Type: Bug > Components: Execution - Flow > Reporter: Jacques Nadeau > Assignee: Jacques Nadeau > Priority: Critical > Fix For: 1.0.0 > > > I'm seeing a pattern in the memory allocator and I need you to take a look at > it. Here are the basic concepts: > 1) We use an extension of PooledByteBufAllocator [1] called > PooledByteBufAllocatorL. > 2) We use many Direct Arenas (generally one per core) > 3) Each arena has chunk lists for different occupancies (chunks that are > empty, chunks 25% full, chunks 50% full, etc) [2] > 4) Each of these chunk lists maintains a list of chunks. The chunks move > from list to list as they get more or less full. > 5) When no memory is being used, chunks move back to the empty list. > 6) If there are excessive empty chunks, they are released back to the OS. (I > don't remember the exact trigger here and I'm only seeing this sometimes > right now.) > We're running on Netty 4.0.27. > What I'm seeing is that we don't seem to be moving the chunks back to the > empty list as they are vacated. You can see an example output from my memory > logging [3] that is enabled by [4]. I haven't replicated this at small scale > but at large scale I see it consistently (30 node cluster, large group by > query [5]). > I want to understand this behavior better, determine if it is a bug or not > and determine whether or not this hurts memory for subsequent queries. > One other note, Netty will cache small amounts of memory that is allocated > and released on the same thread for that thread. I don't believe this is a > large amount of memory but be aware of it. It should be possible to control > this using these settings [6]. > [1] > https://github.com/netty/netty/blob/master/buffer/src/main/java/io/netty/buffer/PooledByteBufAllocator.java > [2] > https://github.com/netty/netty/blob/master/buffer/src/main/java/io/netty/buffer/PoolArena.java#L67 > [3] Memory log output at idle after large query (one example arena out of 32 > on perf cluster, see logs on those nodes for more info): > ::snip:: > Chunk(s) at 0~25%: > none > Chunk(s) at 0~50%: > Chunk(62194b16: 0%, 0/16777216) > Chunk(35983868: 1%, 8192/16777216) > Chunk(5bbfb16a: 1%, 163840/16777216) > Chunk(1c6d277e: 1%, 8192/16777216) > Chunk(2897b6bf: 2%, 204800/16777216) > Chunk(287d5c71: 0%, 0/16777216) > Chunk(s) at 25~75%: > Chunk(61bad0ee: 0%, 0/16777216) > Chunk(s) at 50~100%: > Chunk(2d79a032: 0%, 0/16777216) > Chunk(42415f4e: 0%, 0/16777216) > Chunk(33a3bade: 0%, 0/16777216) > Chunk(1ce7ca63: 0%, 0/16777216) > Chunk(531e1888: 0%, 0/16777216) > Chunk(54786a09: 0%, 0/16777216) > Chunk(5cdcb359: 0%, 0/16777216) > Chunk(3e40137b: 0%, 0/16777216) > Chunk(534f0fb3: 0%, 0/16777216) > Chunk(6301ee8a: 0%, 0/16777216) > Chunk(6a90c3aa: 0%, 0/16777216) > Chunk(s) at 75~100%: > none > Chunk(s) at 100%: > none > ::snip:: > [4] Enable the memory logger by enabling trace level debugging for the > "drill.allocator" logger like this: > <logger name="drill.allocator" additivity="false"> > <level value="trace" /> > <appender-ref ref="FILE" /> > <appender-ref ref="SOCKET" /> > </logger> > [5] On perf cluster > # sqllineTPCDS > ALTER SESSION SET `exec.errors.verbose` = true; > ALTER SESSION SET `planner.enable_multiphase_agg` = false; > ALTER SESSION SET `store.parquet.block-size` = 134217728; > ALTER SESSION SET `planner.enable_mux_exchange` = false; > ALTER SESSION SET `exec.min_hash_table_size` = 67108864; > ALTER SESSION SET `planner.enable_hashagg` = true; > ALTER SESSION SET `planner.memory.max_query_memory_per_node` = 29205777612; > ALTER SESSION SET `planner.width.max_per_node` = 23; > create table dfs.tmp.agg33 as > select ss_sold_date_sk , ss_sold_time_sk , ss_item_sk , ss_customer_sk , > ss_cdemo_sk, count(*) from `store_sales_dri30000` > group by ss_sold_date_sk , ss_sold_time_sk , ss_item_sk , ss_customer_sk , > ss_cdemo_sk; > [6] > https://github.com/netty/netty/blob/master/buffer/src/main/java/io/netty/buffer/PooledByteBufAllocator.java#L98 -- This message was sent by Atlassian JIRA (v6.3.4#6332)