[jira] [Commented] (HIVE-16180) LLAP: Native memory leak in EncodedReader
[ https://issues.apache.org/jira/browse/HIVE-16180?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15935232#comment-15935232 ] Prasanth Jayachandran commented on HIVE-16180: -- lgtm, +1 > LLAP: Native memory leak in EncodedReader > - > > Key: HIVE-16180 > URL: https://issues.apache.org/jira/browse/HIVE-16180 > Project: Hive > Issue Type: Bug > Components: llap >Affects Versions: 2.2.0 >Reporter: Prasanth Jayachandran >Assignee: Sergey Shelukhin >Priority: Critical > Attachments: DirectCleaner.java, FullGC-15GB-cleanup.png, > Full-gc-native-mem-cleanup.png, HIVE-16180.03.patch, HIVE-16180.04.patch, > HIVE-16180.1.patch, HIVE-16180.2.patch, Native-mem-spike.png > > > Observed this in internal test run. There is a native memory leak in Orc > EncodedReaderImpl that can cause YARN pmem monitor to kill the container > running the daemon. Direct byte buffers are null'ed out which is not > guaranteed to be cleaned until next Full GC. To show this issue, attaching a > small test program that allocates 3x256MB direct byte buffers. First buffer > is null'ed out but still native memory is used. Second buffer user Cleaner to > clean up native allocation. Third buffer is also null'ed but this time > invoking a System.gc() which cleans up all native memory. Output from the > test program is below > {code} > Allocating 3x256MB direct memory.. > Native memory used: 786432000 > Native memory used after data1=null: 786432000 > Native memory used after data2.clean(): 524288000 > Native memory used after data3=null: 524288000 > Native memory used without gc: 524288000 > Native memory used after gc: 0 > {code} > Longer term improvements/solutions: > 1) Use DirectBufferPool from hadoop or netty's > https://netty.io/4.0/api/io/netty/buffer/PooledByteBufAllocator.html as > direct byte buffer allocations are expensive (System.gc() + 100ms thread > sleep). > 2) Use HADOOP-12760 for proper cleaner invocation in JDK8 and JDK9 -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HIVE-16180) LLAP: Native memory leak in EncodedReader
[ https://issues.apache.org/jira/browse/HIVE-16180?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15935180#comment-15935180 ] Sergey Shelukhin commented on HIVE-16180: - Failure is unrelated. [~prasanth_j] can you take a look? > LLAP: Native memory leak in EncodedReader > - > > Key: HIVE-16180 > URL: https://issues.apache.org/jira/browse/HIVE-16180 > Project: Hive > Issue Type: Bug > Components: llap >Affects Versions: 2.2.0 >Reporter: Prasanth Jayachandran >Assignee: Sergey Shelukhin >Priority: Critical > Attachments: DirectCleaner.java, FullGC-15GB-cleanup.png, > Full-gc-native-mem-cleanup.png, HIVE-16180.03.patch, HIVE-16180.04.patch, > HIVE-16180.1.patch, HIVE-16180.2.patch, Native-mem-spike.png > > > Observed this in internal test run. There is a native memory leak in Orc > EncodedReaderImpl that can cause YARN pmem monitor to kill the container > running the daemon. Direct byte buffers are null'ed out which is not > guaranteed to be cleaned until next Full GC. To show this issue, attaching a > small test program that allocates 3x256MB direct byte buffers. First buffer > is null'ed out but still native memory is used. Second buffer user Cleaner to > clean up native allocation. Third buffer is also null'ed but this time > invoking a System.gc() which cleans up all native memory. Output from the > test program is below > {code} > Allocating 3x256MB direct memory.. > Native memory used: 786432000 > Native memory used after data1=null: 786432000 > Native memory used after data2.clean(): 524288000 > Native memory used after data3=null: 524288000 > Native memory used without gc: 524288000 > Native memory used after gc: 0 > {code} > Longer term improvements/solutions: > 1) Use DirectBufferPool from hadoop or netty's > https://netty.io/4.0/api/io/netty/buffer/PooledByteBufAllocator.html as > direct byte buffer allocations are expensive (System.gc() + 100ms thread > sleep). > 2) Use HADOOP-12760 for proper cleaner invocation in JDK8 and JDK9 -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HIVE-16180) LLAP: Native memory leak in EncodedReader
[ https://issues.apache.org/jira/browse/HIVE-16180?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15934126#comment-15934126 ] Hive QA commented on HIVE-16180: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12859687/HIVE-16180.04.patch {color:red}ERROR:{color} -1 due to no test(s) being added or modified. {color:red}ERROR:{color} -1 due to 1 failed/errored test(s), 10480 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[comments] (batchId=35) {noformat} Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/4260/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/4260/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-4260/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 1 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12859687 - PreCommit-HIVE-Build > LLAP: Native memory leak in EncodedReader > - > > Key: HIVE-16180 > URL: https://issues.apache.org/jira/browse/HIVE-16180 > Project: Hive > Issue Type: Bug > Components: llap >Affects Versions: 2.2.0 >Reporter: Prasanth Jayachandran >Assignee: Sergey Shelukhin >Priority: Critical > Attachments: DirectCleaner.java, FullGC-15GB-cleanup.png, > Full-gc-native-mem-cleanup.png, HIVE-16180.03.patch, HIVE-16180.04.patch, > HIVE-16180.1.patch, HIVE-16180.2.patch, Native-mem-spike.png > > > Observed this in internal test run. There is a native memory leak in Orc > EncodedReaderImpl that can cause YARN pmem monitor to kill the container > running the daemon. Direct byte buffers are null'ed out which is not > guaranteed to be cleaned until next Full GC. To show this issue, attaching a > small test program that allocates 3x256MB direct byte buffers. First buffer > is null'ed out but still native memory is used. Second buffer user Cleaner to > clean up native allocation. Third buffer is also null'ed but this time > invoking a System.gc() which cleans up all native memory. Output from the > test program is below > {code} > Allocating 3x256MB direct memory.. > Native memory used: 786432000 > Native memory used after data1=null: 786432000 > Native memory used after data2.clean(): 524288000 > Native memory used after data3=null: 524288000 > Native memory used without gc: 524288000 > Native memory used after gc: 0 > {code} > Longer term improvements/solutions: > 1) Use DirectBufferPool from hadoop or netty's > https://netty.io/4.0/api/io/netty/buffer/PooledByteBufAllocator.html as > direct byte buffer allocations are expensive (System.gc() + 100ms thread > sleep). > 2) Use HADOOP-12760 for proper cleaner invocation in JDK8 and JDK9 -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HIVE-16180) LLAP: Native memory leak in EncodedReader
[ https://issues.apache.org/jira/browse/HIVE-16180?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15933299#comment-15933299 ] Sergey Shelukhin commented on HIVE-16180: - Tests at https://builds.apache.org/job/PreCommit-HIVE-Build/4234/testReport/ (HiveQA still fails to post). Failures need to be looked at > LLAP: Native memory leak in EncodedReader > - > > Key: HIVE-16180 > URL: https://issues.apache.org/jira/browse/HIVE-16180 > Project: Hive > Issue Type: Bug > Components: llap >Affects Versions: 2.2.0 >Reporter: Prasanth Jayachandran >Assignee: Prasanth Jayachandran >Priority: Critical > Attachments: DirectCleaner.java, FullGC-15GB-cleanup.png, > Full-gc-native-mem-cleanup.png, HIVE-16180.03.patch, HIVE-16180.1.patch, > HIVE-16180.2.patch, Native-mem-spike.png > > > Observed this in internal test run. There is a native memory leak in Orc > EncodedReaderImpl that can cause YARN pmem monitor to kill the container > running the daemon. Direct byte buffers are null'ed out which is not > guaranteed to be cleaned until next Full GC. To show this issue, attaching a > small test program that allocates 3x256MB direct byte buffers. First buffer > is null'ed out but still native memory is used. Second buffer user Cleaner to > clean up native allocation. Third buffer is also null'ed but this time > invoking a System.gc() which cleans up all native memory. Output from the > test program is below > {code} > Allocating 3x256MB direct memory.. > Native memory used: 786432000 > Native memory used after data1=null: 786432000 > Native memory used after data2.clean(): 524288000 > Native memory used after data3=null: 524288000 > Native memory used without gc: 524288000 > Native memory used after gc: 0 > {code} > Longer term improvements/solutions: > 1) Use DirectBufferPool from hadoop or netty's > https://netty.io/4.0/api/io/netty/buffer/PooledByteBufAllocator.html as > direct byte buffer allocations are expensive (System.gc() + 100ms thread > sleep). > 2) Use HADOOP-12760 for proper cleaner invocation in JDK8 and JDK9 -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HIVE-16180) LLAP: Native memory leak in EncodedReader
[ https://issues.apache.org/jira/browse/HIVE-16180?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15923305#comment-15923305 ] Prasanth Jayachandran commented on HIVE-16180: -- This needs some more changes. In some cases, this is try to clean a ByteBuffer slice (view of the original ByteBuffer) which throws NPE in reflection get, which disables the cleaner for rest of the buffers. Will post new patch with fix. > LLAP: Native memory leak in EncodedReader > - > > Key: HIVE-16180 > URL: https://issues.apache.org/jira/browse/HIVE-16180 > Project: Hive > Issue Type: Bug > Components: llap >Affects Versions: 2.2.0 >Reporter: Prasanth Jayachandran >Assignee: Prasanth Jayachandran >Priority: Critical > Attachments: DirectCleaner.java, FullGC-15GB-cleanup.png, > Full-gc-native-mem-cleanup.png, HIVE-16180.1.patch, HIVE-16180.2.patch, > Native-mem-spike.png > > > Observed this in internal test run. There is a native memory leak in Orc > EncodedReaderImpl that can cause YARN pmem monitor to kill the container > running the daemon. Direct byte buffers are null'ed out which is not > guaranteed to be cleaned until next Full GC. To show this issue, attaching a > small test program that allocates 3x256MB direct byte buffers. First buffer > is null'ed out but still native memory is used. Second buffer user Cleaner to > clean up native allocation. Third buffer is also null'ed but this time > invoking a System.gc() which cleans up all native memory. Output from the > test program is below > {code} > Allocating 3x256MB direct memory.. > Native memory used: 786432000 > Native memory used after data1=null: 786432000 > Native memory used after data2.clean(): 524288000 > Native memory used after data3=null: 524288000 > Native memory used without gc: 524288000 > Native memory used after gc: 0 > {code} > Longer term improvements/solutions: > 1) Use DirectBufferPool from hadoop or netty's > https://netty.io/4.0/api/io/netty/buffer/PooledByteBufAllocator.html as > direct byte buffer allocations are expensive (System.gc() + 100ms thread > sleep). > 2) Use HADOOP-12760 for proper cleaner invocation in JDK8 and JDK9 -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HIVE-16180) LLAP: Native memory leak in EncodedReader
[ https://issues.apache.org/jira/browse/HIVE-16180?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15915843#comment-15915843 ] Sergey Shelukhin commented on HIVE-16180: - +1 assuming test is unrelated > LLAP: Native memory leak in EncodedReader > - > > Key: HIVE-16180 > URL: https://issues.apache.org/jira/browse/HIVE-16180 > Project: Hive > Issue Type: Bug > Components: llap >Affects Versions: 2.2.0 >Reporter: Prasanth Jayachandran >Assignee: Prasanth Jayachandran >Priority: Critical > Attachments: DirectCleaner.java, FullGC-15GB-cleanup.png, > Full-gc-native-mem-cleanup.png, HIVE-16180.1.patch, HIVE-16180.2.patch, > Native-mem-spike.png > > > Observed this in internal test run. There is a native memory leak in Orc > EncodedReaderImpl that can cause YARN pmem monitor to kill the container > running the daemon. Direct byte buffers are null'ed out which is not > guaranteed to be cleaned until next Full GC. To show this issue, attaching a > small test program that allocates 3x256MB direct byte buffers. First buffer > is null'ed out but still native memory is used. Second buffer user Cleaner to > clean up native allocation. Third buffer is also null'ed but this time > invoking a System.gc() which cleans up all native memory. Output from the > test program is below > {code} > Allocating 3x256MB direct memory.. > Native memory used: 786432000 > Native memory used after data1=null: 786432000 > Native memory used after data2.clean(): 524288000 > Native memory used after data3=null: 524288000 > Native memory used without gc: 524288000 > Native memory used after gc: 0 > {code} > Longer term improvements/solutions: > 1) Use DirectBufferPool from hadoop or netty's > https://netty.io/4.0/api/io/netty/buffer/PooledByteBufAllocator.html as > direct byte buffer allocations are expensive (System.gc() + 100ms thread > sleep). > 2) Use HADOOP-12760 for proper cleaner invocation in JDK8 and JDK9 -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HIVE-16180) LLAP: Native memory leak in EncodedReader
[ https://issues.apache.org/jira/browse/HIVE-16180?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15906144#comment-15906144 ] Hive QA commented on HIVE-16180: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12857456/HIVE-16180.2.patch {color:red}ERROR:{color} -1 due to no test(s) being added or modified. {color:red}ERROR:{color} -1 due to 1 failed/errored test(s), 10339 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_if_expr] (batchId=141) {noformat} Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/4088/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/4088/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-4088/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 1 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12857456 - PreCommit-HIVE-Build > LLAP: Native memory leak in EncodedReader > - > > Key: HIVE-16180 > URL: https://issues.apache.org/jira/browse/HIVE-16180 > Project: Hive > Issue Type: Bug > Components: llap >Affects Versions: 2.2.0 >Reporter: Prasanth Jayachandran >Assignee: Prasanth Jayachandran >Priority: Critical > Attachments: DirectCleaner.java, FullGC-15GB-cleanup.png, > Full-gc-native-mem-cleanup.png, HIVE-16180.1.patch, HIVE-16180.2.patch, > Native-mem-spike.png > > > Observed this in internal test run. There is a native memory leak in Orc > EncodedReaderImpl that can cause YARN pmem monitor to kill the container > running the daemon. Direct byte buffers are null'ed out which is not > guaranteed to be cleaned until next Full GC. To show this issue, attaching a > small test program that allocates 3x256MB direct byte buffers. First buffer > is null'ed out but still native memory is used. Second buffer user Cleaner to > clean up native allocation. Third buffer is also null'ed but this time > invoking a System.gc() which cleans up all native memory. Output from the > test program is below > {code} > Allocating 3x256MB direct memory.. > Native memory used: 786432000 > Native memory used after data1=null: 786432000 > Native memory used after data2.clean(): 524288000 > Native memory used after data3=null: 524288000 > Native memory used without gc: 524288000 > Native memory used after gc: 0 > {code} > Longer term improvements/solutions: > 1) Use DirectBufferPool from hadoop or netty's > https://netty.io/4.0/api/io/netty/buffer/PooledByteBufAllocator.html as > direct byte buffer allocations are expensive (System.gc() + 100ms thread > sleep). > 2) Use HADOOP-12760 for proper cleaner invocation in JDK8 and JDK9 -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HIVE-16180) LLAP: Native memory leak in EncodedReader
[ https://issues.apache.org/jira/browse/HIVE-16180?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15906116#comment-15906116 ] Hive QA commented on HIVE-16180: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12857456/HIVE-16180.2.patch {color:red}ERROR:{color} -1 due to no test(s) being added or modified. {color:red}ERROR:{color} -1 due to 1 failed/errored test(s), 10339 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[columnstats_part_coltype] (batchId=153) {noformat} Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/4086/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/4086/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-4086/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 1 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12857456 - PreCommit-HIVE-Build > LLAP: Native memory leak in EncodedReader > - > > Key: HIVE-16180 > URL: https://issues.apache.org/jira/browse/HIVE-16180 > Project: Hive > Issue Type: Bug > Components: llap >Affects Versions: 2.2.0 >Reporter: Prasanth Jayachandran >Assignee: Prasanth Jayachandran >Priority: Critical > Attachments: DirectCleaner.java, FullGC-15GB-cleanup.png, > Full-gc-native-mem-cleanup.png, HIVE-16180.1.patch, HIVE-16180.2.patch, > Native-mem-spike.png > > > Observed this in internal test run. There is a native memory leak in Orc > EncodedReaderImpl that can cause YARN pmem monitor to kill the container > running the daemon. Direct byte buffers are null'ed out which is not > guaranteed to be cleaned until next Full GC. To show this issue, attaching a > small test program that allocates 3x256MB direct byte buffers. First buffer > is null'ed out but still native memory is used. Second buffer user Cleaner to > clean up native allocation. Third buffer is also null'ed but this time > invoking a System.gc() which cleans up all native memory. Output from the > test program is below > {code} > Allocating 3x256MB direct memory.. > Native memory used: 786432000 > Native memory used after data1=null: 786432000 > Native memory used after data2.clean(): 524288000 > Native memory used after data3=null: 524288000 > Native memory used without gc: 524288000 > Native memory used after gc: 0 > {code} > Longer term improvements/solutions: > 1) Use DirectBufferPool from hadoop or netty's > https://netty.io/4.0/api/io/netty/buffer/PooledByteBufAllocator.html as > direct byte buffer allocations are expensive (System.gc() + 100ms thread > sleep). > 2) Use HADOOP-12760 for proper cleaner invocation in JDK8 and JDK9 -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HIVE-16180) LLAP: Native memory leak in EncodedReader
[ https://issues.apache.org/jira/browse/HIVE-16180?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15905976#comment-15905976 ] Prasanth Jayachandran commented on HIVE-16180: -- Agreed. Assuming ZCR already takes care of it. > LLAP: Native memory leak in EncodedReader > - > > Key: HIVE-16180 > URL: https://issues.apache.org/jira/browse/HIVE-16180 > Project: Hive > Issue Type: Bug > Components: llap >Affects Versions: 2.2.0 >Reporter: Prasanth Jayachandran >Assignee: Prasanth Jayachandran >Priority: Critical > Attachments: DirectCleaner.java, FullGC-15GB-cleanup.png, > Full-gc-native-mem-cleanup.png, HIVE-16180.1.patch, Native-mem-spike.png > > > Observed this in internal test run. There is a native memory leak in Orc > EncodedReaderImpl that can cause YARN pmem monitor to kill the container > running the daemon. Direct byte buffers are null'ed out which is not > guaranteed to be cleaned until next Full GC. To show this issue, attaching a > small test program that allocates 3x256MB direct byte buffers. First buffer > is null'ed out but still native memory is used. Second buffer user Cleaner to > clean up native allocation. Third buffer is also null'ed but this time > invoking a System.gc() which cleans up all native memory. Output from the > test program is below > {code} > Allocating 3x256MB direct memory.. > Native memory used: 786432000 > Native memory used after data1=null: 786432000 > Native memory used after data2.clean(): 524288000 > Native memory used after data3=null: 524288000 > Native memory used without gc: 524288000 > Native memory used after gc: 0 > {code} > Longer term improvements/solutions: > 1) Use DirectBufferPool from hadoop or netty's > https://netty.io/4.0/api/io/netty/buffer/PooledByteBufAllocator.html as > direct byte buffer allocations are expensive (System.gc() + 100ms thread > sleep). > 2) Use HADOOP-12760 for proper cleaner invocation in JDK8 and JDK9 -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HIVE-16180) LLAP: Native memory leak in EncodedReader
[ https://issues.apache.org/jira/browse/HIVE-16180?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15905972#comment-15905972 ] Sergey Shelukhin commented on HIVE-16180: - What I mean for ZCR is that we should not call this when zcr is on. > LLAP: Native memory leak in EncodedReader > - > > Key: HIVE-16180 > URL: https://issues.apache.org/jira/browse/HIVE-16180 > Project: Hive > Issue Type: Bug > Components: llap >Affects Versions: 2.2.0 >Reporter: Prasanth Jayachandran >Assignee: Prasanth Jayachandran >Priority: Critical > Attachments: DirectCleaner.java, FullGC-15GB-cleanup.png, > Full-gc-native-mem-cleanup.png, HIVE-16180.1.patch, Native-mem-spike.png > > > Observed this in internal test run. There is a native memory leak in Orc > EncodedReaderImpl that can cause YARN pmem monitor to kill the container > running the daemon. Direct byte buffers are null'ed out which is not > guaranteed to be cleaned until next Full GC. To show this issue, attaching a > small test program that allocates 3x256MB direct byte buffers. First buffer > is null'ed out but still native memory is used. Second buffer user Cleaner to > clean up native allocation. Third buffer is also null'ed but this time > invoking a System.gc() which cleans up all native memory. Output from the > test program is below > {code} > Allocating 3x256MB direct memory.. > Native memory used: 786432000 > Native memory used after data1=null: 786432000 > Native memory used after data2.clean(): 524288000 > Native memory used after data3=null: 524288000 > Native memory used without gc: 524288000 > Native memory used after gc: 0 > {code} > Longer term improvements/solutions: > 1) Use DirectBufferPool from hadoop or netty's > https://netty.io/4.0/api/io/netty/buffer/PooledByteBufAllocator.html as > direct byte buffer allocations are expensive (System.gc() + 100ms thread > sleep). > 2) Use HADOOP-12760 for proper cleaner invocation in JDK8 and JDK9 -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HIVE-16180) LLAP: Native memory leak in EncodedReader
[ https://issues.apache.org/jira/browse/HIVE-16180?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15905966#comment-15905966 ] Prasanth Jayachandran commented on HIVE-16180: -- Well.. if Full GC is not triggered. This can really be problematic. Worst case, no Full GC (very less heap occupancy) and lots of off heap BB allocations can take down the system. Null'fying it, we will lose reference and opportunity for cleanup. > LLAP: Native memory leak in EncodedReader > - > > Key: HIVE-16180 > URL: https://issues.apache.org/jira/browse/HIVE-16180 > Project: Hive > Issue Type: Bug > Components: llap >Affects Versions: 2.2.0 >Reporter: Prasanth Jayachandran >Assignee: Prasanth Jayachandran >Priority: Critical > Attachments: DirectCleaner.java, FullGC-15GB-cleanup.png, > Full-gc-native-mem-cleanup.png, HIVE-16180.1.patch, Native-mem-spike.png > > > Observed this in internal test run. There is a native memory leak in Orc > EncodedReaderImpl that can cause YARN pmem monitor to kill the container > running the daemon. Direct byte buffers are null'ed out which is not > guaranteed to be cleaned until next Full GC. To show this issue, attaching a > small test program that allocates 3x256MB direct byte buffers. First buffer > is null'ed out but still native memory is used. Second buffer user Cleaner to > clean up native allocation. Third buffer is also null'ed but this time > invoking a System.gc() which cleans up all native memory. Output from the > test program is below > {code} > Allocating 3x256MB direct memory.. > Native memory used: 786432000 > Native memory used after data1=null: 786432000 > Native memory used after data2.clean(): 524288000 > Native memory used after data3=null: 524288000 > Native memory used without gc: 524288000 > Native memory used after gc: 0 > {code} > Longer term improvements/solutions: > 1) Use DirectBufferPool from hadoop or netty's > https://netty.io/4.0/api/io/netty/buffer/PooledByteBufAllocator.html as > direct byte buffer allocations are expensive (System.gc() + 100ms thread > sleep). > 2) Use HADOOP-12760 for proper cleaner invocation in JDK8 and JDK9 -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HIVE-16180) LLAP: Native memory leak in EncodedReader
[ https://issues.apache.org/jira/browse/HIVE-16180?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15905965#comment-15905965 ] Prasanth Jayachandran commented on HIVE-16180: -- Yes. Orc side changes for ZCR might also be required. I haven't looked into it yet. Will do in a follow up. > LLAP: Native memory leak in EncodedReader > - > > Key: HIVE-16180 > URL: https://issues.apache.org/jira/browse/HIVE-16180 > Project: Hive > Issue Type: Bug > Components: llap >Affects Versions: 2.2.0 >Reporter: Prasanth Jayachandran >Assignee: Prasanth Jayachandran >Priority: Critical > Attachments: DirectCleaner.java, FullGC-15GB-cleanup.png, > Full-gc-native-mem-cleanup.png, HIVE-16180.1.patch, Native-mem-spike.png > > > Observed this in internal test run. There is a native memory leak in Orc > EncodedReaderImpl that can cause YARN pmem monitor to kill the container > running the daemon. Direct byte buffers are null'ed out which is not > guaranteed to be cleaned until next Full GC. To show this issue, attaching a > small test program that allocates 3x256MB direct byte buffers. First buffer > is null'ed out but still native memory is used. Second buffer user Cleaner to > clean up native allocation. Third buffer is also null'ed but this time > invoking a System.gc() which cleans up all native memory. Output from the > test program is below > {code} > Allocating 3x256MB direct memory.. > Native memory used: 786432000 > Native memory used after data1=null: 786432000 > Native memory used after data2.clean(): 524288000 > Native memory used after data3=null: 524288000 > Native memory used without gc: 524288000 > Native memory used after gc: 0 > {code} > Longer term improvements/solutions: > 1) Use DirectBufferPool from hadoop or netty's > https://netty.io/4.0/api/io/netty/buffer/PooledByteBufAllocator.html as > direct byte buffer allocations are expensive (System.gc() + 100ms thread > sleep). > 2) Use HADOOP-12760 for proper cleaner invocation in JDK8 and JDK9 -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HIVE-16180) LLAP: Native memory leak in EncodedReader
[ https://issues.apache.org/jira/browse/HIVE-16180?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15905958#comment-15905958 ] Sergey Shelukhin commented on HIVE-16180: - Also it's not really a leak, it's just a delayed cleanup/ > LLAP: Native memory leak in EncodedReader > - > > Key: HIVE-16180 > URL: https://issues.apache.org/jira/browse/HIVE-16180 > Project: Hive > Issue Type: Bug > Components: llap >Affects Versions: 2.2.0 >Reporter: Prasanth Jayachandran >Assignee: Prasanth Jayachandran >Priority: Critical > Attachments: DirectCleaner.java, FullGC-15GB-cleanup.png, > Full-gc-native-mem-cleanup.png, HIVE-16180.1.patch, Native-mem-spike.png > > > Observed this in internal test run. There is a native memory leak in Orc > EncodedReaderImpl that can cause YARN pmem monitor to kill the container > running the daemon. Direct byte buffers are null'ed out which is not > guaranteed to be cleaned until next Full GC. To show this issue, attaching a > small test program that allocates 3x256MB direct byte buffers. First buffer > is null'ed out but still native memory is used. Second buffer user Cleaner to > clean up native allocation. Third buffer is also null'ed but this time > invoking a System.gc() which cleans up all native memory. Output from the > test program is below > {code} > Allocating 3x256MB direct memory.. > Native memory used: 786432000 > Native memory used after data1=null: 786432000 > Native memory used after data2.clean(): 524288000 > Native memory used after data3=null: 524288000 > Native memory used without gc: 524288000 > Native memory used after gc: 0 > {code} > Longer term improvements/solutions: > 1) Use DirectBufferPool from hadoop or netty's > https://netty.io/4.0/api/io/netty/buffer/PooledByteBufAllocator.html as > direct byte buffer allocations are expensive (System.gc() + 100ms thread > sleep). > 2) Use HADOOP-12760 for proper cleaner invocation in JDK8 and JDK9 -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HIVE-16180) LLAP: Native memory leak in EncodedReader
[ https://issues.apache.org/jira/browse/HIVE-16180?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15905957#comment-15905957 ] Prasanth Jayachandran commented on HIVE-16180: -- [~sershe] can you please take a look? > LLAP: Native memory leak in EncodedReader > - > > Key: HIVE-16180 > URL: https://issues.apache.org/jira/browse/HIVE-16180 > Project: Hive > Issue Type: Bug > Components: llap >Affects Versions: 2.2.0 >Reporter: Prasanth Jayachandran >Assignee: Prasanth Jayachandran >Priority: Critical > Attachments: DirectCleaner.java, FullGC-15GB-cleanup.png, > Full-gc-native-mem-cleanup.png, HIVE-16180.1.patch, Native-mem-spike.png > > > Observed this in internal test run. There is a native memory leak in Orc > EncodedReaderImpl that can cause YARN pmem monitor to kill the container > running the daemon. Direct byte buffers are null'ed out which is not > guaranteed to be cleaned until next Full GC. To show this issue, attaching a > small test program that allocates 3x256MB direct byte buffers. First buffer > is null'ed out but still native memory is used. Second buffer user Cleaner to > clean up native allocation. Third buffer is also null'ed but this time > invoking a System.gc() which cleans up all native memory. Output from the > test program is below > {code} > Allocating 3x256MB direct memory.. > Native memory used: 786432000 > Native memory used after data1=null: 786432000 > Native memory used after data2.clean(): 524288000 > Native memory used after data3=null: 524288000 > Native memory used without gc: 524288000 > Native memory used after gc: 0 > {code} > Longer term improvements/solutions: > 1) Use DirectBufferPool from hadoop or netty's > https://netty.io/4.0/api/io/netty/buffer/PooledByteBufAllocator.html as > direct byte buffer allocations are expensive (System.gc() + 100ms thread > sleep). > 2) Use HADOOP-12760 for proper cleaner invocation in JDK8 and JDK9 -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HIVE-16180) LLAP: Native memory leak in EncodedReader
[ https://issues.apache.org/jira/browse/HIVE-16180?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15905956#comment-15905956 ] Sergey Shelukhin commented on HIVE-16180: - +1 pending tests... should this also not be done when zero-copy reader is enabled? > LLAP: Native memory leak in EncodedReader > - > > Key: HIVE-16180 > URL: https://issues.apache.org/jira/browse/HIVE-16180 > Project: Hive > Issue Type: Bug > Components: llap >Affects Versions: 2.2.0 >Reporter: Prasanth Jayachandran >Assignee: Prasanth Jayachandran >Priority: Critical > Attachments: DirectCleaner.java, FullGC-15GB-cleanup.png, > Full-gc-native-mem-cleanup.png, HIVE-16180.1.patch, Native-mem-spike.png > > > Observed this in internal test run. There is a native memory leak in Orc > EncodedReaderImpl that can cause YARN pmem monitor to kill the container > running the daemon. Direct byte buffers are null'ed out which is not > guaranteed to be cleaned until next Full GC. To show this issue, attaching a > small test program that allocates 3x256MB direct byte buffers. First buffer > is null'ed out but still native memory is used. Second buffer user Cleaner to > clean up native allocation. Third buffer is also null'ed but this time > invoking a System.gc() which cleans up all native memory. Output from the > test program is below > {code} > Allocating 3x256MB direct memory.. > Native memory used: 786432000 > Native memory used after data1=null: 786432000 > Native memory used after data2.clean(): 524288000 > Native memory used after data3=null: 524288000 > Native memory used without gc: 524288000 > Native memory used after gc: 0 > {code} > Longer term improvements/solutions: > 1) Use DirectBufferPool from hadoop or netty's > https://netty.io/4.0/api/io/netty/buffer/PooledByteBufAllocator.html as > direct byte buffer allocations are expensive (System.gc() + 100ms thread > sleep). > 2) Use HADOOP-12760 for proper cleaner invocation in JDK8 and JDK9 -- This message was sent by Atlassian JIRA (v6.3.15#6346)