[ https://issues.apache.org/jira/browse/HBASE-15999?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15329566#comment-15329566 ]
Eshcar Hillel edited comment on HBASE-15999 at 6/14/16 2:17 PM: ---------------------------------------------------------------- [~anoop.hbase] is right. The NPE is just the syndrome. Only one thread at a time should run the memstore compactor. I believe the problem is in CompactingMemStore. Replacing lines 281-284 {code} if (allowCompaction.get()) { inMemoryFlushInProgress.set(true); compactor.startCompaction(); } {code} with {code} if (allowCompaction.get() && inMemoryFlushInProgress.compareAndSet(false,true) ) { compactor.startCompaction(); } {code} should solve the problem. Can you verify this in the tests you are running? Also keep in mind that the implementation of MemStoreCompactor is being massively changed by HBASE-14921. was (Author: eshcar): [~anoop.hbase] is right. The NPE is just the syndrome. Only one thread at a time should run the memstore compactor. I believe the problem is in CompactingMemStore. Replacing lines 281-284 {code} if (allowCompaction.get()) { inMemoryFlushInProgress.set(true); compactor.startCompaction(); } {code} with {code} if (allowCompaction.get() && inMemoryFlushInProgress.compareAndSet(false,true) ) { compactor.startCompaction(); } {code} should solve the problem. Can you verify this in the tests you are running? > NPE in MemstoreCompactor > ------------------------ > > Key: HBASE-15999 > URL: https://issues.apache.org/jira/browse/HBASE-15999 > Project: HBase > Issue Type: Sub-task > Affects Versions: 2.0.0 > Reporter: ramkrishna.s.vasudevan > Assignee: ramkrishna.s.vasudevan > Priority: Critical > Fix For: 2.0.0 > > Attachments: HBASE-15999.patch, HBASE-15999_1.patch > > > In the INMEMORY_COMPACTION_POOL ThreadPoolExecutor for every thread the > current thread name is also appended. Since we are using a pool the name gets > appended and we end up in names like this > {code} > "B.defaultRpcServer.handler=89,queue=9,port=16041-inmemoryCompactions-1465492533442-inmemoryCompactions-1465492548754-inmemoryCompactions-1465492548913-inmemoryCompactions-1465492549625-inmemoryCompactions-1465492549956-inmemoryCompactions-1465492567040-inmemoryCompactions-1465492567160-inmemoryCompactions-1465492578465-inmemoryCompactions-1465492578707-inmemoryCompactions-1465492579292-inmemoryCompactions-1465492579357-inmemoryCompactions-1465492579786-inmemoryCompactions-1465492580059-inmemoryCompactions-1465492589975-inmemoryCompactions-1465492590192-inmemoryCompactions-1465492590484-inmemoryCompactions-1465492591144-inmemoryCompactions-1465492592603-inmemoryCompactions-1465492592799-inmemoryCompactions-1465492597106-inmemoryCompactions-1465492602925-inmemoryCompactions-1465492606620-inmemoryCompactions-1465492651478-inmemoryCompactions-1465492653460-inmemoryCompactions-1465492677020-inmemoryCompactions-1465492680857-inmemoryCompactions-1465492681989-inmemoryCompactions-1465492721818-inmemoryCompactions-1465492723562-inmemoryCompactions-1465492724801-inmemoryCompactions-1465492726665-inmemoryCompactions-1465492745750-inmemoryCompactions-1465492745964-inmemoryCompactions-1465492746578-inmemoryCompactions-1465492756867-inmemoryCompactions-1465492764727-inmemoryCompactions-1465492766944-inmemoryCompactions-1465492767098-inmemoryCompactions-1465492785298-inmemoryCompactions-1465492788334-inmemoryCompactions-1465492795954-inmemoryCompactions-1465493047265-inmemoryCompactions-1465493091530-inmemoryCompactions-1465493185684" > #6006 daemon prio=5 os_prio=0 tid=0x000000000daa6800 nid=0x454a runnable > [0x00007f50fd0b9000] > {code} > As we were surprised to see why so many threads are getting created as Anoop > pointed out the pool size is 10 and there is no setting for the thread to > die, the reason for this issue is that we have an multi threaded issue in > MemstoreCompactor. The memstoreCompactor has the StoreScanner and > MemstoreScanner as the state variable and every time we just instantiate a > new one when a new inmemory flush request comes. Finally we try to release > the resource where the scanner is nullified and closed. But the instance > would have already been updated or nullified by another thread when there are > multiple requests. So this causes an NPE in releaseResources. > {code} > Exception in thread > "B.defaultRpcServer.handler=76,queue=6,port=16041-inmemoryCompactions-1465906554314" > java.lang.NullPointerException > at > org.apache.hadoop.hbase.regionserver.MemStoreCompactor.releaseResources(MemStoreCompactor.java:108) > at > org.apache.hadoop.hbase.regionserver.MemStoreCompactor.doCompaction(MemStoreCompactor.java:144) > at > org.apache.hadoop.hbase.regionserver.MemStoreCompactor.startCompaction(MemStoreCompactor.java:88) > at > org.apache.hadoop.hbase.regionserver.CompactingMemStore.flushInMemory(CompactingMemStore.java:287) > at > org.apache.hadoop.hbase.regionserver.CompactingMemStore$InMemoryFlushRunnable.run(CompactingMemStore.java:356) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > {code} > So every time the thread dies and a new is created which I found from the > stack traces by adding some logs. So this is why we were creating lot of > threads and the name was simply getting appened. Now from this we can see > that adding the Handler name is fine but the main issue is the NPE. -- This message was sent by Atlassian JIRA (v6.3.4#6332)