Yes. The below looks viable (though strange we have not seen it up to this). The profiler may have slowed things to bring on the deadlock -- or the run up to the high water mark -- but its still a deadlock. Please file a critical priority issue.
If you have a patch, that'd be excellent. Thanks for digging in on this, St.Ack 2011/4/28 Zhoushuaifeng <[email protected]>: > Thanks, I will do more test. > Maybe the deadlock hapened like this? Please point it out if it's wrong. > > 1,One handler is handling put op, and reclaimMemStoreMemory, but the memory > is isAboveHighWaterMark, so this handler locked the memstoreflusher until > global mem is lower: > > public synchronized void reclaimMemStoreMemory() { > if (isAboveHighWaterMark()) { > lock.lock(); > try { > while (isAboveHighWaterMark() && !server.isStopped()) { > wakeupFlushThread(); > try { > // we should be able to wait forever, but we've seen a bug where > // we miss a notify, so put a 5 second bound on it at least. > flushOccurred.await(5, TimeUnit.SECONDS); > } catch (InterruptedException ie) { > Thread.currentThread().interrupt(); > } > } > } finally { > lock.unlock(); > > 2, flushforGlobalPressure is trigered, but to flush the memstore, it needed > to lock the memstoreflusher: > > private boolean flushRegion(final HRegion region, final boolean > emergencyFlush) { > synchronized (this.regionsInQueue) { > FlushRegionEntry fqe = this.regionsInQueue.remove(region); > if (fqe != null && emergencyFlush) { > // Need to remove from region from delay queue. When NOT an > // emergencyFlush, then item was removed via a flushQueue.poll. > flushQueue.remove(fqe); > } > lock.lock(); > } > > 3, because lock is locked by the ipchandler of put op, the flushRegion will > never get the lock and flush will never happen. > 4, no flush, memory stay in AboveHighWaterMark state, and never unlock, so, > deadlock happend. > > Is it right? > > Zhou Shuaifeng(Frank) > > > -----邮件原件----- > 发件人: [email protected] [mailto:[email protected]] 代表 Jean-Daniel Cryans > 发送时间: 2011年4月29日 3:09 > 收件人: [email protected] > 主题: Re: found one deadlock on hbase? > > Like I said in the previous thread you made about this issue, it seems > that the YourKit profiler is doing something unexpected from the HBase > POV. Can you try running without it and see if it still happens? > > J-D > > 2011/4/28 Zhoushuaifeng <[email protected]>: >> Thanks, version is 0.90.1 >> >> Zhou Shuaifeng(Frank) >> >> -----邮件原件----- >> 发件人: [email protected] [mailto:[email protected]] 代表 Stack >> 发送时间: 2011年4月28日 13:10 >> 收件人: [email protected] >> 抄送: Yanlijun >> 主题: Re: found one deadlock on hbase? >> >> Must be a deadlock if the dumb JVM can figure it out. What version of >> hbase please so I can dig into source code? >> Thanks, >> St.Ack >> >> >
