Hmm, maybe a deadlock with a flush and a Put, depending on the ordering of how a Put acquires the locks.
-----Original Message----- From: Jonathan Gray [mailto:jl...@streamy.com] Sent: Friday, March 05, 2010 1:49 AM To: hbase-dev@hadoop.apache.org Subject: RE: locking hierarchy in HBase Good stuff! Regionservers "quiesce" when they shut down. Shutting down implies flushing out all your MemStores and then closing all your Regions. For potential deadlocks... Flush gets a readLock on splitsAndCloses, a scanner opens and gets a readLock on newScannerLock, the flush now needs a writeLock on newScannerLock. Should be okay? Need to look more at the code but this is really helpful, thanks Dhruba. JG -----Original Message----- From: Dhruba Borthakur [mailto:dhr...@gmail.com] Sent: Friday, March 05, 2010 12:41 AM To: hbase-dev@hadoop.apache.org Subject: locking hierarchy in HBase I am trying to come up with a locking hierarchy document for all the locks used in HBase. Any details/feedback anybody can share to make this document better would be much appreciated. This heirarchy can be used to interpreted as: a thread that has a lock C cannot acquire A or B, and so on. A. HRegionServer.lock This lock protects the list of regions maintained by a region server. The lock protects a data structure called onlineRegions that mantains a map of region names to HRegion. Question: what is region quiesce? B. HRegion.newScannerLock This lock ensures that puts and deletes are atomic. Also, one cannot close a region unless all puts, deletes and scans are completed. put gets a writelock delete gets a writelock scanner gets a readlock regionclose gets a writelock flushcache gets a writelock to switch between memstore and new store files C. HRegion.splitsAndClosesLock This lock ensures that compactions and flshCaches are completed before a region close is successful. I do not yet understand why put and delete have to acquire this lock. put gets a readlock to check if record already exists delete gets a readlock compaction gets a readlock flushCache gets a readlock regionclose gets a writelock D. RowLock This is for atomic updates of a row. put gets a rowlock E. HRegion.updatesLock put gets a readlock to insert entire key into memstore and hlog delete gets a readlock interalFlushCache gets a writelock to snapshot memstore and hlog F. HRegion.synchronized_splitLock G. HRegion.synchronized_writestate H. MemStore.lock I. Store.compactLock compaction gets a writelock: allows only 1 compaction at a time J. Store.lock add(KeyValue) gets a readlock delete(KeyValue) gets a readlock get(KeyValue) gets a readlock close gets a writelock completeCompaction gets a writelock --------------------------------------------- One posible problem lock hierarchy violation: HRegion.flushcache holds the splitsAndClosesLock.readlock and then invokes internalFlushCache. internalFlushCache acquires the netScannerLock to switch between memstore and new store files. could this cause a problem? -- Connect to me at http://www.facebook.com/dhruba