I dropped in the patched jar (from http://people.apache.org/~stack/hbase-0.20.4-2524.jar) into my HBase installation. All reducers run to completion (for multiple consecutive jobs) and all appears to be fixed now. Also from the thread dumps, I think it is safe to say that mine is the same problem as hbase-2545.
I am going back to 0.20.3 for now and will wait for 0.20.5 to become available and upgrade to that version. Thanks a lot for all the help! Best regards, Friso On May 14, 2010, at 8:33 PM, Stack wrote: > The below looks same as blockage seen in logs posted to hbase-2545 by > Kris Jirapinyo. > > Todd has made a fix and added a test. > > We'll roll a 0.20.5 hbase with this fix and a fix for hbase-2541 > (missing licenses from the head of some source files) soon as we get > confirmation that Todd's fix works for Kris Jirapinyo's seemingly > similar issue. > > Thanks, > St.Ack > > On Fri, May 14, 2010 at 9:07 AM, Todd Lipcon <t...@cloudera.com> wrote: >> It appears like we might be stuck in an infinite loop here: >> >> "IPC Server handler 9 on 60020" daemon prio=10 tid=0x00002aaeb42f7800 >> nid=0x6508 runnable [0x00000000445bb000] >> java.lang.Thread.State: RUNNABLE >> at >> org.apache.hadoop.hbase.regionserver.ExplicitColumnTracker.checkColumn(ExplicitColumnTracker.java:128) >> at >> org.apache.hadoop.hbase.regionserver.ScanQueryMatcher.match(ScanQueryMatcher.java:165) >> at >> org.apache.hadoop.hbase.regionserver.StoreScanner.next(StoreScanner.java:176) >> >> It's holding a lock that some other threads are blocked on. In both of your >> pastes, there are some threads stuck here. >> >> JD, any thoughts? Looks like you made some changes to this code for 0.20.4. >> >> -Todd >> >> On Fri, May 14, 2010 at 6:56 AM, Friso van Vollenhoven < >> fvanvollenho...@xebia.com> wrote: >> >>> Hi Todd, >>> >>> The row counting works fine. It is quite slow, I have to say. But I have >>> never used row counting from the shell before, so I don't know what >>> performance to expect from it or how it is implemented. It's just that a >>> regular scan from software is way faster. >>> >>> Also, from our application we do a full table scan to populate an in-memory >>> index of row keys, because we need to be able to quickly determine if a >>> certain key exists or not. I triggered this scan from our UI while there >>> were hanging reducers. This also works fine. There are close to 5 million >>> records in the table and I checked in the web interface that the table is >>> divided across all 4 region servers, so this process should hit them all. >>> >>> The earlier jstacks of the region servers were taken when the reducers >>> (clients) were hanging, before the shutdown was requested. Some 3 to 5 >>> reduce tasks hang, not all of them, but surely more than just one. >>> >>> Because of your question about what is locked up (client or region server), >>> I SSH'ed into each of the worker machines after giving HBase the shutdown >>> signal (so the shutdown sequence started) and tried to see if the region >>> server was still running and if so, shutdown each individual region server >>> manually (doing 'hbase-daemon.sh stop regionserver' on each, I'm glad there >>> are only 4 nodes). I found that: >>> - one of the region servers actually shut down normally (worker3) >>> - two region servers shut down normally after the hbase-daemon.sh command >>> (worker4 and worker1) >>> - one region server does not shut down (worker2) >>> >>> I put some additional info on pastebin. >>> >>> Here is the jstack of worker2 (the hanging one): >>> http://pastebin.com/5V0UZi7N >>> There are two jstack outputs, one from before the shutdown command was >>> given and one (starting at line 946) from after the shutdown command was >>> given. >>> >>> Here are the logs of that region server: http://pastebin.com/qCXSKR2A >>> I set the log level for org.apache.hadoop.hbase to DEBUG before doing all >>> this, so it's more verbose (I don't know if this helps). >>> >>> So, it appears that it is one of the region servers that is locked up, but >>> only for some connections while it can still serve other connections >>> normally. From the locked up region server logs, it looks like the shutdown >>> sequence runs completely, but the server just won't die afterwards (because >>> of running non-daemon threads; maybe it should just do a System.exit() if >>> all cleanup is successful). At least nothing gets corrupted, which is nice. >>> Of course I am still trying to find out why things get locked up in the >>> first place. >>> >>> I did this test twice today. During the first run it was a different region >>> server that was hanging, so I think it has nothing to do with a problem >>> related to that specific machine. >>> >>> My next step is to go through code (including HBase's, so It will take me >>> some time...) and see what exactly happens in our scenario, because from my >>> current knowledge the jstack outputs don't mean enough to me. >>> >>> >>> >>> Friso >>> >>> >>> >>> >>> On May 13, 2010, at 7:09 PM, Todd Lipcon wrote: >>> >>>> Hi Friso, >>>> >>>> When did you take the jstack dumps of the region servers? Was it when the >>>> reduce tasks were still hanging? >>>> >>>> Do all of the reduce tasks hang or is it just one that gets stuck? >>>> >>>> If, once the reduce tasks are hung, you open the hbase shell and run >>> "count >>>> 'mytable', 100000" does it successfully count the rows? >>>> >>>> (I'm trying to determine if the client is locked up, or one of the RSes >>> is >>>> locked up) >>>> >>>> Enjoy your holiday! >>>> >>>> -Todd >>>> >>>> On Thu, May 13, 2010 at 12:38 AM, Friso van Vollenhoven < >>>> fvanvollenho...@xebia.com> wrote: >>>> >>>>> Hi, >>>>> >>>>> I was kind of hoping that this was a known thing and I was just >>> overlooking >>>>> something. Apparently it requires more investigation. >>>>> >>>>> I reproduced the situation and gathered some more info. I still have to >>>>> look through everything myself a bit more (today is a national holiday >>> in >>>>> The Netherlands, so not much working), but I posted some info to >>> pastebin. >>>>> >>>>> I did the following sequence (with HBase 0.20.4): >>>>> - startup HBase (waited for all the regions to come online and let it >>>>> settle) >>>>> - startup our application >>>>> - wait for the importer job to hang (it only happened on the second run, >>>>> which started 15 reducers; the first run was really small and only one >>> key >>>>> was generated, so just one reducer) >>>>> - kill the hanging importer job (hadoop job -kill) >>>>> - try to shutdown HBase (as I type this it is still producing dots on my >>>>> console) >>>>> >>>>> The HBase master logs are here (includes shutdown attempt): >>>>> http://pastebin.com/PYpPVcyK >>>>> The jstacks are here: >>>>> - HMaster: http://pastebin.com/Da6jCAuA (this includes two thread >>> dumps, >>>>> one during operation with the hanging clients and one during hanging >>>>> shutdown) >>>>> - RegionServer 1: http://pastebin.com/5dQXfxCn >>>>> - RegionServer 2: http://pastebin.com/XWwBGXYC >>>>> - RegionServer 3: http://pastebin.com/mDgWbYGV >>>>> - RegionServer 4: http://pastebin.com/XDR14bth >>>>> >>>>> As you can see in the master logs, the shutdown cannot get a thread >>> called >>>>> Thread-10 to stop running. The trace for that thread looks like this: >>>>> "Thread-10" prio=10 tid=0x000000004d218800 nid=0x1e73 in Object.wait() >>>>> [0x00000000427a7000] >>>>> java.lang.Thread.State: TIMED_WAITING (on object monitor) >>>>> at java.lang.Object.wait(Native Method) >>>>> - waiting on <0x00002aaab364c9d0> (a java.lang.Object) >>>>> at org.apache.hadoop.hbase.util.Sleeper.sleep(Sleeper.java:89) >>>>> - locked <0x00002aaab364c9d0> (a java.lang.Object) >>>>> at org.apache.hadoop.hbase.Chore.run(Chore.java:76) >>>>> >>>>> I still have no clue what happened, but I will investigate a bit more >>>>> tomorrow. >>>>> >>>>> >>>>> Thanks for the responses. >>>>> >>>>> >>>>> Friso >>>>> >>>>> >>>>> >>>>> On May 12, 2010, at 9:02 PM, Todd Lipcon wrote: >>>>> >>>>>> Hi Friso, >>>>>> >>>>>> Also, if you can capture a jstack of the regionservers at thie time >>>>>> that would be great. >>>>>> >>>>>> -Todd >>>>>> >>>>>> On Wed, May 12, 2010 at 9:26 AM, Jean-Daniel Cryans < >>> jdcry...@apache.org> >>>>> wrote: >>>>>>> Friso, >>>>>>> >>>>>>> Unfortunately it's hard to determine the cause with the provided >>>>>>> information, the client call you pasted is pretty much normal i.e. the >>>>>>> client is waiting to receive a result from a region server. >>>>>>> >>>>>>> The fact that you can't shut down the master when this happens is very >>>>>>> concerning. Do you still have those logs around? Same for the region >>>>>>> servers? Can you post this in pastebin or on a web server? >>>>>>> >>>>>>> Also, feel free to come chat with us on IRC, it's always easier to >>>>>>> debug when live. #hbase on freenode >>>>>>> >>>>>>> J-D >>>>>>> >>>>>>> On Wed, May 12, 2010 at 8:31 AM, Friso van Vollenhoven >>>>>>> <fvanvollenho...@xebia.com> wrote: >>>>>>>> Hi all, >>>>>>>> >>>>>>>> I am using Hadoop (0.20.2) and HBase to periodically import data >>> (every >>>>> 15 minutes). There are a number of import processes, but generally they >>> all >>>>> create a sequence file on HDFS, which is then run through a MapReduce >>> job. >>>>> The MapReduce uses the identity mapper (the input file is a Hadoop >>> sequence >>>>> file) and a specialized reducer that does the following: >>>>>>>> - Combine the values for a key into one value >>>>>>>> - Do a Get from HBase to retrieve existing values for the same key >>>>>>>> - Combine the existing value from HBase and the new one into one >>> value >>>>> again >>>>>>>> - Put the final value into HBase under the same key (thus 'overwrite' >>>>> the existing row; I keep only one version) >>>>>>>> >>>>>>>> After I upgraded HBase to the 0.20.4 release, the reducers sometimes >>>>> start hanging on a Get. When the jobs start, some reducers run to >>> completion >>>>> fine, but after a while the last reducers will start to hang. Eventually >>> the >>>>> reducers are killed of by Hadoop (after 600 secs). >>>>>>>> >>>>>>>> I did a thread dump for one of the hanging reducers. It looks like >>>>> this: >>>>>>>> "main" prio=10 tid=0x0000000048083800 nid=0x4c93 in Object.wait() >>>>> [0x00000000420ca000] >>>>>>>> java.lang.Thread.State: WAITING (on object monitor) >>>>>>>> at java.lang.Object.wait(Native Method) >>>>>>>> - waiting on <0x00002aaaaeb50d70> (a >>>>> org.apache.hadoop.hbase.ipc.HBaseClient$Call) >>>>>>>> at java.lang.Object.wait(Object.java:485) >>>>>>>> at >>>>> org.apache.hadoop.hbase.ipc.HBaseClient.call(HBaseClient.java:721) >>>>>>>> - locked <0x00002aaaaeb50d70> (a >>>>> org.apache.hadoop.hbase.ipc.HBaseClient$Call) >>>>>>>> at >>>>> org.apache.hadoop.hbase.ipc.HBaseRPC$Invoker.invoke(HBaseRPC.java:333) >>>>>>>> at $Proxy2.get(Unknown Source) >>>>>>>> at >>> org.apache.hadoop.hbase.client.HTable$4.call(HTable.java:450) >>>>>>>> at >>> org.apache.hadoop.hbase.client.HTable$4.call(HTable.java:448) >>>>>>>> at >>>>> >>> org.apache.hadoop.hbase.client.HConnectionManager$TableServers.getRegionServerWithRetries(HConnectionManager.java:1050) >>>>>>>> at org.apache.hadoop.hbase.client.HTable.get(HTable.java:447) >>>>>>>> at >>>>> >>> net.ripe.inrdb.hbase.accessor.real.HBaseTableAccessor.get(HBaseTableAccessor.java:36) >>>>>>>> at >>>>> >>> net.ripe.inrdb.hbase.store.HBaseStoreUpdater.getExistingRecords(HBaseStoreUpdater.java:101) >>>>>>>> at >>>>> >>> net.ripe.inrdb.hbase.store.HBaseStoreUpdater.mergeTimelinesWithExistingRecords(HBaseStoreUpdater.java:60) >>>>>>>> at >>>>> >>> net.ripe.inrdb.hbase.store.HBaseStoreUpdater.doInsert(HBaseStoreUpdater.java:40) >>>>>>>> at >>>>> >>> net.ripe.inrdb.core.store.SinglePartitionStore$Updater.insert(SinglePartitionStore.java:92) >>>>>>>> at >>>>> >>> net.ripe.inrdb.core.store.CompositeStore$CompositeStoreUpdater.insert(CompositeStore.java:142) >>>>>>>> at >>>>> >>> net.ripe.inrdb.importer.StoreInsertReducer.reduce(StoreInsertReducer.java:70) >>>>>>>> at >>>>> >>> net.ripe.inrdb.importer.StoreInsertReducer.reduce(StoreInsertReducer.java:17) >>>>>>>> at org.apache.hadoop.mapreduce.Reducer.run(Reducer.java:176) >>>>>>>> at >>>>> org.apache.hadoop.mapred.ReduceTask.runNewReducer(ReduceTask.java:566) >>>>>>>> at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:408) >>>>>>>> at org.apache.hadoop.mapred.Child.main(Child.java:170) >>>>>>>> >>>>>>>> So the client hangs in a wait() call, waiting on a HBaseClient$Call >>>>> object. I looked at the code. The wait is in a while() loop and has no >>> time >>>>> out, so it figures that it never gets out of there if no notify() gets >>>>> called on the object. I am not sure for exactly what condition it is >>>>> waiting, however. >>>>>>>> >>>>>>>> Meanwhile, after this has happened, I cannot shutdown the master >>> server >>>>> normally. I have to kill -9 it, to make it shut down. Normally and >>> before >>>>> this problem occurs, the master server shuts down just fine. (Sorry, >>> didn't >>>>> do a thread dump of the master and now I downgraded to 0.20.3 again.) >>>>>>>> >>>>>>>> I cannot reproduce this error on my local setup (developer machine). >>> It >>>>> only occurs on our (currently modest) cluster of one machine running >>>>> master+NN+Zookeeper and four datanodes which are all task trackers and >>>>> region servers as well. The inputs to the periodic MapReduce jobs are >>> very >>>>> small (ranging from some Kb to several Mb) and thus contain not so many >>>>> records. I know this is not very efficient to do in MapReduce and will >>> be >>>>> faster when inserted in process by the importer process because of >>> startup >>>>> overhead, but we are setting up this architecture of importers and >>> insertion >>>>> for anticipated larger loads (up to 80 million records per day). >>>>>>>> >>>>>>>> Does anyone have a clue about what happens? Or where to look for >>>>> further investigation? >>>>>>>> >>>>>>>> Thanks a lot! >>>>>>>> >>>>>>>> >>>>>>>> Cheers, >>>>>>>> Friso >>>>>>>> >>>>>>>> >>>>>>> >>>>>> >>>>>> >>>>>> >>>>>> -- >>>>>> Todd Lipcon >>>>>> Software Engineer, Cloudera >>>>> >>>>> >>>> >>>> >>>> -- >>>> Todd Lipcon >>>> Software Engineer, Cloudera >>> >>> >> >> >> -- >> Todd Lipcon >> Software Engineer, Cloudera >>