I have captured some logs from what is happening during one of these pauses.
http://pastebin.com/K162Einz Can someone help me figure out what's actually going on from these logs? --- My interpretation of the logs --- As you can see at the start of the logs, my coprocessor for updating the data is executing rapidly until 10:17:06. At that time the coprocessor for querying is invoked. This query should take only moments to return, but doesn't return until 10:44:52. At 10:18:53 there appear to be some compaction related messages (though they didn't appear to be the cause, happening over a minute after the server stops functioning). It appears to run compaction until 10:42:25. The next two minutes contain just LRU eviction messages. At 10:44:52, the query from earlier appears to complete, after having summarized only 863 rows. A few other queued requests are attempted, but fail with exceptions (ClosedChannelException). Eventually the exceptions are being thrown from "openScanner", which really doesn't sound good to me. --Tom On Mon, Sep 10, 2012 at 11:32 AM, Tom Brown <[email protected]> wrote: > Hi, > > We have our system setup such that all interaction is done through > co-processors. We update the database via a co-processor (it has the > appropriate logic for dealing with concurrent access to rows), and we > also query/aggregate via co-processor (since we don't want to send all > the data over the network). > > This generally works very well. However, some times one of the region > servers will "pause". This doesn't appear to be a GC pause since it > still serves up the UI, and adds occasional messages to the log > regarding the LRU. The only thing I've found is that when I check the > server that's causing the problem (easy to tell, since all the > "working" servers have a low load, and the problem server has a higher > load), I can see that there are a number of execCoprocessor requests > that have been executing for much longer than they should. > > I want to know more details about the specifics of those requests; Is > there an API I can use that will allow my coprocessor requests to be > tracked more functionally? Is there a way to hook into the UI so I can > provide my own list of running processes? Or would I have to write > that all myself? > > I am using HBase 0.92.1, but will be upgrading to 0.94.1 soon. > > Thanks in advance! > > --Tom
