For your first question, region server web UI, rs-status#regionRequestStats, shows Write Request Count.
You can monitor the value for the underlying region to see if it receives above-normal writes. Cheers On Mon, Nov 10, 2014 at 4:06 PM, Brian Jeltema <[email protected]> wrote: > > Was the region containing this row hot around the time of failure ? > > How do I measure that? > > > > > Can you check region server log (along with monitoring tool) what > memstore pressure was ? > > I didn't see anything in the region server logs to indicate a problem. And > given the > reproducibility of the behavior, it's hard to see how dynamic parameters > such as > memory pressure could be at the root of the problem. > > Brian > > On Nov 10, 2014, at 3:22 PM, Ted Yu <[email protected]> wrote: > > > Was the region containing this row hot around the time of failure ? > > > > Can you check region server log (along with monitoring tool) what > memstore pressure was ? > > > > Thanks > > > > On Nov 10, 2014, at 11:34 AM, Brian Jeltema < > [email protected]> wrote: > > > >>> How many tasks may write to this row concurrently ? > >> > >> only 1 mapper should be writing to this row. Is there a way to check > which > >> locks are being held? > >> > >>> Which 0.98 release are you using ? > >> > >> 0.98.0.2.1.2.1-471-hadoop2 > >> > >> Thanks > >> Brian > >> > >> On Nov 10, 2014, at 2:21 PM, Ted Yu <[email protected]> wrote: > >> > >>> There could be more than one reason where RegionTooBusyException is > thrown. > >>> Below are two (from HRegion): > >>> > >>> * We throw RegionTooBusyException if above memstore limit > >>> * and expect client to retry using some kind of backoff > >>> */ > >>> private void checkResources() > >>> > >>> * Try to acquire a lock. Throw RegionTooBusyException > >>> > >>> * if failed to get the lock in time. Throw InterruptedIOException > >>> > >>> * if interrupted while waiting for the lock. > >>> > >>> */ > >>> > >>> private void lock(final Lock lock, final int multiplier) > >>> > >>> How many tasks may write to this row concurrently ? > >>> > >>> Which 0.98 release are you using ? > >>> > >>> Cheers > >>> > >>> On Mon, Nov 10, 2014 at 11:10 AM, Brian Jeltema < > >>> [email protected]> wrote: > >>> > >>>> I’m running a map/reduce job against a table that is performing a > large > >>>> number of writes (probably updating every row). > >>>> The job is failing with the exception below. This is a solid failure; > it > >>>> dies at the same point in the application, > >>>> and at the same row in the table. So I doubt it’s a conflict with > >>>> compaction (and the UI shows no compaction in progress), > >>>> or that there is a load-related cause. > >>>> > >>>> ‘hbase hbck’ does not report any inconsistencies. The > >>>> ‘waitForAllPreviousOpsAndReset’ leads me to suspect that > >>>> there is operation in progress that is hung and blocking the update. I > >>>> don’t see anything suspicious in the HBase logs. > >>>> The data at the point of failure is not unusual, and is identical to > many > >>>> preceding rows. > >>>> Does anybody have any ideas of what I should look for to find the > cause of > >>>> this RegionTooBusyException? > >>>> > >>>> This is Hadoop 2.4 and HBase 0.98. > >>>> > >>>> 14/11/10 13:46:13 INFO mapreduce.Job: Task Id : > >>>> attempt_1415210751318_0010_m_000314_1, Status : FAILED > >>>> Error: > >>>> org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: > Failed > >>>> 1744 actions: RegionTooBusyException: 1744 times, > >>>> at > >>>> > org.apache.hadoop.hbase.client.AsyncProcess$BatchErrors.makeException(AsyncProcess.java:207) > >>>> at > >>>> > org.apache.hadoop.hbase.client.AsyncProcess$BatchErrors.access$1700(AsyncProcess.java:187) > >>>> at > >>>> > org.apache.hadoop.hbase.client.AsyncProcess.waitForAllPreviousOpsAndReset(AsyncProcess.java:1568) > >>>> at > >>>> > org.apache.hadoop.hbase.client.HTable.backgroundFlushCommits(HTable.java:1023) > >>>> at org.apache.hadoop.hbase.client.HTable.doPut(HTable.java:995) > >>>> at org.apache.hadoop.hbase.client.HTable.put(HTable.java:953) > >>>> > >>>> Brian > >> > > > >
