Ryan, All regions seem to be online. hbck shows state as inconsistent,
Region webtable,XXXXXXXXXXXXXXXX,1295614226878.00 465edbdfd73ad89de4a7cd6c0dc4ff. is listed in META on region server <HOSTX>:60020 but is multiply assigned to region servers <HOSTX>:60020, <HOSTX>:60020 It shows the region is overallocated on the same region server. I am hoping that this is related to the reverse DNS requirement. I am running into intermittent reverse DNS lookup issues. I dont see anything interesting related to this in the master log. The following is the complete log from the region server around that time frame. 2011-01-25 19:05:59,717 INFO org.apache.hadoop.hbase.regionserver.HRegion: Starting compaction on region webtable,XXXXXXXXXXXXXXXXXXXXXXX., size=739.7k 2011-01-25 19:05:59,719 DEBUG org.apache.hadoop.hbase.regionserver.Store: Major compaction triggered on store c; time since last major compaction 104139818msfc25d432e079d45/qa/2628506464708923281, keycount=3597, bloomtype=NONE, size=84.9k 2011-01-25 19:05:59,719 INFO org.apache.hadoop.hbase.regionserver.Store: Started compaction of 4 file(s) in cf=c into hdfs://XX.XX.XX:8020/hbase/webtable/d8c98179e0d6257fe0e0ed6b73705d16/.tmp, seqid=9186434, totalSize=275.0m 2011-01-25 19:05:59,719 DEBUG org.apache.hadoop.hbase.regionserver.Store: Compacting hdfs://XX.XX.XX/hbase/webtable/d8c98179e0d6257fe0e0ed6b73705d16/c/9205102805597630312, keycount=42464, bloomtype=NONE, size=253.9m size=824.1k; total size for store is 824.1k 2011-01-25 19:05:59,719 DEBUG org.apache.hadoop.hbase.regionserver.Store: Compacting hdfs://XX.XX.XX/hbase/webtable/d8c98179e0d6257fe0e0ed6b73705d16/c/6125196014617064476, keycount=444, bloomtype=NONE, size=2.7m 2011-01-25 19:05:59,719 DEBUG org.apache.hadoop.hbase.regionserver.Store: Compacting hdfs://XX.XX.XX/hbase/webtable/d8c98179e0d6257fe0e0ed6b73705d16/c/601650198232659486, keycount=2630, bloomtype=NONE, size=15.7m 2011-01-25 19:05:59,719 DEBUG org.apache.hadoop.hbase.regionserver.Store: Compacting hdfs://XX.XX.XX/hbase/webtable/d8c98179e0d6257fe0e0ed6b73705d16/c/837285093771597419, keycount=456, bloomtype=NONE, size=2.7m ************************ for almost 1 minute region server is idle.. 2011-01-25 19:06:53,693 INFO org.apache.hadoop.hbase.regionserver.Store: Completed major compaction of 4 file(s), new file=hdfs://XX.XX.XX:8020/hbase/webtable/d8c98179e0d6257fe0e0ed6b73705d16/c/3351637132834457815, size=274.9m; total size for store is 274.9m 2011-01-25 19:06:53,695 DEBUG org.apache.hadoop.hbase.regionserver.Store: Major compaction triggered on store mtdt; time since last major compaction 104108312msed6b73705d16/c/9205102805597630312, keycount=42464, bloomtype=NONE, size=253.9m 2011-01-25 19:06:53,695 INFO org.apache.hadoop.hbase.regionserver.Store: Started compaction of 4 file(s) in cf=mtdt into hdfs://XX.XX.XX:8020/hbase/webtable/d8c98179e0d6257fe0e0ed6b73705d16/.tmp, seqid=9186434, totalSize=1.3m 2011-01-25 19:06:53,695 DEBUG org.apache.hadoop.hbase.regionserver.Store: Compacting hdfs://XX.XX.XX/hbase/webtable/d8c98179e0d6257fe0e0ed6b73705d16/mtdt/1448570682542813090, keycount=21232, bloomtype=NONE, size=1.2m 2011-01-25 19:06:53,695 DEBUG org.apache.hadoop.hbase.regionserver.Store: Compacting hdfs://XX.XX.XX/hbase/webtable/d8c98179e0d6257fe0e0ed6b73705d16/mtdt/9176424692900683560, keycount=222, bloomtype=NONE, size=13.7ksize=274.9m; total size for store is 274.9m 2011-01-25 19:06:53,695 DEBUG org.apache.hadoop.hbase.regionserver.Store: Compacting hdfs://XX.XX.XX/hbase/webtable/d8c98179e0d6257fe0e0ed6b73705d16/mtdt/2626470500007036267, keycount=1315, bloomtype=NONE, size=77.9klSize=1.3m 2011-01-25 19:06:53,695 DEBUG org.apache.hadoop.hbase.regionserver.Store: Compacting hdfs://XX.XX.XX/hbase/webtable/d8c98179e0d6257fe0e0ed6b73705d16/mtdt/6494685694636691529, keycount=228, bloomtype=NONE, size=13.8k On Tue, Jan 25, 2011 at 9:03 PM, Ryan Rawson <[email protected]> wrote: > The exception text: > > Failed 1 action: NotServingRegionException: 1 time, servers with issues: > XXXXXXX:60020, > > is attempting to summarize potentially dozens if not hundreds of > exceptions, and '1 time' means the exception NSRE only appeared once. > The client did try multiple times. > > are you sure every region is online? Try hbck? > > -ryan > > On Tue, Jan 25, 2011 at 8:51 PM, Charan K <[email protected]> wrote: > > Hi Ryan, > > > > Table is online, since other mapred tasks continue to run without fail. > > > > There was a major compaction running in the region server which took > almost a minute . I am assuming one minute since there was no log entry for > one minute, before it completed the compaction. > > > > And from the exception it looks client tried only once, bcos it says 1 > times > > > > Thanks > > Charan > > > > Sent from my iPhone > > > > On Jan 25, 2011, at 7:42 PM, Ryan Rawson <[email protected]> wrote: > > > >> the problem is the client was talking to the given regionserver, and > >> that regionserver kept on rejecting the requests - NSRE. Are you sure > >> your table is online? Are all regions online? Anything interesting > >> in the master log? > >> > >> -ryan > >> > >> On Tue, Jan 25, 2011 at 7:32 PM, charan kumar <[email protected]> > wrote: > >>> Hi, > >>> > >>> Map Reduce Tasks are failing with the following exception. There was > >>> major compaction running on the region server around the same time. > >>> > >>> no. of retries are not customized, which is 10 by default. But I get > this > >>> exception for the first time , it gets this exception. Any suggestions? > >>> > >>> org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: > >>> Failed 1 action: NotServingRegionException: 1 time, servers with > issues: > >>> XXXXXXX:60020, > >>> at > >>> > org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.processBatch(HConnectionManager.java:1220) > >>> at > >>> > org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.processBatchOfPuts(HConnectionManager.java:1234) > >>> at > org.apache.hadoop.hbase.client.HTable.flushCommits(HTable.java:819) > >>> at org.apache.hadoop.hbase.client.HTable.doPut(HTable.java:675) > >>> at org.apache.hadoop.hbase.client.HTable.put(HTable.java:660) > >>> at > >>> > org.apache.hadoop.hbase.mapreduce.TableOutputFormat$TableRecordWriter.write(TableOutputFormat.java:126) > >>> at > >>> > org.apache.hadoop.hbase.mapreduce.TableOutputFormat$TableRecordWriter.write(TableOutputFormat.java:81) > >>> at > >>> > org.apache.hadoop.mapred.ReduceTask$NewTrackingRecordWriter.write(ReduceTask.java:508) > >>> at > >>> > org.apache.hadoop.mapreduce.TaskInputOutputContext.write(TaskInputOutputContext.java:80) > >>> at > >>> > com.ask.af.segscan.SegmentScanner$WebTableReducer.reduce(SegmentScanner.java:284) > >>> at > >>> > com.ask.af.segscan.SegmentScanner$WebTableReducer.reduce(SegmentScanner.java:91) > >>> at org.apache.hadoop.mapreduce.Reducer.run(Reducer.java:176) > >>> at > >>> org.apache.hadoop.mapred.ReduceTask.runNewReducer(ReduceTask.java:566) > >>> at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:408) > >>> at org.apache.hadoop.mapred.Child.main(Child.java:170) > >>> > >>> Thanks, > >>> Charan > >>> > > >
