Hi, We got a Map/Reduce job that threw NotServingRegionException when the reducer was about to insert data into a Hbase table. The error message is as follows. I also copied the corresponding region server log at the end of the message. Also, we browsed through the hbase administrative page of the table. We couldn't see the list of Table Regions (the table is pre-splitted.) Is there anybody who knows what's happening? Thanks.
Ey-Chih Chow ========= log from the map/reduce job ========================= 12:59:06 [dba@dba@h01 1-exec][INFO] 12/07/22 00:50:35 INFO mapred.JobClient: Task Id : attempt_201206142240_19696_r_000005_0, Status : FAILED 12:59:06 [dba@dba@h01 1-exec][INFO] org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed 110 actions: NotServingRegionException: 110 times, servers with issues: h07.mtv.byah.net:60020, 12:59:06 [dba@dba@h01 1-exec][INFO] at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.processBatch(HConnectionManager.java:1227) 12:59:06 [dba@dba@h01 1-exec][INFO] at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.processBatchOfPuts(HConnectionManager.java:1241) 12:59:06 [dba@dba@h01 1-exec][INFO] at org.apache.hadoop.hbase.client.HTable.flushCommits(HTable.java:826) 12:59:06 [dba@dba@h01 1-exec][INFO] at org.apache.hadoop.hbase.client.HTable.doPut(HTable.java:682) 12:59:06 [dba@dba@h01 1-exec][INFO] at org.apache.hadoop.hbase.client.HTable.put(HTable.java:667) 12:59:06 [dba@dba@h01 1-exec][INFO] at org.apache.hadoop.hbase.mapreduce.TableOutputFormat$TableRecordWriter.write(TableOutputFormat.java:127) 12:59:06 [dba@dba@h01 1-exec][INFO] at org.apache.hadoop.hbase.mapreduce.TableOutputFormat$TableRecordWriter.write(TableOutputFormat.java:82) 12:59:06 [dba@dba@h01 1-exec][INFO] at com.booyah.analytics.mapreduce.AvroHbaseTableOutputFormat$AvroHbaseTableRecordWriter.write(AvroHbaseTableOutputFormat.java:75) 12:59:06 [dba@dba@h01 1-exec][INFO] at com.booyah.analytics.mapred.AdaptAvroHbaseTableOu 12:59:06 [dba@dba@h01 1-exec][INFO] attempt_201206142240_19696_r_000005_0: INFO 22-07 00:45:54,927 - Loaded the native-hadoop library ================================================================= ========log from the corresponding region server========================== 2012-07-22T09:08:25.833-0700: [GC [ParNew: 136739K->138K(153344K), 0.0027890 secs] 316329K->179757K(1621376K) icms_dc=0 , 0.0028470 secs] [Times: user=0.03 sys=0.00, real=0.01 secs] 2012-07-22T09:25:00.822-0700: [GC [ParNew: 136458K->82K(153344K), 0.0028930 secs] 316077K->179701K(1621376K) icms_dc=0 , 0.0029500 secs] [Times: user=0.02 sys=0.00, real=0.00 secs] 2012-07-22T09:41:35.638-0700: [GC [ParNew: 136402K->49K(153344K), 0.0030770 secs] 316021K->179668K(1621376K) icms_dc=0 , 0.0031310 secs] [Times: user=0.02 sys=0.00, real=0.01 secs] 2012-07-22T09:58:10.796-0700: [GC [ParNew: 136351K->44K(153344K), 0.0028190 secs] 315970K->179663K(1621376K) icms_dc=0 , 0.0028750 secs] [Times: user=0.03 sys=0.00, real=0.01 secs] 2012-07-22T10:14:45.638-0700: [GC [ParNew: 136364K->66K(153344K), 0.0031410 secs] 315983K->179694K(1621376K) icms_dc=0 , 0.0031960 secs] [Times: user=0.02 sys=0.00, real=0.01 secs] 2012-07-22T10:31:15.761-0700: [GC [ParNew: 136346K->52K(153344K), 0.0029310 secs] 315974K->179680K(1621376K) icms_dc=0 , 0.0029870 secs] [Times: user=0.02 sys=0.00, real=0.00 secs] 2012-07-22T10:47:48.745-0700: [GC [ParNew: 136341K->37K(153344K), 0.0031490 secs] 315969K->179665K(1621376K) icms_dc=0 , 0.0032070 secs] [Times: user=0.02 sys=0.00, real=0.00 secs] 2012-07-22T11:04:25.008-0700: [GC [ParNew: 136357K->39K(153344K), 0.0027710 secs] 315985K->179667K(1621376K) icms_dc=0 , 0.0028260 secs] [Times: user=0.01 sys=0.00, real=0.00 secs] 2012-07-22T11:20:55.715-0700: [GC [ParNew: 136337K->39K(153344K), 0.0032670 secs] 315965K->179667K(1621376K) icms_dc=0 , 0.0033210 secs] [Times: user=0.03 sys=0.00, real=0.00 secs] 2012-07-22T11:37:28.701-0700: [GC [ParNew: 136327K->39K(153344K), 0.0027510 secs] 315955K->179667K(1621376K) icms_dc=0 , 0.0028070 secs] [Times: user=0.02 sys=0.00, real=0.01 secs] 2012-07-22T11:54:02.688-0700: [GC [ParNew: 136342K->39K(153344K), 0.0033410 secs] 315971K->179667K(1621376K) icms_dc=0 , 0.0033980 secs] [Times: user=0.03 sys=0.00, real=0.00 secs] 2012-07-22T12:10:35.639-0700: [GC [ParNew: 136359K->39K(153344K), 0.0026440 secs] 315987K->179667K(1621376K) icms_dc=0 , 0.0027000 secs] [Times: user=0.02 sys=0.00, real=0.00 secs] 2012-07-22T12:27:05.649-0700: [GC [ParNew: 136359K->39K(153344K), 0.0027960 secs] 315987K->179667K(1621376K) icms_dc=0 , 0.0028520 secs] [Times: user=0.02 sys=0.00, real=0.00 secs] 2012-07-22T12:43:35.627-0700: [GC [ParNew: 136340K->39K(153344K), 0.0030750 secs] 315968K->179667K(1621376K) icms_dc=0 , 0.0031320 secs] [Times: user=0.02 sys=0.00, real=0.00 secs] 2012-07-22T13:00:05.607-0700: [GC [ParNew: 136359K->39K(153344K), 0.0032030 secs] 315987K->179667K(1621376K) icms_dc=0 , 0.0032770 secs] [Times: user=0.03 sys=0.01, real=0.00 secs] 2012-07-22T13:16:35.587-0700: [GC [ParNew: 136359K->39K(153344K), 0.0027270 secs] 315987K->179667K(1621376K) icms_dc=0 , 0.0027820 secs] [Times: user=0.02 sys=0.00, real=0.00 secs] 2012-07-22T13:33:05.565-0700: [GC [ParNew: 136325K->39K(153344K), 0.0028510 secs] 315953K->179667K(1621376K) icms_dc=0 , 0.0029060 secs] [Times: user=0.02 sys=0.00, real=0.00 secs] 2012-07-22T13:49:35.545-0700: [GC [ParNew: 136359K->39K(153344K), 0.0030620 secs] 315987K->179667K(1621376K) icms_dc=0 , 0.0031170 secs] [Times: user=0.03 sys=0.00, real=0.01 secs] 2012-07-22T14:06:05.524-0700: [GC [ParNew: 136359K->56K(153344K), 0.0029310 secs] 315987K->179684K(1621376K) icms_dc=0 , 0.0029870 secs] [Times: user=0.03 sys=0.00, real=0.01 secs] 2012-07-22T14:22:35.502-0700: [GC [ParNew: 136376K->55K(153344K), 0.0028840 secs] 316004K->179684K(1621376K) icms_dc=0 , 0.0029400 secs] [Times: user=0.02 sys=0.00, real=0.01 secs] 2012-07-22T14:39:06.248-0700: [GC [ParNew: 136375K->55K(153344K), 0.0032820 secs] 316004K->179684K(1621376K) icms_dc=0 , 0.0033360 secs] [Times: user=0.02 sys=0.00, real=0.00 secs] 12/07/22 14:47:33 INFO regionserver.Store: Renaming flushed file at hdfs://h01.mtv.byah.net/hbase/session_prod2/010f43c1636e86b8546bd480158df536/.tmp/7069292654066145383 to hdfs://h01.mtv.byah.net/hbase/session_prod2/010f43c1636e86b8546bd480158df536/M/8185820606610414476 12/07/22 14:47:33 INFO regionserver.Store: Added hdfs://h01.mtv.byah.net/hbase/session_prod2/010f43c1636e86b8546bd480158df536/M/8185820606610414476, entries=91568, sequenceid=15360597, memsize=22.4m, filesize=1.4m 2012-07-22T14:47:33.177-0700: [GC [ParNew: 136324K->12390K(153344K), 0.0061720 secs] 315953K->192019K(1621376K) icms_dc=0 , 0.0062440 secs] [Times: user=0.04 sys=0.00, real=0.01 secs] 12/07/22 14:47:33 INFO regionserver.Store: Renaming flushed file at hdfs://h01.mtv.byah.net/hbase/session_prod2/010f43c1636e86b8546bd480158df536/.tmp/4192914148964281571 to hdfs://h01.mtv.byah.net/hbase/session_prod2/010f43c1636e86b8546bd480158df536/P/5830576607855695317 12/07/22 14:47:33 INFO regionserver.Store: Added hdfs://h01.mtv.byah.net/hbase/session_prod2/010f43c1636e86b8546bd480158df536/P/5830576607855695317, entries=101249, sequenceid=15360597, memsize=46.1m, filesize=4.2m 12/07/22 14:47:33 INFO regionserver.HRegion: Finished memstore flush of ~68.5m for region session_prod2,2EEEEEEC-0000-0000-0000-000000000000,1342965495011.010f43c1636e86b8546bd480158df536. in 275ms, sequenceid=15360597, compaction requested=false 2012-07-22T14:47:35.138-0700: [GC [ParNew: 148634K->8478K(153344K), 0.0116530 secs] 328263K->198054K(1621376K) icms_dc=0 , 0.0117260 secs] [Times: user=0.06 sys=0.00, real=0.01 secs] 2012-07-22T14:47:43.631-0700: [GC [ParNew: 144640K->17024K(153344K), 0.0302560 secs] 334216K->238447K(1621376K) icms_dc=0 , 0.0303350 secs] [Times: user=0.20 sys=0.00, real=0.03 secs] =================================================================
