Was the master very busy? How many regions were in transition during that period?
On Sun, Feb 9, 2014 at 2:31 PM, Ted Yu <[email protected]> wrote: > Can you pastebin master log from 20:40:00 to 21:42:00 so that we get more > context on what happened to tpch_hb_1000_2.lineitem,,1391921037353. > 010c1981882d1a59201af5e2dc589d44. ? > > Cheers > > > On Sun, Feb 9, 2014 at 2:12 PM, Jerry He <[email protected]> wrote: > > > Hi, folks > > > > This is what I am seeing when running in a stress env. I am getting > > "RetriesExhaustedWithDetailsException: Failed 748 actions: > > NotServingRegionException" > > On the master log, *2014-02-08 20:43 is the timestamp from OFFLINE to * > > *SPLITTING_NEW*, *2014-02-08 21:41 is the timestamp from **SPLITTING_NEW > to > > OPEN.* > > Am I seeing anything wrong here? > > > > 2014-02-08 20:45:53,215 WARN mapreduce.Counters: Group FileSystemCounters > > is deprecated. Use org.apache.hadoop.mapreduce.FileSystemCounter instead > > *2014-02-08 20:50:44,824* WARN > org.apache.hadoop.hbase.client.AsyncProcess: > > Attempt #35/35 failed for 748 ops on hdtest208.svl.ibm.com > > ,60020,1391887547473 > > NOT resubmitting. > > > > > region=tpch_hb_1000_2.lineitem,,1391921037353.010c1981882d1a59201af5e2dc589d44., > > hostname=hdtest208.svl.ibm.com,60020,1391887547473, seqNum=1 > > 2014-02-08 20:50:44,839 INFO > org.apache.sqoop.mapreduce.AutoProgressMapper: > > Auto-progress thread is finished. keepGoing=false > > 2014-02-08 20:50:44,842 INFO org.apache.hadoop.mapred.TaskLogsTruncater: > > Initializing logs' truncater with mapRetainSize=-1 and > reduceRetainSize=-1 > > 2014-02-08 20:50:44,858 ERROR > > org.apache.hadoop.security.UserGroupInformation: > PriviledgedActionException > > as:hive (auth:SIMPLE) > > > cause:org.apache.hadoop.hbase.client.*RetriesExhaustedWithDetailsException: > > Failed 748 actions: NotServingRegionException*: 748 times, > > 2014-02-08 20:50:44,858 WARN org.apache.hadoop.mapred.Child: Error > running > > child > > org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: > Failed > > 748 actions: NotServingRegionException: 748 times, > > at > > > > > org.apache.hadoop.hbase.client.AsyncProcess$BatchErrors.makeException(AsyncProcess.java:185) > > at > > > > > org.apache.hadoop.hbase.client.AsyncProcess$BatchErrors.access$500(AsyncProcess.java:169) > > at > > > > > org.apache.hadoop.hbase.client.AsyncProcess.getErrors(AsyncProcess.java:782) > > at > > > > > org.apache.hadoop.hbase.client.HTable.backgroundFlushCommits(HTable.java:934) > > > > > > *2014-02-08 20:43:57,382 INFO > org.apache.hadoop.hbase.master.RegionStates: > > Transitioned {010c1981882d1a59201af5e2dc589d44 state=OFFLINE, > > ts=1391921037382, server=null} to {010c1981882d1a59201af5e2dc589d44 > > state=SPLITTING_NEW*, ts=1391921037382, server=hdtest208.svl.ibm.com > > ,60020,1391887547473} > > 2014-02-08 20:43:57,382 INFO org.apache.hadoop.hbase.master.RegionStates: > > Transitioned {c2eb9b7971ca7f3fed3da86df5b788e7 state=OFFLINE, > > ts=1391921037382, server=null} to {c2eb9b7971ca7f3fed3da86df5b788e7 > > state=SPLITTING_NEW, ts=1391921037382, server=hdtest208.svl.ibm.com > > ,60020,1391887547473} > > 2014-02-08 20:43:57,382 INFO org.apache.hadoop.hbase.master.RegionStates: > > Transitioned {b576e8db65d56ec08db5ca900587c28d state=OPEN, > > ts=1391918522959, server=hdtest208.svl.ibm.com,60020,1391887547473} to > > {b576e8db65d56ec08db5ca900587c28d state=SPLITTING, ts=1391921037382, > > server= > > hdtest208.svl.ibm.com,60020,1391887547473} > > > > > > *2014-02-08 21:41:14,093 INFO > org.apache.hadoop.hbase.master.RegionStates: > > Transitioned {010c1981882d1a59201af5e2dc589d44 > > state=SPLITTING_NEW,*ts=1391924474093, server= > > hdtest208.svl.ibm.com,60020,1391887547473} to > > {*010c1981882d1a59201af5e2dc589d44 > > state=OPEN*, ts=1391924474093, server=hdtest208.svl.ibm.com > > ,60020,1391887547473} > > 2014-02-08 21:41:14,093 INFO org.apache.hadoop.hbase.master.RegionStates: > > Onlined 010c1981882d1a59201af5e2dc589d44 on hdtest208.svl.ibm.com > > ,60020,1391887547473 > > 2014-02-08 21:41:14,093 INFO org.apache.hadoop.hbase.master.RegionStates: > > Transitioned {c2eb9b7971ca7f3fed3da86df5b788e7 state=SPLITTING_NEW, > > ts=1391924474093, server=hdtest208.svl.ibm.com,60020,1391887547473} to > > {c2eb9b7971ca7f3fed3da86df5b788e7 state=OPEN, ts=1391924474093, server= > > hdtest208.svl.ibm.com,60020,1391887547473} > > 2014-02-08 21:41:14,094 INFO org.apache.hadoop.hbase.master.RegionStates: > > Onlined c2eb9b7971ca7f3fed3da86df5b788e7 on hdtest208.svl.ibm.com > > ,60020,1391887547473 > > 2014-02-08 21:41:14,100 INFO > > org.apache.hadoop.hbase.master.AssignmentManager: Handled SPLIT event; > > > > > parent=tpch_hb_1000_2.lineitem,,1391918508561.b576e8db65d56ec08db5ca900587c28d., > > daughter > > > a=tpch_hb_1000_2.lineitem,,1391921037353.010c1981882d1a59201af5e2dc589d44., > > daughter > > > > > b=tpch_hb_1000_2.lineitem,]\x01\x8B\xE9\xF4\x8A\x01\x80p\xA3\xA4\x01\x80\x00\x00\x00\xB6\xB7+\x02\x01\x80\x00\x00\x02,1391921037353.c2eb9b7971ca7f3fed3da86df5b788e7., > > on hdtest208.svl.ibm.com,60020,1391887547473 > > >
