[ https://issues.apache.org/jira/browse/HBASE-1603?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
stack updated HBASE-1603: ------------------------- Fix Version/s: (was: 0.20.0) Moving out of 0.20.0. Root cause seems to have been fixed by 1615. Meantime have better debugging in case this arises again. > MR failed "RetriesExhaustedException: Trying to contact region server Some > server for region TestTable..." > ---------------------------------------------------------------------------------------------------------- > > Key: HBASE-1603 > URL: https://issues.apache.org/jira/browse/HBASE-1603 > Project: Hadoop HBase > Issue Type: Bug > Reporter: stack > Attachments: debugging-v4.patch > > > Here is the master. Region > TestTable,\x00\x00\x06\x05\x01\x05\x07\x09\x08\x00,1246462685358 was split at > 16:11:42,865. My MR job failed at 18:12:26,462 with this: > {code} > 2009-07-01 18:12:26,462 WARN org.apache.hadoop.mapred.TaskTracker: Error > running child > org.apache.hadoop.hbase.client.RetriesExhaustedException: Trying to contact > region server Some server for region TestTable,�,1246464670313, row '�� > ', but failed after 10 attempts. > Exceptions: > ... > {code} > Why after ten attempts did the client not find the region? > {code} > 2009-07-01 16:11:42,865 [IPC Server handler 2 on 60001] INFO > org.apache.hadoop.hbase.master.ServerManager: Received MSG_REPORT_SPLIT: > TestTable,\x00\x00\x06\x05\x01\x05\x07\x09\x08\x00,1246462685358: Daughters; > TestTable,\x00\x00\x06\x05\x01\x05\x07\x09\x08\x00,1246464670313, > TestTable,\x00\x01\x01\x04\x04\x07\x02\x08\x08\x03,1246464670313 from > aa0-000-15.u.powerset.com,60020,1246461673026; 1 of 3 > 2009-07-01 16:11:42,866 [IPC Server handler 2 on 60001] INFO > org.apache.hadoop.hbase.master.RegionManager: Assigning region > TestTable,\x00\x00\x06\x05\x01\x05\x07\x09\x08\x00,1246464670313 to > aa0-000-15.u.powerset.com,60020,1246461673026 > 2009-07-01 16:11:42,866 [IPC Server handler 2 on 60001] INFO > org.apache.hadoop.hbase.master.RegionManager: Assigning region > TestTable,\x00\x01\x01\x04\x04\x07\x02\x08\x08\x03,1246464670313 to > aa0-000-15.u.powerset.com,60020,1246461673026 > 2009-07-01 16:11:45,905 [IPC Server handler 8 on 60001] INFO > org.apache.hadoop.hbase.master.ServerManager: Received > MSG_REPORT_PROCESS_OPEN: > TestTable,\x00\x01\x01\x04\x04\x07\x02\x08\x08\x03,1246464670313 from > aa0-000-15.u.powerset.com,60020,1246461673026; 1 of 3 > 2009-07-01 16:11:45,905 [IPC Server handler 8 on 60001] INFO > org.apache.hadoop.hbase.master.ServerManager: Received MSG_REPORT_OPEN: > TestTable,\x00\x00\x06\x05\x01\x05\x07\x09\x08\x00,1246464670313 from > aa0-000-15.u.powerset.com,60020,1246461673026; 2 of 3 > 2009-07-01 16:11:45,906 [IPC Server handler 8 on 60001] INFO > org.apache.hadoop.hbase.master.ServerManager: Received MSG_REPORT_OPEN: > TestTable,\x00\x01\x01\x04\x04\x07\x02\x08\x08\x03,1246464670313 from > aa0-000-15.u.powerset.com,60020,1246461673026; 3 of 3 > 2009-07-01 16:11:45,906 [HMaster] INFO > org.apache.hadoop.hbase.master.RegionServerOperation: > TestTable,\x00\x00\x06\x05\x01\x05\x07\x09\x08\x00,1246464670313 open on > 208.76.44.142:60020 > 2009-07-01 16:11:45,906 [HMaster] INFO > org.apache.hadoop.hbase.master.RegionServerOperation: updating row > TestTable,\x00\x00\x06\x05\x01\x05\x07\x09\x08\x00,1246464670313 in region > .META.,,1 with startcode 1246461673026 and server 208.76.44.142:60020 > 2009-07-01 16:11:45,908 [HMaster] INFO > org.apache.hadoop.hbase.master.RegionServerOperation: > TestTable,\x00\x01\x01\x04\x04\x07\x02\x08\x08\x03,1246464670313 open on > 208.76.44.142:60020 > 2009-07-01 16:11:45,908 [HMaster] INFO > org.apache.hadoop.hbase.master.RegionServerOperation: updating row > TestTable,\x00\x01\x01\x04\x04\x07\x02\x08\x08\x03,1246464670313 in region > .META.,,1 with startcode 1246461673026 and server 208.76.44.142:60020 > 2009-07-01 17:46:42,670 [IPC Server handler 0 on 60001] INFO > org.apache.hadoop.hbase.master.ServerManager: Received MSG_REPORT_SPLIT: > TestTable,\x00\x00\x06\x05\x01\x05\x07\x09\x08\x00,1246464670313: Daughters; > TestTable,\x00\x00\x06\x05\x01\x05\x07\x09\x08\x00,1246470379467, > TestTable,\x00\x00\x08\x04\x05\x07\x02\x05\x04\x08,1246470379467 from > aa0-000-15.u.powerset.com,60020,1246461673026; 5 of 7 > {code} > Here is over on the regionserver: > {code} > 2009-07-01 16:11:42,865 [IPC Server handler 2 on 60001] INFO > org.apache.hadoop.hbase.master.ServerManager: Received MSG_REPORT_SPLIT: > TestTable,\x00\x00\x06\x05\x01\x05\x07\x09\x08\x00,1246462685358: Daughters; > TestTable,\x00\x00\x06\x05\x01\x05\x07\x09\x08\x00,1246464670313, > TestTable,\x00\x01\x01\x04\x04\x07\x02\x08\x08\x03,1246464670313 from > aa0-000-15.u.powerset.com,60020,1246461673026; 1 of 3 > 2009-07-01 16:11:42,866 [IPC Server handler 2 on 60001] INFO > org.apache.hadoop.hbase.master.RegionManager: Assigning region > TestTable,\x00\x00\x06\x05\x01\x05\x07\x09\x08\x00,1246464670313 to > aa0-000-15.u.powerset.com,60020,1246461673026 > 2009-07-01 16:11:42,866 [IPC Server handler 2 on 60001] INFO > org.apache.hadoop.hbase.master.RegionManager: Assigning region > TestTable,\x00\x01\x01\x04\x04\x07\x02\x08\x08\x03,1246464670313 to > aa0-000-15.u.powerset.com,60020,1246461673026 > 2009-07-01 16:11:45,905 [IPC Server handler 8 on 60001] INFO > org.apache.hadoop.hbase.master.ServerManager: Received > MSG_REPORT_PROCESS_OPEN: > TestTable,\x00\x01\x01\x04\x04\x07\x02\x08\x08\x03,1246464670313 from > aa0-000-15.u.powerset.com,60020,1246461673026; 1 of 3 > 2009-07-01 16:11:45,905 [IPC Server handler 8 on 60001] INFO > org.apache.hadoop.hbase.master.ServerManager: Received MSG_REPORT_OPEN: > TestTable,\x00\x00\x06\x05\x01\x05\x07\x09\x08\x00,1246464670313 from > aa0-000-15.u.powerset.com,60020,1246461673026; 2 of 3 > 2009-07-01 16:11:45,906 [IPC Server handler 8 on 60001] INFO > org.apache.hadoop.hbase.master.ServerManager: Received MSG_REPORT_OPEN: > TestTable,\x00\x01\x01\x04\x04\x07\x02\x08\x08\x03,1246464670313 from > aa0-000-15.u.powerset.com,60020,1246461673026; 3 of 3 > 2009-07-01 16:11:45,906 [HMaster] INFO > org.apache.hadoop.hbase.master.RegionServerOperation: > TestTable,\x00\x00\x06\x05\x01\x05\x07\x09\x08\x00,1246464670313 open on > X.X.X.142:60020 > 2009-07-01 16:11:45,906 [HMaster] INFO > org.apache.hadoop.hbase.master.RegionServerOperation: updating row > TestTable,\x00\x00\x06\x05\x01\x05\x07\x09\x08\x00,1246464670313 in region > .META.,,1 with startcode 1246461673026 and server X.X.X4.142:60020 > 2009-07-01 16:11:45,908 [HMaster] INFO > org.apache.hadoop.hbase.master.RegionServerOperation: > TestTable,\x00\x01\x01\x04\x04\x07\x02\x08\x08\x03,1246464670313 open on > X.X.X.142:60020 > 2009-07-01 16:11:45,908 [HMaster] INFO > org.apache.hadoop.hbase.master.RegionServerOperation: updating row > TestTable,\x00\x01\x01\x04\x04\x07\x02\x08\x08\x03,1246464670313 in region > .META.,,1 with startcode 1246461673026 and server X.X.X.142:60020 > {code} -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.