[ https://issues.apache.org/jira/browse/HBASE-1573?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
stack resolved HBASE-1573. -------------------------- Resolution: Fixed Hadoop Flags: [Reviewed] Committed a workaround. Not sure it will work. Will open new issue if it doesn't. Real fix is master rewrite. At root, issue is that the serverinfo map in master has had server removed because it crashed. The catalog scanner next finds the old server though logs show catalog table has been updated just before with regions new server location; because the catalog scanner doesn't see update -- timing -- it thinks region needs assigning and does so though it has just been assigned. Double-assignment mess. > Holes in master state change; updated startcode and server go into .META. but > catalog scanner just got old values > ----------------------------------------------------------------------------------------------------------------- > > Key: HBASE-1573 > URL: https://issues.apache.org/jira/browse/HBASE-1573 > Project: Hadoop HBase > Issue Type: Bug > Reporter: stack > Fix For: 0.19.4 > > Attachments: 1573-v2.patch, 1573-v3.patch, 1573.patch > > > Here is example of a scan that takes a while because 6k regions acting on > stale data resulting in double assignment of region: > {code} > hbase-Powerset-master-aa0-000-8.u.powerset.com.log.2009-06-22:2009-06-22 > 10:56:06,220 INFO org.apache.hadoop.hbase.master.ServerManager: Received > MSG_REPORT_OPEN: enwikibase,Cwj1sehVeEbnrUDR_j0xok==,1245604540748: > safeMode=false from XX.XX.45.121:20020 > hbase-Powerset-master-aa0-000-8.u.powerset.com.log.2009-06-22:2009-06-22 > 10:56:06,220 INFO org.apache.hadoop.hbase.master.RegionServerOperation: > enwikibase,Cwj1sehVeEbnrUDR_j0xok==,1245604540748 open on XX.XX.45.121:20020 > hbase-Powerset-master-aa0-000-8.u.powerset.com.log.2009-06-22:2009-06-22 > 10:56:06,220 INFO org.apache.hadoop.hbase.master.RegionServerOperation: > updating row enwikibase,Cwj1sehVeEbnrUDR_j0xok==,1245604540748 in region > .META.,,1 with with startcode !?B and server XX.XX.45.121:20020 > hbase-Powerset-master-aa0-000-8.u.powerset.com.log.2009-06-22:2009-06-22 > 10:56:06,397 DEBUG org.apache.hadoop.hbase.master.BaseScanner: Current > assignment of enwikibase,Cwj0sehVeEbnrUDR_j0xok==,1245604540748 is not valid; > Server 'XX.XX.44.95:20020' unknown. > hbase-Powerset-master-aa0-000-8.u.powerset.com.log.2009-06-22:2009-06-22 > 10:56:06,582 INFO org.apache.hadoop.hbase.master.RegionManager: Assigning > region enwikibase,Cwj1sehVeEbnrUDR_j0xok==,1245604540748 to XX.XX.45.97:20020 > hbase-Powerset-master-aa0-000-8.u.powerset.com.log.2009-06-22:2009-06-22 > 10:56:09,587 INFO org.apache.hadoop.hbase.master.ServerManager: Received > MSG_REPORT_PROCESS_OPEN: enwikibase,Cwj1sehVeEbnrUDR_j0xok==,1245604540748: > safeMode=false from XX.XX.45.97:20020 > hbase-Powerset-master-aa0-000-8.u.powerset.com.log.2009-06-22:2009-06-22 > 10:56:12,614 INFO org.apache.hadoop.hbase.master.ServerManager: Received > MSG_REPORT_OPEN: enwikibase,Cwj1sehVeEbnrUDR_j0xok==,1245604540748: > safeMode=false from XX.XX.45.97:20020 > hbase-Powerset-master-aa0-000-8.u.powerset.com.log.2009-06-22:2009-06-22 > 10:56:13,549 INFO org.apache.hadoop.hbase.master.RegionServerOperation: > enwikibase,Cwj1sehVeEbnrUDR_j0xok==,1245604540748 open on XX.XX.45.97:20020 > hbase-Powerset-master-aa0-000-8.u.powerset.com.log.2009-06-22:2009-06-22 > 10:56:13,549 INFO org.apache.hadoop.hbase.master.RegionServerOperation: > updating row enwikibase,Cwj1sehVeEbnrUDR_j0xok==,1245604540748 in region > .META.,,1 with with startcode !?? and server XX.XX.45.97:20020 > {code} > We've just updated the server info in the master because of the region open > message but the scan sees old info in the .META. table though .META. was just > updated. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.