[ https://issues.apache.org/jira/browse/HADOOP-2295?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12546267 ]
stack commented on HADOOP-2295: ------------------------------- The sense is that this issue has been around for a long time and that it has been responsible for profusion of unexplainable NotServerRegionExceptions. Whats happening is that the master double-assigns a region. Because of the context at the time of assignment hasn't changed when the second assignment happens, the two assignments go to the same host. When the first assignment reports back to the master that region is open, because of the state in the master, it thinks it a rogue regionserver coming up deploying regions it shouldn't have. So the master sends a 'you-shouldn't-have-opened-that-region-shut-it-down' message. Regionserver dutifully closes the region. As far as the master is concerned, its accounting shows the region properly assigned though no regionserver is now carrying the region. Clients come in looking for a row from the problematic region. Master sends them to where it thinks the region has been deployed. When client goes to identified regionserver, it throws the NotServingRegionException. > [hbase] Still multi-assigning regions > ------------------------------------- > > Key: HADOOP-2295 > URL: https://issues.apache.org/jira/browse/HADOOP-2295 > Project: Hadoop > Issue Type: Bug > Components: contrib/hbase > Reporter: stack > Priority: Minor > > Start key is img216/1725/22236401lvh6.gif. We doubly assign. In between > assignments, master rules 'should not have opened region' and sends a > shutdown region to remote server -- which it obeys. Meantime, master thinks > regions are deployed. > Here is master log: > {code} > ... > 2007-11-27 04:26:21,818 INFO hbase.HMaster - region > postlog,img216/1725/22236401lvh6.gif,1196164441230 split. New regions are: > postlog,img216/1725/22236401lvh6.gif,1196166373235, > postlog,img216/3961/aasaxonlivene6.jpg,1196166373241 > 2007-11-27 04:26:23,261 INFO hbase.HMaster - assigning region > postlog,img216/1725/22236401lvh6.gif,1196166373235 to server XX.XX.XX.33:60020 > 2007-11-27 04:26:27,462 DEBUG hbase.HMaster - HMaster.metaScanner scanner: > -400573094926830852 regioninfo: {regionname: > postlog,img216/1725/22236401lvh6.gif,1196164441230, startKey: > <img216/1725/22236401lvh6.gif>, offline: true, split: true, tableDesc: {name: > postlog, families: {cookie:={name: cookie, max versions: 1, compression: > NONE, in memory: false, max length: 2147483647, bloom filter: none}, > ip:={name: ip, max versions: 1, compression: NONE, in memory: false, max > length: 2147483647, bloom filter: none}}}}, server: XX.XX.XX.25:60020, > startCode: 1196155489036 > 2007-11-27 04:26:27,464 DEBUG hbase.HMaster - HMaster.metaScanner scanner: > -400573094926830852 regioninfo: {regionname: > postlog,img216/1725/22236401lvh6.gif,1196166373235, startKey: > <img216/1725/22236401lvh6.gif>, tableDesc: {name: postlog, families: > {cookie:={name: cookie, max versions: 1, compression: NONE, in memory: false, > max length: 2147483647, bloom filter: none}, ip:={name: ip, max versions: 1, > compression: NONE, in memory: false, max length: 2147483647, bloom filter: > none}}}}, server: , startCode: -1 > 2007-11-27 04:26:27,464 DEBUG hbase.HMaster - Checking > postlog,img216/1725/22236401lvh6.gif,1196166373235 is assigned > 2007-11-27 04:26:27,671 DEBUG hbase.HMaster - Checked > postlog,img216/1725/22236401lvh6.gif,1196164441230 for references: splitA: > true, splitB: true > 2007-11-27 04:26:53,328 INFO hbase.HMaster - assigning region > postlog,img216/1725/22236401lvh6.gif,1196166373235 to server XX.XX.XX.33:60020 > 2007-11-27 04:27:05,364 INFO hbase.HMaster - XX.XX.XX.33:60020 serving > postlog,img216/1725/22236401lvh6.gif,1196166373235 > 2007-11-27 04:27:05,364 DEBUG hbase.HMaster - region server XX.XX.XX.33:60020 > should not have opened region > postlog,img216/1725/22236401lvh6.gif,1196166373235 > 2007-11-27 04:27:05,367 INFO hbase.HMaster - regionname: > postlog,img216/1725/22236401lvh6.gif,1196166373235, startKey: > <img216/1725/22236401lvh6.gif>, tableDesc: {name: postlog, families: > {cookie:={name: cookie, max versions: 1, compression: NONE, in memory: false, > max length: 2147483647, bloom filter: none}, ip:={name: ip, max versions: 1, > compression: NONE, in memory: false, max length: 2147483647, bloom filter: > none}}} open on XX.XX.XX.33:60020 > 2007-11-27 04:27:05,368 INFO hbase.HMaster - updating row > postlog,img216/1725/22236401lvh6.gif,1196166373235 in table .META.,,1 with > startcode 1196155493271 and server XX.XX.XX.33:60020 > 2007-11-27 04:27:26,380 DEBUG hbase.HMaster - HMaster.metaScanner scanner: > -4876467680083821318 regioninfo: {regionname: > postlog,img216/1725/22236401lvh6.gif,1196164441230, startKey: > <img216/1725/22236401lvh6.gif>, offline: true, split: true, tableDesc: {name: > postlog, families: {cookie:={name: cookie, max versions: 1, compression: > NONE, in memory: false, max length: 2147483647, bloom filter: none}, > ip:={name: ip, max versions: 1, compression: NONE, in memory: false, max > length: 2147483647, bloom filter: none}}}}, server: XX.XX.XX.25:60020, > startCode: 1196155489036 > 2007-11-27 04:27:26,381 DEBUG hbase.HMaster - HMaster.metaScanner scanner: > -4876467680083821318 regioninfo: {regionname: > postlog,img216/1725/22236401lvh6.gif,1196166373235, startKey: > <img216/1725/22236401lvh6.gif>, tableDesc: {name: postlog, families: > {cookie:={name: cookie, max versions: 1, compression: NONE, in memory: false, > max length: 2147483647, bloom filter: none}, ip:={name: ip, max versions: 1, > compression: NONE, in memory: false, max length: 2147483647, bloom filter: > none}}}}, server: XX.XX.XX.33:60020, startCode: 1196155493271 > 2007-11-27 04:27:26,381 DEBUG hbase.HMaster - Checking > postlog,img216/1725/22236401lvh6.gif,1196166373235 is assigned > 2007-11-27 04:27:26,511 DEBUG hbase.HMaster - Checked > postlog,img216/1725/22236401lvh6.gif,1196164441230 for references: splitA: > true, splitB: true > 2007-11-27 04:28:26,444 DEBUG hbase.HMaster - HMaster.metaScanner scanner: > -7699450087241029374 regioninfo: {regionname: > postlog,img216/1725/22236401lvh6.gif,1196164441230, startKey: > <img216/1725/22236401lvh6.gif>, offline: true, split: true, tableDesc: {name: > postlog, families: {cookie:={name: cookie, max versions: 1, compression: > NONE, in memory: false, max length: 2147483647, bloom filter: none}, > ip:={name: ip, max versions: 1, compression: NONE, in memory: false, max > length: 2147483647, bloom filter: none}}}}, server: XX.XX.XX.25:60020, > startCode: 1196155489036 > 2007-11-27 04:28:26,446 DEBUG hbase.HMaster - HMaster.metaScanner scanner: > -7699450087241029374 regioninfo: {regionname: > postlog,img216/1725/22236401lvh6.gif,1196166373235, startKey: > <img216/1725/22236401lvh6.gif>, tableDesc: {name: postlog, families: > {cookie:={name: cookie, max versions: 1, compression: NONE, in memory: false, > max length: 2147483647, bloom filter: none}, ip:={name: ip, max versions: 1, > compression: NONE, in memory: false, max length: 2147483647, bloom filter: > none}}}}, server: XX.XX.XX.33:60020, startCode: 1196155493271 > 2007-11-27 04:28:26,446 DEBUG hbase.HMaster - Checking > postlog,img216/1725/22236401lvh6.gif,1196166373235 is assigned > 2007-11-27 04:28:26,590 DEBUG hbase.HMaster - Checked > postlog,img216/1725/22236401lvh6.gif,1196164441230 for references: splitA: > true, splitB: true > 2007-11-27 04:29:26,459 DEBUG hbase.HMaster - HMaster.metaScanner scanner: > 5846860608957952540 regioninfo: {regionname: > postlog,img216/1725/22236401lvh6.gif,1196164441230, startKey: > <img216/1401lvh6.gif>, offline: true, split: true, tableDesc: {name: po: 1, > compression: NONE, in memory: false, max length: 21474836s: 1, compression: > NONE, in memory: false, max length: 21474836er: none}}}}, server: > XX.XX.XX.25:60020, startCode: 11961554890ase.HMaster - HMaster.metaScanner > scanner: 5846860608957952540 /1725/22236401lvh6.gif,1196166373235, startKey: > <img216/1725/22okie:={name: cookie, max versions: 1, compression: NONE, in > mem NONE, in memory: false, max length: 2147483647, bloom filter: artCode: > 1196155493271 > 2007-11-27 04:29:26,461 DEBUG hbase.HMas235 is assigned > 2007-11-27 04:29:26,622 DEBUG hbase.HMaster - Cces: splitA: true, splitB: true > ... > {code} > Here is regionserver log on node XX.XX.XX.33: > {code} > ... > 2007-11-27 04:26:29,318 INFO hbase.HRegionServer - MSG_REGION_OPEN : > regionname: postlog,img216/1725/22236401lvh6.gif,1196166373235, startKey: > <img216/1725/22236401lvh6.gif>, tableDesc: {name: postlog, families: > {cookie:={name: cookie, max versions: 1, compression: NONE, in memory: false, > max length: 2147483647, bloom filter: none}, ip:={name: ip, max versions: 1, > compression: NONE, in memory: false, max length: 2147483647, bloom filter: > none}}} > 2007-11-27 04:26:29,325 DEBUG hbase.HStore - starting 51396220/cookie (no > reconstruction log) > 2007-11-27 04:26:30,420 DEBUG hbase.HRegionServer - flushing region > postlog,img100/941/brandyfn6.gif,1196158363481 > 2007-11-27 04:26:30,420 DEBUG hbase.HRegion - Not flushing cache for region > postlog,img100/941/brandyfn6.gif,1196158363481: snapshotMemcaches() > determined that there was nothing to do > 2007-11-27 04:26:31,752 DEBUG hbase.HRegionServer - flushing region > postlog,img149/2019/liyanaum6.gif,1196164996321 > 2007-11-27 04:26:31,753 DEBUG hbase.HRegion - Not flushing cache for region > postlog,img149/2019/liyanaum6.gif,1196164996321: snapshotMemcaches() > determined that there was nothing to do > 2007-11-27 04:26:31,753 DEBUG hbase.HRegionServer - flushing region > postlog,img148/6048/p9290339kb0.jpg,1196162613173 > 2007-11-27 04:26:31,753 DEBUG hbase.HRegion - Not flushing cache for region > postlog,img148/6048/p9290339kb0.jpg,1196162613173: snapshotMemcaches() > determined that there was nothing to do > 2007-11-27 04:26:31,754 DEBUG hbase.HRegionServer - flushing region > postlog,img132/5512/timeline8bx.png,1196159744555 > 2007-11-27 04:26:31,754 DEBUG hbase.HRegion - Not flushing cache for region > postlog,img132/5512/timeline8bx.png,1196159744555: snapshotMemcaches() > determined that there was nothing to do > 2007-11-27 04:26:37,915 DEBUG hbase.HRegionServer - flushing region > postlog,img149/3724/whosherne4.jpg,1196164996322 > 2007-11-27 04:26:37,915 DEBUG hbase.HRegion - Not flushing cache for region > postlog,img149/3724/whosherne4.jpg,1196164996322: snapshotMemcaches() > determined that there was nothing to do > 2007-11-27 04:26:57,059 DEBUG hbase.HRegionServer - flushing region > postlog,img148/8339/361de8578d356flip0hd.png,1196165445439 > 2007-11-27 04:26:57,060 DEBUG hbase.HRegion - Not flushing cache for region > postlog,img148/8339/361de8578d356flip0hd.png,1196165445439: > snapshotMemcaches() determined that there was nothing to do > 2007-11-27 04:27:03,604 DEBUG hbase.HStore - maximum sequence id for hstore > 51396220/cookie is 30625781 > 2007-11-27 04:27:03,761 DEBUG hbase.HStore - starting 51396220/ip (no > reconstruction log) > 2007-11-27 04:27:03,859 DEBUG hbase.HStore - maximum sequence id for hstore > 51396220/ip is 30625781 > 2007-11-27 04:27:03,965 DEBUG hbase.HRegion - Next sequence id for region > postlog,img216/1725/22236401lvh6.gif,1196166373235 is 30625782 > 2007-11-27 04:27:03,969 INFO hbase.HRegion - region > postlog,img216/1725/22236401lvh6.gif,1196166373235 available > 2007-11-27 04:27:03,969 INFO hbase.HRegionServer - MSG_REGION_OPEN : > regionname: postlog,img216/1725/22236401lvh6.gif,1196166373235, startKey: > <img216/1725/22236401lvh6.gif>, tableDesc: {name: postlog, families: > {cookie:={name: cookie, max versions: 1, compression: NONE, in memory: false, > max length: 2147483647, bloom filter: none}, ip:={name: ip, max versions: 1, > compression: NONE, in memory: false, max length: 2147483647, bloom filter: > none}}} > 2007-11-27 04:27:04,578 INFO hbase.HRegionServer - > MSG_REGION_CLOSE_WITHOUT_REPORT : regionname: > postlog,img216/1725/22236401lvh6.gif,1196166373235, startKey: > <img216/1725/22236401lvh6.gif>, tableDesc: {name: postlog, families: > {cookie:={name: cookie, max versions: 1, compression: NONE, in memory: false, > max length: 2147483647, bloom filter: none}, ip:={name: ip, max versions: 1, > compression: NONE, in memory: false, max length: 2147483647, bloom filter: > none}}} > 2007-11-27 04:27:04,578 DEBUG hbase.HRegion - Not flushing cache for region > postlog,img216/1725/22236401lvh6.gif,1196166373235: snapshotMemcaches() > determined that there was nothing to do > 2007-11-27 04:27:04,579 DEBUG hbase.HStore - closed 51396220/ip > 2007-11-27 04:27:04,580 DEBUG hbase.HStore - closed 51396220/cookie > 2007-11-27 04:27:04,580 INFO hbase.HRegion - closed > postlog,img216/1725/22236401lvh6.gif,1196166373235 > .. > {code} > It looks like the assigned region hasn't been cleared from the unassigned > regions list when the report comes in. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.