Region is left unassigned after a split/rebalancing, throws NSRE
----------------------------------------------------------------

                 Key: HBASE-851
                 URL: https://issues.apache.org/jira/browse/HBASE-851
             Project: Hadoop HBase
          Issue Type: Bug
    Affects Versions: 0.2.0
            Reporter: Jean-Daniel Cryans
             Fix For: 0.19.0


Master log:
{code}
2008-08-28 12:12:27,174 INFO org.apache.hadoop.hbase.master.ServerManager: 
Received MSG_REPORT_PROCESS_OPEN: 
web_pages,http://www.senior-community.net/michigan/charlevoix.htm,1219939934794 
from 192.168.1.95:60020
<jdcryans> 2008-08-28 12:12:27,174 INFO 
org.apache.hadoop.hbase.master.ServerManager: Received MSG_REPORT_OPEN: 
web_pages,http://www.salonskincare.co.uk/product_info.php/products_id/168,1219939934794
 from 192.168.1.95:60020
<jdcryans> 2008-08-28 12:12:27,174 INFO 
org.apache.hadoop.hbase.master.ServerManager: Received MSG_REPORT_OPEN: 
web_pages,http://www.senior-community.net/michigan/charlevoix.htm,1219939934794 
from 192.168.1.95:60020
<jdcryans> 2008-08-28 12:12:27,174 DEBUG 
org.apache.hadoop.hbase.master.RegionManager: Server 192.168.1.95:60020 is 
overloaded. Server load: 8 avg: 7.0
<jdcryans> 2008-08-28 12:12:27,174 DEBUG 
org.apache.hadoop.hbase.master.RegionManager: Choosing to reassign 1 regions. 
mostLoadedRegions has 8 regions in it.
<jdcryans> 2008-08-28 12:12:27,174 DEBUG 
org.apache.hadoop.hbase.master.RegionManager: Going to close region 
web_pages,http://www.senior-community.net/michigan/charlevoix.htm,1219939934794
<jdcryans> 2008-08-28 12:12:27,174 DEBUG 
org.apache.hadoop.hbase.master.HMaster: Main processing loop: 
PendingOpenOperation from 192.168.1.95:60020
<jdcryans> 2008-08-28 12:12:27,175 INFO 
org.apache.hadoop.hbase.master.ProcessRegionOpen$1: 
web_pages,http://www.salonskincare.co.uk/product_info.php/products_id/168,1219939934794
 open on 192.168.1.95:60020
<jdcryans> 2008-08-28 12:12:27,175 DEBUG 
org.apache.hadoop.hbase.master.RegionServerOperation: numberOfMetaRegions: 1, 
onlineMetaRegions.size(): 1
<jdcryans> 2008-08-28 12:12:27,175 INFO 
org.apache.hadoop.hbase.master.ProcessRegionOpen$1: updating row 
web_pages,http://www.salonskincare.co.uk/product_info.php/products_id/168,1219939934794
 in region .META.,,1 with startcode 1219931259154 and server 192.168.1.95:60020
<jdcryans> 2008-08-28 12:12:30,352 INFO 
org.apache.hadoop.hbase.master.ServerManager: Received MSG_REPORT_CLOSE: 
web_pages,http://www.senior-community.net/michigan/charlevoix.htm,1219939934794 
from 192.168.1.95:60020
<jdcryans> 2008-08-28 12:1
<jdcryans> 2008-08-28 12:12:32,557 DEBUG 
org.apache.hadoop.hbase.master.ServerManager: Total Load: 103, Num Servers: 15, 
Avg Load: 7.0
<jdcryans> 2008-08-28 12:12:34,093 DEBUG 
org.apache.hadoop.hbase.master.HMaster: Main processing loop: 
PendingOpenOperation from 192.168.1.95:60020
<jdcryans> 2008-08-28 12:12:34,093 INFO 
org.apache.hadoop.hbase.master.ProcessRegionOpen$1: 
web_pages,http://www.senior-community.net/michigan/charlevoix.htm,1219939934794 
open on 192.168.1.95:60020
<jdcryans> 2008-08-28 12:12:34,093 DEBUG 
org.apache.hadoop.hbase.master.RegionServerOperation: numberOfMetaRegions: 1, 
onlineMetaRegions.size(): 1
<jdcryans> 2008-08-28 12:12:34,093 INFO 
org.apache.hadoop.hbase.master.ProcessRegionOpen$1: updating row 
web_pages,http://www.senior-community.net/michigan/charlevoix.htm,1219939934794 
in region .META.,,1 with startcode 1219931259154 and server 192.168.1.95:60020
{code}

HRS 192.168.1.95
{code}
jdcryans> 2008-08-28 12:12:24,953 DEBUG 
org.apache.hadoop.hbase.regionserver.CompactSplitThread: Compaction requested 
for region: 
web_pages,http://www.senior-community.net/michigan/charlevoix.htm,1219939934794
<jdcryans> 2008-08-28 12:12:27,307 INFO 
org.apache.hadoop.hbase.regionserver.HRegionServer: MSG_REGION_CLOSE: 
web_pages,http://www.senior-community.net/michigan/charlevoix.htm,1219939934794:
 [EMAIL PROTECTED]
<jdcryans> 2008-08-28 12:12:27,307 INFO 
org.apache.hadoop.hbase.regionserver.HRegionServer: MSG_REGION_CLOSE: 
web_pages,http://www.senior-community.net/michigan/charlevoix.htm,1219939934794:
 [EMAIL PROTECTED]
<jdcryans> 2008-08-28 12:12:27,308 DEBUG 
org.apache.hadoop.hbase.regionserver.HRegion: Compactions and cache flushes 
disabled for region 
web_pages,http://www.senior-community.net/michigan/charlevoix.htm,1219939934794
<jdcryans> 2008-08-28 12:12:27,308 DEBUG 
org.apache.hadoop.hbase.regionserver.HRegion: Scanners disabled for region 
web_pages,http://www.senior-community.net/michigan/charlevoix.htm,1219939934794
<jdcryans> 2008-08-28 12:12:27,308 DEBUG 
org.apache.hadoop.hbase.regionserver.HRegion: No more active scanners for 
region 
web_pages,http://www.senior-community.net/michigan/charlevoix.htm,1219939934794
<jdcryans> 2008-08-28 12:12:27,308 DEBUG 
org.apache.hadoop.hbase.regionserver.HRegion: Updates disabled for region 
web_pages,http://www.senior-community.net/michigan/charlevoix.htm,1219939934794
<jdcryans> 2008-08-28 12:12:27,308 DEBUG 
org.apache.hadoop.hbase.regionserver.HRegion: No more row locks outstanding on 
region 
web_pages,http://www.senior-community.net/michigan/charlevoix.htm,1219939934794
<jdcryans> 2008-08-28 12:12:27,308 DEBUG 
org.apache.hadoop.hbase.regionserver.HStore: closed 1860667227/attribute
<jdcryans> 2008-08-28 12:12:27,308 INFO 
org.apache.hadoop.hbase.regionserver.HRegion: closed 
web_pages,http://www.senior-community.net/michigan/charlevoix.htm,1219939934794
<jdcryans> 2008-08-28 12:12:34,246 INFO org.apache.hadoop.ipc.Server: IPC 
Server handler 1 on 60020, call batchUpdate([EMAIL PROTECTED], row => 
http://www.simplewebengines.com/, {column => attribute:traveliness, value => 
'...', column => attribute:processed_at, value => '...', column => 
attribute:content, value => '...', column => attribute:refs, value => '...', 
column => attribute:crawled_at, value => '...', column => att
<jdcryans> ribute:html, value => '...', column => attribute:crawled, value => 
'...'}) from 192.168.1.96:50102: error: 
org.apache.hadoop.hbase.NotServingRegionException: 
web_pages,http://www.senior-community.net/michigan/charlevoix.htm,1219939934794
<jdcryans> org.apache.hadoop.hbase.NotServingRegionException: 
web_pages,http://www.senior-community.net/michigan/charlevoix.htm,1219939934794

NSRE for a hundred times
{code}

Restarting the cluster cleared the issue but this is a nasty bug. Proposed 
bandaid would be that if we have a NSRE after the retries, asked the master to 
scan the HRS to see if it's located somewhere else. If not, assign it 
somewhere. Finally update META.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to