[ 
https://issues.apache.org/jira/browse/HBASE-3431?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12991094#comment-12991094
 ] 

stack commented on HBASE-3431:
------------------------------

If RS passes 127.0.0.1, then thats what its bound too and no (remote) client 
will be able to connect.  Its broke.

The fixup in master would let this (broke) server successfully register.  The 
master would call remoteIP on the connected socket to get the RSs' address and 
it would then know the RS as this.  This would happen only on startup, in 
reportForDuty, not subsequently during heartbeating; we only do the lookup of 
remoteip on reportForDuty.

Heartbeating, the RS was supposed to be volunteering the HServerInfo that the 
Master had passed it back as response to the reportForDuty.

Since 0.90.0, servers can register at heartbeat time.  This is because masters 
can join an already running cluster.  The RSs do not rerun the reportForDuty 
step.  They just start heartbeating the new Master.

We could I suppose add lookup on the sockets remoteip to heartbeating too with 
reverse lookup.

I'm thinking its better to just strip all this crap out.

> Regionserver is not using the name given it by the master; double entry in 
> master listing of servers
> ----------------------------------------------------------------------------------------------------
>
>                 Key: HBASE-3431
>                 URL: https://issues.apache.org/jira/browse/HBASE-3431
>             Project: HBase
>          Issue Type: Bug
>    Affects Versions: 0.90.0
>            Reporter: stack
>            Assignee: stack
>            Priority: Blocker
>             Fix For: 0.90.1
>
>         Attachments: 3431-v2.txt, 3431-v3.txt, 3431-v3.txt, 3431.txt
>
>
> Our man Ted Dunning found the following where RS checks in with one name, the 
> master tells it use another name but we seem to go ahead and continue with 
> our original name.
> In RS logs I see:
> {code}
> 2011-01-07 15:45:50,757 INFO  
> org.apache.hadoop.hbase.regionserver.HRegionServer [regionserver60020]: 
> Master passed us address to use. Was=perfnode11:60020, Now=10.10.30.11:60020
> {code}
> On master I see
> {code}
> 2011-01-07 15:45:38,613 INFO  org.apache.hadoop.hbase.master.ServerManager 
> [IPC Server handler 0 on 60000]: Registering 
> server=10.10.30.11,60020,1294443935414, regionCount=0, userLoad=false
> {code}
> ....
> then later
> {code}
> 2011-01-07 15:45:44,247 INFO  org.apache.hadoop.hbase.master.ServerManager 
> [IPC Server handler 2 on 60000]: Registering 
> server=perfnode11,60020,1294443935414, regionCount=0, userLoad=true
> {code}
> This might be since we started letting servers register in other than with 
> the reportStartup.

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to