[
https://issues.apache.org/jira/browse/HBASE-3431?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12991610#comment-12991610
]
stack commented on HBASE-3431:
------------------------------
Chatted w/ Jon and J-D on this. Jon suggests EnvironmentEdgeManager utility as
means of intercepting lookups so we can do up tests returning different
answers. Let me try it out. J-D rehearsed issues w/ have had in here over
time and that this 'mess' was 'working' in 0.20.x and even unto 0.89.x (He
remembers also that a RS can volunteer its address as 127.0.0.1 but actually
bind to real, non-localhost address somehow). He's wary about stripping it all
out as the patch does. Let me try and put up unit tests that can mock the
various scenarios.
Looking at code w/ J-D, we turned up one problematic bit of code -- HSA will
create a new InetSocketAddress on deserialization which can result in a lookup.
Looking in hdfs, datanode generates a registration name -- e.g.
DS-198919343-10.20.20.187-10010-1291133524722 -- and this is how it identifies
itself to NN regardless. No messing w/ NN telling it what name to use. TT
does something similar.
> Regionserver is not using the name given it by the master; double entry in
> master listing of servers
> ----------------------------------------------------------------------------------------------------
>
> Key: HBASE-3431
> URL: https://issues.apache.org/jira/browse/HBASE-3431
> Project: HBase
> Issue Type: Bug
> Affects Versions: 0.90.0
> Reporter: stack
> Assignee: stack
> Priority: Blocker
> Fix For: 0.90.1
>
> Attachments: 3431-v2.txt, 3431-v3.txt, 3431-v3.txt, 3431-v4.txt,
> 3431.txt
>
>
> Our man Ted Dunning found the following where RS checks in with one name, the
> master tells it use another name but we seem to go ahead and continue with
> our original name.
> In RS logs I see:
> {code}
> 2011-01-07 15:45:50,757 INFO
> org.apache.hadoop.hbase.regionserver.HRegionServer [regionserver60020]:
> Master passed us address to use. Was=perfnode11:60020, Now=10.10.30.11:60020
> {code}
> On master I see
> {code}
> 2011-01-07 15:45:38,613 INFO org.apache.hadoop.hbase.master.ServerManager
> [IPC Server handler 0 on 60000]: Registering
> server=10.10.30.11,60020,1294443935414, regionCount=0, userLoad=false
> {code}
> ....
> then later
> {code}
> 2011-01-07 15:45:44,247 INFO org.apache.hadoop.hbase.master.ServerManager
> [IPC Server handler 2 on 60000]: Registering
> server=perfnode11,60020,1294443935414, regionCount=0, userLoad=true
> {code}
> This might be since we started letting servers register in other than with
> the reportStartup.
--
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira