[ 
https://issues.apache.org/jira/browse/HBASE-5067?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13173079#comment-13173079
 ] 

Eran Hirsch commented on HBASE-5067:
------------------------------------

To the best of my understanding, the problem is fixed in the trunk, but only to 
some extent.
It seems like the flow would work correctly, but relies on the underlying VM 
implementation and assumes certain things which are not strictly assumable.

=====

I'll explain...
1. the hostname is computed based on the reverse DNS like before
2. an InetSockedAddress is built from this hostname and stored locally as 
'initialIsa'
3. The RPC server now is created using, among others, 'initialIsa.getHostName()'
4. The address which was binded on by the rpc server is stored as the HMaster 
field 'isa'
5. The server name is initialized with the 'isa' field's hostname.

=========

Why is this problematic?
------------------------
Because it assumes things about the socked implementation which are not 
strictly enforced:

We first call the 'bind' method of a ServerSocket object, with an 
InetSocketAddress instance.
Later on we call ServerSocket's 'getLocalSocketAddress' to get this address 
instance back.
There is no way to know if the same object is returned, or maybe a new object 
is built based on the IP, or whatever other way the implementation chooses. 
Specifically to our case, You can tell this would still hold the 'hostname' 
field we gave it, with our fully qualified dns name.

====

To conclude,
I think there is a semantic problem with the way the HMaster is initialzed in 
it's c'tor:
1. When creating the rpcServer, we should call the method with 
'initialIsa.getAddress().getHostAddress()' (instead of 
'initialIsa.getHostName()).
This would also be consistent with the comment written next to this parameter, 
saying that we are sending an IP (because now we are sending a DNS name).
2. When setting the 'serverName' field, we need to use the local field 
'hostname' computed earlier (instead of 'this.isa.getHostName()).

======

Notes:
1. The same problem applies to HRegionServer which uses almost the same 
initialization code in its c'tor. 
2. I am not an HBase developer, so i don't know really how to add these changes 
myself.


                
> HMaster uses wrong name for address (in stand-alone mode)
> ---------------------------------------------------------
>
>                 Key: HBASE-5067
>                 URL: https://issues.apache.org/jira/browse/HBASE-5067
>             Project: HBase
>          Issue Type: Bug
>          Components: master
>    Affects Versions: 0.90.4
>            Reporter: Eran Hirsch
>
> In STANDALONE mode:
> When setting the configuration option "hbase.master.dns.interface" (and 
> optional "hbase.master.dns.nameserver") to non-default values,
> it is EXPECTED that the master node would report its fully qualified dns name 
> when registering in ZooKeeper,
> BUT INSTEAD, the machines hostname is taken instead.
> For example, my machine is called (aka "its hostname is...") "machine1" but 
> it's name in the network is "machine1.our-dev-network.my-corp.com", so to 
> find this machine's IP anywhere on the network i would need to query for the 
> whole name (because trying to find "machine1" is ambiguous on a network).
> Why is this a bug, because when trying to connect to this stand-alone hbase 
> installation from outside the machine it is running on, when querying ZK for 
> /hbase/master we get only the "machine1" part, and then fail with an 
> unresolvable address for the master (which later even gives a null pointer 
> because of a missing null check).
> This is the stack trace when calling HTable's c'tor:
> java.lang.IllegalArgumentException: hostname can't be null
>       at java.net.InetSocketAddress.<init>(InetSocketAddress.java:139) 
> ~[na:1.7.0_02]
>       at 
> org.apache.hadoop.hbase.HServerAddress.getResolvedAddress(HServerAddress.java:108)
>  ~[hbase-0.90.4.jar:0.90.4]
>       at 
> org.apache.hadoop.hbase.HServerAddress.<init>(HServerAddress.java:64) 
> ~[hbase-0.90.4.jar:0.90.4]
>       at 
> org.apache.hadoop.hbase.zookeeper.RootRegionTracker.dataToHServerAddress(RootRegionTracker.java:82)
>  ~[hbase-0.90.4.jar:0.90.4]
>       at 
> org.apache.hadoop.hbase.zookeeper.RootRegionTracker.waitRootRegionLocation(RootRegionTracker.java:73)
>  ~[hbase-0.90.4.jar:0.90.4]
>       at 
> org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.locateRegion(HConnectionManager.java:579)
>  ~[hbase-0.90.4.jar:0.90.4]
>       at 
> org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.locateRegion(HConnectionManager.java:559)
>  ~[hbase-0.90.4.jar:0.90.4]
>       at 
> org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.locateRegionInMeta(HConnectionManager.java:688)
>  ~[hbase-0.90.4.jar:0.90.4]
>       at 
> org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.locateRegion(HConnectionManager.java:590)
>  ~[hbase-0.90.4.jar:0.90.4]
>       at 
> org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.locateRegion(HConnectionManager.java:559)
>  ~[hbase-0.90.4.jar:0.90.4]
>       at 
> org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.locateRegionInMeta(HConnectionManager.java:688)
>  ~[hbase-0.90.4.jar:0.90.4]
>       at 
> org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.locateRegion(HConnectionManager.java:594)
>  ~[hbase-0.90.4.jar:0.90.4]
>       at 
> org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.locateRegion(HConnectionManager.java:559)
>  ~[hbase-0.90.4.jar:0.90.4]
>       at org.apache.hadoop.hbase.client.HTable.<init>(HTable.java:173) 
> ~[hbase-0.90.4.jar:0.90.4]
>       at org.apache.hadoop.hbase.client.HTable.<init>(HTable.java:147) 
> ~[hbase-0.90.4.jar:0.90.4]
> ==============
> Why this happens?
> 1. When building the HMaster object we correctly use the static 
> 'getMyAddress(conf)' to read the configuration options, and then to try and 
> resolve the machine's ip. This method returns the full qualified name 
> correctly, and this is then used to construct an 'HServerAddress' object 
> which is locally stored as 'a'.
> 2. So far so good, but now, instead of using this object as the value for the 
> master's 'address' field the code goes on to initialize the 'rpcServer' 
> field. As part of this calls the static 'HBaseRPC.getServer' method is called 
> with, among others, the HServerAddress's BIND ADDRESS (aka the IP) that we 
> have just built.
> 3. But now, when we finally get to setting the value for HMaster's 'address' 
> field, we initialize a NEW HServerAddress initialized with 
> rpcServer.getListenerAddress() (which is basically the IP we just gave it, 
> with a new listening port.
> 4. HServerAddress calls 'getAddress().getHostName()' on this address object, 
> which would return the local hostname of the machine, because the IP would be 
> resolved locally by the machine, and not using a nameserver.
> So eventually, the fully qualified name computed in step 1 is NOT USED in any 
> way, instead, all further processing is done on the IP address of the host 
> (and its local resolving to the hostname).
> =======
> What should happen?
> The 'HMaster.address' field should be set to an address which is made of the 
> fully qualified name retrieved in step 1, combined with the port retrieved 
> from the rpcServer computed at step 2.
> ====
> Notes:
> 1. It seems that the 'HBaseServer' c'tor (which is called when 
> 'HBaseRPC.getServer()' static method is called) is faulty as it doesn't use 
> the port number sent to it in effect (it sets the local 'port' field to it, 
> but then overrides it without ever reading it later on, with the port 
> returned from the new 'Listener' object. This might be a bug, but i have not 
> checked it enough.
> 2. The same bug with the master node could repeat itself in the region server 
> code, but i haven't checked that at all.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to