[ 
https://issues.apache.org/jira/browse/HBASE-3425?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12978594#action_12978594
 ] 

Matt Corgan commented on HBASE-3425:
------------------------------------

Hmm - it's not happening anymore.  We had just changed the DNS entry that 
pointed to that IP address to give the regionserver a more friendly name.  
While having the problem the regionserver would log this line:

2011-01-06 15:55:48,910 DEBUG 
org.apache.hadoop.hbase.regionserver.HRegionServer: Config from master: 
hbase.regionserver.address=HadoopNode98.hotpads.srv:60020

But now that is gone and it logs this one:

2011-01-06 18:29:10,903 INFO 
org.apache.hadoop.hbase.regionserver.HRegionServer: Master passed us address to 
use. Was=HadoopNode98.hotpads.srv:60020, Now=HadoopNode98.hotpads.srv:60020

Looking through HMaster.regionServerStartup, it calls 
HBaseServer.getRemoteIp().  That asks the currently open socket for the 
InetAddress, and then things get hairy with PlainSocketImpl, InetAddress, 
etc...  Something strange probably happened here due to the DNS modifications.

If for some reason any of those external networking classes returned the port 
number appended to the hostname, then HBase currently does nothing to catch it. 
 Hbase instantiates an InetSocketAddress which doesn't validate the string, and 
that is passed to an HServerAddress which also doesn't validate it.  Maybe the 
solution is to not handle the error, but to at least throw an exception earlier 
by validating the hostname string in HMaster.regionServerStartup.

> HMaster sends duplicate ports to regionserver in HServerAddress
> ---------------------------------------------------------------
>
>                 Key: HBASE-3425
>                 URL: https://issues.apache.org/jira/browse/HBASE-3425
>             Project: HBase
>          Issue Type: Bug
>          Components: master
>    Affects Versions: 0.90.0
>            Reporter: Matt Corgan
>             Fix For: 0.90.1
>
>         Attachments: HBASE-3425[0.90.0].patch
>
>
> On regionserver startup, the regionserver receives an HServerAddress from the 
> master as a Writable.  It's a string hostname and an integer port.  Our 
> master is also appending the port to the string, so when they are 
> concatenated it becomes hadoopnode98:60020:60020 and the HServerAddress 
> cannot be instantiated.  
> This should probably be fixed in the master as well, but I don't know where 
> it happens.  The attached patch handles it in the regionserver.
> Regionserver startup log:
> 2011-01-06 15:55:48,813 INFO 
> org.apache.hadoop.hbase.regionserver.HRegionServer: Connected to master at 
> hadoopmaster.hotpads.srv:60000
> 2011-01-06 15:55:48,857 INFO 
> org.apache.hadoop.hbase.regionserver.HRegionServer: Telling master at 
> hadoopmaster.hotpads.srv:60000 that we are up
> 2011-01-06 15:55:48,910 DEBUG 
> org.apache.hadoop.hbase.regionserver.HRegionServer: Config from master: 
> hbase.regionserver.address=HadoopNode98.hotpads.srv:60020
> 2011-01-06 15:55:48,910 DEBUG 
> org.apache.hadoop.hbase.regionserver.HRegionServer: Config from master: 
> fs.default.name=hdfs://hadoopmaster.hotpads.srv:54310/hbase
> 2011-01-06 15:55:48,910 DEBUG 
> org.apache.hadoop.hbase.regionserver.HRegionServer: Config from master: 
> hbase.rootdir=hdfs://hadoopmaster.hotpads.srv:54310/hbase
> 2011-01-06 15:55:48,945 ERROR org.apache.hadoop.hbase.HServerAddress: Could 
> not resolve the DNS name of HadoopNode98.hotpads.srv:60020:60020
> 2011-01-06 15:55:48,945 INFO 
> org.apache.hadoop.hbase.regionserver.HRegionServer: STOPPED: Failed 
> initialization
> 2011-01-06 15:55:48,947 ERROR 
> org.apache.hadoop.hbase.regionserver.HRegionServer: Failed init
> java.lang.IllegalArgumentException: Could not resolve the DNS name of 
> HadoopNode98.hotpads.srv:60020:60020
>         at 
> org.apache.hadoop.hbase.HServerAddress.checkBindAddressCanBeResolved(HServerAddress.java:105)
>         at 
> org.apache.hadoop.hbase.HServerAddress.<init>(HServerAddress.java:76)
>         at 
> org.apache.hadoop.hbase.regionserver.HRegionServer.handleReportForDutyResponse(HRegionServer.java:798)
>         at 
> org.apache.hadoop.hbase.regionserver.HRegionServer.tryReportForDuty(HRegionServer.java:1394)
>         at 
> org.apache.hadoop.hbase.regionserver.HRegionServer.run(HRegionServer.java:522)
>         at java.lang.Thread.run(Thread.java:619)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to