[ 
https://issues.apache.org/jira/browse/GEODE-746?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15377423#comment-15377423
 ] 

Kevin Duling edited comment on GEODE-746 at 7/14/16 10:45 PM:
--------------------------------------------------------------

Grace and I tracked the first part of this down to a problem in 
{{LauncherLifecycleCommands}}:
{{String locatorHostName = 
StringUtils.defaultIfBlank(locatorLauncher.getHostnameForClients(), 
getLocalHost());}}
We've changed this to look instead at the bind address first:
{code}
        String locatorHostName;
        InetAddress bindAddr = locatorLauncher.getBindAddress();
        if (bindAddr != null){
          locatorHostName = bindAddr.getCanonicalHostName();
        } else {
          locatorHostName = 
StringUtils.defaultIfBlank(locatorLauncher.getHostnameForClients(), 
getLocalHost());
        }
{code}

This resolved the problem.  The system will now connect: {{gfsh start locator 
--name=locator1 --port=19991 --bind-address=192.168.1.187}}
{noformat}
Listening for transport dt_socket at address: 30000
...............
Locator in /gemfire/open/locator1 on 192.168.1.187[19991] as locator1 is 
currently online.
Process ID: 2765
Uptime: 1 minute 23 seconds
GemFire Version: 1.0.0-incubating-SNAPSHOT
Java Version: 1.8.0_92
Log File: /gemfire/open/locator1/locator1.log
JVM Arguments: -Dgemfire.enable-cluster-configuration=true 
-Dgemfire.load-cluster-configuration-from-dir=false 
-agentlib:jdwp=transport=dt_socket,server=y,suspend=y,address=29999 
-Dgemfire.launcher.registerSignalHandlers=true -Djava.awt.headless=true 
-Dsun.rmi.dgc.server.gcInterval=9223372036854775806
Class-Path: 
/gemfire/open/geode-assembly/build/install/apache-geode/lib/geode-core-1.0.0-incubating-SNAPSHOT.jar:/gemfire/open/geode-assembly/build/install/apache-geode/lib/geode-dependencies.jar

Successfully connected to: [host=pdx2-office-dhcp9.eng.vmware.com, port=1099]

Cluster configuration service is up and running.
{noformat}

The successfully connected message appears to be showing the wrong IP address.  
Looking at netstat, we can see that the listener is correctly bound to the IP 
address specified:
{noformat}
$ netstat -an | grep 19991
tcp4       0      0  192.168.1.187.19991    *.*                    LISTEN 
{noformat}
The "successfully connected" hostname reports a different NIC: {{ping 
pdx2-office-dhcp9.eng.vmware.com}}
{noformat}
PING pdx2-office-dhcp9.eng.vmware.com (10.118.33.209): 56 data bytes
{noformat}
Both NICs exist on this machine: {{nestat -rn}}
{noformat}
Routing tables

Internet:
Destination        Gateway            Flags        Refs      Use   Netif Expire
default            10.118.33.253      UGSc          360        0     en4
default            192.168.1.253      UGScI          35        0     en0
{noformat}

Tracing this down, the address is coming from this line in 
{{ShellCommands.connectToLocator(String host, int port, int timeout, 
Map<String, String> props)}}

{code}
JmxManagerLocatorResponse locatorResponse = JmxManagerLocatorRequest.send(host, 
port, timeout, props);  
// locatorResponse: “JmxManagerLocatorResponse [host=10.118.33.209, port=1099, 
ssl=false, ex=null]”
// host: “192.168.1.187”
// port: 19991
// timeout: 15000
// props: size = 0
{code}

So the confusion here now is that this is the JMX address, not the locator 
address.  The formatting of this message lends one to believe it's supposed to 
be the locator.  Yet, if you look at the original response from the system, it 
correctly reports the Locator's address:
{noformat}
Locator in /gemfire/open/locator1 on 192.168.1.187[19991] as locator1 is 
currently online.
{noformat}

I've added JMX to the "successfully connected" message to reduce confusion.




was (Author: kduling):
Grace and I tracked the first part of this down to a problem in 
{{LauncherLifecycleCommands}}:
{{String locatorHostName = 
StringUtils.defaultIfBlank(locatorLauncher.getHostnameForClients(), 
getLocalHost());}}
We've changed this to look instead at the bind address first:
{code}
        String locatorHostName;
        InetAddress bindAddr = locatorLauncher.getBindAddress();
        if (bindAddr != null){
          locatorHostName = bindAddr.getCanonicalHostName();
        } else {
          locatorHostName = 
StringUtils.defaultIfBlank(locatorLauncher.getHostnameForClients(), 
getLocalHost());
        }
{code}

This improved things a little.  The system will now connect: {{gfsh start 
locator --name=locator1 --port=19991 --bind-address=192.168.1.187}}
{noformat}
Listening for transport dt_socket at address: 30000
...............
Locator in /gemfire/open/locator1 on 192.168.1.187[19991] as locator1 is 
currently online.
Process ID: 2765
Uptime: 1 minute 23 seconds
GemFire Version: 1.0.0-incubating-SNAPSHOT
Java Version: 1.8.0_92
Log File: /gemfire/open/locator1/locator1.log
JVM Arguments: -Dgemfire.enable-cluster-configuration=true 
-Dgemfire.load-cluster-configuration-from-dir=false 
-agentlib:jdwp=transport=dt_socket,server=y,suspend=y,address=29999 
-Dgemfire.launcher.registerSignalHandlers=true -Djava.awt.headless=true 
-Dsun.rmi.dgc.server.gcInterval=9223372036854775806
Class-Path: 
/gemfire/open/geode-assembly/build/install/apache-geode/lib/geode-core-1.0.0-incubating-SNAPSHOT.jar:/gemfire/open/geode-assembly/build/install/apache-geode/lib/geode-dependencies.jar

Successfully connected to: [host=pdx2-office-dhcp9.eng.vmware.com, port=1099]

Cluster configuration service is up and running.
{noformat}

But now the successfully connected message is showing the wrong IP address.  
Looking at netstat, we can see that the listener is correctly bound to the IP 
address specified:
{noformat}
$ netstat -an | grep 19991
tcp4       0      0  192.168.1.187.19991    *.*                    LISTEN 
{noformat}
Yet the hostname actually resolves to a different NIC: {{ping 
pdx2-office-dhcp9.eng.vmware.com}}
{noformat}
PING pdx2-office-dhcp9.eng.vmware.com (10.118.33.209): 56 data bytes
{noformat}
Both NICs exist on this machine, just one is being erroneously reported: 
{{nestat -rn}}
{noformat}
Routing tables

Internet:
Destination        Gateway            Flags        Refs      Use   Netif Expire
default            10.118.33.253      UGSc          360        0     en4
default            192.168.1.253      UGScI          35        0     en0
{noformat}

Tracing this down, it appears to be an incorrect response from the locator in 
{{ShellCommands.connectToLocator(String host, int port, int timeout, 
Map<String, String> props)}}

{code}
JmxManagerLocatorResponse locatorResponse = JmxManagerLocatorRequest.send(host, 
port, timeout, props);  
// locatorResponse: “JmxManagerLocatorResponse [host=10.118.33.209, port=1099, 
ssl=false, ex=null]”
// host: “192.168.1.187”
// port: 19991
// timeout: 15000
// props: size = 0
{code}


> When starting a locator using --bind-address, gfsh prints incorrect connect 
> message
> -----------------------------------------------------------------------------------
>
>                 Key: GEODE-746
>                 URL: https://issues.apache.org/jira/browse/GEODE-746
>             Project: Geode
>          Issue Type: Improvement
>          Components: gfsh
>            Reporter: Jens Deppe
>            Assignee: Kevin Duling
>
> When starting my locator with {{gfsh start locator --name=locator1 
> --port=19991 --bind-address=192.168.103.1}}, the output from gfsh looks like 
> this:
> {noformat}
> ..............................
> Locator in /Users/jdeppe/debug/locator1 on 192.168.103.1[19991] as locator1 
> is currently online.
> Process ID: 2666
> Uptime: 15 seconds
> GemFire Version: 8.2.0.Beta
> Java Version: 1.7.0_72
> Log File: /Users/jdeppe/debug/locator1/locator1.log
> JVM Arguments: -Dgemfire.enable-cluster-configuration=true 
> -Dgemfire.load-cluster-configuration-from-dir=false 
> -Dgemfire.launcher.registerSignalHandlers=true -Djava.awt.headless=true 
> -Dsun.rmi.dgc.server.gcInterval=9223372036854775806
> Class-Path: 
> /Users/jdeppe/gemfire/82/lib/gemfire.jar:/Users/jdeppe/gemfire/82/lib/locator-dependencies.jar
> Please use "connect --locator=192.168.1.10[19991]" to connect Gfsh to the 
> locator.
> Failed to connect; unknown cause: Connection refused
> {noformat}
> The connect string shown is just displaying my host address and not the bind 
> address.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to