[jira] [Created] (HBASE-14958) regionserver.HRegionServer: Master passed us a different hostname to use; was=n04docker2, but now=192.168.3.114

2015-12-09 Thread Yong Zheng (JIRA)
Yong Zheng created HBASE-14958:
--

 Summary: regionserver.HRegionServer: Master passed us a different 
hostname to use; was=n04docker2, but now=192.168.3.114
 Key: HBASE-14958
 URL: https://issues.apache.org/jira/browse/HBASE-14958
 Project: HBase
  Issue Type: Bug
Affects Versions: 1.1.2
 Environment: physical machines: redhat7.1
docker version: 1.9.1
Reporter: Yong Zheng


I have two physical machines: c3m3n03docker and c3m3n04docker.
I started two docker instances per physical node. the topology is like:

n03docker1(172.17.1.2)  -\
  | br0(172.17.1.1)  +  c3m3n03
n03docker2(172.17.1.3) -/


n04docker1(172.17.2.2)  -\
  | br0(172.17.2.1)  +  c3m3n04
n04docker2(172.17.2.3) -/

for physical machines, c3m3n03 is bundled with physical adapter enp11s0f0 with 
IP (192.168.3.113/16); c3m3n04 is bundled with physical adapter enp11s0f0 with 
IP(192.168.3.114/16). these two physical adapters are connecting to the same 
switch.

Note: br0 is not bundled to physical adapter enp11s0f0  on both nodes. so, all 
requests in 172.17.2.x will be source NAT as 192.168.3.114(c3m3n04) and 
forwarded to c3m3n03.

n03docker1: hbase(1.1.2) master
n03docker2: region server
n04docker1: region server
n04docker2: region server

I first start the n03docker1 and n03docker2, it works; after that, I start 
n04docker1 and it will reported:

2015-12-09 08:01:58,259 ERROR 
[regionserver/n04docker2.gpfs.net/172.17.2.3:16020] regionserver.HRegionServer: 
Master passed us a different hostname to use; was=n04docker2.gpfs.net, but 
now=192.168.3.114

on the master logs:
2015-12-09 08:11:12,234 INFO  [PriorityRpcServer.handler=0,queue=0,port=16000] 
master.ServerManager: Registering server=192.168.3.114,16020,144970721

So, you see, when hbase master receives the requests from n04docker1, all these 
requests are source NATed with 192.168.3.114(not 172.17.2.2).  and hbase master 
passes 192.168.3.114 back to 172.17.2.2(n04docker1). Thus, 
n04docker1(172.17.2.2) reported exceptions in logs.

hbase doesn't support running in virtualization cluster? because SNAT is widely 
used in virtualization. if hbase master get remote hostname/ip(thus get 
192.168.3.114) and pass it back to region server, it will hit this issues.

HBASE-8667 doesn't fix this issue because the fix has been hbase 0.98(I'm 
taking hbase 1.1.2).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-14958) regionserver.HRegionServer: Master passed us a different hostname to use; was=n04docker2, but now=192.168.3.114

2015-12-09 Thread Yong Zheng (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-14958?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yong Zheng updated HBASE-14958:
---
Description: 
I have two physical machines: c3m3n03docker and c3m3n04docker.
I started two docker instances per physical node. the topology is like:

n03docker1(172.17.1.2)  -\
  | br0(172.17.1.1)  +  c3m3n03
n03docker2(172.17.1.3) -/


n04docker1(172.17.2.2)  -\
  | br0(172.17.2.1)  +  c3m3n04
n04docker2(172.17.2.3) -/

for physical machines, c3m3n03 is bundled with physical adapter enp11s0f0 with 
IP (192.168.3.113/16); c3m3n04 is bundled with physical adapter enp11s0f0 with 
IP(192.168.3.114/16). these two physical adapters are connecting to the same 
switch.

Note: br0 is not bundled to physical adapter enp11s0f0  on both nodes. so, all 
requests in 172.17.2.x will be source NAT as 192.168.3.114(c3m3n04) and 
forwarded to c3m3n03.

n03docker1: hbase(1.1.2) master
n03docker2: region server
n04docker1: region server
n04docker2: region server

I first start the n03docker1 and n03docker2, it works; after that, I start 
n04docker2 and it will reported:

2015-12-09 08:01:58,259 ERROR 
[regionserver/n04docker2.gpfs.net/172.17.2.3:16020] regionserver.HRegionServer: 
Master passed us a different hostname to use; was=n04docker2.gpfs.net, but 
now=192.168.3.114

on the master logs:
2015-12-09 08:11:12,234 INFO  [PriorityRpcServer.handler=0,queue=0,port=16000] 
master.ServerManager: Registering server=192.168.3.114,16020,144970721

So, you see, when hbase master receives the requests from n04docker2, all these 
requests are source NATed with 192.168.3.114(not 172.17.2.3).  and hbase master 
passes 192.168.3.114 back to 172.17.2.3(n04docker2). Thus, 
n04docker1(172.17.2.3) reported exceptions in logs.

hbase doesn't support running in virtualization cluster? because SNAT is widely 
used in virtualization. if hbase master get remote hostname/ip(thus get 
192.168.3.114) and pass it back to region server, it will hit this issues.

HBASE-8667 doesn't fix this issue because the fix has been hbase 0.98(I'm 
taking hbase 1.1.2).

  was:
I have two physical machines: c3m3n03docker and c3m3n04docker.
I started two docker instances per physical node. the topology is like:

n03docker1(172.17.1.2)  -\
  | br0(172.17.1.1)  +  c3m3n03
n03docker2(172.17.1.3) -/


n04docker1(172.17.2.2)  -\
  | br0(172.17.2.1)  +  c3m3n04
n04docker2(172.17.2.3) -/

for physical machines, c3m3n03 is bundled with physical adapter enp11s0f0 with 
IP (192.168.3.113/16); c3m3n04 is bundled with physical adapter enp11s0f0 with 
IP(192.168.3.114/16). these two physical adapters are connecting to the same 
switch.

Note: br0 is not bundled to physical adapter enp11s0f0  on both nodes. so, all 
requests in 172.17.2.x will be source NAT as 192.168.3.114(c3m3n04) and 
forwarded to c3m3n03.

n03docker1: hbase(1.1.2) master
n03docker2: region server
n04docker1: region server
n04docker2: region server

I first start the n03docker1 and n03docker2, it works; after that, I start 
n04docker1 and it will reported:

2015-12-09 08:01:58,259 ERROR 
[regionserver/n04docker2.gpfs.net/172.17.2.3:16020] regionserver.HRegionServer: 
Master passed us a different hostname to use; was=n04docker2.gpfs.net, but 
now=192.168.3.114

on the master logs:
2015-12-09 08:11:12,234 INFO  [PriorityRpcServer.handler=0,queue=0,port=16000] 
master.ServerManager: Registering server=192.168.3.114,16020,144970721

So, you see, when hbase master receives the requests from n04docker1, all these 
requests are source NATed with 192.168.3.114(not 172.17.2.2).  and hbase master 
passes 192.168.3.114 back to 172.17.2.2(n04docker1). Thus, 
n04docker1(172.17.2.2) reported exceptions in logs.

hbase doesn't support running in virtualization cluster? because SNAT is widely 
used in virtualization. if hbase master get remote hostname/ip(thus get 
192.168.3.114) and pass it back to region server, it will hit this issues.

HBASE-8667 doesn't fix this issue because the fix has been hbase 0.98(I'm 
taking hbase 1.1.2).


> regionserver.HRegionServer: Master passed us a different hostname to use; 
> was=n04docker2, but now=192.168.3.114
> ---
>
> Key: HBASE-14958
> URL: https://issues.apache.org/jira/browse/HBASE-14958
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 1.1.2
> Environment: physical machines: redhat7.1
> docker version: 1.9.1
>Reporter: Yong Zheng
>
> I have two physical machines: c3m3n03docker and c3m3n04docker.
> I started two docker instances per physical node. the topology is like:
> n03docker1(172.17.1.2)  -\
> 

[jira] [Commented] (HBASE-14958) regionserver.HRegionServer: Master passed us a different hostname to use; was=n04docker2, but now=192.168.3.114

2015-12-09 Thread Yong Zheng (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-14958?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15049753#comment-15049753
 ] 

Yong Zheng commented on HBASE-14958:


Thanks for Nick so prompt response. 

After checking the prerequisites, DNS can't solve the issue. 

in my virtualized hbase cluster, it has only 4 nodes: 
n03docker1(172.17.1.2)
n03docker2(172.17.1.3)

n04docker1(172.17.2.2)
n04docker2(172.17.2.3)

DNS is not configured but I configured /etc/hosts:
# cat /etc/hosts
127.0.0.1   localhost localhost.localdomain localhost4 localhost4.localdomain4
::1 localhost localhost.localdomain localhost6 localhost6.localdomain6

172.17.1.1   c3m3n03docker.gpfs.net c3m3n03docker<== the br0 on the 
physical node c3m3n03
172.17.2.1   c3m3n04docker.gpfs.net c3m3n04docker <== the br0 on 
the physical node c3m3n04

172.17.1.2   n03docker1.gpfs.net n03docker1
172.17.1.3   n03docker2.gpfs.net n03docker2
172.17.2.2   n04docker1.gpfs.net n04docker1
172.17.2.3   n04docker2.gpfs.net n04docker2

so, DNS resolution works(I do see the correct name for n03docker1 and 
n03docker2). However, for any region servers located over other physical 
machines, all network packet from those region servers  will be source NATed 
with the IP of c3m3n04(192.168.3.114)(that means, all IP packet will be changed 
with the source IP as 192.168.3.114. so that these packets can be transferred 
to the physical node c3m3n03).

for hbase master, 192.168.3.113 or 192.168.3.114 are invisible for hbase. thus, 
DNS resolution for 192.168.3.114 inside VM doesn't help this.  e.g. 
192.168.3.114's hostname should be c3m3n04, not n04docker1 or n04docker2.
if we configure DNS inside VM to map 192.168.3.114 into n04docker1 or 
n04docker2, this will mess up IP-hostname inside VM. Also, if we map 
192.168.3.114 into n04docker1, that means, we can't start the 2nd region server 
over the same physical node because they will be recognized as the physical 
node's IP address/hostname.

> regionserver.HRegionServer: Master passed us a different hostname to use; 
> was=n04docker2, but now=192.168.3.114
> ---
>
> Key: HBASE-14958
> URL: https://issues.apache.org/jira/browse/HBASE-14958
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 1.1.2
> Environment: physical machines: redhat7.1
> docker version: 1.9.1
>Reporter: Yong Zheng
>
> I have two physical machines: c3m3n03docker and c3m3n04docker.
> I started two docker instances per physical node. the topology is like:
> n03docker1(172.17.1.2)  -\
>   | br0(172.17.1.1)  +  c3m3n03
> n03docker2(172.17.1.3) -/
> n04docker1(172.17.2.2)  -\
>   | br0(172.17.2.1)  +  c3m3n04
> n04docker2(172.17.2.3) -/
> for physical machines, c3m3n03 is bundled with physical adapter enp11s0f0 
> with IP (192.168.3.113/16); c3m3n04 is bundled with physical adapter 
> enp11s0f0 with IP(192.168.3.114/16). these two physical adapters are 
> connecting to the same switch.
> Note: br0 is not bundled to physical adapter enp11s0f0  on both nodes. so, 
> all requests in 172.17.2.x will be source NAT as 192.168.3.114(c3m3n04) and 
> forwarded to c3m3n03.
> n03docker1: hbase(1.1.2) master
> n03docker2: region server
> n04docker1: region server
> n04docker2: region server
> I first start the n03docker1 and n03docker2, it works; after that, I start 
> n04docker2 and it will reported:
> 2015-12-09 08:01:58,259 ERROR 
> [regionserver/n04docker2.gpfs.net/172.17.2.3:16020] 
> regionserver.HRegionServer: Master passed us a different hostname to use; 
> was=n04docker2.gpfs.net, but now=192.168.3.114
> on the master logs:
> 2015-12-09 08:11:12,234 INFO  
> [PriorityRpcServer.handler=0,queue=0,port=16000] master.ServerManager: 
> Registering server=192.168.3.114,16020,144970721
> So, you see, when hbase master receives the requests from n04docker2, all 
> these requests are source NATed with 192.168.3.114(not 172.17.2.3).  and 
> hbase master passes 192.168.3.114 back to 172.17.2.3(n04docker2). Thus, 
> n04docker1(172.17.2.3) reported exceptions in logs.
> hbase doesn't support running in virtualization cluster? because SNAT is 
> widely used in virtualization. if hbase master get remote hostname/ip(thus 
> get 192.168.3.114) and pass it back to region server, it will hit this issues.
> HBASE-8667 doesn't fix this issue because the fix has been hbase 0.98(I'm 
> taking hbase 1.1.2).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-14958) regionserver.HRegionServer: Master passed us a different hostname to use; was=n04docker2, but now=192.168.3.114

2015-12-09 Thread Yong Zheng (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-14958?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15049930#comment-15049930
 ] 

Yong Zheng commented on HBASE-14958:


I did some simple test on n03docker2(172.17.1.3) with tcp_server; on 
n04docker2(172.17.2.3) with tcp_client.

in tcp_server:
...
sin_size=sizeof(struct   sockaddr_in);
if((new_fd=accept(sockfd,(struct   sockaddr   
*)(_addr),_size)) == -1)
{
fprintf(stderr,"Accept   
error:%s\n\a",strerror(errno));
exit(1);
}
fprintf(stderr,"Server   get   connection   from   %x\n",
client_addr.sin_addr.s_addr);

ret = getpeername(sockfd, (struct   sockaddr   
*)(_peer_addr), _size); 
...

on tcp_client, it just connects to the server and send one message.

bash-4.1# hostname
n04docker2
bash-4.1# ./tcp_client 172.17.1.3 8030

on tcp_server,
bash-4.1# hostname
n03docker2.gpfs.net
bash-4.1# ./tcp_server 8030
will accepting...
Server   get   connection ...
Server   get   connection   from   7203a8c0 <== this IP address is 
192.168.3.114 after transforming to host address.

So, in Source NAT-involved virtualization, it looks to me that the current 
hbase master/region server mechanism doesn't work. maybe, we could ask the 
region server/master to exchange the hostname,not depends on socket API to get 
the client IP address.


> regionserver.HRegionServer: Master passed us a different hostname to use; 
> was=n04docker2, but now=192.168.3.114
> ---
>
> Key: HBASE-14958
> URL: https://issues.apache.org/jira/browse/HBASE-14958
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 1.1.2
> Environment: physical machines: redhat7.1
> docker version: 1.9.1
>Reporter: Yong Zheng
>
> I have two physical machines: c3m3n03docker and c3m3n04docker.
> I started two docker instances per physical node. the topology is like:
> n03docker1(172.17.1.2)  -\
>   | br0(172.17.1.1)  +  c3m3n03
> n03docker2(172.17.1.3) -/
> n04docker1(172.17.2.2)  -\
>   | br0(172.17.2.1)  +  c3m3n04
> n04docker2(172.17.2.3) -/
> for physical machines, c3m3n03 is bundled with physical adapter enp11s0f0 
> with IP (192.168.3.113/16); c3m3n04 is bundled with physical adapter 
> enp11s0f0 with IP(192.168.3.114/16). these two physical adapters are 
> connecting to the same switch.
> Note: br0 is not bundled to physical adapter enp11s0f0  on both nodes. so, 
> all requests in 172.17.2.x will be source NAT as 192.168.3.114(c3m3n04) and 
> forwarded to c3m3n03.
> n03docker1: hbase(1.1.2) master
> n03docker2: region server
> n04docker1: region server
> n04docker2: region server
> I first start the n03docker1 and n03docker2, it works; after that, I start 
> n04docker2 and it will reported:
> 2015-12-09 08:01:58,259 ERROR 
> [regionserver/n04docker2.gpfs.net/172.17.2.3:16020] 
> regionserver.HRegionServer: Master passed us a different hostname to use; 
> was=n04docker2.gpfs.net, but now=192.168.3.114
> on the master logs:
> 2015-12-09 08:11:12,234 INFO  
> [PriorityRpcServer.handler=0,queue=0,port=16000] master.ServerManager: 
> Registering server=192.168.3.114,16020,144970721
> So, you see, when hbase master receives the requests from n04docker2, all 
> these requests are source NATed with 192.168.3.114(not 172.17.2.3).  and 
> hbase master passes 192.168.3.114 back to 172.17.2.3(n04docker2). Thus, 
> n04docker1(172.17.2.3) reported exceptions in logs.
> hbase doesn't support running in virtualization cluster? because SNAT is 
> widely used in virtualization. if hbase master get remote hostname/ip(thus 
> get 192.168.3.114) and pass it back to region server, it will hit this issues.
> HBASE-8667 doesn't fix this issue because the fix has been hbase 0.98(I'm 
> taking hbase 1.1.2).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)