Re:Re: [Linux-ha-dev] Why heartbeat(2.0.4) binds to all interface?

Shuxun Cao Wed, 05 Apr 2006 02:33:43 -0700

 >From: Alan Robertson 
 >To: High-Availability Linux Development List 
 >Subject: Re: [Linux-ha-dev] Why heartbeat(2.0.4) binds to all interface?
 >Sent: Wed Apr 05 04:49:44 CST 2006

Much Thanks .

>
 >Cao Shuxun wrote:
 >> Here is my puzzledom.
 >> 
 >> I installed hb2.0.4 on my slackware 10. And it worked well.
 >> But I found its master process bound to all my host's interface.
 >> 
 >> like:
 >> netstat -npl | grep heartbeat
 >> udp 0 0 172.30.31.75:695 0.0.0.0:* 29007/heartbeat: ma 
 >> udp 0 0 0.0.0.0:33906 0.0.0.0:* 29007/heartbeat: ma 
 >
 >Heartbeat uses the SO_BINDTODEVICE option to keep it from binding to all
 >interfaces. Perhaps netstat doesn't read out that option correctly?

So, I used nmap to verify this:

[EMAIL PROTECTED]:~# nmap -sU -v -p 32800,694  172.30.31.79 172.30.31.77

Starting nmap 3.75 ( http://www.insecure.org/nmap/ ) at 2006-04-05 11:40 CST
Initiating UDP Scan against 2 hosts [2 ports/host] at 11:41
Completed UDP Scan against 172.30.31.79 in 1.24s (1 host left)
The UDP Scan took 1.24s to scan 4 total ports.
Host 172.30.31.79 appears to be up ... good.
Interesting ports on 172.30.31.79:
PORT      STATE         SERVICE
694/udp   open|filtered unknown
32800/udp open|filtered unknown

Host 172.30.31.77 appears to be up ... good.
Interesting ports on 172.30.31.77:
PORT      STATE         SERVICE
694/udp   closed        unknown
32800/udp open|filtered unknown

Nmap run completed -- 2 IP addresses (2 hosts up) scanned in 21.384 seconds

Here is "netstat -npl | grep heartbeat" on this node:
[EMAIL PROTECTED]:/home/p2pnetwork/heartbeat-2.0.2/lib/plugins/HBcomm# netstat 
-npl | grep heartbeat
udp    14016      0 0.0.0.0:32800           0.0.0.0:*                           
20430/heartbeat: ma 
                ~~~~~~~~~~~~~~~~~~~
udp        0      0 172.30.31.79:694        0.0.0.0:*                           
20430/heartbeat: ma 
(by the way, I changed "INADDR_ANY" to "172.30.31.79" in ucast.c)

Here is "ip addr show" on this node:
1: lo: <LOOPBACK,UP> mtu 16436 qdisc noqueue 
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
2: eth0: <BROADCAST,MULTICAST,UP> mtu 1500 qdisc pfifo_fast qlen 1000
    link/ether 00:07:e9:24:a4:7b brd ff:ff:ff:ff:ff:ff
    inet 172.30.31.79/24 brd 172.30.31.255 scope global eth0
        ~~~~~~~~~~~~~~~~
3: eth1: <BROADCAST,MULTICAST,UP> mtu 1500 qdisc pfifo_fast qlen 1000
    link/ether 00:07:e9:24:a4:7a brd ff:ff:ff:ff:ff:ff
    inet 172.30.31.77/24 brd 172.30.31.255 scope global eth1:0
        ~~~~~~~~~~~~~~~~~~~~
Then the netstat's version: netstat 1.42 (2001-04-15).
It seems as if everything had been ok. But one exception: hb master still hold 
all interface.

 >
 >The code in question is below. All Linux kernels since 2.2.x have
 >supported this option. You can even activate the debug shown below to
 >see if it's really being executed...
 >
 >#if defined(SO_BINDTODEVICE)
 >{
 >/*
 >* We want to send out this particular interface
 >*
 >* This is so we can have redundant NICs, and heartbeat
 >on both
 >*/
 >struct ifreq i;
 >strcpy(i.ifr_name, mp->name);
 >
 >if (setsockopt(sockfd, SOL_SOCKET, SO_BINDTODEVICE
 >, (const void *) &i, sizeof(i)) == -1) {
 >PILCallLog(LOG, PIL_CRIT
 >, "Error setting socket option
 >SO_BINDTODEVICE"
 >": %s"
 >, strerror(errno));
 >close(sockfd);
 >return(-1);
 >}
 >
 >if (DEBUGPKT) {
 >PILCallLog(LOG, PIL_DEBUG
 >, "bcast_make_send_sock: Modified %d"
 >" Added option SO_BINDTODEVICE."
 >, sockfd);
 >}
 >
 >}
 >#endif
 >
So, I hacked the code again. And with some luck this time, I found the key.
In ucast.c, 
// function static int HB_make_send_sock(struct hb_media *mp)
L497, After we called setsockopt(sockfd, SOL_SOCKET, SO_BINDTODEVICE, &i, 
sizeof(i)), 
and then fcntl(sockfd,F_SETFD, FD_CLOEXEC), and then we wrote "return sockfd". 
Without a bind.

// But the next function static int HB_make_receive_sock(struct hb_media *mp)
L587, we called setsockopt(sockfd, SOL_SOCKET, SO_BINDTODEVICE, &i, sizeof(i)), 
and then an important function call "bind(sockfd, (struct sockaddr *)&my_addr, 
sizeof(struct sockaddr))". 
(this let us specify the port!) 
After that, called fcntl(sockfd,F_SETFD, FD_CLOEXEC), and then we wrote "return 
sockfd". Had a bind.

So, I was confused. What's the purpose? we would bind to an interface, but at 
least, we also need some 
control on the interface, such as ip and port. I knew if you wanted to bind to 
an ip which wasn't up 
when you started your process, you would bind to INADDR_ANY. Howere, if we 
could do more than that, that
should be more great!

I didn't get a clue from the document with the source code packet.

Now, I guess that someone who hacked this file had forgot to complete it.(Maybe 
I am wrong. That is reasonalbe
because of some issues which I don't know now.)

If we could add an argument to the ha.cf, or we could use udpport+1 directly 
for sending packets, that would be
more friendly to some people like me. ^_^

Thanks a lot!

 >
 >-- 
 >Alan Robertson 
 >
 >"Openness is the foundation and preservative of friendship... Let me
 >claim from you at all times your undisguised opinions." - William
 >Wilberforce
 >_______________________________________________________
 >Linux-HA-Dev: [email protected]
 >http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
 >Home Page: http://linux-ha.org/

------------------------------
我现在使用Sogou.com的2G邮箱了，你也来试试吧! 
http://mail.sogou.com/recommend/sogoumail_invite_reg1.jsp?from=sogouinvitation&s_EMAIL=csxnju%40sogou.com&username=&FullName=&Email=NULL&verify=61ec8704cd2c80c645b39e620c780fa4

_______________________________________________________
Linux-HA-Dev: [email protected]
http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
Home Page: http://linux-ha.org/

Re:Re: [Linux-ha-dev] Why heartbeat(2.0.4) binds to all interface?

Reply via email to