>From: Alan Robertson >To: High-Availability Linux Development List >Subject: Re: [Linux-ha-dev] Why heartbeat(2.0.4) binds to all interface? >Sent: Wed Apr 05 04:49:44 CST 2006
Much Thanks . > >Cao Shuxun wrote: >> Here is my puzzledom. >> >> I installed hb2.0.4 on my slackware 10. And it worked well. >> But I found its master process bound to all my host's interface. >> >> like: >> netstat -npl | grep heartbeat >> udp 0 0 172.30.31.75:695 0.0.0.0:* 29007/heartbeat: ma >> udp 0 0 0.0.0.0:33906 0.0.0.0:* 29007/heartbeat: ma > >Heartbeat uses the SO_BINDTODEVICE option to keep it from binding to all >interfaces. Perhaps netstat doesn't read out that option correctly? So, I used nmap to verify this: [EMAIL PROTECTED]:~# nmap -sU -v -p 32800,694 172.30.31.79 172.30.31.77 Starting nmap 3.75 ( http://www.insecure.org/nmap/ ) at 2006-04-05 11:40 CST Initiating UDP Scan against 2 hosts [2 ports/host] at 11:41 Completed UDP Scan against 172.30.31.79 in 1.24s (1 host left) The UDP Scan took 1.24s to scan 4 total ports. Host 172.30.31.79 appears to be up ... good. Interesting ports on 172.30.31.79: PORT STATE SERVICE 694/udp open|filtered unknown 32800/udp open|filtered unknown Host 172.30.31.77 appears to be up ... good. Interesting ports on 172.30.31.77: PORT STATE SERVICE 694/udp closed unknown 32800/udp open|filtered unknown Nmap run completed -- 2 IP addresses (2 hosts up) scanned in 21.384 seconds Here is "netstat -npl | grep heartbeat" on this node: [EMAIL PROTECTED]:/home/p2pnetwork/heartbeat-2.0.2/lib/plugins/HBcomm# netstat -npl | grep heartbeat udp 14016 0 0.0.0.0:32800 0.0.0.0:* 20430/heartbeat: ma ~~~~~~~~~~~~~~~~~~~ udp 0 0 172.30.31.79:694 0.0.0.0:* 20430/heartbeat: ma (by the way, I changed "INADDR_ANY" to "172.30.31.79" in ucast.c) Here is "ip addr show" on this node: 1: lo: <LOOPBACK,UP> mtu 16436 qdisc noqueue link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00 inet 127.0.0.1/8 scope host lo 2: eth0: <BROADCAST,MULTICAST,UP> mtu 1500 qdisc pfifo_fast qlen 1000 link/ether 00:07:e9:24:a4:7b brd ff:ff:ff:ff:ff:ff inet 172.30.31.79/24 brd 172.30.31.255 scope global eth0 ~~~~~~~~~~~~~~~~ 3: eth1: <BROADCAST,MULTICAST,UP> mtu 1500 qdisc pfifo_fast qlen 1000 link/ether 00:07:e9:24:a4:7a brd ff:ff:ff:ff:ff:ff inet 172.30.31.77/24 brd 172.30.31.255 scope global eth1:0 ~~~~~~~~~~~~~~~~~~~~ Then the netstat's version: netstat 1.42 (2001-04-15). It seems as if everything had been ok. But one exception: hb master still hold all interface. > >The code in question is below. All Linux kernels since 2.2.x have >supported this option. You can even activate the debug shown below to >see if it's really being executed... > >#if defined(SO_BINDTODEVICE) >{ >/* >* We want to send out this particular interface >* >* This is so we can have redundant NICs, and heartbeat >on both >*/ >struct ifreq i; >strcpy(i.ifr_name, mp->name); > >if (setsockopt(sockfd, SOL_SOCKET, SO_BINDTODEVICE >, (const void *) &i, sizeof(i)) == -1) { >PILCallLog(LOG, PIL_CRIT >, "Error setting socket option >SO_BINDTODEVICE" >": %s" >, strerror(errno)); >close(sockfd); >return(-1); >} > >if (DEBUGPKT) { >PILCallLog(LOG, PIL_DEBUG >, "bcast_make_send_sock: Modified %d" >" Added option SO_BINDTODEVICE." >, sockfd); >} > >} >#endif > So, I hacked the code again. And with some luck this time, I found the key. In ucast.c, // function static int HB_make_send_sock(struct hb_media *mp) L497, After we called setsockopt(sockfd, SOL_SOCKET, SO_BINDTODEVICE, &i, sizeof(i)), and then fcntl(sockfd,F_SETFD, FD_CLOEXEC), and then we wrote "return sockfd". Without a bind. // But the next function static int HB_make_receive_sock(struct hb_media *mp) L587, we called setsockopt(sockfd, SOL_SOCKET, SO_BINDTODEVICE, &i, sizeof(i)), and then an important function call "bind(sockfd, (struct sockaddr *)&my_addr, sizeof(struct sockaddr))". (this let us specify the port!) After that, called fcntl(sockfd,F_SETFD, FD_CLOEXEC), and then we wrote "return sockfd". Had a bind. So, I was confused. What's the purpose? we would bind to an interface, but at least, we also need some control on the interface, such as ip and port. I knew if you wanted to bind to an ip which wasn't up when you started your process, you would bind to INADDR_ANY. Howere, if we could do more than that, that should be more great! I didn't get a clue from the document with the source code packet. Now, I guess that someone who hacked this file had forgot to complete it.(Maybe I am wrong. That is reasonalbe because of some issues which I don't know now.) If we could add an argument to the ha.cf, or we could use udpport+1 directly for sending packets, that would be more friendly to some people like me. ^_^ Thanks a lot! > >-- >Alan Robertson > >"Openness is the foundation and preservative of friendship... Let me >claim from you at all times your undisguised opinions." - William >Wilberforce >_______________________________________________________ >Linux-HA-Dev: [email protected] >http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev >Home Page: http://linux-ha.org/ ------------------------------ 我现在使用Sogou.com的2G邮箱了,你也来试试吧! http://mail.sogou.com/recommend/sogoumail_invite_reg1.jsp?from=sogouinvitation&s_EMAIL=csxnju%40sogou.com&username=&FullName=&Email=NULL&verify=61ec8704cd2c80c645b39e620c780fa4 _______________________________________________________ Linux-HA-Dev: [email protected] http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev Home Page: http://linux-ha.org/
