All:

I have two test machines, ib0-test and ib1-test, connect together by a pair of QLogic InfiniPath QLE7140 HCAs. There is no router involved. My goal is to test DHCP provisioning onver infiniband on this simple topology. Unfortunately I am unable to get the client to accept the offered address.

I have the DHCP daemon running on ib0-test. I have given the ib0 interface the following configuration:

        DEVICE=ib0
        ONBOOT=yes
        BOOTPROTO=static
        IPADDR=172.10.10.2
        NETMASK=255.255.255.0

If I set the ib0 interface on ib1-test to a static IP similar to this one, everything works well. However, if I try DHCP, it will not accept the offered IP address. This is the configuration on ib1-test:

        DEVICE=ib0
        ONBOOT=yes
        BOOTPROTO=dhcp

If I do a tcpdump on both nodes during the DHCP request, I get very similar answers. On the dhcpd host, ib0-test, I get:

18:40:32.982491 IP ib1-test.bootpc > 255.255.255.255.bootps: BOOTP/ DHCP, Request, length: 300
        18:40:32.984083 arp who-has 172.10.10.158 tell 172.10.10.2 hardware #32
18:40:33.001142 IP 172.10.10.2.bootps > 255.255.255.255.bootpc: BOOTP/DHCP, Reply, length: 309
        18:40:33.984028 arp who-has 172.10.10.158 tell 172.10.10.2 hardware #32
        18:40:34.983974 arp who-has 172.10.10.158 tell 172.10.10.2 hardware #32
18:40:37.991935 IP ib1-test.bootpc > 255.255.255.255.bootps: BOOTP/ DHCP, Request, length: 300 18:40:37.992125 IP 172.10.10.2.bootps > 255.255.255.255.bootpc: BOOTP/DHCP, Reply, length: 309 18:40:52.992217 IP ib1-test.bootpc > 255.255.255.255.bootps: BOOTP/ DHCP, Request, length: 300 18:40:52.992428 IP 172.10.10.2.bootps > 255.255.255.255.bootpc: BOOTP/DHCP, Reply, length: 309 18:41:13.991209 IP ib1-test.bootpc > 255.255.255.255.bootps: BOOTP/ DHCP, Request, length: 300 18:41:13.991384 IP 172.10.10.2.bootps > 255.255.255.255.bootpc: BOOTP/DHCP, Reply, length: 309 18:41:23.991705 IP ib1-test.bootpc > 255.255.255.255.bootps: BOOTP/ DHCP, Request, length: 300 18:41:23.991909 IP 172.10.10.2.bootps > 255.255.255.255.bootpc: BOOTP/DHCP, Reply, length: 309

Whereas on the client host, ib1-test, I get:

18:40:32.991272 IP ib1-test.bootpc > 255.255.255.255.bootps: BOOTP/ DHCP, Request, length: 300
        18:40:32.992963 arp who-has 172.10.10.158 tell 172.10.10.2 hardware #32
18:40:33.010013 IP 172.10.10.2.bootps > 255.255.255.255.bootpc: BOOTP/DHCP, Reply, length: 309
        18:40:33.992895 arp who-has 172.10.10.158 tell 172.10.10.2 hardware #32
        18:40:34.992846 arp who-has 172.10.10.158 tell 172.10.10.2 hardware #32
18:40:38.000761 IP ib1-test.bootpc > 255.255.255.255.bootps: BOOTP/ DHCP, Request, length: 300 18:40:38.001014 IP 172.10.10.2.bootps > 255.255.255.255.bootpc: BOOTP/DHCP, Reply, length: 309 18:40:53.001075 IP ib1-test.bootpc > 255.255.255.255.bootps: BOOTP/ DHCP, Request, length: 300 18:40:53.001363 IP 172.10.10.2.bootps > 255.255.255.255.bootpc: BOOTP/DHCP, Reply, length: 309 18:41:14.000117 IP ib1-test.bootpc > 255.255.255.255.bootps: BOOTP/ DHCP, Request, length: 300 18:41:14.000366 IP 172.10.10.2.bootps > 255.255.255.255.bootpc: BOOTP/DHCP, Reply, length: 309 18:41:24.000642 IP ib1-test.bootpc > 255.255.255.255.bootps: BOOTP/ DHCP, Request, length: 300 18:41:24.000917 IP 172.10.10.2.bootps > 255.255.255.255.bootpc: BOOTP/DHCP, Reply, length: 309

There are no substantive differences between these two transfers. Having said, that, there are differences at the packet level. I will not insert all of the packet data for brevity's sake. However I can offer an example: one packet may have the following data:

        < 0000  00 00 00 20 00 00 00 00 00 00 00 00 00 00 08 00
        > 0000  00 04 00 20 00 00 00 00 00 00 00 00 00 00 08 00

while a subsequent packet may have:

        < 0000  00 04 00 20 00 00 00 00 00 00 00 00 00 00 08 06
        > 0000  00 00 00 20 00 00 00 00 00 00 00 00 00 00 08 06

I will not claim to understand enough about TCP/IP transport to say whether this is normal. From a pattern perspective, it appears to be in line with what I would expect.

The TCP dumps are in line with the output I get in /var/log/ messages. From the dhcpd host machine I get:

        Jul 18 17:41:35 ib0-test dhcpd: DHCPDISCOVER from  via ib0
        Jul 18 17:41:36 ib0-test dhcpd: DHCPOFFER on 172.10.10.158 to  via ib0
        Jul 18 17:41:40 ib0-test dhcpd: DHCPDISCOVER from  via ib0
        Jul 18 17:41:40 ib0-test dhcpd: DHCPOFFER on 172.10.10.158 to  via ib0
        Jul 18 17:41:54 ib0-test dhcpd: DHCPDISCOVER from  via ib0
        Jul 18 17:41:54 ib0-test dhcpd: DHCPOFFER on 172.10.10.158 to  via ib0
        Jul 18 17:42:13 ib0-test dhcpd: DHCPDISCOVER from  via ib0
        Jul 18 17:42:13 ib0-test dhcpd: DHCPOFFER on 172.10.10.158 to  via ib0
        Jul 18 17:42:23 ib0-test dhcpd: DHCPDISCOVER from  via ib0
        Jul 18 17:42:23 ib0-test dhcpd: DHCPOFFER on 172.10.10.158 to  via ib0

while the client machine has:

        Jul 18 17:41:35 ib1-test dhclient: Sending on   Socket/fallback
Jul 18 17:41:35 ib1-test dhclient: DHCPDISCOVER on ib0 to 255.255.255.255 port 67 interval 6 Jul 18 17:41:41 ib1-test dhclient: DHCPDISCOVER on ib0 to 255.255.255.255 port 67 interval 14 Jul 18 17:41:54 ib1-test dhclient: DHCPDISCOVER on ib0 to 255.255.255.255 port 67 interval 19 Jul 18 17:42:14 ib1-test dhclient: DHCPDISCOVER on ib0 to 255.255.255.255 port 67 interval 10 Jul 18 17:42:24 ib1-test dhclient: DHCPDISCOVER on ib0 to 255.255.255.255 port 67 interval 12
        Jul 18 17:42:35 ib1-test dhclient: No DHCPOFFERS received.

What baffles me is that the offer is made but never accepted. I have tried any number of changes to the dhclient.conf file to avoid rejections, to no avail.

Finally, I will leave you with the relevant configuration files...

/etc/dhcpd.conf on ib0-test:

        ddns-update-style interim;
        ignore client-updates;

        subnet 172.10.10.0 netmask 255.255.255.0 {
                range dynamic-bootp     172.10.10.100 172.10.10.200;
                option domain-name              "univaud.com";
option domain-name-servers 192.168.31.10, 10.10.0.12, 10.10.0.13;
                #option routers         192.168.31.1;
                option subnet-mask      255.255.255.0;
                option time-offset      -21600; # Central Standard Time
option ntp-servers 64.113.32.5, 65.111.164.223, 72.52.190.26;
                default-lease-time 21600;
                max-lease-time 43200;
        }

/etc/dhclient.conf on ib1-test:

        interface "ib0" {
send dhcp-client-identifier 20:00:55:00:01:FE: 80:00:00:00:00:00:00:00:11:75:00:00:ff:94:fd;
        }

I created the client identifier from the following information:

        20:<4 byte QP Number><8-byte subnet prefix><8 byte GUID>.

This seems to be in line with the patch that we applied to the DHCP software to make it work with infiniband.

Lastly, an excerpt from /etc/modules.conf on ip1-test:

        alias ib0 ib_ipoib

I tried to use ipath_ether but qlogic's website only has the RHEL4 release (despite having a RHEL5 label).

Cheers!!!

--
Roderick Flores
Solutions Architect
Univa UD
[EMAIL PROTECTED]

_______________________________________________
general mailing list
[email protected]
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

Reply via email to