*** From dhcp-server -- To unsubscribe, see the end of this message. ***

I downloaded 2.0b1pl23 on April 1 (which is something of a dubious date
to install and fire up a new product, I guess) and we're discovering a
particular sort of NT RAS packet that's causing the dhcp server to
unexpectedly die.

I did some searching through the mailing list and while I found a
reference to a patch to the DHCP server to "ignore" RAS queries (or
something along those lines) as well as the ubiquitous "go and turn off
the offending RAS server" messages, I feel I should bring this up for
two reasons.

1. Since we're in a university environment it's a bit difficult to get
people to see things our way, and there's no guarantee that servers that
are shut off won't be "secretly" turned back on. Plus, there's no
guarantee that people won't be starting new NT servers in the future.
Tracking these things down will take time and we don't want the DHCP
server to be out of commission while we track the guys down.

2. We're running the ISC DHCP server on two servers (with
non-overlapping dynamic address ranges). One machine is running
2.0b1pl23 and is dying because of this killer packet... the other is
running 2.0b1pl6 and isn't complaining at all. In fact, it's pretty
bullet-proof. We're hoping this is some sort of oversight that can be
quickly remedied.

 Here are the details:

DHCP server platform ("proxy1"):
Intel Pentium Pro, Red Hat 2.5 kernel 2.0.36, dhcp-2.0b1pl23

# ldd /usr/sbin/dhcpd (pl23)
        libc.so.6 => /lib/libc.so.6 (0x40004000)
        /lib/ld-linux.so.2 => /lib/ld-linux.so.2 (0x00000000)

DHCP server platform ("proxy2"):
Intel Pentium Pro, Red Hat 2.5 kernel 2.0.35, dhcp-2.0b1pl6

# ldd /usr/sbin/dhcpd
        libc.so.6 => /lib/libc.so.6 (0x40003000)
        /lib/ld-linux.so.2 => /lib/ld-linux.so.2 (0x00000000)



What happens is that 2.0b1pl23 will continue chugging along (though with
much less output than pl6) until it prints out the following line in
syslog and exits:

Apr  3 10:21:42 proxy1 usr/sbin/dhcpd: DHCPREQUEST for 155.246.18.59
from 52:41:53:20:20:8c:0d:58:7c:48:be:01:02:00:00:00 via 155.246.18.1

Yes, yes, I know, RAS. Here's a similar entry from today on our 2.0b1pl6
server, and it's still running like a champ (as evidenced by handing out
another static IP soon afterward):

Apr  4 01:25:25 proxy2 dhcpd: DHCPREQUEST for 155.246.18.58 from
52:41:53:20:20:8c:0d:58:7c:48:be:01:01:00:00:00 via 155.246.18.1
Apr  4 01:25:27 proxy2 dhcpd: DHCPREQUEST for 155.246.220.28 from
00:20:af:a4:c3:d8 via eth0
Apr  4 01:25:27 proxy2 dhcpd: DHCPACK on 155.246.220.28 to
00:20:af:a4:c3:d8 via eth0

When 2.0b1pl23 exits, it does not core dump. It "just stops running." To
investigate this more we ran tcpdump with the following options:

/usr/sbin/tcpdump -nexi eth0 -s1500 src net 155.246.18 and \( udp port
68 or udp port 67 \)

-n suppresses DNS lookups
-e prints the link-level header on the dump line
-x prints out the packet out in hex

And here's the output:

11:30:00.135841 0:90:92:cd:18:0 0:10:4b:72:2d:ae 0800 342:
155.246.18.37.68 > 155.246.1.108.67: hlen:16 xid:0xb562eb05
G:155.246.18.1 vend-rfc1048 T53:3
T61:1.82.65.83.32.32.140.13.88.124.72.190.1.2.0.0.0 T50:991098523
HN:"SOL^@"
                         4500 0148 e2d0 0000 7e11 0d57 9bf6 1225
                         9bf6 016c 0044 0043 0134 2850 0101 1000
                         b562 eb05 0000 8000 0000 0000 0000 0000
                         0000 0000 9bf6 1201 5241 5320 208c 0d58
                         7c48 be01 0200 0000 0000 0000 0000 0000
                         0000 0000 0000 0000 0000 0000 0000 0000
                         0000 0000 0000 0000 0000 0000 0000 0000
                         0000 0000 0000 0000 0000 0000 0000 0000
                         0000 0000 0000 0000 0000 0000 0000 0000
                         0000 0000 0000 0000 0000 0000 0000 0000
                         0000 0000 0000 0000 0000 0000 0000 0000
                         0000 0000 0000 0000 0000 0000 0000 0000
                         0000 0000 0000 0000 0000 0000 0000 0000
                         0000 0000 0000 0000 0000 0000 0000 0000
                         0000 0000 0000 0000 0000 0000 0000 0000
                         0000 0000 0000 0000 0000 0000 0000 0000
                         0000 0000 0000 0000 6382 5363 3501 033d
                         1101 5241 5320 208c 0d58 7c48 be01 0200
                         0000 3204 9bf6 123b 0c04 534f 4c00 ff00
                         0000 0000 0000 0000 0000 0000 0000 0000
                         0000 0000 0000 0000
11:30:00.785841 0:90:92:cd:18:0 0:10:4b:72:2d:ae 0800 342:
155.246.18.37.67 > 155.246.1.108.68: hlen:16 xid:0xb562eb05
Y:155.246.18.59 vend-rfc1048 T53:5 T58:134676480 T59:1309409280
T51:269352960 T54:621999771 SM:255.255.255.0
                         4500 0148 e4d0 0000 7e11 0b57 9bf6 1225
                         9bf6 016c 0043 0044 0134 b7c2 0201 1000
                         b562 eb05 0000 0000 0000 0000 9bf6 123b
                         0000 0000 0000 0000 5241 5320 208c 0d58
                         7c48 be01 0200 0000 0000 0000 0000 0000
                         0000 0000 0000 0000 0000 0000 0000 0000
                         0000 0000 0000 0000 0000 0000 0000 0000
                         0000 0000 0000 0000 0000 0000 0000 0000
                         0000 0000 0000 0000 0000 0000 0000 0000
                         0000 0000 0000 0000 0000 0000 0000 0000
                         0000 0000 0000 0000 0000 0000 0000 0000
                         0000 0000 0000 0000 0000 0000 0000 0000
                         0000 0000 0000 0000 0000 0000 0000 0000
                         0000 0000 0000 0000 0000 0000 0000 0000
                         0000 0000 0000 0000 0000 0000 0000 0000
                         0000 0000 0000 0000 0000 0000 0000 0000
                         0000 0000 0000 0000 6382 5363 3501 053a
                         0400 0007 083b 0400 000c 4e33 0400 000e
                         1036 049b f612 2501 04ff ffff 00ff 0000
                         0000 0000 0000 0000 0000 0000 0000 0000
                         0000 0000 0000 0000

At this point 2.0b1pl23 dies. However, 2.0b1pl6 is still running.


To further complicate the issue, it appears we have yet another RAS
(YARAS?) situation on another subnet. However, the packets look a little
different, and in this case neither DHCP server suffers any problems
(e.g. 2.0b1pl23 keeps on ticking):

Here's the entry from syslog:

Apr  5 13:07:42 proxy1 usr/sbin/dhcpd: DHCPDISCOVER from
52:41:53:20:a0:22:0d:b8:19:7b:be:01:01:00:00:00 via 155.246.4.1
Apr  5 13:07:42 proxy1 usr/sbin/dhcpd: Ignoring unknown client
52:41:53:20:a0:22:0d:b8:19:7b:be:01:01:00:00:00

So you see here it (pl23) is actively ignoring the client (we have "deny
unknown" enabled (if that's the correct wording for that option)).
Here's the output for this "benign" RAS packet from tcpdump using the
command: /usr/sbin/tcpdump -nexi eth0 -s 1500 \( udp port 67 and udp
port 68 \) and src 155.246.4.1


11:53:15.749834 0:90:92:cd:18:0 0:10:4b:72:2c:a0 0800 342:
155.246.4.143.68 > 155.246.1.109.67: hlen:16 hops:1 xid:0xec5d4116
G:155.246.4.1 vend-rfc1048 T53:1
T61:1.82.65.83.32.160.34.13.184.25.123.190.1.1.0.0.0 HN:"SQLT1R^@"
                         4500 0148 b27f 0000 7f11 4a3d 9bf6 048f
                         9bf6 016d 0044 0043 0134 2d13 0101 1001
                         ec5d 4116 0000 8000 0000 0000 0000 0000
                         0000 0000 9bf6 0401 5241 5320 a022 0db8
                         197b be01 0100 0000 0000 0000 0000 0000
                         0000 0000 0000 0000 0000 0000 0000 0000
                         0000 0000 0000 0000 0000 0000 0000 0000
                         0000 0000 0000 0000 0000 0000 0000 0000
                         0000 0000 0000 0000 0000 0000 0000 0000
                         0000 0000 0000 0000 0000 0000 0000 0000
                         0000 0000 0000 0000 0000 0000 0000 0000
                         0000 0000 0000 0000 0000 0000 0000 0000
                         0000 0000 0000 0000 0000 0000 0000 0000
                         0000 0000 0000 0000 0000 0000 0000 0000
                         0000 0000 0000 0000 0000 0000 0000 0000
                         0000 0000 0000 0000 0000 0000 0000 0000
                         0000 0000 0000 0000 6382 5363 3501 013d
                         1101 5241 5320 a022 0db8 197b be01 0100
                         0000 0c07 5351 4c54 3152 00ff 0000 0000
                         0000 0000 0000 0000 0000 0000 0000 0000
                         0000 0000 0000 0000
11:53:25.200208 0:90:92:cd:18:0 0:10:4b:72:2c:a0 0800 342:
155.246.4.143.68 > 155.246.1.109.67: hlen:16 hops:1 xid:0xec5d4116
secs:2304 G:155.246.4.1 vend-rfc1048 T53:1
T61:1.82.65.83.32.160.34.13.184.25.123.190.1.1.0.0.0 HN:"SQLT1R^@"
                         4500 0148 b47f 0000 7f11 483d 9bf6 048f
                         9bf6 016d 0044 0043 0134 2413 0101 1001
                         ec5d 4116 0900 8000 0000 0000 0000 0000
                         0000 0000 9bf6 0401 5241 5320 a022 0db8
                         197b be01 0100 0000 0000 0000 0000 0000
                         0000 0000 0000 0000 0000 0000 0000 0000
                         0000 0000 0000 0000 0000 0000 0000 0000
                         0000 0000 0000 0000 0000 0000 0000 0000
                         0000 0000 0000 0000 0000 0000 0000 0000
                         0000 0000 0000 0000 0000 0000 0000 0000
                         0000 0000 0000 0000 0000 0000 0000 0000
                         0000 0000 0000 0000 0000 0000 0000 0000
                         0000 0000 0000 0000 0000 0000 0000 0000
                         0000 0000 0000 0000 0000 0000 0000 0000
                         0000 0000 0000 0000 0000 0000 0000 0000
                         0000 0000 0000 0000 0000 0000 0000 0000
                         0000 0000 0000 0000 6382 5363 3501 013d
                         1101 5241 5320 a022 0db8 197b be01 0100
                         0000 0c07 5351 4c54 3152 00ff 0000 0000
                         0000 0000 0000 0000 0000 0000 0000 0000
                         0000 0000 0000 0000


The two previous packets were captured (as you can see from the time
stamp) earlier this morning. The one below is more recent (but since
this is happening sporadically it's hard to capture this all at once).


14:51:36.635841 0:90:92:cd:18:0 0:10:4b:72:2d:ae 0800 342:
155.246.4.1.68 > 155.246.1.108.67: hops:1 xid:0x17 secs:13491
G:155.246.4.1 ether 0:4:0:8:4e:68 vend-rfc1048
                         4500 0148 0607 0000 fd11 7944 9bf6 0401
                         9bf6 016c 0044 0043 0134 5f86 0101 0601
                         0000 0017 34b3 8000 0000 0000 0000 0000
                         0000 0000 9bf6 0401 0004 0008 4e68 0000
                         0000 0000 0000 0000 0000 0000 0000 0000
                         0000 0000 0000 0000 0000 0000 0000 0000
                         0000 0000 0000 0000 0000 0000 0000 0000
                         0000 0000 0000 0000 0000 0000 0000 0000
                         0000 0000 0000 0000 0000 0000 0000 0000
                         0000 0000 0000 0000 0000 0000 0000 0000
                         0000 0000 0000 0000 0000 0000 0000 0000
                         0000 0000 0000 0000 0000 0000 0000 0000
                         0000 0000 0000 0000 0000 0000 0000 0000
                         0000 0000 0000 0000 0000 0000 0000 0000
                         0000 0000 0000 0000 0000 0000 0000 0000
                         0000 0000 0000 0000 0000 0000 0000 0000
                         0000 0000 0000 0000 6382 5363 ff00 0000
                         0000 0000 0000 0000 0000 0000 0000 0000
                         0000 0000 0000 0000 0000 0000 0000 0000
                         0000 0000 0000 0000 0000 0000 0000 0000
                         0000 0000 0000 0000

Any insight would be appreciated. I understand that the short-term
solution is to go turn off the RAS servers, which we will do. However,
pl6 is immune to this sort of "sudden death" syndrome, it would be nice
to see pl23 offer the same sort of robustness.

This is the part of our dhcpd.conf declaration for our 18 subnet from
the pl23 server:

shared-network Carnegie {
  option subnet-mask 255.255.255.0;
  default-lease-time 3600;
  max-lease-time 43200;
## use-host-decl-names on;

  subnet 155.246.18.0 netmask 255.255.255.0 {
    option broadcast-address 155.246.18.255;
    option subnet-mask 255.255.255.0;
    option routers 155.246.18.1;
    option netbios-name-servers 155.246.1.109, 155.246.1.108;
    option netbios-node-type 8;
    default-lease-time 3600;
    max-lease-time 43200;

    ## host entries here
  }
}


And this is the same declaration from the conf file for the pl6 server
(they should be identical):

shared-network Carnegie {
  option subnet-mask 255.255.255.0;
  default-lease-time 3600;
  max-lease-time 43200;
## use-host-decl-names on;

  subnet 155.246.18.0 netmask 255.255.255.0 {
    option broadcast-address 155.246.18.255;
    option subnet-mask 255.255.255.0;
    option routers 155.246.18.1;
    option netbios-name-servers 155.246.1.109, 155.246.1.108;
    option netbios-node-type 8;
    default-lease-time 3600;
    max-lease-time 43200;

    # host entries removed

  }
}

Ted, if you read this and want to see the entire conf file I can make
that available (this message is long enough already!)


cheers,
larry

--
Lawrence Lee
UNIX Systems Programmer
Stevens Institute of Technology




------------------------------------------------------------------------------
To unsubscribe from this list, please visit http://www.fugue.com/dhcp/lists
If you are without web access, or if you are having trouble with the web page,
please send mail to [EMAIL PROTECTED]   Please try to use the web
page first - it will take a long time for your request to be processed by hand.
------------------------------------------------------------------------------

Reply via email to