dhcp-users Digest, Vol 131, Issue 12

dhcp-users-request Tue, 17 Sep 2019 09:04:59 -0700

Send dhcp-users mailing list submissions to
        dhcp-users@lists.isc.org


To subscribe or unsubscribe via the World Wide Web, visit
        https://lists.isc.org/mailman/listinfo/dhcp-users
or, via email, send a message with subject or body 'help' to
        dhcp-users-requ...@lists.isc.org

You can reach the person managing the list at
        dhcp-users-ow...@lists.isc.org

When replying, please edit your Subject line so it is more specific
than "Re: Contents of dhcp-users digest..."


Today's Topics:

   1. Re: Esoteric question (Gregory Sloop)
   2. Re: DHCP server assigned its own address (Bob Harold)


----------------------------------------------------------------------

Message: 1
Date: Tue, 17 Sep 2019 08:56:48 -0700
From: Gregory Sloop <gr...@sloop.net>
To: Users of ISC DHCP <dhcp-users@lists.isc.org>
Subject: Re: Esoteric question
Message-ID: <387316097.20190917085...@sloop.net>
Content-Type: text/plain; charset="us-ascii"

Top posting

I don't have captures on Eth1 - though that's probably a good idea. Hard 
though, because it's a site that is in production like 7x12+ - so a PITA to go 
onsite (for the fourth time now) to grab some more data...

The potential of an interface with an overlapping subnet on Eth1 was raised and 
that's a good idea, I think.
But I certainly can't see anything in my config that would do that. I've 
stripped the config down the the very basics; just, essentially, defining the 
two Eth interfaces, the NAT/MASQ, DNS & NTP - in an effort to make sure there 
wasn't something somewhere in the config that was inadvertently causing the 
issue.

A Question, if anyone knows the answer.
If it's doing a full handshake on Eth0 currently, doesn't that indicate that it 
believes that Eth0 is the proper interface for that subnet declaration - and 
so, why would it also be doing it on another interface too? [I get why it would 
be good to verify by doing some packet-caps - but asking for my own 
knowledge/education.]

As for cloud-mgmt/call-home - no there's none of that.

Thanks for the thoughts so far.

-Greg

gsuca> Hi Greg,

gsuca> A very interesting problem... I've heard good reports about both those
gsuca> vendor's hardware, so sounds like a reasonable choice.

gsuca> What do you get if you snoop eth1 while connected to the different WAN
gsuca> devices? I wonder if dhcpd is trying to talk to something else upstream
gsuca> (no idea why it would do that).

gsuca> Does the Ubiquiti have some form of cloud management or call home setup?

gsuca> Best of luck.

gsuca> regards,
gsuca> -glenn

gsuca> On 2019-09-17 09:20, Gregory Sloop wrote:
>> So, this is kind of a wild goose-chase for some direction - but
>> thought there might be some useful answers here.

>> [But I know it's way out there and I'm not going to get direct help on
>> solving the issue on the platform I'm having issues with - just bear
>> with me and see if you have any helpful ideas.]

>> Let me set the background.

>> I'm using specific device hardware - in this case, a Mikrotik RB450G
>> [currently in place] and moving to a Ubiquiti EdgeRouter lite.
>> They're multi-ethernet interface routers - based on Linux.
>> The RB450G works fine and simply needs replacement. [The two devices
>> are configured as identically as I can. They're very different, so
>> we're talking "functionally" identical, not literally with the same
>> conf files.]

>> I'm having issues with DHCPd on the new device. [And queries at
>> Ubiquiti are going nowhere fast. It IS an unusual problem, so I'm not
>> terribly surprised.]

>> Lets assume Eth0/LAN is 10.0.0.1/24
>> DHCPD is setup to hand out addresses for 10.0.0.20-100, say.
>> 14440 second leases.
>> Clients are connected directly to a switch that's directly connected
>> to ETH0. [No DHCP relay etc.]

>> Eth1/WAN is a static /30 - connected directly to a Comcast Modem/BSG.
>> Lets say 1.2.3.5/30
>> The gateway [not that it matters is 1.2.3.6]

>> We're masquerading traffic [NAT] from the local RFC1918 [10.0.0.0/24]
>> network to the static public IP on the WAN.

>> ---
>> So, here's what happens/happened.

>> I went in to swap out the 'Tik box for the new hardware.
>> Plug it in, and none of the clients on the LAN get DHCP addresses. All
>> the DHCP clients time out.
>> After several passes at testing here's what I find.

>> I can't find any configuration problems on the replacement hardware.
>> The *old* 'Tik hardware/software works perfectly.

>> If we have the WAN connected to a simple live ethernet port on the
>> *new hardware,* [EdgeRouter] DHCP works fine for the LAN side. Totally
>> fine.
>> Only when we plug in the Comcast gateway/modem into the WAN port on
>> the new hardware does DHCP fail/timeout. [Remember just plugging it
>> into a regular ethernet switch works fine. It won't pass traffic,
>> because the static IP assignment isn't right - but the LAN side DHCP
>> server works perfectly.]

>> If we take a client on the LAN and plug in a static IP [rather than
>> DHCP], traffic flows out to the internet perfectly fine.

>> Packet caps from the new router show that the router/DHCP server IS
>> seeing all the DHCP protocol handshake. [When it's having the
>> "problem."]
>> The client does a DISCOVER
>> Server responds with OFFER
>> The client responds with REQUEST
>> Then there's a LONG pause. [like 90s+ worth.]
>> The Server responds with ACK. [It actually appears to send several
>> ACKS. I probably cut my captures too short, so I only have about 2m of
>> capture in my largest one. But that's what I see in what I have.]
>> However, the client [Windows in this case] has timed out, and never
>> gets the ACK.
>> And while I'm not 100% certain, the times I've looked, the device
>> believes it's handed out a lease. [I believe it's in the leases file.]
>> But because of the long delay, the client never actually got the
>> lease.

>> Again,
>> -simply unplugging the Comcast modem from the router, and DHCP
>> immediately starts working again.
>> -Plugging Eth1 into a live ethernet port [so that interface is seen as
>> up] also works fine.
>> -It's only when connected to the Comcast gateway/modem that it fails.

>> On the LAN side of the network, we've tinkered replacing the switches
>> - dumb, identically configured managed switches, different manged
>> switch, or no switch at all - simply plugged directly into a single
>> client. No changes on the LAN side make the slightest difference
>> either.

>> Since we're doing NAT/MASQ from LAN->WAN no WAN traffic should leak
>> into the LAN - but I've also explicitly defined rules that prevent
>> anything from the WAN getting to the LOCAL or LAN interfaces - other
>> than established/related traffic.

>> So, I'm not asking for you to solve the issue on this particular
>> hardware. What I'm asking for is some plausible explanation that might
>> have these symptoms. I'm completely at wits end. I've spent a lot of
>> hours trying a whole host of troubleshooting things - but I can't
>> think of any possible way this could be happening. But clearly it is.

>> IMO, either we have some very weird hardware physical layer problem
>> that only impacts DHCP [and not traffic routing] or there's something
>> I'm missing. I'd normally imagine that I'm missing something - but
>> can't figure out what, if anything.

>> I've tried to closely define the setup, but I'm sure I've forgotten
>> something - perhaps lots of somethings - just ask and I'll try to
>> clarify any missing pieces.

>> Given how awesome people on this list are, I'm hopeful someone will
>> have something that might jiggle loose something useful!

>> TIA
>> -Greg
>> _______________________________________________
>> dhcp-users mailing list
>> dhcp-users@lists.isc.org
>> https://lists.isc.org/mailman/listinfo/dhcp-users

-- 
Gregory Sloop, Principal: Sloop Network & Computer Consulting
Voice: 503.251.0452 x82
EMail: gr...@sloop.net
http://www.sloop.net
---
-------------- next part --------------
An HTML attachment was scrubbed...
URL: 
<https://lists.isc.org/pipermail/dhcp-users/attachments/20190917/af4430da/attachment-0001.html>

------------------------------

Message: 2
Date: Tue, 17 Sep 2019 12:03:42 -0400
From: Bob Harold <rharo...@umich.edu>
To: Users of ISC DHCP <dhcp-users@lists.isc.org>
Subject: Re: DHCP server assigned its own address
Message-ID:
        <CA+nkc8DZM=uus4kai_ldn0nti8q4u15xayqggrwmu947iv0...@mail.gmail.com>
Content-Type: text/plain; charset="utf-8"

On Mon, Sep 16, 2019 at 9:32 PM Larry Apolonio <isc-d...@rh73.com> wrote:

>
> All,
>
> I have a weird problem that I am trying to solve.
>
> In short, for those who don't want to read the details, I am trying to
> figure out why the DHCP server assigned its own IP address to another
> device.
>
>
> My dhcp server is running on CentOS 6.10 and is the regular RPM that
> comes with that distribution dhcp-4.1.1-63.P1.el6.centos.x86_64.
>
> What is a little unusual is that webmin is used to manage the dhcp
> server, for the most part it works for our environment.
>
> Yesterday, I got a nagios alert that the server was no longer available.
>   This nagios server is on the same subnet as the server so there was no
> weird firewall routing issues involved.  With the help of the networking
> guys, we found that another machine took the IP address of our DHCP
> server.  This happened late July this year and it ended up being a human
> error, the person spinning up a machine on this network assigned a
> static IP address to their machine that was the same IP as our server,
> so we thought someone did it again.
>
> The difference this time is that it seems like the DHCP server itself
> assigned its own IP address
>
> Here is a sample of that subnet declaration, with IPs changed to protect
> the innocent
>
> # XXXXXX Subnet
> subnet 192.168.11.0 netmask 255.255.255.0 {
>          range 192.168.11.10 10.254.11.10;
>          option subnet-mask 255.255.255.0;
>          default-lease-time 28800;
>          option broadcast-address 192.168.11.255;
>          option routers 192.168.11.254;
>          option domain-name-servers 208.67.222.222 , 208.67.220.220;
>          option domain-name "example.local";
>          }
>
> The IP address of the DHCP server is 192.168.11.10, I personally would
> not do this, I would have not even had the DHCP server IP address in
> that range.  But please read on
>
> This is a rarely used subnet, so a machine appearing on this subnet is
> rare, in fact I thought this subnet did not have a dhcp declaration
> prior to me looking in to it.  Doesn't this log entry in
> /var/log/messages confirm it? (hostname was changed in this paste)
>
> Sep 12 10:02:12 linuxdhcpserver dhcpd: No subnet declaration for eth0
> (no IPv4 addresses).
> Sep 12 10:02:12 linuxdhcpserver dhcpd: ** Ignoring requests on eth0.  If
> this is not what
> Sep 12 10:02:12 linuxdhcpserver dhcpd:    you want, please write a
> subnet declaration
> Sep 12 10:02:12 linuxdhcpserver dhcpd:    in your dhcpd.conf file for
> the network segment
> Sep 12 10:02:12 linuxdhcpserver dhcpd:    to which interface eth0 is
> attached. **
>
> When the service was restarted 3 hours later, that same message about no
> subnet declaration for eth0 did not appear.
>
> One reason we use webmin is so that non-linux folk (AKA people without
> the root password) can log in to an easy web interface is to manage the
> service that the Linux server does, in this case dhcp.
>
> But it also logs what they did, up to a certain point, I can tell who
> edited which subnet declarations but not the exact changes they did.
>
>  From the webmin logs, until yesterday this subnet was not changed.
>
>  From the command line I also ran last to see who logged in, it was
> either root, or a proper Linux server admin, and I admit that someone in
> this group could be holding back, I don't think we did anything via CLI.
>
> So I am at a loss, trying to figure out why a DHCP server would assign
> its own IP address (it is pingable, no iptables rules blocking ICMP), I
> thought conflict resolution would prevent it. If I am reading RFC1541
> section 2.2 correctly.
>
> Did someone do a good job at cleaning up their tracks?  I don't think
> the effort or skill was there.  It would be easier to just admit they
> made a mistake.
>
> Was webmin not logging correctly?  I really dont recall this subnet
> being on this server, because I do recall seeing that message in the
> logs regarding no subnet declaration in the past.
>
> Couple solutions were proposed so this would not happen again, the
> biggest one is putting this server and its big brother nagios server on
> its lonesome VLAN/subnet and restrict anything else from being on this
> subnet.  Seems overkill but this IP hijack happened twice within 60 days
> when it has been fine for years.
>
> Thank you,
>
> Larry Apolonio
>
> Although I have been speaking English for a while now, I still have
> problems articulating my thoughts, thank you for your patience.
>

Do not depend on "ping before assign" to cover for an incorrect
configuration.   Static devices and dynamic DHCP ranges should never
overlap.  The subnet that the DHCP server is in must be defined, but does
not need to have a dynamic range.  It can have a range if no like, no need
for a separate subnet, just don't define the same IP as both static and
dynamic.

-- 
Bob Harold
-------------- next part --------------
An HTML attachment was scrubbed...
URL: 
<https://lists.isc.org/pipermail/dhcp-users/attachments/20190917/89e3b37a/attachment.html>

------------------------------

Subject: Digest Footer

_______________________________________________
dhcp-users mailing list
dhcp-users@lists.isc.org
https://lists.isc.org/mailman/listinfo/dhcp-users


------------------------------

End of dhcp-users Digest, Vol 131, Issue 12
*******************************************

dhcp-users Digest, Vol 131, Issue 12

Reply via email to