Re: Problem I've been having with dhclient dhcp in Fedora27

Thomas Haller Fri, 12 Jan 2018 01:59:24 -0800

On Tue, 2018-01-09 at 12:06 -0500, [email protected] wrote:
> Thanks, Thomas for the hints.
>  
> I include two attachments that should clarify what is going on. The
> output of 'nmcli con show', and then the trace-level log of two
> cases, with **************** lines inserted at the beginning of each.
>  
> The first case, which is what was happening to me that I didn't
> understand, is the trace for the two commands 'nmcli con down br0;
> nmcli con up br0'.
>  
> The second case, which apparently works fine, is the trace for a
> slightly different way of doing roughly the same thing 'nmcli con
> down br0; nmcli con up bridge-slave-eno1'.
>  
> Your email made it clear that the second case is preferred as a way
> to bring up a bridge. In the second case, the bridge acquires the MAC
> address of the slave being brought up. In the first case, the bridge
> gets a new random MAC address.
>  
> I'm still not clear on why or whether the result of the first case is
> "right". But I get why I need to bring up the connection with the
> slave interface to get what I want.
>  
> Note: because br0 is the source of DHCP requests, the first case's
> result really confuses things, and makes br0 unable to get an IP
> address after 'nmcli con up br0', whereas the second case renews the
> IP address properly.
>  
> I'm no expert in what NetworkManager *should* do, but it seemed
> logical to me that the first case should have noticed the slave and
> brought it up first (just as happens in the second case), and doing
> nothing about br0 until the slave was fully up. Then the rule "choose
> the MAC address of the first slave" would have kicked in, and both
> cases would have produced equivalent results.
>  
> Thank you very much for clarifying this order-dependency on bridge
> connections. I wonder if there's somewhere it can be documented (like
> the Networking Guide of Fedora/RHEL). It would save others some
> confusion. I've never contributed to documentation, but I'd be happy
> to write something as a draft and send it somewhere.


Hi,

In the first case, you only activate the master br0.
This may not activate any slaves, depending on your configuration.
You end up with a bridge device with not slaves attached.
The fact that the MAC address is unspecified doesn't matter here.
The problem is that the bridge has no carrier unless slaves are
attached. You cannot meaningfully do DHCP unless you have carrier.
You would also see:
  $ nmcli connection up br0
  Connection successfully activated (master waiting for slaves) (D-Bus active 
path: /org/freedesktop/NetworkManager/ActiveConnection/10)
and
  $ nmcli device 
  nm-bridge  bridge    connecting (getting IP configuration)  br0

Static IPv4 addresses, would avoid that problem, because you can
configure them without carrier. Static IPv6 addresses, still have that
problem, because you cannot do duplicate address detection without
carrier.


To fix that either:

- configure the bridge master with "connection.autoconnect-slaves=yes"

- always ensure to also activate at least one slave. That means to
  manually activate the slaves. Note that:

  $ nmcli con up bridge-slave-eno1
    is sufficient, because activating a slaves always ensures the
    master is active as well.

  $ nmcli con up br0 && nmcli con up bridge-slave-eno1
    works too, but is redundandent and slower

  $ nmcli con up bridge-slave-eno1 && nmcli con up br0
    is wrong,
because after the first comment, br0 is
    already fully up. So,
issuing another `con up` brings
    br0 down again to re-activate it --
and you end up
    with no slaves again.


best,
Thomas

>  
>  
> -----Original Message-----
> From: "Thomas Haller" <[email protected]>
> Sent: Tuesday, January 9, 2018 12:34am
> To: "[email protected]" <[email protected]>, networkmanager-list@
> gnome.org
> Subject: Re: Problem I've been having with dhclient dhcp in Fedora27
> 
> On Sat, 2018-01-06 at 17:23 -0500, [email protected] wrote:
> > Help needed to understand if this is a bug/feature/weirdness:
> > 
> > I've made some progress diagnosing this problem with losing DHCP
> > connectivity. I've got it reproducible by a simple command: 'nmcli
> > con down br0; nmcli con up br0' fails to get a DHCP lease in the
> "up"
> > case.
> > 
> > It seems to be the way that NM handles a bridge connection. When
> > Fedora boots, it comes up with the bridge (br0) using the same MAC
> > address as the slave (eno1), which is the hardware MAC address of
> the
> > wired card. However, if you do 'nmcli con down br0; nmcli con up
> > br0', the br0 device now has a randomly generated MAC address.
> > 
> > This is a little weird. I suspect I can work around my specific
> > problem by giving the br0 device a fixed ether.mac-address.
> However,
> > I don't know if that is the right thing for others to do in my
> > situation.
> > 
> > In fact, there is little info about bridge management behavior in
> NM
> > docs I can find, so it's not obvious what is the "correct" behavior
> > of an NM-managed bridge connection.
> > 
> > Should NM be giving the bridge its MAC address from the slave
> device
> > the first time? Makes sense, though it's a little unclear what the
> > "default" should be.
> > 
> > And should the second and later times use "random addresses"?
> > 
> > Seems like there may be two different pieces of NM code that do the
> > same function of bringing up the interface, but which are not
> > consistent with each other.
> > 
> > Anyway, I'd like to know what is right.
> 
> Hi,
> 
> your previously sent logfile has no level=TRACE logging enabled, so
> it's not clear whats happening.
> See https://cgit.freedesktop.org/NetworkManager/NetworkManager/tree/c
> ontrib/fedora/rpm/NetworkManager.conf
> 
> You might set a fixed MAC address, via "bridge.mac-address" and
> "ethernet.cloned-mac-address". The first property is used when
> creating
> the bridge interface, the second later when activating.
> On 1.10, "bridge.mac-address" got deprecated, because it's obviously
> redundant.
> 
> Anyway, in general, if you activate a master device alone with `nmcli
> connection up` (be it bridge, bond, or team), then you only activate
> the master alone. There is also a "connection.autoconnect-slaves"
> property, that aims to brings up available slaves. So whether any
> slaves are attaches is unclear. But quite possibly, no slaves are
> attached.
> 
> Note that the sequence:
> nmcli connection up "$SLAVE"
> nmcli connection up "$MASTER"
> is wrong, because activating the slave already brings up the master
> as
> well, so activating the master again results in a disconnect of the
> slave.
> Either do
> nmcli connection up "$MASTER"
> nmcli connection up "$SLAVE"
> or just
> 
> nmcli connection up "$SLAVE"
> 
> As you don't set "bridge.mac-address" nor "ethernet.cloned-mac-
> address", the MAC address of a master device without slaves is
> randomly
> assigned by kernel.
> If the bridge's MAC address is unset (from kernel's point of view),
> kernel will assign the MAC address of the first slave that attaches.
> So, the MAC address changes. Usually, that shouldn't matter, because
> as
> long as there are no slaves, the master's MAC address isn't very
> useful.
> 
> Long story short, please send a log file to see what's happening.
> 
> 
> Thanks,
> Thomas
> 
> > 
> > 
> > -----Original Message-----
> > From: "[email protected]" <[email protected]>
> > Sent: Thursday, January 4, 2018 3:59pm
> > To: [email protected]
> > Subject: Problem I've been having with dhclient dhcp in Fedora27
> > 
> > I've been having problems with NetworkManager dhcp on my Fedora27
> > Workstation (desktop, wired).. Note, because I'm using VMs on that
> > workstation, the interface is a bridge (br0 with slave eno1).
> > 
> > What seems to happen is this:
> > Workstation wakes up from sleeping, reactivates connection.
> > dhclient issues 4 DHCPDISCOVER tries, none of which get a response.
> > the interface state changes "unknown-->timeout->done", and the
> DHCPv4
> > transaction is cancelled.
> > A restart is scheduled for 120 seconds later.
> > The restart 120 seconds later succeeds with DHCPDISCOVER,
> > DHCPREQUEST, DHCPOFFER, DHCPACK.
> > 
> > All the other machines served by my DHCP server have no problems at
> > all, however, they are MacBooks, various storage servers, and
> > RaspberryPis.
> > 
> > It seems to be that NetworkManager somehow interferes with the
> > DHCPDISCOVER after the workstation wakes up.
> > 
> > The attached log file shows this sequence of events.
> > 
> > (I'm wondering if there is a timing issue because the first
> dhclient
> > call is issued when eno1 is in the "unavailable" state, and before
> it
> > is captured as a slave to br0)
> > _______________________________________________
> > networkmanager-list mailing list
> > [email protected]
> > https://mail.gnome.org/mailman/listinfo/networkmanager-list

signature.asc
Description: This is a digitally signed message part

_______________________________________________
networkmanager-list mailing list
[email protected]
https://mail.gnome.org/mailman/listinfo/networkmanager-list

Re: Problem I've been having with dhclient dhcp in Fedora27

Reply via email to