On Wed, 2017-11-22 at 16:18 +0100, Piotr Figiel wrote:
> Hi,
> I'm looking into couple issues we stumbled on with pppd and
> NetworkManager. It's tested on NetworkManager 1.8.4, ModemManager
> 1.6.8 and pppd 2.4.7. In general there are sporadic issues with
> reconnecting when pppd exits because of LCP echo timeout and when it
> exits due to other reasons.
> 
> In simplest scenario with active pppd connection when pppd crashes/is
> forcibly killed and nm pppd plug-in doesn't have a chance to send the
> status update via D-Bus to NM the connection is not deactivated
> (nmcli
> c shows it's up).
> I've done some analysis and NM does notice the pppd process exiting,
> it enters nm-device-modem.c:ppp_failed() but because the "may-fail"
> ipv4 setting defaults to TRUE not much happens (the connection losses
> IP address and gateway setting but is still active).
> The behavior doesn't seem correct as according to the nm-settings(5)
> the connection should fail if none of ipv4 and ipv6 are available
> (even when they have may-fail setting to TRUE).
> 
> There are actually more issues with nm-device-modem.c:ppp_failed().
> Normally when pppd session ends (e.g. because of LCP echo timeout) it
> seems there are 2 messages sent from pppd which trigger ppp_failed():
> SetState with NM_PPP_STATUS_DISCONNECT and NM_PPP_STATUS_DEAD.
> Additionally pppd process exiting triggers the ppp_failed() for the
> 3rd time. In such conditions NM deactivates connection with error:
> <warn>  [1511357452.8500] device (ttyACM0): PPP failure in unexpected
> state 100
> This seems to be a side effect of the may-fail logic and the fact
> that
> multiple events result in call to ppp_failed() but this side effect
> deactivates the device and triggers reconnect later on (ie fixes the
> issue).
> 
> This though doesn't work always like that: I observed that
> sporadically the nm pppd plugin doesn't send the NM_PPP_STATUS_DEAD
> state (observed with dbus-monitor). I think it's related with
> g_dbus_proxy_call() being used in the plugin code, which is
> asynchronous and it may sometimes not complete before pppd exit.
> In such case and some amount of bad luck ppp_failed() is called
> because of NM_PPP_STATUS_DISCONNECT and pppd exiting only (i.e. only
> two times). If in such case ip4_state is configured (with may-fail)
> and e.g. ip6_state == IP_CONF then the connection will not be
> deactivated and reconnection won't be triggered.
> 
> I think for the LCP echo timeout error case setting may-fail to FALSE
> on ipv4 on ppp connections should solve the issue. But if anything
> from above makes sense to you it seems there are at least couple bugs
> there. I think the simplest solution would be to replace
> nm-device-modem.c:ppp_failed() contents with:
> 
> nm_device_state_changed (device, NM_DEVICE_STATE_FAILED, reason);
> 
> Or is there any case when pppd failure should leave the connection
> active?
> Additionally perhaps g_dbus_proxy_call in
> nm-pppd-plugin.c:nm_phasechange() could be changed to sync
> counterpart.

Hi Piotr,

may-fail is supposed to behave as you explained, and in general we
should implement the logic, that a failure of one address family does
not necessarily fail the entire connection.

I think, at the very least, the if-else-if in ppp_failed()
https://cgit.freedesktop.org/NetworkManager/NetworkManager/tree/src/devices/wwan/nm-device-modem.c?id=ca26d1ae07a4e8a7036051301d0103ab0822fd44#n89
should be

   if (ip4_state_in_conf())
       schedule_ip4_timeout()
   else if (ip4_state_done())
       ip4_method_failed()
   /*BREAK*/
   if (ip6_state_in_conf())
       schedule_ip6_timeout()
   else if (ip6_state_done())
       ip6_method_failed()

But I think (and I think with this I agree with you), if ppp fails,
then the entire ppp connection failed. Not just one address family.
So, your device_state_change(STATE_FAILED) makes sense to me.



best,
Thomas

Attachment: signature.asc
Description: This is a digitally signed message part

_______________________________________________
networkmanager-list mailing list
[email protected]
https://mail.gnome.org/mailman/listinfo/networkmanager-list

Reply via email to