Re: How to recover from serial device being force closed.
Hey! > > I completely understand why there may be concerns at editing such critical > code for a use case that is not very common. > > Unfortunately my current situation requires renaming the interfaces. Multiple > modems may take turns being 'selected' (having the interface renamed) to pass > traffic. I hope to remove that need one day, but until then I will need this > patch. > So you have a modem that is in connected state but isn't selected to pass traffic, and when it gets selected, the net interface is renamed to some other name? I think it is the first time I've seen something like this :) Is it because some other app uses a hardcoded net interface name somewhere or something like that? > Even if this is an uncommon problem, I still think this change (or at least > my approach) could be useful to pull into ModemManager mainline. Yes, people > don't rename interfaces often, but it is something that Linux/udev support, > these events effectively breaks the Modem object lifecycle in ModemManager > (which makes ModemManager look buggy), the log messages that are printed out > do not make it clear that renaming the interface caused it (it took my a long > time to figure it out), and I have not seen any piece of documentation > warning that renaming any interface managed by ModemManager is unsupported. > Technically it is completely possible, but my worry is that it may introduce an amount of complexity that is not worth the effort. Updating the documentation to say that MM currently does not support interface renaming once the device has been initially probed would be quicker... and maybe we should do that right away even before thinking about supporting your usecase. > You do bring up an interesting point I had missed: what if someone renamed a > control interface? My code currently would not catch that, but maybe it > should! > Renaming the control interface would be overkill. Renaming the net port may be somewhat easier. > ModemManager handles tons of events and states for modems and all the weird > things they can do. If a user could do something with the ModemManager API > that caused strange errors like I described, the issue would be fixed in the > driver/etc. > > Linux Interfaces can be renamed using the Linux API, and while they are > renamed, ModemManager is in a precarious state until the interface is changed > back to its original name. Removing or resetting the modem in that precarious > state will cause ModemManager to act strangely for that modem until it is > removed again with the original interface name (which could be a problem if > another modem was added while the interface was renamed). And linux+udev > gives us all the tools to handle these known and supported events. I > personally think these are enough reasons to address this behavior. > > I see this as an issue of stability, reliability, and resiliency. Even if > this behavior is rarely encountered, I believe there are substantial reasons > to fix it. That does not mean that my code is the best way to do it. If you > end up agreeing that this behavior should be fixed, I will make revisions to > address your concerns. > MM also has a lot of limitations, and one of the limitations that I always warn about is that the state of the modem should not be touched out of MM's own APIs. E.g. if you run QMI commands that modify the state of the modem using qmicli, MM may not pick them up. If the state needs to be updated, it should be done using MM's own APIs. I think we can say the same about this problem, you're trying to modify the state of how the modem is exposed in the kernel out of MM's eyes, and expect MM to pick that up. Technically both things could be possible, but it may really be too much complexity introduced for little gain, especially when the issues can be avoided in other ways :/ > To the complexity of my code, I tried to made it pretty straight forward and > self contained. When a rename event happens, find the Modem object associated > with the old device name, remove the old name from the modem, and add the new > name. I could make the replacement logic into a function to keep the code > more readable if you would prefer. > Aren't we fully re-probing the modem from scratch upon the net interface rename? We could try to do that as an initial step, in the same way as we do the forced reprobing when e.g. the proxy dies or when the SIM card is switched. The reprobing logic will make MM re-detect the modem object with the new state. > You mention opening the door to more complex interactions, and that is > interesting. If I don't correctly handle the udev events, then I am not > really fixing anything! > > I just looked up all of the udev event types: add, remove, change, move, > online, offline, bind, and unbind. > > ModemManager already has support for add, remove, change (kind of, more on > that later), and with my patch move (I would argue that the upstream version > of
Re: How to recover from serial device being force closed.
Sorry for the late reply, I completely understand why there may be concerns at editing such critical code for a use case that is not very common. Unfortunately my current situation requires renaming the interfaces. Multiple modems may take turns being 'selected' (having the interface renamed) to pass traffic. I hope to remove that need one day, but until then I will need this patch. Even if this is an uncommon problem, I still think this change (or at least my approach) could be useful to pull into ModemManager mainline. Yes, people don't rename interfaces often, but it is something that Linux/udev support, these events effectively breaks the Modem object lifecycle in ModemManager (which makes ModemManager look buggy), the log messages that are printed out do not make it clear that renaming the interface caused it (it took my a long time to figure it out), and I have not seen any piece of documentation warning that renaming any interface managed by ModemManager is unsupported. You do bring up an interesting point I had missed: what if someone renamed a control interface? My code currently would not catch that, but maybe it should! ModemManager handles tons of events and states for modems and all the weird things they can do. If a user could do something with the ModemManager API that caused strange errors like I described, the issue would be fixed in the driver/etc. Linux Interfaces can be renamed using the Linux API, and while they are renamed, ModemManager is in a precarious state until the interface is changed back to its original name. Removing or resetting the modem in that precarious state will cause ModemManager to act strangely for that modem until it is removed again with the original interface name (which could be a problem if another modem was added while the interface was renamed). And linux+udev gives us all the tools to handle these known and supported events. I personally think these are enough reasons to address this behavior. I see this as an issue of stability, reliability, and resiliency. Even if this behavior is rarely encountered, I believe there are substantial reasons to fix it. That does not mean that my code is the best way to do it. If you end up agreeing that this behavior should be fixed, I will make revisions to address your concerns. To the complexity of my code, I tried to made it pretty straight forward and self contained. When a rename event happens, find the Modem object associated with the old device name, remove the old name from the modem, and add the new name. I could make the replacement logic into a function to keep the code more readable if you would prefer. You mention opening the door to more complex interactions, and that is interesting. If I don't correctly handle the udev events, then I am not really fixing anything! I just looked up all of the udev event types: *add*, *remove*, *change*, *move*, *online*, *offline*, *bind*, and *unbind*. ModemManager already has support for *add*, *remove*, *change* (kind of, more on that later), and with my patch *move* (I would argue that the upstream version of the move handler was incorrect). The *online* and *offline* events are supposedly for things like a network interface coming up or going down, so I do not think these events will be relevant to ModemManager at all. The *bind* and *unbind* events are for when drivers are attached or detached from the device, which I also believe is useless to ModemManager and could not affect anything with the Modem lifecycle. But the *change* event is interesting. I am getting contradicting descriptions from different sources about *change* vs *move*. Perhaps you are right that this could cause similar issues. I will probably need to rework this to properly handle both in case *change* can be triggered in a rename, but either way, more research is required. I may also be double handling *move*, which could be handled more gracefully. I will try to fix these up in the coming weeks or months as time allows. I welcome feedback and ideas even before I finish the patch. Thanks as usual to the ModemManager team for building such a useful tool, Jessy Diamond Exum On Tue, Jan 30, 2024 at 2:44 AM Aleksander Morgado < aleksande...@chromium.org> wrote: > Hey, > > > In my application, I use ModemManager to connect the modem, and then I > rename the network interface to a special name that is required by another > part of my system (and change the name back if the modem disconnects). > > Is it not possible to rename the interface before even MM tries to grab > the port? > > We could support a patch like yours, but it seems overly complicated and > may open the door to even more complex interactions, e.g. what if someone > comes with the idea of renaming the control port instead of the net port. > I'm sure it's technically possible to do all that, but not sure how much > benefit it will bring. > > -- > Aleksander >
Re: How to recover from serial device being force closed.
Hey, > In my application, I use ModemManager to connect the modem, and then I rename the network interface to a special name that is required by another part of my system (and change the name back if the modem disconnects). Is it not possible to rename the interface before even MM tries to grab the port? We could support a patch like yours, but it seems overly complicated and may open the door to even more complex interactions, e.g. what if someone comes with the idea of renaming the control port instead of the net port. I'm sure it's technically possible to do all that, but not sure how much benefit it will bring. -- Aleksander
Re: How to recover from serial device being force closed.
I think I figured out why the `forced close` issue is happening to me! Short answer: network interface renaming. In my application, I use ModemManager to connect the modem, and then I rename the network interface to a special name that is required by another part of my system (and change the name back if the modem disconnects). I added some additional logging to ModemManager's hardware detection code and was watching the udev events as the modem was rebooted (comparing when the modem is properly torn down and when it is not). I noticed that when a network device is removed (like the CDC interfaces on the modem) the interface's current name is listed in the remove event (which makes sense), but ModemManager does not update the name of the network interfaces internally, so when a remove uevent comes through with the renamed interface name, ModemManager does not realize it is one of the port that is associated with the modem. And because in this situation there is always one 'port' on the modem (even though it has actually been removed), the modem is not disposed of. modem reset caused zombie modem === [8627]: [1700034180.178810] [ttyACM1/at] unexpected port hangup! [8627]: [1700034180.178885] [ttyACM1/at] forced to close port [8627]: [1700034180.178923] [ttyACM1/at] device open count is 0 (close) [8627]: [1700034180.178969] [ttyACM1/at] closing serial port... [8627]: [1700034180.183019] [ttyACM1/at] serial port closed [8627]: [1700034180.183358] [base-manager] EXTRALOG MMBaseManager::handle_uevent(remove) "tty", "ttyACM0" [8627]: [1700034180.183411] [base-manager] EXTRALOG MMBaseManager::device_removed("tty", "ttyACM0" [8627]: [1700034180.183468] [base-manager] port ttyACM0 released by device '/sys/devices/platform/soc/310.usb3/xhci-hcd.2.auto/usb5/5-1' [8627]: [1700034180.185135] [base-manager] EXTRALOG MMBaseManager::handle_uevent(remove) "tty", "ttyACM1" [8627]: [1700034180.185214] [base-manager] EXTRALOG MMBaseManager::device_removed("tty", "ttyACM1" [8627]: [1700034180.185271] [base-manager] port ttyACM1 released by device '/sys/devices/platform/soc/310.usb3/xhci-hcd.2.auto/usb5/5-1' [8627]: [1700034180.190975] [base-manager] EXTRALOG MMBaseManager::handle_uevent(remove) "tty", "ttyACM2" [8627]: [1700034180.191056] [base-manager] EXTRALOG MMBaseManager::device_removed("tty", "ttyACM2" [8627]: [1700034180.19] [base-manager] port ttyACM2 released by device '/sys/devices/platform/soc/310.usb3/xhci-hcd.2.auto/usb5/5-1' [8627]: [1700034180.193025] [base-manager] EXTRALOG MMBaseManager::handle_uevent(remove) "tty", "ttyACM4" [8627]: [1700034180.193099] [base-manager] EXTRALOG MMBaseManager::device_removed("tty", "ttyACM4" [8627]: [1700034180.193150] [base-manager] port ttyACM4 released by device '/sys/devices/platform/soc/310.usb3/xhci-hcd.2.auto/usb5/5-1' [8627]: [1700034180.197629] [base-manager] EXTRALOG MMBaseManager::handle_uevent(remove) "tty", "ttyACM3" [8627]: [1700034180.197713] [base-manager] EXTRALOG MMBaseManager::device_removed("tty", "ttyACM3" [8627]: [1700034180.197768] [base-manager] port ttyACM3 released by device '/sys/devices/platform/soc/310.usb3/xhci-hcd.2.auto/usb5/5-1' [8627]: [1700034180.224472] [base-manager] EXTRALOG MMBaseManager::handle_uevent(remove) "net", "cell_cdc" [8627]: [1700034180.224667] [base-manager] EXTRALOG MMBaseManager::device_removed("net", "cell_cdc" [8627]: [1700034180.253334] [base-manager] EXTRALOG MMBaseManager::handle_uevent(remove) "net", "usb1" [8627]: [1700034180.253410] [base-manager] EXTRALOG MMBaseManager::device_removed("net", "usb1" [8627]: [1700034180.253462] [base-manager] port usb1 released by device '/sys/devices/platform/soc/310.usb3/xhci-hcd.2.auto/usb5/5-1' === modem boot == [8627]: [1700034187.060079] [base-manager] EXTRALOG MMBaseManager::handle_uevent(add) "tty", "ttyACM1" [8627]: [1700034187.060497] [ttyACM1] port contents loaded (EXTRALOG DEVICE UDEV): [8627]: [1700034187.060540] [ttyACM1] bus: usb [8627]: [1700034187.060594] [ttyACM1] interface: /sys/devices/platform/soc/310.usb3/xhci-hcd.2.auto/usb5/5-1/5-1:1.2 [8627]: [1700034187.060631] [ttyACM1] device: /sys/devices/platform/soc/310.usb3/xhci-hcd.2.auto/usb5/5-1 [8627]: [1700034187.060667] [ttyACM1] driver: cdc_acm [8627]: [1700034187.060700] [ttyACM1] vendor: 1e2d [8627]: [1700034187.060733] [ttyACM1] product: 0061 [8627]: [1700034187.060767] [ttyACM1] revision: 0232 [8627]: [1700034187.060802] [base-manager] EXTRALOG MMBaseManager::device_added("ttyACM1", 1, 0) [8627]: [1700034187.060841] [base-manager] adding port ttyACM1 at sysfs path: /sys/devices/platform/soc/310.usb3/xhci-hcd.2.auto/usb5/5-1/5-1:1.2/tty/ttyACM1 [8627]: [1700034187.060978] [filter] (tty/ttyACM1) port allowed: device is allowlisted by plugin (vid) [8627]: [1700034187.061032] [base-manager] additional port ttyACM1 in device /sys/devices/platform/soc/310.usb3/xhci-hcd.2.auto/usb5/5-1 [8627
Re: How to recover from serial device being force closed.
Hey, > > it would be great if you could verify the behavior on the main branch > (at least on the last stable 1.20.6), and give us some logs, so we can > have a better idea of the issue and how proceed, whether in a general > manner, or with some specific fixes. > Yes, especially since several bugfixes were done that could help solve the problem (although not sure if they were done in 1.20 or earlier really) > > > > I have encountered an issue while running ModemManager (version 1.18.4) on > > a Cinterion PLAS8 modem. Sometimes, the serial device used for AT > > communication with the modem is force closed (due to G_IO_HUP) without any > > obvious reason. > > Also, are you using NetworkManager along with ModemManager? Or are you running pppd yourself? You must ensure that MM only touches the port once pppd has exited, otherwise they'll use mismatched CLOCAL flags and MM may get a HUP when the modem got disconnected from the network (as opposed to getting a HUP only when the port goes away from the system) -- Aleksander
Re: How to recover from serial device being force closed.
Hello Jessy, it would be great if you could verify the behavior on the main branch (at least on the last stable 1.20.6), and give us some logs, so we can have a better idea of the issue and how proceed, whether in a general manner, or with some specific fixes. Giacinto On Mon, May 1, 2023 at 8:47 AM Jessy Exum wrote: > > Hello ModemManager devs, > > I have encountered an issue while running ModemManager (version 1.18.4) on a > Cinterion PLAS8 modem. Sometimes, the serial device used for AT communication > with the modem is force closed (due to G_IO_HUP) without any obvious reason. > > Despite the modem being unreachable over the dead connection, the modem > remains listed in mmcli (without being marked as having an error), and any > ModemManager command that would send an AT command to the modem fails with an > error similar to: > > GDBus.Error:org.freedesktop.ModemManager1.Error.Core.Connected: Cannot run > sequence: 'Could not open serial device ttyACM1: it has been forced close' > > > When I first saw this issue, I thought it was the modem acting up (it would > not be the first time), but I found that restarting ModemManager recovers the > modem. This tells me that the modem was still functioning, and ModemManager > was unable (or never tried) to re-open the channel. I have not had this error > occur on a device with ModemManager debugging enabled, so I can not be sure > if reconnection is attempted (I will try to find a way to manually trigger > this issue so I can get better debugging). > > If I could reset the modem, that should cause it to be re-initiated in > ModemManger and likely fix the issue. But as far as I know, there is no way > to have ModemManager re-enumerate a modem without removing the device > (physically, through a modem reset, etc). Unfortunately, I am also unable to > recover the modem through ModemManager because a modem reset requires the > ability to send AT commands to the modem. Currently, the only solutions I > have found to recover the modem is to physically power cycle the modem (which > is not always possible), or to stop and restart ModemManager (inconvenient if > there are other modems). I may also be able to use an external script to > detect ModemManager getting into this state, and directly send a reset > command to the modem over the AT channel (without ModemManager), but I would > prefer if ModemManager was able to recover this on its own. > > The error occurs rarely/randomly enough that I am not interested in chasing > down why this happens (it could be the modem being flakey, electrical > interference, whatever). I am more interested in being able to gracefully > recover from this situation, so I want to know what ModemManager should be > doing so I can try to help fix it. > > Any suggestions or information would be greatly appreciated. > > Thanks for all the hard work in this useful tool, > Jessy Diamond Exum >
How to recover from serial device being force closed.
Hello ModemManager devs, I have encountered an issue while running ModemManager (version 1.18.4) on a Cinterion PLAS8 modem. *Sometimes, the serial device used for AT communication with the modem is force closed (due to G_IO_HUP) without any obvious reason.* Despite the modem being unreachable over the dead connection, the modem remains listed in mmcli (without being marked as having an error), and any ModemManager command that would send an AT command to the modem fails with an error similar to: GDBus.Error:org.freedesktop.ModemManager1.Error.Core.Connected: Cannot run sequence: 'Could not open serial device ttyACM1: it has been forced close' When I first saw this issue, I thought it was the modem acting up (it would not be the first time), but I found that restarting ModemManager recovers the modem. This tells me that the modem was still functioning, and ModemManager was unable (or never tried) to re-open the channel. I have not had this error occur on a device with ModemManager debugging enabled, so I can not be sure if reconnection is attempted (I will try to find a way to manually trigger this issue so I can get better debugging). If I could reset the modem, that should cause it to be re-initiated in ModemManger and likely fix the issue. But as far as I know, there is no way to have ModemManager re-enumerate a modem without removing the device (physically, through a modem reset, etc). Unfortunately, I am also unable to recover the modem through ModemManager because a modem reset requires the ability to send AT commands to the modem. Currently, the only solutions I have found to recover the modem is to physically power cycle the modem (which is not always possible), or to stop and restart ModemManager (inconvenient if there are other modems). I may also be able to use an external script to detect ModemManager getting into this state, and directly send a reset command to the modem over the AT channel (without ModemManager), but I would prefer if ModemManager was able to recover this on its own. The error occurs rarely/randomly enough that I am not interested in chasing down why this happens (it could be the modem being flakey, electrical interference, whatever). I am more interested in being able to gracefully recover from this situation, so *I want to know what ModemManager should be doing so I can try to help fix it.* Any suggestions or information would be greatly appreciated. Thanks for all the hard work in this useful tool, Jessy Diamond Exum