Re: E173s-1 stops working after weeks of uptime

2019-07-21 Thread De Alti, Cristiano
>Yes, definitely useful there if the platform has a way of actually
>killing power to the device, whether that's USB or GPIO or whatever.
>And I think it would be fine to add a method to the Modem object to do
>this.
>
>If there's a lot of variability in platforms (and I think there is :)
>we could theoretically have MM call out to a script to poke GPIOs or
>sysfs or run some other program.
>
>A start would be gathering the data on the different ways that
>platforms allow killing power to modems. TO start with, if it's USB and
>the hub reports per-port power control in wHubCharacteristic via lsusb
>-v then we might be able to power cycle.
>
>Dan

I think calling a script after giving up with the modem is a great idea.
We have platforms with power and reset control of the modem and I’d be able to 
test on real world hardware.
Please add this in your roadmap.

Ciao,
  Cristiano

___
ModemManager-devel mailing list
ModemManager-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/modemmanager-devel

Re: E173s-1 stops working after weeks of uptime

2019-05-31 Thread Dan Williams
On Fri, 2019-05-31 at 11:18 +0200, Ladislav Michl wrote:
> On Thu, May 30, 2019 at 09:46:01PM -0500, Dan Williams wrote:
> > On Fri, 2019-05-31 at 00:01 +0200, Ladislav Michl wrote:
> [...]
> ...and I'll eventually create new thread to not continue hikacking
> this one.
> > > I'm using MM restart and if that doesn't help, then modem
> > > powercycle and
> > > then daemon restart. Boards do have modem power supply controlled
> > > by
> > > gpio,
> > > so that's easy to do, but rather hackish. Is there any plan to
> > > add
> > > some
> > > "modem hw reset" infrastructure here?
> > 
> > We've thought about it before, but the problem is that it's pretty
> > unreliable for USB ports on most machines that aren't embedded.
> > Basically, you cannot guarantee that the USB host controller
> > supports
> > the command to power cycle a port, and a lot of them just don't.
> > 
> > You can't expect the USB port reset to work, because that's a USB
> > request to the firmware running on the device IIRC and if the
> > device is
> > already hung it's surely not going to pay attention to a reset
> > request.
> 
> Even worse, many quectels have nRST and PWR_KEY pins which name
> suggests
> to reset or power down/up them, but those are just gpios for them, so
> the
> only reliable way to reset is power cycle. And most of sane hardware
> designers are creating their hardware with that in mind.
> 
> > So basically yes, that infrastructure could be added, but there's
> > no
> > way you can depend on the request actually succeeding especially on
> > x86
> > machines.
> 
> I'm not talking about making it anyhow mandatory and not even about
> ordinary laptops or PCs with frustrated user always able to replug
> poor
> USB modem once their favourite web page stops loading.
> 
> ModemManager hit OpenWrt recently and is in PTXdist for ages, so
> number
> of "embedded" instalations could easily overgrow those "ordinary"
> ones.
> And each that embedded device have to deal with unreliable modem
> somehow,
> so I'd say it could be usefull to have at least some recomendation
> how to handle such powercycles.

Yes, definitely useful there if the platform has a way of actually
killing power to the device, whether that's USB or GPIO or whatever.
And I think it would be fine to add a method to the Modem object to do
this.

If there's a lot of variability in platforms (and I think there is :)
we could theoretically have MM call out to a script to poke GPIOs or
sysfs or run some other program.

A start would be gathering the data on the different ways that
platforms allow killing power to modems. TO start with, if it's USB and
the hub reports per-port power control in wHubCharacteristic via lsusb
-v then we might be able to power cycle.

Dan

> The thing is that those devices often do not have enough storage to
> run
> MM with debug log level, are otherwise unreachable, their CPU is
> often
> less powerfull than those used in modems (which could teoretically
> run
> all that software running on the board they are plugged into, but
> that's
> different story) and hangs does not occur so often, to be reasonably
> debuggable. All that makes modem device power cycling the only
> reliable
> way to recover in field (except whole machine power on reset, of
> course)

___
ModemManager-devel mailing list
ModemManager-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/modemmanager-devel

Re: E173s-1 stops working after weeks of uptime

2019-05-31 Thread Ladislav Michl
On Thu, May 30, 2019 at 09:46:01PM -0500, Dan Williams wrote:
> On Fri, 2019-05-31 at 00:01 +0200, Ladislav Michl wrote:
[...]
...and I'll eventually create new thread to not continue hikacking this one.
> > I'm using MM restart and if that doesn't help, then modem
> > powercycle and
> > then daemon restart. Boards do have modem power supply controlled by
> > gpio,
> > so that's easy to do, but rather hackish. Is there any plan to add
> > some
> > "modem hw reset" infrastructure here?
> 
> We've thought about it before, but the problem is that it's pretty
> unreliable for USB ports on most machines that aren't embedded.
> Basically, you cannot guarantee that the USB host controller supports
> the command to power cycle a port, and a lot of them just don't.
>
> You can't expect the USB port reset to work, because that's a USB
> request to the firmware running on the device IIRC and if the device is
> already hung it's surely not going to pay attention to a reset request.

Even worse, many quectels have nRST and PWR_KEY pins which name suggests
to reset or power down/up them, but those are just gpios for them, so the
only reliable way to reset is power cycle. And most of sane hardware
designers are creating their hardware with that in mind.

> So basically yes, that infrastructure could be added, but there's no
> way you can depend on the request actually succeeding especially on x86
> machines.

I'm not talking about making it anyhow mandatory and not even about
ordinary laptops or PCs with frustrated user always able to replug poor
USB modem once their favourite web page stops loading.

ModemManager hit OpenWrt recently and is in PTXdist for ages, so number
of "embedded" instalations could easily overgrow those "ordinary" ones.
And each that embedded device have to deal with unreliable modem somehow,
so I'd say it could be usefull to have at least some recomendation
how to handle such powercycles.

The thing is that those devices often do not have enough storage to run
MM with debug log level, are otherwise unreachable, their CPU is often
less powerfull than those used in modems (which could teoretically run
all that software running on the board they are plugged into, but that's
different story) and hangs does not occur so often, to be reasonably
debuggable. All that makes modem device power cycling the only reliable
way to recover in field (except whole machine power on reset, of course)

ladis
___
ModemManager-devel mailing list
ModemManager-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/modemmanager-devel

Re: E173s-1 stops working after weeks of uptime

2019-05-30 Thread Ladislav Michl
On Thu, May 30, 2019 at 08:49:15PM +0200, Dario Nieuwenhuis wrote:
> Hello,
> 
> We have some embedded devices deployed in the field using a huawei E173s-1
> modem. We're having some issues where connectivity stops working randomly,
> after 1-3 weeks of uptime.
> 
> First, there's a PPP disconnection. Then there's a few loops of this error
> for 1-2 seconds, rather fast:
> 
> failed to connect modem: Couldn't connect: cannot keep data port
> open.Could not open serial device ttyUSB0: reopen operation in progress
> 
> Afterwards, there are 10 attempts of reconnecting, with "at port timed out
> X consecutive times" errors. After 10 attempts, ModemManager gives up
> permanently, and never tries again to bring the device back up.
> 
> (tty/ttyUSB0) at port timed out 10 consecutive times, marking modem
> '/org/freedesktop/ModemManager1/Modem/0' as invalid
> 
> A few hours later, when we could get onsite, restarting ModemManager and
> NetworkManager brought the modem back online with no issues (with no
> unplug/replug/powercycle of the modem or the Linux board.) This means the
> modem is not irreversibly crashed/failed, so I think this is a software
> issue that should be fixable.
> 
> Unfortunately, ModemManager was set to INFO log level because we though
> DEBUG was too verbose. We have set DEBUG log level, so the next time it
> happens we will have more logs.
> 
> I would really appreciate any input you may have on how to solve this.
> - I thought of patching out the "max 10 timeouts" limit, so ModemManager
> keeps retrying indefinitely. Is this a good idea?
> - What can I try so next time this happens we can get more info on the
> issue? (besides debug log level)
> - Any recommendations in general on how to ensure the device is always
> connected? Any config knobs to tweak? We've been thinking of adding a "if
> no internet during 10 minutes, powercycle everything" watchdog, but that
> feels like giving up on getting this working properly.

I'm using MM restart and if that doesn't help, then modem powercycle and
then daemon restart. Boards do have modem power supply controlled by gpio,
so that's easy to do, but rather hackish. Is there any plan to add some
"modem hw reset" infrastructure here?

ladis
___
ModemManager-devel mailing list
ModemManager-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/modemmanager-devel

E173s-1 stops working after weeks of uptime

2019-05-30 Thread Dario Nieuwenhuis
Hello,

We have some embedded devices deployed in the field using a huawei E173s-1
modem. We're having some issues where connectivity stops working randomly,
after 1-3 weeks of uptime.

First, there's a PPP disconnection. Then there's a few loops of this error
for 1-2 seconds, rather fast:

failed to connect modem: Couldn't connect: cannot keep data port
open.Could not open serial device ttyUSB0: reopen operation in progress

Afterwards, there are 10 attempts of reconnecting, with "at port timed out
X consecutive times" errors. After 10 attempts, ModemManager gives up
permanently, and never tries again to bring the device back up.

(tty/ttyUSB0) at port timed out 10 consecutive times, marking modem
'/org/freedesktop/ModemManager1/Modem/0' as invalid

A few hours later, when we could get onsite, restarting ModemManager and
NetworkManager brought the modem back online with no issues (with no
unplug/replug/powercycle of the modem or the Linux board.) This means the
modem is not irreversibly crashed/failed, so I think this is a software
issue that should be fixable.

Unfortunately, ModemManager was set to INFO log level because we though
DEBUG was too verbose. We have set DEBUG log level, so the next time it
happens we will have more logs.

I would really appreciate any input you may have on how to solve this.
- I thought of patching out the "max 10 timeouts" limit, so ModemManager
keeps retrying indefinitely. Is this a good idea?
- What can I try so next time this happens we can get more info on the
issue? (besides debug log level)
- Any recommendations in general on how to ensure the device is always
connected? Any config knobs to tweak? We've been thinking of adding a "if
no internet during 10 minutes, powercycle everything" watchdog, but that
feels like giving up on getting this working properly.

Thanks in advance!

Dario

---

Logs of the time of failure:
https://gist.github.com/Dirbaio/bdea5235b832ee12d4b56caf7576fe3d
Output of mmcli -m 0:
https://gist.github.com/Dirbaio/13c97d9ec508faba0829ce9de0cf79de
lsusb output:
https://gist.github.com/Dirbaio/0e58a51fd551272f92530f96c612b8c6

Relevant software versions:
- ModemManager 1.10.0
- NetworkManager 1.16.0
- Linux 5.0.5
- pppd 2.4.7
___
ModemManager-devel mailing list
ModemManager-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/modemmanager-devel