John: good sleuthing!

Although, I didn't quite understand your comment about exiting
genDavisLoopPackets(). Are you saying you might as well set max_tries=1
because it never recovers?

It's possible that after a time sync, the logger is occupied for a bit and
unable to generate new packets. A possible solution might be to sleep for 5
or 10 seconds after the sync.

In any case, I'm 1,200 miles from my station, so I can't offer much for a
few months.

-tk



On Sun, Dec 12, 2021 at 6:00 PM John Kline <[email protected]> wrote:

> 
> I’ve been studying the LOOP errors for over a year now.  I see them on
> three different installs, 1 NUC7i5 (w/ console) and two Raspberry Pi 4s (1
> w/ console, 1 w/ envoy).
>
> I’ve got a modified and instrumented driver.  The important mod is to exit
> genDavisLoopPackets() when the error is hit.  The error is almost always 
> “Expected
> to read 99 chars; got 0 instead”, but it does not have to be 0 that was
> actually received.  I you don’t exit out of that loop, and you don’t get
> lucky to be near the end of the loop, weewx will restart.  With the change,
> weewx never restarts.
>
> Next is the question of why.  It may very well be what Tom suggests.  I’ve
> suspected as such and have been meaning to write a C program to read the
> serial (USB) port and feed weewx.
>
> One thing I can say is that, for me, ON ALL THREE consoles, the error
> occurs *almost* always after a time set.
>
> Here’s an example from the log:
>
>  Vantage Clock Set/Short Reads Info:
>    Dec 12 00:30:04 judy weewx[491] INFO user.vantagenext: Clock set to 
> 2021-12-12 00:30:06 PST (1639297806) (9128, 1639297804.599486, 1.303189)
>    Dec 12 00:30:09 judy weewx[491] INFO user.vantagenext: get_packet returned 
> 0 bytes. (9129)
>    Dec 12 00:30:13 judy weewx[491] INFO user.vantagenext: get_packet returned 
> 0 bytes. (9130)
>    Dec 12 00:30:23 judy weewx[491] INFO user.vantagenext: get_packet returned 
> 0 bytes. (9134)
>    Dec 12 00:30:27 judy weewx[491] INFO user.vantagenext: get_packet returned 
> 0 bytes. (9135)
>    Dec 12 00:30:31 judy weewx[491] INFO user.vantagenext: get_packet returned 
> 0 bytes. (9136)
>    Dec 12 00:30:35 judy weewx[491] INFO user.vantagenext: get_packet returned 
> 0 bytes. (9137)
>    Dec 12 00:30:39 judy weewx[491] INFO user.vantagenext: get_packet returned 
> 0 bytes. (9138)
>    Dec 12 00:30:44 judy weewx[491] INFO user.vantagenext: get_packet returned 
> 0 bytes. (9139)
>    Dec 12 00:30:49 judy weewx[491] INFO user.vantagenext: get_packet returned 
> 0 bytes. (9141)
>
> In the above, when the clock set happened, the driver had served 9128
> successful loop packets.  It served one more after the clock set and then
> got the zero bytes error (9129).  The same is true for the next, one
> success and then one error.  After that, 4 successes and one error.  And so
> on.
>
> Sometimes after a time set, I see no errors, but that is not often.  The
> number of errors varies.  This is on the high side.  1 - 4 errors might be
> more common.
>
> Sometimes there is an error that occurs NOT after the time set.  This is
> very rare and, when it does happen, it is very often an error that I see on
> all three consoles.  And, when it does occur, it’s always a single error.
>
> I’m still studying this (because it bothers the heck out of me).
>
> On Dec 12, 2021, at 5:32 PM, Tom Keffer <[email protected]> wrote:
>
> 
> Naturally, that excerpt didn't have any LOOP errors.
>
> I can definitely see a scenario where an overloaded machine could miss
> LOOP packets. The LOOP packets are requested in big bunches, typically 200
> at a time, then the driver blocks, waiting for them. If a reporting thread
> hogs the CPU, the driver may be starved for time and not get back in time
> to get the waiting packet. The Vantage then assumes the driver has gone
> away and stops sending the packets.
>
> You have a lot going on with your system. Is there anything that could
> prevent the Python runtime engine from switching threads?
>
> On Sun, Dec 12, 2021 at 5:23 PM vince <[email protected]> wrote:
>
>> On Sunday, December 12, 2021 at 4:19:33 PM UTC-8 Tom Keffer wrote:
>>
>>> Vince, you should know better! That's not much of a log excerpt. What
>>> was before that DMPAFT log entry? Did weewx put something in the database?
>>> If not, that looks like a classic case of memory corruption.
>>>
>>> As for the LOOP errors, I see those every once in a while, particularly
>>> when the logger is busy after a large catch up. Again, more of the log
>>> would help.
>>>
>>>
>> Pot. Kettle .Black. Moi ? - Guilty as charged, sir :-)
>>
>> I guess I was trying to get a feel for "occasionally seeing this or that"
>> is not that unusual nor necessarily actionable if everything else works ok.
>>
>> But to answer - everything seems to be working ok.
>> The db is getting updated just fine, as are the web pages and images.
>> All the extensions posting to WU/PWS/CWOP and the MQTT stuff (publish and
>> subscribe) are working fine too.
>>
>> I've attached a gzipped syslog from a restart of weewx and a couple
>> cycles afterward.  Typically the rsync stuff completes last, so I added a
>> blank line between cycles to make this a little easier to parse.   I did
>> notice a api key error in my forecast setup (fixed) so the initial startup
>> might be a little longer than normal since the forecast stuff had to catch
>> up.   Timings for subsequent runs look normal for here - a little over 3
>> minutes til the rsync stuff ends a 5-minute archive set of stuff...
>>
>> --
>> You received this message because you are subscribed to the Google Groups
>> "weewx-user" group.
>> To unsubscribe from this group and stop receiving emails from it, send an
>> email to [email protected].
>> To view this discussion on the web visit
>> https://groups.google.com/d/msgid/weewx-user/61b8a4db-7ee4-4449-9fcc-7780f1775078n%40googlegroups.com
>> <https://groups.google.com/d/msgid/weewx-user/61b8a4db-7ee4-4449-9fcc-7780f1775078n%40googlegroups.com?utm_medium=email&utm_source=footer>
>> .
>>
> --
> You received this message because you are subscribed to the Google Groups
> "weewx-user" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to [email protected].
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/weewx-user/CAPq0zEBV7S%2B7hP55BZ_UU9SRdJTfnpJiw-SG%3DKfc0tmrnkYJTQ%40mail.gmail.com
> <https://groups.google.com/d/msgid/weewx-user/CAPq0zEBV7S%2B7hP55BZ_UU9SRdJTfnpJiw-SG%3DKfc0tmrnkYJTQ%40mail.gmail.com?utm_medium=email&utm_source=footer>
> .
>
> --
> You received this message because you are subscribed to the Google Groups
> "weewx-user" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to [email protected].
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/weewx-user/327F1AFB-9303-493E-B6BA-265B31D4A380%40johnkline.com
> <https://groups.google.com/d/msgid/weewx-user/327F1AFB-9303-493E-B6BA-265B31D4A380%40johnkline.com?utm_medium=email&utm_source=footer>
> .
>

-- 
You received this message because you are subscribed to the Google Groups 
"weewx-user" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/weewx-user/CAPq0zEDUKgHvsK6e5gJdXJVF7Wz%3DdzgqwkhrJ0eCEXDi2yP9Jw%40mail.gmail.com.

Reply via email to