John: good sleuthing! Although, I didn't quite understand your comment about exiting genDavisLoopPackets(). Are you saying you might as well set max_tries=1 because it never recovers?
It's possible that after a time sync, the logger is occupied for a bit and unable to generate new packets. A possible solution might be to sleep for 5 or 10 seconds after the sync. In any case, I'm 1,200 miles from my station, so I can't offer much for a few months. -tk On Sun, Dec 12, 2021 at 6:00 PM John Kline <[email protected]> wrote: > > I’ve been studying the LOOP errors for over a year now. I see them on > three different installs, 1 NUC7i5 (w/ console) and two Raspberry Pi 4s (1 > w/ console, 1 w/ envoy). > > I’ve got a modified and instrumented driver. The important mod is to exit > genDavisLoopPackets() when the error is hit. The error is almost always > “Expected > to read 99 chars; got 0 instead”, but it does not have to be 0 that was > actually received. I you don’t exit out of that loop, and you don’t get > lucky to be near the end of the loop, weewx will restart. With the change, > weewx never restarts. > > Next is the question of why. It may very well be what Tom suggests. I’ve > suspected as such and have been meaning to write a C program to read the > serial (USB) port and feed weewx. > > One thing I can say is that, for me, ON ALL THREE consoles, the error > occurs *almost* always after a time set. > > Here’s an example from the log: > > Vantage Clock Set/Short Reads Info: > Dec 12 00:30:04 judy weewx[491] INFO user.vantagenext: Clock set to > 2021-12-12 00:30:06 PST (1639297806) (9128, 1639297804.599486, 1.303189) > Dec 12 00:30:09 judy weewx[491] INFO user.vantagenext: get_packet returned > 0 bytes. (9129) > Dec 12 00:30:13 judy weewx[491] INFO user.vantagenext: get_packet returned > 0 bytes. (9130) > Dec 12 00:30:23 judy weewx[491] INFO user.vantagenext: get_packet returned > 0 bytes. (9134) > Dec 12 00:30:27 judy weewx[491] INFO user.vantagenext: get_packet returned > 0 bytes. (9135) > Dec 12 00:30:31 judy weewx[491] INFO user.vantagenext: get_packet returned > 0 bytes. (9136) > Dec 12 00:30:35 judy weewx[491] INFO user.vantagenext: get_packet returned > 0 bytes. (9137) > Dec 12 00:30:39 judy weewx[491] INFO user.vantagenext: get_packet returned > 0 bytes. (9138) > Dec 12 00:30:44 judy weewx[491] INFO user.vantagenext: get_packet returned > 0 bytes. (9139) > Dec 12 00:30:49 judy weewx[491] INFO user.vantagenext: get_packet returned > 0 bytes. (9141) > > In the above, when the clock set happened, the driver had served 9128 > successful loop packets. It served one more after the clock set and then > got the zero bytes error (9129). The same is true for the next, one > success and then one error. After that, 4 successes and one error. And so > on. > > Sometimes after a time set, I see no errors, but that is not often. The > number of errors varies. This is on the high side. 1 - 4 errors might be > more common. > > Sometimes there is an error that occurs NOT after the time set. This is > very rare and, when it does happen, it is very often an error that I see on > all three consoles. And, when it does occur, it’s always a single error. > > I’m still studying this (because it bothers the heck out of me). > > On Dec 12, 2021, at 5:32 PM, Tom Keffer <[email protected]> wrote: > > > Naturally, that excerpt didn't have any LOOP errors. > > I can definitely see a scenario where an overloaded machine could miss > LOOP packets. The LOOP packets are requested in big bunches, typically 200 > at a time, then the driver blocks, waiting for them. If a reporting thread > hogs the CPU, the driver may be starved for time and not get back in time > to get the waiting packet. The Vantage then assumes the driver has gone > away and stops sending the packets. > > You have a lot going on with your system. Is there anything that could > prevent the Python runtime engine from switching threads? > > On Sun, Dec 12, 2021 at 5:23 PM vince <[email protected]> wrote: > >> On Sunday, December 12, 2021 at 4:19:33 PM UTC-8 Tom Keffer wrote: >> >>> Vince, you should know better! That's not much of a log excerpt. What >>> was before that DMPAFT log entry? Did weewx put something in the database? >>> If not, that looks like a classic case of memory corruption. >>> >>> As for the LOOP errors, I see those every once in a while, particularly >>> when the logger is busy after a large catch up. Again, more of the log >>> would help. >>> >>> >> Pot. Kettle .Black. Moi ? - Guilty as charged, sir :-) >> >> I guess I was trying to get a feel for "occasionally seeing this or that" >> is not that unusual nor necessarily actionable if everything else works ok. >> >> But to answer - everything seems to be working ok. >> The db is getting updated just fine, as are the web pages and images. >> All the extensions posting to WU/PWS/CWOP and the MQTT stuff (publish and >> subscribe) are working fine too. >> >> I've attached a gzipped syslog from a restart of weewx and a couple >> cycles afterward. Typically the rsync stuff completes last, so I added a >> blank line between cycles to make this a little easier to parse. I did >> notice a api key error in my forecast setup (fixed) so the initial startup >> might be a little longer than normal since the forecast stuff had to catch >> up. Timings for subsequent runs look normal for here - a little over 3 >> minutes til the rsync stuff ends a 5-minute archive set of stuff... >> >> -- >> You received this message because you are subscribed to the Google Groups >> "weewx-user" group. >> To unsubscribe from this group and stop receiving emails from it, send an >> email to [email protected]. >> To view this discussion on the web visit >> https://groups.google.com/d/msgid/weewx-user/61b8a4db-7ee4-4449-9fcc-7780f1775078n%40googlegroups.com >> <https://groups.google.com/d/msgid/weewx-user/61b8a4db-7ee4-4449-9fcc-7780f1775078n%40googlegroups.com?utm_medium=email&utm_source=footer> >> . >> > -- > You received this message because you are subscribed to the Google Groups > "weewx-user" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to [email protected]. > To view this discussion on the web visit > https://groups.google.com/d/msgid/weewx-user/CAPq0zEBV7S%2B7hP55BZ_UU9SRdJTfnpJiw-SG%3DKfc0tmrnkYJTQ%40mail.gmail.com > <https://groups.google.com/d/msgid/weewx-user/CAPq0zEBV7S%2B7hP55BZ_UU9SRdJTfnpJiw-SG%3DKfc0tmrnkYJTQ%40mail.gmail.com?utm_medium=email&utm_source=footer> > . > > -- > You received this message because you are subscribed to the Google Groups > "weewx-user" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to [email protected]. > To view this discussion on the web visit > https://groups.google.com/d/msgid/weewx-user/327F1AFB-9303-493E-B6BA-265B31D4A380%40johnkline.com > <https://groups.google.com/d/msgid/weewx-user/327F1AFB-9303-493E-B6BA-265B31D4A380%40johnkline.com?utm_medium=email&utm_source=footer> > . > -- You received this message because you are subscribed to the Google Groups "weewx-user" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To view this discussion on the web visit https://groups.google.com/d/msgid/weewx-user/CAPq0zEDUKgHvsK6e5gJdXJVF7Wz%3DdzgqwkhrJ0eCEXDi2yP9Jw%40mail.gmail.com.
