Hi Tom,
In particular, if I get the read too few bytes exception, I simply return from
genDavisLoopPackets(). genDavisLoopPackets() will be immediately called again
asking for another 200 packets.
Just in case I’m not explaining it well, here’s the code that checks for such
an error and simply returns. With this code, weewx never restarts.
for loop in range(N):
for count in range(self.max_tries):
try:
loop_packet = self._get_packet()
log.debug('%s: Loop packet success!' %
weeutil.weeutil.timestamp_to_string(loop_packet['dateTime']))
except weewx.WeeWxIOError as e:
if str(e).endswith('instead'):
if str(e).endswith('got 0 instead'):
log.info('get_packet returned 0 bytes. (%d)' %
self.pkt_count)
else:
log.info('get_packet returned too few bytes. (%d)'
% self.pkt_count)
# If already on a bad read, log that fact.
if self.on_bad_read:
log.info('genDavisLoopPackets: repeated bad read.')
self.on_bad_read = True
return
log.error("LOOP try #%d; error: %s", count + 1, e)
time.sleep(self.wait_before_retry)
else:
self.on_bad_read = False
self.pkt_count += 1
yield loop_packet
break
else:
log.error("LOOP max tries (%d) exceeded.", self.max_tries)
raise weewx.RetriesExceeded("Max tries exceeded while getting
LOOP data.")
> On Dec 13, 2021, at 4:13 AM, Tom Keffer <[email protected]> wrote:
>
>
> John: good sleuthing!
>
> Although, I didn't quite understand your comment about exiting
> genDavisLoopPackets(). Are you saying you might as well set max_tries=1
> because it never recovers?
>
> It's possible that after a time sync, the logger is occupied for a bit and
> unable to generate new packets. A possible solution might be to sleep for 5
> or 10 seconds after the sync.
>
> In any case, I'm 1,200 miles from my station, so I can't offer much for a few
> months.
>
> -tk
>
>
>
>> On Sun, Dec 12, 2021 at 6:00 PM John Kline <[email protected]> wrote:
>>
>> I’ve been studying the LOOP errors for over a year now. I see them on three
>> different installs, 1 NUC7i5 (w/ console) and two Raspberry Pi 4s (1 w/
>> console, 1 w/ envoy).
>>
>> I’ve got a modified and instrumented driver. The important mod is to exit
>> genDavisLoopPackets() when the error is hit. The error is almost always
>> “Expected to read 99 chars; got 0 instead”, but it does not have to be 0
>> that was actually received. I you don’t exit out of that loop, and you
>> don’t get lucky to be near the end of the loop, weewx will restart. With
>> the change, weewx never restarts.
>>
>> Next is the question of why. It may very well be what Tom suggests. I’ve
>> suspected as such and have been meaning to write a C program to read the
>> serial (USB) port and feed weewx.
>>
>> One thing I can say is that, for me, ON ALL THREE consoles, the error occurs
>> *almost* always after a time set.
>>
>> Here’s an example from the log:
>> Vantage Clock Set/Short Reads Info:
>> Dec 12 00:30:04 judy weewx[491] INFO user.vantagenext: Clock set to
>> 2021-12-12 00:30:06 PST (1639297806) (9128, 1639297804.599486, 1.303189)
>> Dec 12 00:30:09 judy weewx[491] INFO user.vantagenext: get_packet
>> returned 0 bytes. (9129)
>> Dec 12 00:30:13 judy weewx[491] INFO user.vantagenext: get_packet
>> returned 0 bytes. (9130)
>> Dec 12 00:30:23 judy weewx[491] INFO user.vantagenext: get_packet
>> returned 0 bytes. (9134)
>> Dec 12 00:30:27 judy weewx[491] INFO user.vantagenext: get_packet
>> returned 0 bytes. (9135)
>> Dec 12 00:30:31 judy weewx[491] INFO user.vantagenext: get_packet
>> returned 0 bytes. (9136)
>> Dec 12 00:30:35 judy weewx[491] INFO user.vantagenext: get_packet
>> returned 0 bytes. (9137)
>> Dec 12 00:30:39 judy weewx[491] INFO user.vantagenext: get_packet
>> returned 0 bytes. (9138)
>> Dec 12 00:30:44 judy weewx[491] INFO user.vantagenext: get_packet
>> returned 0 bytes. (9139)
>> Dec 12 00:30:49 judy weewx[491] INFO user.vantagenext: get_packet
>> returned 0 bytes. (9141)
>> In the above, when the clock set happened, the driver had served 9128
>> successful loop packets. It served one more after the clock set and then
>> got the zero bytes error (9129). The same is true for the next, one success
>> and then one error. After that, 4 successes and one error. And so on.
>>
>> Sometimes after a time set, I see no errors, but that is not often. The
>> number of errors varies. This is on the high side. 1 - 4 errors might be
>> more common.
>>
>> Sometimes there is an error that occurs NOT after the time set. This is
>> very rare and, when it does happen, it is very often an error that I see on
>> all three consoles. And, when it does occur, it’s always a single error.
>>
>> I’m still studying this (because it bothers the heck out of me).
>>
>>>> On Dec 12, 2021, at 5:32 PM, Tom Keffer <[email protected]> wrote:
>>>>
>>>
>>> Naturally, that excerpt didn't have any LOOP errors.
>>>
>>> I can definitely see a scenario where an overloaded machine could miss LOOP
>>> packets. The LOOP packets are requested in big bunches, typically 200 at a
>>> time, then the driver blocks, waiting for them. If a reporting thread hogs
>>> the CPU, the driver may be starved for time and not get back in time to get
>>> the waiting packet. The Vantage then assumes the driver has gone away and
>>> stops sending the packets.
>>>
>>> You have a lot going on with your system. Is there anything that could
>>> prevent the Python runtime engine from switching threads?
>>>
>>>> On Sun, Dec 12, 2021 at 5:23 PM vince <[email protected]> wrote:
>>>>> On Sunday, December 12, 2021 at 4:19:33 PM UTC-8 Tom Keffer wrote:
>>>>
>>>>> Vince, you should know better! That's not much of a log excerpt. What was
>>>>> before that DMPAFT log entry? Did weewx put something in the database? If
>>>>> not, that looks like a classic case of memory corruption.
>>>>>
>>>>> As for the LOOP errors, I see those every once in a while, particularly
>>>>> when the logger is busy after a large catch up. Again, more of the log
>>>>> would help.
>>>>>
>>>>
>>>> Pot. Kettle .Black. Moi ? - Guilty as charged, sir :-)
>>>>
>>>> I guess I was trying to get a feel for "occasionally seeing this or that"
>>>> is not that unusual nor necessarily actionable if everything else works ok.
>>>>
>>>> But to answer - everything seems to be working ok.
>>>> The db is getting updated just fine, as are the web pages and images.
>>>> All the extensions posting to WU/PWS/CWOP and the MQTT stuff (publish and
>>>> subscribe) are working fine too.
>>>>
>>>> I've attached a gzipped syslog from a restart of weewx and a couple cycles
>>>> afterward. Typically the rsync stuff completes last, so I added a blank
>>>> line between cycles to make this a little easier to parse. I did notice
>>>> a api key error in my forecast setup (fixed) so the initial startup might
>>>> be a little longer than normal since the forecast stuff had to catch up.
>>>> Timings for subsequent runs look normal for here - a little over 3 minutes
>>>> til the rsync stuff ends a 5-minute archive set of stuff...
>>>>
>>>> --
>>>> You received this message because you are subscribed to the Google Groups
>>>> "weewx-user" group.
>>>> To unsubscribe from this group and stop receiving emails from it, send an
>>>> email to [email protected].
>>>> To view this discussion on the web visit
>>>> https://groups.google.com/d/msgid/weewx-user/61b8a4db-7ee4-4449-9fcc-7780f1775078n%40googlegroups.com.
>>>
>>> --
>>> You received this message because you are subscribed to the Google Groups
>>> "weewx-user" group.
>>> To unsubscribe from this group and stop receiving emails from it, send an
>>> email to [email protected].
>>> To view this discussion on the web visit
>>> https://groups.google.com/d/msgid/weewx-user/CAPq0zEBV7S%2B7hP55BZ_UU9SRdJTfnpJiw-SG%3DKfc0tmrnkYJTQ%40mail.gmail.com.
>>
>> --
>> You received this message because you are subscribed to the Google Groups
>> "weewx-user" group.
>> To unsubscribe from this group and stop receiving emails from it, send an
>> email to [email protected].
>> To view this discussion on the web visit
>> https://groups.google.com/d/msgid/weewx-user/327F1AFB-9303-493E-B6BA-265B31D4A380%40johnkline.com.
>
> --
> You received this message because you are subscribed to the Google Groups
> "weewx-user" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to [email protected].
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/weewx-user/CAPq0zEDUKgHvsK6e5gJdXJVF7Wz%3DdzgqwkhrJ0eCEXDi2yP9Jw%40mail.gmail.com.
--
You received this message because you are subscribed to the Google Groups
"weewx-user" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to [email protected].
To view this discussion on the web visit
https://groups.google.com/d/msgid/weewx-user/63D224BC-5CBB-4070-B2AB-44AEC4B91624%40johnkline.com.