Makes sense. That could easily be simplified with a subclass of
WeeWxIOError. Call it WeeWxZeroBytesError.

On Mon, Dec 13, 2021 at 6:51 AM John Kline <[email protected]> wrote:

> Hi Tom,
>
> In particular, if I get the read too few bytes exception, I simply return
> from genDavisLoopPackets().  genDavisLoopPackets() will be immediately
> called again asking for another 200 packets.
>
> Just in case I’m not explaining it well, here’s the code that checks for
> such an error and simply returns.  With this code, weewx never restarts.
> for loop in range(N):
> for count in range(self.max_tries):
> try:
> loop_packet = self._get_packet()
> log.debug('%s: Loop packet success!' % weeutil.weeutil.timestamp_to_string
> (loop_packet['dateTime']))
> except weewx.WeeWxIOError as e:
> if str(e).endswith('instead'):
> if str(e).endswith('got 0 instead'):
> log.info('get_packet returned 0 bytes. (%d)' % self.pkt_count)
> else:
> log.info('get_packet returned too few bytes. (%d)' % self.pkt_count)
> # If already on a bad read, log that fact.
> if self.on_bad_read:
> log.info('genDavisLoopPackets: repeated bad read.')
> self.on_bad_read = True
> return
> log.error("LOOP try #%d; error: %s", count + 1, e)
> time.sleep(self.wait_before_retry)
> else:
> self.on_bad_read = False
> self.pkt_count += 1
> yield loop_packet
> break
> else:
> log.error("LOOP max tries (%d) exceeded.", self.max_tries)
> raise weewx.RetriesExceeded("Max tries exceeded while getting LOOP data.")
>
>
> On Dec 13, 2021, at 4:13 AM, Tom Keffer <[email protected]> wrote:
>
> 
> John: good sleuthing!
>
> Although, I didn't quite understand your comment about exiting
> genDavisLoopPackets(). Are you saying you might as well set max_tries=1
> because it never recovers?
>
> It's possible that after a time sync, the logger is occupied for a bit and
> unable to generate new packets. A possible solution might be to sleep for 5
> or 10 seconds after the sync.
>
> In any case, I'm 1,200 miles from my station, so I can't offer much for a
> few months.
>
> -tk
>
>
>
> On Sun, Dec 12, 2021 at 6:00 PM John Kline <[email protected]> wrote:
>
>> 
>> I’ve been studying the LOOP errors for over a year now.  I see them on
>> three different installs, 1 NUC7i5 (w/ console) and two Raspberry Pi 4s (1
>> w/ console, 1 w/ envoy).
>>
>> I’ve got a modified and instrumented driver.  The important mod is to
>> exit genDavisLoopPackets() when the error is hit.  The error is almost
>> always “Expected to read 99 chars; got 0 instead”, but it does not have
>> to be 0 that was actually received.  I you don’t exit out of that loop, and
>> you don’t get lucky to be near the end of the loop, weewx will restart.
>> With the change, weewx never restarts.
>>
>> Next is the question of why.  It may very well be what Tom suggests.
>> I’ve suspected as such and have been meaning to write a C program to read
>> the serial (USB) port and feed weewx.
>>
>> One thing I can say is that, for me, ON ALL THREE consoles, the error
>> occurs *almost* always after a time set.
>>
>> Here’s an example from the log:
>>
>>  Vantage Clock Set/Short Reads Info:
>>    Dec 12 00:30:04 judy weewx[491] INFO user.vantagenext: Clock set to 
>> 2021-12-12 00:30:06 PST (1639297806) (9128, 1639297804.599486, 1.303189)
>>    Dec 12 00:30:09 judy weewx[491] INFO user.vantagenext: get_packet 
>> returned 0 bytes. (9129)
>>    Dec 12 00:30:13 judy weewx[491] INFO user.vantagenext: get_packet 
>> returned 0 bytes. (9130)
>>    Dec 12 00:30:23 judy weewx[491] INFO user.vantagenext: get_packet 
>> returned 0 bytes. (9134)
>>    Dec 12 00:30:27 judy weewx[491] INFO user.vantagenext: get_packet 
>> returned 0 bytes. (9135)
>>    Dec 12 00:30:31 judy weewx[491] INFO user.vantagenext: get_packet 
>> returned 0 bytes. (9136)
>>    Dec 12 00:30:35 judy weewx[491] INFO user.vantagenext: get_packet 
>> returned 0 bytes. (9137)
>>    Dec 12 00:30:39 judy weewx[491] INFO user.vantagenext: get_packet 
>> returned 0 bytes. (9138)
>>    Dec 12 00:30:44 judy weewx[491] INFO user.vantagenext: get_packet 
>> returned 0 bytes. (9139)
>>    Dec 12 00:30:49 judy weewx[491] INFO user.vantagenext: get_packet 
>> returned 0 bytes. (9141)
>>
>> In the above, when the clock set happened, the driver had served 9128
>> successful loop packets.  It served one more after the clock set and then
>> got the zero bytes error (9129).  The same is true for the next, one
>> success and then one error.  After that, 4 successes and one error.  And so
>> on.
>>
>> Sometimes after a time set, I see no errors, but that is not often.  The
>> number of errors varies.  This is on the high side.  1 - 4 errors might be
>> more common.
>>
>> Sometimes there is an error that occurs NOT after the time set.  This is
>> very rare and, when it does happen, it is very often an error that I see on
>> all three consoles.  And, when it does occur, it’s always a single error.
>>
>> I’m still studying this (because it bothers the heck out of me).
>>
>> On Dec 12, 2021, at 5:32 PM, Tom Keffer <[email protected]> wrote:
>>
>> 
>> Naturally, that excerpt didn't have any LOOP errors.
>>
>> I can definitely see a scenario where an overloaded machine could miss
>> LOOP packets. The LOOP packets are requested in big bunches, typically 200
>> at a time, then the driver blocks, waiting for them. If a reporting thread
>> hogs the CPU, the driver may be starved for time and not get back in time
>> to get the waiting packet. The Vantage then assumes the driver has gone
>> away and stops sending the packets.
>>
>> You have a lot going on with your system. Is there anything that could
>> prevent the Python runtime engine from switching threads?
>>
>> On Sun, Dec 12, 2021 at 5:23 PM vince <[email protected]> wrote:
>>
>>> On Sunday, December 12, 2021 at 4:19:33 PM UTC-8 Tom Keffer wrote:
>>>
>>>> Vince, you should know better! That's not much of a log excerpt. What
>>>> was before that DMPAFT log entry? Did weewx put something in the database?
>>>> If not, that looks like a classic case of memory corruption.
>>>>
>>>> As for the LOOP errors, I see those every once in a while, particularly
>>>> when the logger is busy after a large catch up. Again, more of the log
>>>> would help.
>>>>
>>>>
>>> Pot. Kettle .Black. Moi ? - Guilty as charged, sir :-)
>>>
>>> I guess I was trying to get a feel for "occasionally seeing this or
>>> that" is not that unusual nor necessarily actionable if everything else
>>> works ok.
>>>
>>> But to answer - everything seems to be working ok.
>>> The db is getting updated just fine, as are the web pages and images.
>>> All the extensions posting to WU/PWS/CWOP and the MQTT stuff (publish
>>> and subscribe) are working fine too.
>>>
>>> I've attached a gzipped syslog from a restart of weewx and a couple
>>> cycles afterward.  Typically the rsync stuff completes last, so I added a
>>> blank line between cycles to make this a little easier to parse.   I did
>>> notice a api key error in my forecast setup (fixed) so the initial startup
>>> might be a little longer than normal since the forecast stuff had to catch
>>> up.   Timings for subsequent runs look normal for here - a little over 3
>>> minutes til the rsync stuff ends a 5-minute archive set of stuff...
>>>
>>> --
>>> You received this message because you are subscribed to the Google
>>> Groups "weewx-user" group.
>>> To unsubscribe from this group and stop receiving emails from it, send
>>> an email to [email protected].
>>> To view this discussion on the web visit
>>> https://groups.google.com/d/msgid/weewx-user/61b8a4db-7ee4-4449-9fcc-7780f1775078n%40googlegroups.com
>>> <https://groups.google.com/d/msgid/weewx-user/61b8a4db-7ee4-4449-9fcc-7780f1775078n%40googlegroups.com?utm_medium=email&utm_source=footer>
>>> .
>>>
>> --
>> You received this message because you are subscribed to the Google Groups
>> "weewx-user" group.
>> To unsubscribe from this group and stop receiving emails from it, send an
>> email to [email protected].
>> To view this discussion on the web visit
>> https://groups.google.com/d/msgid/weewx-user/CAPq0zEBV7S%2B7hP55BZ_UU9SRdJTfnpJiw-SG%3DKfc0tmrnkYJTQ%40mail.gmail.com
>> <https://groups.google.com/d/msgid/weewx-user/CAPq0zEBV7S%2B7hP55BZ_UU9SRdJTfnpJiw-SG%3DKfc0tmrnkYJTQ%40mail.gmail.com?utm_medium=email&utm_source=footer>
>> .
>>
>> --
>> You received this message because you are subscribed to the Google Groups
>> "weewx-user" group.
>> To unsubscribe from this group and stop receiving emails from it, send an
>> email to [email protected].
>> To view this discussion on the web visit
>> https://groups.google.com/d/msgid/weewx-user/327F1AFB-9303-493E-B6BA-265B31D4A380%40johnkline.com
>> <https://groups.google.com/d/msgid/weewx-user/327F1AFB-9303-493E-B6BA-265B31D4A380%40johnkline.com?utm_medium=email&utm_source=footer>
>> .
>>
> --
> You received this message because you are subscribed to the Google Groups
> "weewx-user" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to [email protected].
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/weewx-user/CAPq0zEDUKgHvsK6e5gJdXJVF7Wz%3DdzgqwkhrJ0eCEXDi2yP9Jw%40mail.gmail.com
> <https://groups.google.com/d/msgid/weewx-user/CAPq0zEDUKgHvsK6e5gJdXJVF7Wz%3DdzgqwkhrJ0eCEXDi2yP9Jw%40mail.gmail.com?utm_medium=email&utm_source=footer>
> .
>
> --
> You received this message because you are subscribed to the Google Groups
> "weewx-user" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to [email protected].
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/weewx-user/63D224BC-5CBB-4070-B2AB-44AEC4B91624%40johnkline.com
> <https://groups.google.com/d/msgid/weewx-user/63D224BC-5CBB-4070-B2AB-44AEC4B91624%40johnkline.com?utm_medium=email&utm_source=footer>
> .
>

-- 
You received this message because you are subscribed to the Google Groups 
"weewx-user" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/weewx-user/CAPq0zED_5wh8adJpn_ObJvW%2Bz5_TGz_kE4OrqAApXDALUbbDPw%40mail.gmail.com.

Reply via email to