Well now I really hope this doesn’t turn out to be an instance of “what could possibly go wrong?” :)
I did some poking around the intermittent failures and am pretty convinced they occur when weewx starts up while the datalogger is uploading to Weatherlink. As long as weewx is running it doesn’t try to upload to Weatherlink, but as soon as weewx stops the uploads begin, consisting of a single packet HTTP PUT every minute. With opening and closing the connection, the whole transaction takes 1-1.5 sec. If weewx connects and tries to do the wakeup during or immediately after an upload, the TCP connection happens but the datalogger doesn’t respond to the wakeup. The failure is only occasional for random restarts because there’s a 2-3 second window each minute when it will happen. But doing a “systemctl restart weewx” restarts weewx fast enough that it is very likely to hit the problem. The driver’s retry on the established TCP connection after a short delay seems to work reliably. For this particular issue, that's better than closing and re-opening the connection because that would probably trigger the race with Weatherlink all over again. BTW, my LAN for the datalogger is just a cable between it and the RPi running weewx; there’s no other network traffic to confuse things. -Les > On 18 Feb 2021, at 5:11, Tom Keffer <[email protected]> wrote: > > Normally I am very reluctant to make changes in the Vantage driver because it > has been so reliable for so long. > > However, thanks to your careful sleuthing, you have uncovered a subtle, and > hard-to-find, bug. Thanks so much, Les! > > Commit 9605ec9 > <https://github.com/weewx/weewx/commit/9605ec91c86d38b81c45a839aa95f77af8e32b21>. > > -tk > > On Wed, Feb 17, 2021 at 12:28 PM Les Niles <[email protected] > <mailto:[email protected]>> wrote: > In the course of doing a fresh install (4.3.0 debian package), I’ve been > doing a lot of restarting weewx. The restart would fail pretty often due to > not waking up the Vantage console (Davis Ethernet datalogger). The driver > would report an ip-read error, followed by a series of ip-write errors until > max_tries was used up. This all happens very quickly because the sleep for > wait_before_retry is inside the try clause so there’s no delay when there’s a > WeeWxIOError exception. (Lines 110-115 in vantage.py.) I moved the sleep > outside of the try/except block and it fixed the problem — with the delay, > the wakeup succeeds after a few retries. (diff attached) > > I’m not sure if there was a specific reason for skipping the delay in case of > WeeWxIOError. It seems like there wouldn’t be any disadvantage to putting the > delay outside of the exception, other than taking slightly longer for weewx > to exit in case of an unrecoverable error, and there certainly is the > possibility that having a delay between retries makes it more likely to > succeed. Neither do I see a reason this wouldn’t work with a USB datalogger, > but I have no way to test that. > Thoughts? > > -Les > > > *** /usr/share/weewx/weewx/drivers/vantage.py.dist 2021-01-04 > 11:43:12.000000000 -0800 > --- /usr/share/weewx/weewx/drivers/vantage.py 2021-02-13 10:11:53.084750115 > -0800 > *************** > *** 107,117 **** > if _resp == b'\n\r': > log.debug("Rude wake up of console successful") > return > - print("Unable to wake up console... sleeping") > - time.sleep(self.wait_before_retry) > - print("Unable to wake up console... retrying") > except weewx.WeeWxIOError: > pass > log.debug("Retry #%d failed", count) > > log.error("Unable to wake up console") > --- 107,118 ---- > if _resp == b'\n\r': > log.debug("Rude wake up of console successful") > return > except weewx.WeeWxIOError: > pass > + > + print("Unable to wake up console... sleeping") > + time.sleep(self.wait_before_retry) > + print("Unable to wake up console... retrying") > log.debug("Retry #%d failed", count) > > log.error("Unable to wake up console") > > > > -- > You received this message because you are subscribed to the Google Groups > "weewx-development" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to [email protected] > <mailto:[email protected]>. > To view this discussion on the web visit > https://groups.google.com/d/msgid/weewx-development/7D296A46-154D-4518-AF5D-AA2DA6B4F2AD%402pi.org > > <https://groups.google.com/d/msgid/weewx-development/7D296A46-154D-4518-AF5D-AA2DA6B4F2AD%402pi.org?utm_medium=email&utm_source=footer>. -- You received this message because you are subscribed to the Google Groups "weewx-development" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To view this discussion on the web visit https://groups.google.com/d/msgid/weewx-development/00EA82BE-8563-4E95-A437-61C011CAF096%402pi.org.
