Re: [weewx-user] Watchdog example to warn when station is not reporting for any reason

Leon Shaner Thu, 02 May 2019 08:55:08 -0700

Hey, Mikael,

Nice catch.  That message about 1556807461 seconds is a left-over from the 
earlier version.  It is not only redundant, but also incorrect, due to what 
amounts to re-use of an earlier variable.  :S   I have removed it.


https://github.com/UberEclectic/weewx/tree/watchdog/examples/watchdog

Your weewx.conf archive interval of 600 seconds is not really a problem, it's 
up to you what to put there, but in that case you probably want to increase my 
watchdogsecs by a factor that approximates how long it takes your system to 
boot and for weewx to post the first record.  My RPI boots in under a minute, 
so if yours is similar, even watchdogsecs=660 should suffice.

Hmm.  Maybe longer.  Please check your syslog and see if you can approximate 
how long after a reboot it takes before this type of message to appear from 
weewx:

weewx[9809]: manager: Added record 2019-05-02 06:24:00 EDT (1556792640) to 
database

To your next questions...
It isn't 100% clear to me which way you have the toggles set.
Did you ultimately enable both remediation actions steps, before seeing the 
erroneous message?

doweewxrestart=1
dohostreboot=1

Even that erroneous message shouldn't have appeared unless you have 
dohostreboot=1 but you are mentioning mainly that you tried the 
doweewxrestart=1  first.  Not sure if you mean that after trying the restart 
option you then went on to also try the dohostreboot=1 and saw the erroneous 
message?

Just want to be sure, because the logic is the most important thing, and I did 
test that extensively.  =D

I didn't notice the erroneous message because it only goes to syslog and I was 
mainly watching the more interesting log in the case of weewx_watchdog, which 
is at /var/log/weewx.log.  Would be helpful to see that log if there are any 
other issues.

Cheers! =D
\Leon
--
Leon Shaner :: Dearborn, Michigan (iPad Pro)



Regards,
Leon
--
Leon Shaner :: Dearborn, Michigan (iPad Pro)
> On May 2, 2019, at 10:56 AM, [email protected] wrote:
> 
> Hi Leon,
> 
> Just installed the updated script now.
> 
> Seems to be running fine dispite on thing that occured, see part of log here:
> 
> May  2 16:21:08 raspberrypi weewx[2209]: cheetahgenerator: Generated 10 files 
> for report Sofaskin-FW2205-master in 6.93 seconds
> May  2 16:21:10 raspberrypi weewx[2209]: imagegenerator: Generated 9 images 
> for Sofaskin-FW2205-master in 1.03 seconds
> May  2 16:21:10 raspberrypi weewx[2209]: copygenerator: copied 8 files to 
> /var/www/html/weewx/Sofaskin-FW2205-master
> May  2 16:21:10 raspberrypi weewx[2209]: GaugeGenerator: Generated 6 images 
> for Bootstrap in 0.55 seconds
> May  2 16:21:10 raspberrypi weewx[2209]: historygenerator.pyc: Generated 9 
> tables in 0.18 seconds
> May  2 16:21:13 raspberrypi weewx[2209]: cheetahgenerator: Generated 10 files 
> for report Bootstrap in 2.86 seconds
> May  2 16:21:13 raspberrypi weewx[2209]: copygenerator: copied 3 files to 
> /var/www/html/weewx/Bootstrap
> May  2 16:21:17 raspberrypi weewx[2209]: cheetahgenerator: Generated 6 files 
> for report Bjurdammen in 4.13 seconds
> May  2 16:21:17 raspberrypi weewx[2209]: copygenerator: copied 3 files to 
> /var/www/html/weewx/Bjurdammen
> May  2 16:21:19 raspberrypi weewx[2209]: cheetahgenerator: Generated 6 files 
> for report SeasonsReport in 1.46 seconds
> May  2 16:21:19 raspberrypi weewx[2209]: copygenerator: copied 3 files to 
> /var/www/html/weewx
> May  2 16:22:28 raspberrypi vncserver-x11[438]: Connections: rejecting 
> blacklisted connection: 217.61.106.100
> May  2 16:27:01 raspberrypi CRON[2616]: (root) CMD 
> (/var/www/html/scripts/Watchdog/weewx_watchdog2)
> May  2 16:29:14 raspberrypi crontab[2636]: (root) BEGIN EDIT (root)
> May  2 16:29:24 raspberrypi crontab[2636]: (root) REPLACE (root)
> May  2 16:29:24 raspberrypi crontab[2636]: (root) END EDIT (root)
> May  2 16:29:41 raspberrypi systemd[1]: Stopping LSB: weewx weather system...
> May  2 16:29:41 raspberrypi weewx[2209]: engine: Main loop exiting. Shutting 
> engine down.
> May  2 16:29:41 raspberrypi weewx[2209]: engine: Shutting down StdReport 
> thread
> May  2 16:29:41 raspberrypi weewx[2209]: engine: Terminating weewx version 
> 3.9.1
> May  2 16:29:46 raspberrypi weewx[2683]: Stopping weewx weather system: 
> weewx..
> May  2 16:29:46 raspberrypi systemd[1]: Stopped LSB: weewx weather system.
> May  2 16:30:01 raspberrypi cron[304]: (root) RELOAD (crontabs/root)
> May  2 16:30:01 raspberrypi CRON[2727]: (pliggen) CMD (/usr/bin/php7.0 
> /var/www/html/weewx/smhi_warnings_bjurdammen.php > /dev/null 2>&1)
> May  2 16:31:01 raspberrypi CRON[2741]: (root) CMD 
> (/var/www/html/scripts/Watchdog/weewx_watchdog2)
> May  2 16:31:01 raspberrypi root[2748]: weewx_watchdog: 661 seconds have 
> passed since weewx logged any records!
> May  2 16:31:01 raspberrypi root[2759]: weewx_watchdog: 1556807461 seconds 
> have passed since weewx logged any records!  Rebooting!
> May  2 16:31:17 raspberrypi systemd-modules-load[85]: Inserted module 
> 'i2c_dev'
> May  2 16:31:17 raspberrypi fake-hwclock[88]: tor  2 maj 2019 14:31:15 UTC
> 
> I stopped weewx via service weewx stop.
> Then I had the script to run by cronjob. 
> It first tried to restart and wrote that 661 seconds have passed since weewx 
> logged any records, and that is correct, I have it set to archive_interval = 
> 600 in weewx.conf. Or should I change this time?
> the the script wrote that 1556807461 seconds have passed since weewx logged 
> any records! Rebooting! 
> Is this correct behavior? Because weewx logged 660 seconds erlier...
> 
> Thanks for the update and I will continue to try it and hope to have it 
> running :)
> 
> BR Mikael
>> Den torsdag 2 maj 2019 kl. 06:35:20 UTC+2 skrev Leon Shaner:
>> Hi, Mikael,
>> 
>> I've added the simple toggles at the top of the script to control 
>> remediation actions.
>> The script will always notify, but the restart weewx and reboot host actions 
>> are completely optional and controlled simply by the variables at the top of 
>> the script.
>> 
>> If both restart weewx and reboot host actions are enabled, these become 
>> nested in that first the restart weewx will be tried one time and if after 
>> watchdogsecs the station is still not reporting, then the host reboot will 
>> occur.  You can enable one or the other or both.
>> 
>> I also added logic to check if the host was rebooted within 2x watchdog secs 
>> to avoid reboot loops.   I strongly recommend not lowering the watchdogsecs 
>> below 600 (10 minutes).
>> 
>> It's important to note that my approach isn't actually checking whether the 
>> weewx process is running, rather, the loss of communications checking is 
>> based on whether weewx has written any records to the database lately. 
>> 
>> You'll find the update over here:
>> 
>> https://github.com/UberEclectic/weewx/tree/watchdog/examples/watchdog
>> 
>> BTW, if you use my script don't run another kind of weewx restarter script, 
>> or they will step on each other.  (Or at least be sure to disable the weewx 
>> restart toggle in mine).
>> 
>> Regards,
>> Leon
>> --
>> Leon Shaner :: Dearborn, Michigan (iPad Pro)
>> 
>>> On May 1, 2019, at 2:24 PM, [email protected] wrote:
>>> 
>>> Hi,
>>> 
>>> I now have this script running. 
>>> Indeed this shouldn't be necessary but I've had both issues (Weewx crash 
>>> and USB issue) so I think these scripts are nice to have in case something 
>>> goes wrong until these issue are ironed out. Of course we should provide TF 
>>> with logs from these events.
>>> 
>>> I still have commented out the settings that initiate a reboot, so I will 
>>> have to do that manually in that case, for now.
>>> So it's good to have those scripts sending email at errors.
>>> 
>>> Yes it would be great to have these functions in the same script, with 
>>> toggle's, so I would appreciate it. Lets see if there are someone else that 
>>> would like it. 
>>> 
>>> Well, thank you so far for all help! Learning all the time now :)
>>> 
>>> BR Mikael
>>> 
>>> 
>>> 
>>> 
>>> 
>>>> Den onsdag 1 maj 2019 kl. 16:26:33 UTC+2 skrev Leon Shaner:
>>>> Hi, Mikael,
>>>> 
>>>> My watchdog script could certainly be "complimentary" to Constantine's 
>>>> script, and with minor modifications could potentially replace it.   IIRC, 
>>>> isn't mine at least the third script recently mentioned on the alias to 
>>>> address nearly the same issues?
>>>> 
>>>> For sure these scripts shouldn't be necessary, but things happen, and bugs 
>>>> that can be fixed may eventually be fixed, but in the mean-time, I'd 
>>>> rather have a script notify / remediate / workaround issues than keep 
>>>> randomly discovering that my station isn't reporting only hours after it 
>>>> "went offline."   :-/
>>>> 
>>>> So, anyway, in my weewx_watchdog script I am merely sending an e-mail 
>>>> notification when the station hasn't reported, but there is an example 
>>>> included (commented out) that initiates a reboot, which is the only remedy 
>>>> I have found for when my WMR300 stops communicating on my RPI.
>>>> 
>>>> Instead of the "shutdown -r now" it would be very easy to instead use 
>>>> "systemctl status weewx" to check if the service is running and then 
>>>> "systemctl start weewx" to start it (or "systemctl restart weewx").
>>>> 
>>>> With a little more work, I could do a series of remediation steps, like 
>>>> first attempting to restart weewx, then IFF the station still is not 
>>>> reporting, do the reboot the next time.   I didn't bother, because never 
>>>> once ever did a simple restart of weewx fix the USB issues with my WMR300 
>>>> on RPI.   Also, I didn't bother to check if weewx was running, because my 
>>>> weewx has never actually stopped running.  It's always the USB issue at 
>>>> fault in my case.  :-/
>>>> 
>>>> If there is interest I could add a weewx restart remediation and even put 
>>>> some simple toggle's at the top to control which remediation steps are 
>>>> desired, such as restart_weewx vs. reboot_host, and maybe even do the 
>>>> logic to first try one and then the other if both are enabled.   Could put 
>>>> a toggle for whether to do the wunderfixer steps, too.   I'm going to 
>>>> write a separate post about that in a minute.  ;-)
>>>> 
>>>> Incidentally, I had to move the script.  It's in its own branch now:
>>>> 
>>>> https://github.com/UberEclectic/weewx/tree/watchdog/examples/watchdog
>>>> 
>>>> Regards,
>>>> Leon
>>>> --
>>>> Leon Shaner :: Dearborn, Michigan (iPad Pro)
>>>> 
>>>>> On May 1, 2019, at 4:13 AM, [email protected] wrote:
>>>>> 
>>>>> Hi Leon,
>>>>> 
>>>>> could this be used as a complement to Constantine Samaklis script you 
>>>>> helped me with yesterday? 
>>>>> Sometimes the wh1080 freezes USB-connection to my raspberry pi but the 
>>>>> weewx service is still running so Constantines script wont have any 
>>>>> effect in this case?
>>>>> I think this issue with whxxxx is well known.
>>>>> 
>>>>> BR Mikael
>>>>> 
>>>>> -- 
>>>>> You received this message because you are subscribed to the Google Groups 
>>>>> "weewx-user" group.
>>>>> To unsubscribe from this group and stop receiving emails from it, send an 
>>>>> email to [email protected].
>>>>> For more options, visit https://groups.google.com/d/optout.
>>> 
>>> -- 
>>> You received this message because you are subscribed to the Google Groups 
>>> "weewx-user" group.
>>> To unsubscribe from this group and stop receiving emails from it, send an 
>>> email to [email protected].
>>> For more options, visit https://groups.google.com/d/optout.
> 
> -- 
> You received this message because you are subscribed to the Google Groups 
> "weewx-user" group.
> To unsubscribe from this group and stop receiving emails from it, send an 
> email to [email protected].
> For more options, visit https://groups.google.com/d/optout.

-- 
You received this message because you are subscribed to the Google Groups 
"weewx-user" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
For more options, visit https://groups.google.com/d/optout.

Re: [weewx-user] Watchdog example to warn when station is not reporting for any reason

Reply via email to