Re: [weewx-development] Memory leak in StdReport?

Glenn McKechnie Mon, 30 Mar 2020 04:31:09 -0700

Luc,

(Bugger, and this one to the list)
NB:
The script lines wordwrap with my mailer, some may need.. straightening?

On 30/03/2020, Lucas Heijst <ljm.hei...@gmail.com> wrote:
> Glen,
>
> I like the method you use by firing the report in another way but I am not
> keen in modifying engine.py (each time with a new version of weewx).
> I could put the report calls in a cron job task, but then they are executed
>
> even when weewx is not running.
> Let me think about it. The timing when the report is started is not a
> problem I expect. I have runned many times the report by hand while the
> weewx instances were running somewhere in their cycli.

The following is a description of bypassing the internal weewx report
generation, and replacing it with a seperate instance.
It looks complicated but in actual fact, it's pretty simple and runs
flawlessly here.

To recap...
The objective is to remove any memory problems that occur during the
report cycle (whatever the cause) by ensuring that the report cycle is
seperated from the main weewx instance. This allows it to release all
its memory upon completion. It starts afresh each time it's called.
The reason for doing it this way is to ensure that the reports are
generated at the same time weewx would have done them, the theory
being that there is less chance of interference/breakage of weewx.

It consists of ...
1. patching engine.py, and tweaking weewx.conf
2. adding a call to rsyslog
3. creating a calling script for rsyslog to action.
4. creating a wee_report calling script, that gets out of the way of rsyslog.

The following expands on those 4 points.

1.
$ diff -Naur  /home/weewx/bin/weewx/engine.py.org
/home/weewx/bin/weewx/engine.py
--- /home/weewx/bin/weewx/engine.py.org 2020-03-30 17:49:40.760732824 +1100
+++ /home/weewx/bin/weewx/engine.py     2020-01-26 03:47:05.588871567 +1100
@@ -602,6 +602,11 @@
                 raise ValueError("Unknown station record generation
value %s" % self.record_generation)
             self.old_accumulator = None

+        # Glenn McKechnie - flag to run wee_reports
+        # also turn off StdReport
+        # report_services = weewx.engine.StdPrint # , weewx.engine.StdReport
+        log.info("engine: weewx loop has finished")
+
         # Set the time of the next break loop:
         self.end_archive_delay_ts = self.end_archive_period_ts +
self.archive_delay

Those 5 lines that start with a "+" are the only change that's made to
engine.py. It doesn't break anything. If it's not there the following
won't work and no independent report run will be performed.
It's only one-line that actually matters, the comments are there to
aid my memory when I need to, very occasionally, find and revisit it.
If it is there and you don't use it, your logs suffer the indignity of
a pointless message.
If it is there and weewx is not running, then wee_reports will not be
called by the scripts, as distinct from a CRON entry which would only
stay silent if it could detect a stopped weewx.

The other change is to weewx.conf where the report_services line has
the weewx.engine.StdReport entry commented out.
 report_services = weewx.engine.StdPrint # , weewx.engine.StdReport

2.
I've redirected my syslog output to a weewx specific log file so I use
the rsyslog.d/weewx.conf example file. The next step is to add the
following additional lines to the top of that rsyslog.d conf file.
$ /etc/rsyslog.d/weewx.conf
#
# This will break rsyslog if the called command takes too long to complete.
# Therefore, the script that this points to has to be simplistic. In fact
# this points to a one-line call to the real script before it
backgrounds/completes.
# This script is mode 0644
# source https://www.slideshare.net/rainergerhards1/writing-rsyslog-p
#
module(load="omprog")
if $rawmsg contains 'weewx loop has finished' then
    action(type="omprog"
           binary="/home/weewx/bin/user/runreports.sh")
#

3.
The script that is being called consists of six lines, but only one
truly matters and that's the call to run_weereports.sh
$  /home/weewx/bin/user/runreports.sh
#!/bin/sh
# script called from rsyslog.conf or equivalent ... rsyslog.d/weewx.conf
`bash /home/weewx/bin/user/run_weereports.sh &`
exit 0
# This script needs to be executable
# run chmod 0755 /home/weewx/bin/user/runreports.sh

It calls /home/weewx/bin/user/run_weereports.sh and then drops into
the background and thereby returns control to the rsyslog daemon. It
may not look like much but this script is an important intermediate
step. If this was the actual script that invoked wee_reports it would
take too long to complete and you'd find rsyslog would break. (ie:-
you'd start getting log messages to your terminal and nothing to any
of the log files!)

4.
Now, The following is the actual call to wee_reports, run_weereports.sh
This is mine, yours can be whatever you like. I write a PID file and
with that I can decide to run wee_reports every archive , or every
alternate archive interval. It doesn't matter, just so long as it runs
wee_reports once and that you know what's going on.
This has syslog entries, you can comment them out once you've finished
with them.

$ /home/weewx/bin/user/run_weereports.sh
#!/bin/bash
# script called from /home/weewx/bin/user/runreports.sh
pid_file='/var/run/runreports.pid'
tymestamp=$(date +%d%m%Y-%H:%M)

if [ -f $pid_file ]
then
   `rm -f $pid_file`
   logger --id --tag weewx_weatherpi "Skipped this weewx report by
removing $pid_file"
else
   $(/home/weewx/bin/wee_reports --config=/home/weewx/weatherpi.conf &
echo $! > $pid_file)
   logger --id --tag weewx_weatherpi "Ran wee_reports manually,
created $pid_file"
   # comment the following to skip alternate wee_report runs
   # `rm -f $pid_file`
fi
# This script can be made executable if you wish
# run chmod 0755 /home/weewx/bin/user/run_weereports.sh

exit 0

Putting it all together...

Change anything in the scripts that fits your setup. My weatherpi.conf
is probably the equivalent of your weewx.conf. My directories may not
match yours. Tweak to suit.

I'd suggest starting with the last script, run it and make sure it's
doing what you want. (running wee_reports)

Next, create the second (calling ) script and make sure that it does
what you expect it to. (running the above script, that then runs
wee_reports)

Then I'd add the rsyslog modifications and restart rsyslog. Nothing
spectacular should happen, except that it restarts and stays running.
Ah... If you don't have seperate weewx logging, then that addition I
mention  (the 12 lines including module(load="omprog") ) goes into
your /etc/rsyslog.conf file after the #### RULES #### header (It's
been a while since I did it that way, I'm pretty sure that was the
spot.)

The next step is to test it by running the command...
logger "weewx loop has finished"
and that simulates what the (not yet) modified engine.py will utter.
When that message hits syslog wee_reports should fire up and do its
thing. (ie:- the above calling script runs and that executes the
script that actually fires up wee_reports)

Up to this point, the dominoes have been placed but only a manual step
can set them in motion.
So, if all the above works as intended then modify engine.py, then
weewx.conf and you've set the trigger. Restart weewx and it should
fire. Review your reports. Watch your logs, there should be a series
of wee_report entries occur at the appropriate point (ie: same as
before) in the weewx cycle.

Run it for a while and check your memory usage and it should be flat
lining. That's if StdReports was the cause of your memory leak (which
is probably in the version of PIL or other C lib as Tom notes.)

If you still get a message about the report cycle being blocked by a
previous run (I was fairly certain that message still occured but as I
think about it , doubt enters. Hmmm.) then there is a serious problem
within the actual report run. I've had something similar with mysql
hitting the roof due to a ... Okay, I should go back and revisit that
issue. :-(

That's it. Hopefully I've missed nothing, and that it's of use.

> On Sunday, 29 March 2020 23:32:44 UTC-3, Glenn McKechnie wrote:
>>
>> On 30/03/2020, Lucas Heijst <ljm....@gmail.com <javascript:>> wrote:
>> [...]
>> > 2. With weewx reports (mben with 84+ generated graphs each run).
>> >
>> > Luc
>>
>> I have a fairly complex set of graphs ( multiple plots, large size,
>> anti-alias turned on) and I have a persistent and steady climb until
>> the OOM killer takes over. If I simplify the graphing then it holds up
>> very well, but there's no fun in that!
>>
>> Rather than a Cronjob to force a set restart, I've turned off  StdReport
>>   report_services = weewx.engine.StdPrint    # , weewx.engine.StdReport
>> and then once engine.py has finished its post_loop(self, _event):
>> function, it writes a log message to loginfo (rsyslog).
>> syslog picks up that message and then fires off an instance of
>> wee_reports.
>>
>> The advantage of this is that...
>> 1. The Reports are called at the correct point in the weewx cycle.
>> 2. Once wee_reports finishes it closes completely (assuming another
>> problem isn't holding it open) and completely frees the memory it's
>> used.
>> 3. Weewx stays up and I don't waste anymore time looking for a memory
>> leak that I can do nothing about, or is way above my skill set. I've
>> chased PIL versions and run memory profilers but never had a suitable
>> Eureka moment.
>>
>> The disadvantage is that...
>> 1. weeWX uptime is reported as zero
>> 2. engine.py needs a one line patch (a suitable loginfo message)
>> 3. rsyslog is easy but a set procedure needs following. One that's
>> simple in hindsight.
>>
>> now, in a week, weewx might use 2 Meg. It basically flatlines with this
>> action.
>>
>> If you want I'll track down the scripts, post them and outline the full
>> logic.

-- 

Cheers
 Glenn

rorpi - read only raspberry pi & various weewx addons
https://github.com/glennmckechnie

-- 
You received this message because you are subscribed to the Google Groups 
"weewx-development" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to weewx-development+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/weewx-development/CAAraAzhpnTNwUh01Zx1saMT_O-%3D2W4pdVx5SRCsV9tK-AK_Ufg%40mail.gmail.com.

Re: [weewx-development] Memory leak in StdReport?

Reply via email to