After many hours of troubleshooting and testing, I think I have an idea what's happening.
Background: I want to use weewx with the Belchertown skin and my extension that reads numerous RSS/Atom feeds on a Pi Zero, with either a Davis Pro V2 or Weatherflow weather station. I will eventually have four weather stations in my area. I thought I had things working reasonably well so deployed one station. It stopped working after about 3 days. I can't access that system remotely so I built a very similar system (the next one to deploy) and it falso ailed due to lack of memory. Weewx memory usage on my development system also increases over time - it can grow to over 1 GB in a few hours! My extension was an extension to Belchetown based on the METAR extension. The extension created Belchertown index hook include files each archive cycle. While researching this problem I re-read the weewx customization guide and noted that extensions should NOT be dependent on other extensions so I decided to re-write my extension as a service (which looks like a much cleaner solution) without any dependencies on Belchertown (other than include file names). All I've done so far is create a service (WxFeedsMemoryTest.py) to test weewx/Belchertown memory consumption. The Problem: weewx/Bechertown memory usage increases over time. It starts out at about 45 MB and grows at about 3MB per archive period/cycle (using my test case). A 512 MB Pi will exhaust memory within a few days. It appears that the problem is associated with the creation of the Belchertown include files while weewx/Belchertown is running: - if the include file is 'static' as in not (re)created while weewx/Belchertown is running, memory usage is static - it does not grow beyond about 50 MB. - if the include file is 'dynamic' as in (re)created while weewx/Belchetown is running, memory usage increases. - if the include file is created once, and becomes 'static', memory usage increases and then stabilizes. - if the include file is recreated continuously (such as on each archive cycle), memory usage increases each cycle. It does not appear to matter if the include file is created directly, or created as a temporary file and then copied or renamed. The attached service (WxFeedsMemoryTest.py) can be used to demonstrate the problem. Please see installation and use instructions within the WxFeedsMemoryTest.py. I'm going to continue to work on moving my extension from "an extension to an extension" to a service in the hope that this memory problem can be resolved. With apologies in advance if I'm doing something to cause the problem, please review, advise and let me know what I can do to avoid the problem. Regards, Garry On Thursday, December 31, 2020 at 7:05:15 PM UTC-8 [email protected] wrote: > Got MemoryError after about 9 hours after restart. Have removed cmon by > commenting out any mention of cmon in weewx.conf and restarted. > > Regards, > > Garry Lockyer > Former DEC Product Support Engineer :^) > Kepner-Tregoe Trained :^)) > C: +1.250.689.0686 <(250)%20689-0686> > E: [email protected] > > > On Dec 31, 2020, at 11:44, vince <[email protected]> wrote: > > On Thursday, December 31, 2020 at 11:39:39 AM UTC-8 [email protected] > wrote: > >> Re: editing the Belchertown skin, nope haven’t touched it, *other than >> interfacing with it via the include files (as generated by my >> BelchertownWxFeeds extension)*. When all the endpoints (for testing) >> are enabled index.html is about 1.8MB, so perhaps that’s causing the >> problem. I can easily reduce / eliminate endpoints and prefer to do that >> before eliminating the Belchertown skin. >> >> > There it is. You touched it :-) > > Usual debugging rules apply. Reset it to a baseline unmodified config. > Add in changes one-by-one. If it goes sideways, revert to the last known > good and reverify that it stays good. > > > -- > > You received this message because you are subscribed to the Google Groups > "weewx-user" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to [email protected]. > > To view this discussion on the web visit > https://groups.google.com/d/msgid/weewx-user/57a5bfc7-d0b4-4419-b178-6342564642edn%40googlegroups.com > > <https://groups.google.com/d/msgid/weewx-user/57a5bfc7-d0b4-4419-b178-6342564642edn%40googlegroups.com?utm_medium=email&utm_source=footer> > . > > -- You received this message because you are subscribed to the Google Groups "weewx-user" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To view this discussion on the web visit https://groups.google.com/d/msgid/weewx-user/ea29e981-17c8-49bc-9fe9-d2b313ca47abn%40googlegroups.com.
# WxFeedsMemoryTest.py # # Copyright (c) 2021 Garry Lockyer - [email protected] # See the file LICENSE.txt for your rights. # # Based on alarm.py by: # Copyright (c) 2009-2019 Tom Keffer <[email protected]> # See the file LICENSE.txt for your rights. """ To use this service, add the following to the weewx configuration file: [WxFeedsMemoryTest] # No options are required or processed at this time. This service requests data from an RSS feed: feedURL = "https://511.alberta.ca/api/v2/get/winterroads?format=json&lang=en" and creates an include file: file = open( "/home/weewx/skins/Belchertown/index_hook_after_charts.inc", "wt", 1 ) that the Belchertown Skin can use. ******************************************************************************* To enable this service: 1) Copy this file to the user directory. See https://bit.ly/33YHsqX for where your user directory is located. 2) Modify the weewx configuration file by adding this service to the option "report_services", located in section [Engine][[Services]]. [Engine] [[Services]] ... report_services = weewx.engine.StdPrint, weewx.engine.StdReport, user.WxFeedsMemoryTest.WxFeedsMemoryTest # I want WxFeedsMemoryTest to run before StdReport so I actually use: report_services = weewx.engine.StdPrint, user.WxFeedsMemoryTest.WxFeedsMemoryTest, weewx.engine.StdReport """ import weewx #from weeutil.weeutil import timestamp_to_string, option_as_list from weewx.engine import StdService import logging import os import sys import gc import requests import shutil #import smtplib #import socket #import syslog #import threading #import time #from email.mime.text import MIMEText # Inherit from the base class StdService: class WxFeedsMemoryTest(StdService): """Service that will somedat read many RSS/Atom feeds and create files that can be used by weewx skins such as Belchertown. Currently, all it does is get one RSS feed and produce one file to demonstrate that weewx memory usage grows after each archive period.""" def __init__(self, engine, config_dict): # Pass the initialization information on to my superclass: super(WxFeedsMemoryTest, self).__init__(engine, config_dict) self.log = logging.getLogger( __name__ ) assert self.log != None self.log.info( "WxFeedsMemoryTest::__init__(). . ." ) self.bind( weewx.NEW_ARCHIVE_RECORD, self.new_archive_record ) return def new_archive_record(self, event): """Gets called on a new archive record event.""" self.log.info( "WxFeedsMemoryTest::new_archive_record(). . ." ) self.log.info( "Process ID: %d" % os.getpid() ) # This URL is for Alberta 511's Road Conditions RSS feed. feedURL = "https://511.alberta.ca/api/v2/get/winterroads?format=json&lang=en" # The include file is used by the Belchertown Skin. # # The problem: # # If the include file is static, as in it is not re-created while # weewx/Belchetown is running, weewx memory usage does not grow. # # If the include file is dynamic, as in it is re-created while # weewx/Belchertown is running, weewx memory usage increases # arter each file re-creation. includeFileName = "/home/weewx/skins/Belchertown/index_hook_after_charts.inc" tempFileName = "/home/weewx/skins/Belchertown/WxFeedsTemp.inc" # The include file can be created DIRECTLY or INDIRECTLY: # # DIRECTLY: File is opened, written to and closed. # # INDIRECTLY: A temp file is opened, written to and closed and then # renamed or copied to the Belchertown skin filename. During # testing, the thinking was the problem was associated with the # actual writing of the file and that the problem might go away if # the file was created and then renamed or copied. It did not. directFileCreation = True # If True, include file is created directly, if False, include is create as a # 'temporary' file and then copied or renamed. # Does not seem to affect problem. copyFile = True # For indirect include file creation: # If True, temporary file is copied, if False, temporary file is renamed. # Does not seem to affect problem. recreateFile = True # If True, recreate the include file, if False, do not recreate the include file. # # Recreating the include file appears to cause weewx memory usage to grow. # # If True, weewx memory usage will increase on each archive cycle. # # If False and the include file does not exist when weewx/Belchertown starts, it # will be created the first time this method is called but not on subsequent calls. # Memory usage will stabilize. assert gc.isenabled() == True self.log.info( "WxFeedsMemoryTest:" ) self.log.info( "Memory use before gc.collect():" ) x, y, z = gc.get_count() self.log.info( "gc.get_count(): %d, %d, %d" % ( x, y, z ) ) self.log.info( "sys.getallocatedblocks(): %d" % sys.getallocatedblocks() ) #sys._clear_type_cache() gc.collect() self.log.info( "Memory use after gc.collect():" ) x, y, z = gc.get_count() self.log.info( "gc.get_count(): %d, %d, %d" % ( x, y, z ) ) self.log.info( "sys.getallocatedblocks(): %d" % sys.getallocatedblocks() ) self.log.info( "" ) try: session = requests.Session() assert session != None response = session.get( feedURL ) assert response != None except: self.log.info("Alberta511RoadConditions: requests.get() returned error: %d!" % response.status_code) if response.status_code == requests.codes.ok : roadConditions = response.json() self.log.info( "Alberta511RoadConditions: Received %d entries." % len( roadConditions ) ) if os.path.exists( includeFileName ) == True and \ recreateFile == False: return if os.path.exists( includeFileName ) == True: os.remove( includeFileName ) if os.path.exists( tempFileName ) == True: os.remove( tempFileName ) if directFileCreation == True: file = open( includeFileName, "wt", 1 ) else: file = open( tempFileName, "wt", 1 ) assert file != None for roadCondition in roadConditions: HTML = "<table>\n" \ + "<thead>\n" \ + "<tr>\n" \ + "<th></th>\n" \ + "<th></th>\n" \ + "</tr>\n" \ + "</thead>\n" \ + "<tbody>\n" \ + "<tr>\n" \ + "<td><b>Roadway Name: </b></td>\n" \ + "<td>%s</td>\n" % roadCondition.get( "RoadwayName" ) \ + "</tr>\n" \ + "<tr>\n" \ + "<td><b>Location Description: </b></td>\n" \ + "<td>%s</td>\n" % roadCondition.get( "LocationDescription" ) \ + "</tr>\n" \ + "<tr>\n" \ + "<td><b>Primary Condition: </b></td>\n" \ + "<td>%s</td>\n" % roadCondition.get( "Primary Condition" ) \ + "</tr>\n" \ + "<tr>\n" \ + "<td><b>Id: </b></td>\n" \ + "<td>%s</td>\n" % roadCondition.get( "Id" ) \ + "</tr>\n" \ + "<tr>\n" \ + "<td><b>Area Name: </b></td>\n" \ + "<td>%s</td>\n" % roadCondition.get( "AreaName" ) \ + "</tr>\n" \ + "<tr>\n" \ + "<td><b>Last Updated: </b></td>\n" \ + "<td>%s</td>\n" % roadCondition.get( "LastUpdated" ) \ + "</tr>\n" \ + "</tbody>\n" \ + "</table>\n" \ + "<hr>\n" file.write( HTML ) file.flush() os.fsync( file.fileno() ) del roadConditions file.close() del file if directFileCreation == False: # The include file should have been deleted above but # so that we don't get any exceptions because the destination # already exists, we'll delete it here, before a copy or rename. if os.path.exists( includeFileName ) == True: os.remove( includeFileName ) if copyFile == True: shutil.copy2( tempFileName, includeFileName ) else: os.replace( tempFileName, includeFileName ) response.close() del response session.close() del session self.log.info( "WxFeedsMemoryTest:" ) self.log.info( "Garbage collection after writing file:" ) self.log.info( "Memory use before gc.collect():" ) x, y, z = gc.get_count() self.log.info( "gc.get_count(): %d, %d, %d" % ( x, y, z ) ) self.log.info( "sys.getallocatedblocks(): %d" % sys.getallocatedblocks() ) #sys._clear_type_cache() gc.collect() self.log.info( "Memory use after gc.collect():" ) x, y, z = gc.get_count() self.log.info( "gc.get_count(): %d, %d, %d" % ( x, y, z ) ) self.log.info( "sys.getallocatedblocks(): %d" % sys.getallocatedblocks() ) self.log.info( "" ) return
