On Tue, Jun 13, 2017 at 11:45 AM, sebb <seb...@gmail.com> wrote: > The apmail listing jobs (mods, subs) are generally quite expensive to run. > However the output does not change very frequently. > > So a possible approach might be to run a cheap(er) check to look for > changes to the source files and only run the extraction when there is > a change. > > This should allow list-subs and list-mods to be run hourly rather than > 6-hourly. > > I'm happy to look at how to implement this if people think it's worth > pursuing?
Please do. You might want to first check to see if I am being too conservative. >From memory: the scripts ran faster than I expected. IIRC, the longest times was 'paging back in' the contents of directories into cache, as in if you ran the script the second time it would run much faster than the first time. Also, running "ls -ltr lists/*/*/Log" indicates that there is likely *some* change to at least one mailing list every hour. "ls -ltr lists/*/*/mod/Log" indicates that moderation changes are considerably less frequent, as in every couple of days there is a batch update. Perhaps it might be worth exploring splitting the updates by DNS address? As in, once every 10 to 15 minutes look for a changed log file, and if found, extracting the mods or subs as the case may be and sending it over. At the moment, there are only two whimsy tools that parse this data, so changing the structure of the data would not require much work. - Sam Ruby