>From: Kevin Golding <k...@caomhin.org> >On Thu, 01 Jun 2017 13:45:26 +0100, David Jones <djo...@ena.com.invalid> wrote:
>> Why do you think it was pointless? >Because I got a daily email telling me that rsync failed while the system >was offline. Running a masscheck every day for no purpose seemed a little >pointless. I thought I'd suspend it until the system was back online. If >that's a problem I apologise, it wasn't made clear that I should keep >donating resources during a period when the system was offline. Sure. I completely understand when the server was offline. Sounds like we need a little enhancement in the automasscheck-minimal.sh script to detect when the rsync fails and not waste processing resources. >> I have an idea that will allow masscheckers to cron the automasscheck.sh >> script hourly which would only run the full masscheck when they detect a >> new tagged ruleset to work with. Basically it would do a quick rsync of >> the latest tagged build dir like it does today but if there are no rsync >> changes, it would simply exit. >Presumably there would also be a 24hr window that meant even if no rules >were updated we would rescore after that period to have recent score >adjustments? That would seem more effective than maintaining the morning >run and throwing in an additional one as needed since we could run checks >an hour before the morning run. Yes. We would still keep the daily tagged build of rules that we have today so the existing 24 hour processing would work just like it does today for those who don't want to go to the hourly. Keep in mind, this wouldn't mean you would need to masscheck hourly just for nothing. The script would run hourly "phone home" then exit if there was nothing new to masscheck against. Even the current nightly masscheck would have benefited from this logic while the server was down and not wasted resources. >It would logically also require an amendment to suggest running sa-update >hourly instead of daily too. Sure. Fair point. With sa-update running "randomly" all over the Internet from different locations and time zones, there could be some not getting updates for up to 48 hours even when everything is running perfectly. The average update around the world would go from 24 hours down to 12 hours for any hourly updates assuming we could get enough masscheckers to go hourly. I am still in the planning stages of this after sorting through all of the scripts so I am definitely open to ideas and suggestions like this. The idea is that when we find the recent issue with Yahoo changing their message ID format (see FORGED_MUA_MOZILLA & FORGED_YAHOO_RCVD thread), then this could go out in hours instead of days. >As a sidenote, not all of us use the main automasscheck.sh script so >depending on how the changes are rolled out I can't promise an >uninterrupted supply of masscheck data. Thanks for that feedback. My goal is to add the hourly functionality without changing the current directory structure or timing of cron jobs so this would not impact the existing masscheck submissions. I am still working on getting the current masscheck processing finished up and we are probably months away from the hourly stuff so I will take things slowly and try to fully understand things before adding the hourly logic. I will test on my own masscheck processing for a while first.