Re[10]: [sniffer] Persistent Sniffer
On Saturday, April 2, 2005, 1:07:56 PM, Andrew wrote: CA Pete, your metaphors are wonderful. :-) snip/ CA If I remember correctly, the MaxPollTime was originally much lower. I CA now use the full 4 seconds, but I don't know how often that's needed. I CA easily see Declude processes taking longer than this, sometimes at 100% CA of my CPU (with Task Manager update speed set to High) In persistent mode the max poll time does not matter. It would only matter if the system fell back into peer-server mode. With persistent mode the client instances coordinate their timing with the server instance based on the data in the .stat file. CA I also set Lifetime to 0 (because I don't expect the service to need CA stopping), and Persistence to 12 hours. I'm hedging my bet with CA Persistence, because I normally expect a twice daily rulebase update, CA and my update mechanism should initiate a reload. This seems fine given that you issue reload with your updates. However, you should know that udpates are generally much more frequent than every 12 hours. More in the range of every 5 hours or so at this time. Best, _M This E-Mail came from the Message Sniffer mailing list. For information and (un)subscription instructions go to http://www.sortmonster.com/MessageSniffer/Help/Help.html
Re[2]: [sniffer] Persistent Sniffer
On Friday, April 1, 2005, 8:04:27 AM, Keith wrote: KJ I have read forum results that this behavior is the reverse of KJ what should happen, I should get a reduction in CPU. I did this KJ around 11pm last night, usually during peak times this server KJ would stay at 65% load. Is there anything I can tweak to install KJ the Sniffer persistent server and achieve desired results? Thanks KJ for the aid. Can you share more about your server's configuration and can you also post the .stat file that was produced? Server OS? Server CPU(s)? Drive System(s)? Mail Server SW? _M This E-Mail came from the Message Sniffer mailing list. For information and (un)subscription instructions go to http://www.sortmonster.com/MessageSniffer/Help/Help.html
RE: Re[2]: [sniffer] Persistent Sniffer
Pete, Thanks for the reply. Running on an IBM Xseries 225 Dual Xeon 2.4Ghz w/ 1GB RAM - running IBM's ServerRAID 5i in IBM's RAID 10 config (4 73GB 10K drives) - O/S is Windows 2000 Standard Server SP4 Running Imail 8.15HF1 with Declude JM/Virus 1.82 - BIND DNS Server is 1 hop away (on switch backbone). I had to drop back to the non-persistent mode, thus the .stat file disappeared. I will run it again tonight and copy the file away and post it here tonight. Thanks again for the time and aid. Keith Johnson -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Pete McNeil Sent: Friday, April 01, 2005 11:17 AM To: Keith Johnson Subject: Re[2]: [sniffer] Persistent Sniffer On Friday, April 1, 2005, 8:04:27 AM, Keith wrote: KJ I have read forum results that this behavior is the reverse of what KJ should happen, I should get a reduction in CPU. I did this around KJ 11pm last night, usually during peak times this server would stay at KJ 65% load. Is there anything I can tweak to install the Sniffer KJ persistent server and achieve desired results? Thanks for the aid. Can you share more about your server's configuration and can you also post the .stat file that was produced? Server OS? Server CPU(s)? Drive System(s)? Mail Server SW? _M This E-Mail came from the Message Sniffer mailing list. For information and (un)subscription instructions go to http://www.sortmonster.com/MessageSniffer/Help/Help.html This E-Mail came from the Message Sniffer mailing list. For information and (un)subscription instructions go to http://www.sortmonster.com/MessageSniffer/Help/Help.html
Re[4]: [sniffer] Persistent Sniffer
On Friday, April 1, 2005, 11:44:07 AM, Keith wrote: KJ Pete, KJ Thanks for the reply. KJ Running on an IBM Xseries 225 Dual Xeon 2.4Ghz w/ 1GB RAM - KJ running IBM's ServerRAID 5i in IBM's RAID 10 config (4 73GB 10K drives) KJ - O/S is Windows 2000 Standard Server SP4 KJ Running Imail 8.15HF1 with Declude JM/Virus 1.82 - BIND DNS KJ Server is 1 hop away (on switch backbone). I had to drop back to the KJ non-persistent mode, thus the .stat file disappeared. I will run it KJ again tonight and copy the file away and post it here tonight. KJ Thanks again for the time and aid. I don't see any problems with this setup. Your description sounds like your server is fairly heavily loaded (35-55% cpu in peer-server mode), though I would expect more from the hardware you've described. I suspect that you may have run into the far side of the power curve when you went to persistent server mode. In peer-server mode the failure mode for overload conditions is much softer than with the persistent peer server mode. Up to the failure point in the power curve the persistent server mode will provide a significant savings over peer-server, however once that point is reached the persistent server mode tends to degrade much more quickly and requires a significant drop in load before recovery occurs. I'm working on some strategies to soften that curve a bit, but in the mean time let's explore these options to get the best performance from your server and reduce it's load. The we can see if the persistent server engine will give you even more headroom: 1. I recommend running AVAFTERJM - are you doing this? Typically 80% or more of email traffic is spam and so there is no good reason to attempt a virus scan on these messages. If you hold messages and occasionally re-insert them into the queue then they will not be scanned, however there are ways to work around this when needed - and it is very likely you would not re-insert a message that contained a virus anyway. 2. Consider running bind as a dns resolver on your mail server and pointing the server to itself via the loopback address (127.0.0.1) for DNS services. This tends to speed up processing significantly which also reduces the number of message processes that are running at any given time. YMMV, but I have seen this work consistently to improve performance. --- when trying persistent mode (minor adjustments really) --- A. Set the Persistence value in your snflicid.cfg file to 3600. - no need to check for a new rulebase every 10 minutes usually. These loop events tear down the server momentarily which can perturb an otherwise smooth running system when under heavy loads - thus minimizing the frequency of these events may help. B. Set LogFormat in your snflicid.cfg file to SingleLine. This provides sufficient data for our purposes (most of the time) and should significantly reduce the size of your log file. C. Be sure to keep any unnecessary files out of the SNF working directory - in particular you should clean out any orphaned files that might still be lurking from previous crashes. --- General --- Be sure your drives are regularly defragmented. Hope this helps, _M PS: I just had another random thought really --- Could it be that the high CPU value was appropriate? If you had built up a queue of messages to be processed then once the persistent server was put in place and the system started processing messages again the CPU would probably be much higher for a period of time until all of the backlog had been eliminated. The persistent server would do its best to nail up at least one of the CPUs until this was accomplished, so looking at the CPU load during that period might not be the best way to understand the situation. The CPU load would not drop back down again until the backlog had be eliminated. In comparison to the persistent server mode, the peer-server mode imposes a greater restriction on message throughput and puts a higher load on IO due to repeatedly loading the rulebase file. This can have the effect of reducing the overall CPU load at the expense of raw throughput under some circumstances. This, in fact, explains why the peer-server mode has a softer performance failure curve than the persistent server mode (in theory). Put another way, sometimes the peer-server mode prevents the CPU from getting out of it's own way to scan the messages - so the CPU load looks lower because it spends more time waiting. In these cases putting the persistent server in place has the effect of removing the obstacles so the CPU works harder and the messages go through faster. A better way to judge might be to check the overflow queue... the rate at which it is being emptied (or slowness at which it grows) would indicate a better throughput and that is probably the better goal. -- just a random thought. This E-Mail came from the Message Sniffer mailing list. For information and (un)subscription instructions go to
RE: Re[4]: [sniffer] Persistent Sniffer
Pete, Wow, thank you for the explanation. I did let the persistent server run for 30 min after I restarted the services. However, I did stop the services, then started Sniffer service, then restart Imail services. I could have gotten a backlog of retries at that moment that pegged the CPU as you stated. We have batted around running BIND for NT/2000 on the local machine, but my fear was overhead of another major process running. I don't have any good stats on how much CPU/Memory BIND on an Imail Server requires, thus, we have a SUN/BIND box local to the switch. Are you aware of any stats on this? We don't run the AVAFTERJM switch. This is done in part due to so many of our customers still look at their spam email from time to time. We heavily use the ROUTETO and MAILBOX command, thus, if I let a virus go through to their to mailbox, they could potentially open a virus spam email and hurt themselves. We defrag each partition every night using Diskeeper and it works great. I regularly look at the Sniffer directory to ensure no left over .fin files and others that could cause server load. I will retry it again tonight and see what type of results I get and post them here. It could be as you say, I am on the far side :) Thanks again, Keith -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Pete McNeil Sent: Friday, April 01, 2005 2:16 PM To: Keith Johnson Subject: Re[4]: [sniffer] Persistent Sniffer On Friday, April 1, 2005, 11:44:07 AM, Keith wrote: KJ Pete, KJ Thanks for the reply. KJ Running on an IBM Xseries 225 Dual Xeon 2.4Ghz w/ 1GB RAM - KJ running IBM's ServerRAID 5i in IBM's RAID 10 config (4 73GB 10K KJ drives) KJ - O/S is Windows 2000 Standard Server SP4 KJ Running Imail 8.15HF1 with Declude JM/Virus 1.82 - BIND DNS KJ Server is 1 hop away (on switch backbone). I had to drop back to KJ the non-persistent mode, thus the .stat file disappeared. I will KJ run it again tonight and copy the file away and post it here tonight. KJ Thanks again for the time and aid. I don't see any problems with this setup. Your description sounds like your server is fairly heavily loaded (35-55% cpu in peer-server mode), though I would expect more from the hardware you've described. I suspect that you may have run into the far side of the power curve when you went to persistent server mode. In peer-server mode the failure mode for overload conditions is much softer than with the persistent peer server mode. Up to the failure point in the power curve the persistent server mode will provide a significant savings over peer-server, however once that point is reached the persistent server mode tends to degrade much more quickly and requires a significant drop in load before recovery occurs. I'm working on some strategies to soften that curve a bit, but in the mean time let's explore these options to get the best performance from your server and reduce it's load. The we can see if the persistent server engine will give you even more headroom: 1. I recommend running AVAFTERJM - are you doing this? Typically 80% or more of email traffic is spam and so there is no good reason to attempt a virus scan on these messages. If you hold messages and occasionally re-insert them into the queue then they will not be scanned, however there are ways to work around this when needed - and it is very likely you would not re-insert a message that contained a virus anyway. 2. Consider running bind as a dns resolver on your mail server and pointing the server to itself via the loopback address (127.0.0.1) for DNS services. This tends to speed up processing significantly which also reduces the number of message processes that are running at any given time. YMMV, but I have seen this work consistently to improve performance. --- when trying persistent mode (minor adjustments really) --- A. Set the Persistence value in your snflicid.cfg file to 3600. - no need to check for a new rulebase every 10 minutes usually. These loop events tear down the server momentarily which can perturb an otherwise smooth running system when under heavy loads - thus minimizing the frequency of these events may help. B. Set LogFormat in your snflicid.cfg file to SingleLine. This provides sufficient data for our purposes (most of the time) and should significantly reduce the size of your log file. C. Be sure to keep any unnecessary files out of the SNF working directory - in particular you should clean out any orphaned files that might still be lurking from previous crashes. --- General --- Be sure your drives are regularly defragmented. Hope this helps, _M PS: I just had another random thought really --- Could it be that the high CPU value was appropriate? If you had built up a queue of messages to be processed then once the persistent server was put in place and the system started processing messages again the CPU would
Re: [sniffer] Persistent Sniffer
Keith, Windows DNS service will handle over a million lookups a day without blinking. There should be no reason to switch to a different DNS server. It hardly even registers any CPU load on my boxes. The biggest CPU hog is the virus scanners, and choosing your virus scanners carefully will have a great benefit. F-Prot is the champ followed by ClamAV in daemon mode (the non-daemon is a hog), followed by McAfee at a distant third, though there are many others that are far worse. The AVAFTERJM switch will stop most messages from being virus scanned and hence the magic there, however if you don't delete any messages with JunkMail there is no real advantage. I'm not clear on whether or not the ROUTETO will bypass scanning, but you could create some filters using VBScript to tag messages with attachments associated with viruses and handle them differently. Personally, I haven't found a huge impact from running Sniffer in persistent mode, but it does have a slightly measurable effect on my server. If you are hurting for disk I/O or memory, this could help immensely. If you are running into an issue with disk I/O, it could back things up significantly. Also, if you have any domains where the addresses aren't validated (nobody aliases or gateway domains), this could easy be attacked in such a way so that it overwhelmed your server. We are presently only validating for about 2/3 of our customer base and this morning the address validation software/service failed an automatic restart and it allowed everything through to IMail/Declude and it pegged our server at 100% until it was turned back on. Normally at that time of day, our server runs at an average of about 25% (and it will get better when the other 1/3 becomes validated). BODY and ANYWHERE filters in Declude can also be huge hogs if you don't limit them to a reasonable level. I probably have about 1,500 lines of BODY filters and that isn't causing me any real issues but I am also using SKIPIFWEIGHT and other methods of skipping such filters when it isn't beneficial to run them. Managing my Declude filtering better definitely helped me steal back some CPU. Placing Sniffer in persistent mode definitely shouldn't cause things to slow down unless maybe it was configured improperly. I use the same SERVANY setup that you said that you are using and it has worked flawlessly for me since the day that Pete released that functionality. I am thinking that you might want to scrutinize your setup. Hope that this helps. Matt Keith Johnson wrote: Pete, Wow, thank you for the explanation. I did let the persistent server run for 30 min after I restarted the services. However, I did stop the services, then started Sniffer service, then restart Imail services. I could have gotten a backlog of retries at that moment that pegged the CPU as you stated. We have batted around running BIND for NT/2000 on the local machine, but my fear was overhead of another major process running. I don't have any good stats on how much CPU/Memory BIND on an Imail Server requires, thus, we have a SUN/BIND box local to the switch. Are you aware of any stats on this? We don't run the AVAFTERJM switch. This is done in part due to so many of our customers still look at their spam email from time to time. We heavily use the ROUTETO and MAILBOX command, thus, if I let a virus go through to their to mailbox, they could potentially open a virus spam email and hurt themselves. We defrag each partition every night using Diskeeper and it works great. I regularly look at the Sniffer directory to ensure no left over .fin files and others that could cause server load. I will retry it again tonight and see what type of results I get and post them here. It could be as you say, I am on the far side :) Thanks again, Keith -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Pete McNeil Sent: Friday, April 01, 2005 2:16 PM To: Keith Johnson Subject: Re[4]: [sniffer] Persistent Sniffer On Friday, April 1, 2005, 11:44:07 AM, Keith wrote: KJ Pete, KJ Thanks for the reply. KJ Running on an IBM Xseries 225 Dual Xeon 2.4Ghz w/ 1GB RAM - KJ running IBM's ServerRAID 5i in IBM's RAID 10 config (4 73GB 10K KJ drives) KJ - O/S is Windows 2000 Standard Server SP4 KJ Running Imail 8.15HF1 with Declude JM/Virus 1.82 - BIND DNS KJ Server is 1 hop away (on switch backbone). I had to drop back to KJ the non-persistent mode, thus the .stat file disappeared. I will KJ run it again tonight and copy the file away and post it here tonight. KJ Thanks again for the time and aid. I don't see any problems with this setup. Your description sounds like your server is fairly heavily loaded (35-55% cpu in peer-server mode), though I would expect more from the hardware you've described. I suspect that you may have run into the far side of the power curve when you went
Re[6]: [sniffer] Persistent Sniffer
On Friday, April 1, 2005, 3:37:33 PM, Keith wrote: snip/ KJ pegged the CPU as you stated. We have batted around running BIND KJ for NT/2000 on the local machine, but my fear was overhead of KJ another major process running. I don't have any good stats on how KJ much CPU/Memory BIND on an Imail Server requires, thus, we have a KJ SUN/BIND box local to the switch. Are you aware of any stats on KJ this? No hard data on hand, however a back of the envelope calculation suggests that you probably have a good chunk of ram left - and that this will probably expand if you can retire messages more quickly -- that has a tendency to speed up everything since everything has more room etc. I've never heard a bad experience with this approach, and I have proven it several times on otherwise overwhelmed machines. Paradoxically, for example, my woefully underpowered P2/450 will choke if I don't run bind locally - even if the DNS server it points at is on the a hot, dedicated box on the same switch. The minute I put bind on the same box as the server it recovers nicely. KJ We don't run the AVAFTERJM switch. This is done in part due to KJ so many of our customers still look at their spam email from time to KJ time. We heavily use the ROUTETO and MAILBOX command, thus, if I let a KJ virus go through to their to mailbox, they could potentially open a KJ virus spam email and hurt themselves. Understood. What about prescan? KJ We defrag each partition every night using Diskeeper and it KJ works great. I regularly look at the Sniffer directory to ensure no KJ left over .fin files and others that could cause server load. Sounds good - I like Diskeeper too - won't run a Winx box without it. KJ I will KJ retry it again tonight and see what type of results I get and post them KJ here. It could be as you say, I am on the far side :) Thanks Good Luck, _M This E-Mail came from the Message Sniffer mailing list. For information and (un)subscription instructions go to http://www.sortmonster.com/MessageSniffer/Help/Help.html
RE: Re[8]: [sniffer] Persistent Sniffer
Pete, Yes the file is changing every few seconds or sooner. Sorry, I just did a 'grab' of it and posted. The 307 is due to me stopping it after 30 min or so and altering the few changes to the .conf file. I will continue to monitor it over the weekend. However, so far so good. Thanks again for taking the time to help out. Keith -Original Message- From: [EMAIL PROTECTED] on behalf of Pete McNeil Sent: Fri 4/1/2005 10:18 PM To: Keith Johnson Cc: Subject: Re[8]: [sniffer] Persistent Sniffer On Friday, April 1, 2005, 9:36:05 PM, Keith wrote: KJ Pete/Matt/Andrew, KJ Thanks for all your wonderful input. Maybe I didn't KJ give it a fair shake or time enough as mentioned by Pete earlier. KJ I turned it on again about 30 min ago and have seen my system KJ stable, currently it is: KJTicToc: 1112391330 KJ Loop: 264 KJ Poll: 445 KJ Jobs: 290 KJ Secs: 307 KJ Msg/Min: 56.6775 KJ Current-Load: 21.4724 KJ Average-Load: 22.4706 KJ These numbers were up around 120 Msg/Min and Current KJ Load at 90+CPU is aver. about 17% right now. However, could KJ be skewed a bit since it is Friday night. I will continue to KJ watch it over the weekend and see how it goes. Still considering KJ running Win DNS local or BIND 9.3 for NT/2000/2003. Have a great KJ weekend. Hrmmm Something here doesn't add up. Is the .stat file changing every second or so? If not then the persistent engine has stopped. In fact, 307 seconds is scarcely 5 minutes - not 30. It appears that at the time you sampled the file your system was happily humming along at about 1 msg/sec... which is a lul for you. Remember that your average would be about 1.7 messages per second. I also note that the load and poll time indicated a good deal of dead air so the system was definitely not working hard at the time. Take a look at it again and make sure that the .stat file changes every 1-4 seconds or so. If not then the persistent server has stopped - at least the client instances will see it that way. Hope this helps, _M This E-Mail came from the Message Sniffer mailing list. For information and (un)subscription instructions go to http://www.sortmonster.com/MessageSniffer/Help/Help.html winmail.dat
Re[2]: [sniffer] Persistent Sniffer
On Wednesday, March 30, 2005, 10:50:36 PM, Keith wrote: KJ Pete, KJThanks for the follow-up. I was monitoring the KJ filename.persistent.stat file that yields stats as messages are KJ processed. Is it normal for it to every now and then flash [File KJ is Empty], thus no stats at all. Usually within a few seconds KJ stats would appear again. Thanks again, This usually means that you were reading the file just as it was being written. When the file is output it is opened with O_TRUNC to truncate the file so it can be replaced. If you read it at that moment you will see nothing so this is normal. Hope this helps, _M This E-Mail came from the Message Sniffer mailing list. For information and (un)subscription instructions go to http://www.sortmonster.com/MessageSniffer/Help/Help.html
[sniffer] Persistent Sniffer
I noticed in the archives about a .cfg file one can configure for use when running Persistent sniffer. How do you download it or obtain it? Thanks for the aid. Keith This E-Mail came from the Message Sniffer mailing list. For information and (un)subscription instructions go to http://www.sortmonster.com/MessageSniffer/Help/Help.html
Re: [sniffer] Persistent Sniffer
On Wednesday, March 30, 2005, 4:08:35 PM, Keith wrote: KJ I noticed in the archives about a .cfg file one can configure for use KJ when running Persistent sniffer. How do you download it or obtain it? KJ Thanks for the aid. You can find a sample .cfg file in the latest distribution. If you don't already have a .cfg then chances are your .exe file won't understand it anyway ;-) You can always find the latest distribution on this page: http://www.sortmonster.com/MessageSniffer/Try-It.html Best, _M This E-Mail came from the Message Sniffer mailing list. For information and (un)subscription instructions go to http://www.sortmonster.com/MessageSniffer/Help/Help.html