I should add that Sniffer has been pretty much trouble free for us.  We have
been using it since before the ARM research days (10+ years as a guess).
One of the specialized clients we host for goes through a cycle every few
years where they are very publically visible and there are a number of
attempts to infect them and do other things to take mail and other services
down as well as huge volumes of SPAM directed a couple of hundred additional
published and very visible email addresses.  For several weeks the mail
volume and bandwidth utilization go up dramatically (10X).  Sniffer is
in-line with other processes and has handled the load without a wrinkle.
Whatever we've seen just recently, and as I reflect on it possibly once
before, is definitely out of the ordinary for us.

 

 

From: Message Sniffer Community [mailto:sniffer@sortmonster.com] On Behalf
Of Matt
Sent: Friday, June 28, 2013 11:31 AM
To: Message Sniffer Community
Subject: [sniffer] Re: Slow processing times, errors

 

I'll certainly look more closely next time.  Hopefully I'll be migrated
before this happens again :)

Matt



On 6/28/2013 1:44 PM, Darin Cox wrote:

How about running performance monitor to watch disk I/O, mem, cpu, page
file, etc. over time in the hopes of catching one of the events?

 

Darin.

 

From: Matt <mailto:for...@mailpure.com>  

Sent: Friday, June 28, 2013 12:10 PM

To: Message Sniffer Community <mailto:sniffer@sortmonster.com>  

Subject: [sniffer] Re: Slow processing times, errors

 

Pete,

I'm near positive that it's not system resources that are causing Sniffer to
not be able to access the files.  I believe these errors are a symptom and
not the cause.

You have to keep in mind that on the messages that don't throw errors, they
were taking 30-90 seconds to scan, but immediately after a restart it was
under 1 second.  The system stayed the same, it was just the state of the
service that was off in a bad way.

I did add a larger client about a month ago around the time that this
started, which did inch up load by between 1% and 5% I figure, but I can't
say for sure that the two things are connected.  I've seen much bigger
changes however in spam volumes from single spammers.  I have looked at my
SNFclient.exe.err log and found that the previous slowdowns were all
represented in this file, and nothing else really since a smattering in 2012
of other stuff.  I believe that I/O could be the trigger, or general system
load, but the error in the service that misses opening some files, and is
otherwise slower than normal by 100 times, will persist when everything else
is fine again.  I figure that this is all triggered by a short-term lack of
resources or a killer message type of issue that does something like run
away with memory.  Certainly there were no recent changes on the server
prior to this starting to happen, including Sniffer itself which has been
perfectly solid up until 5/22.

Regarding the ERROR_MSG_FILE batch that I sent you in that log, it did
happen exactly when I restarted Sniffer, and in fact the SNFclient.exe.err
log showed a different error while this was happening, and maybe this will
point you to something else?  That log says "Could Not Connect!" when the
regular Sniffer log shows "ERROR_MSG_FILE" about 1/8th of the time while in
a bad state.  When I restarted the Sniffer service, the regular log showed a
bunch of "ERROR_MSG_FILE" in a row, but the SNFclient.exe.err log below
shows "XCI Error!: FileError snf_EngineHandler::scanMessageFile()
Open/Seek".  You can match the message ID's with the other log that I
provided.  I believe that block of messages was already called to
SNFclient.exe, but the Sniffer service haddn't yet responded, and so they
were dumped as a batch into both logs during shut down of the service.

20130627183807, arg1=F:\\proc\work\D862600e600004269.smd : Could Not
Connect!
20130627183808, arg1=F:\\proc\work\D864401770000431f.smd : Could Not
Connect!
20130627183808, arg1=F:\\proc\work\D861200ce000041ce.smd : Could Not
Connect!
20130627183809, arg1=F:\\proc\work\D8644017300004321.smd : Could Not
Connect!
20130627183809, arg1=F:\\proc\work\D861400da000041e3.smd : Could Not
Connect!
20130627183810, arg1=F:\\proc\work\D862600d70000425f.smd : Could Not
Connect!
20130627183811, arg1=F:\\proc\work\D864a00e900004346.smd : Could Not
Connect!
20130627183811, arg1=F:\\proc\work\D8615019b000041f4.smd : Could Not
Connect!
20130627183813, arg1=F:\\proc\work\D862900e900004282.smd : Could Not
Connect!
20130627183815, arg1=F:\\proc\work\D863d015800004306.smd : Could Not
Connect!
20130627183817, arg1=F:\\proc\work\D860301580000416f.smd : Could Not
Connect!
20130627183818, arg1=F:\\proc\work\D862300e900004255.smd : Could Not
Connect!
20130627183819, arg1=F:\\proc\work\D862900e600004281.smd : Could Not
Connect!
20130627183819, arg1=F:\\proc\work\D864b00d700004357.smd : XCI Error!:
FileError snf_EngineHandler::scanMessageFile() Open/Seek
20130627183819, arg1=F:\\proc\work\D864800d70000433c.smd : XCI Error!:
FileError snf_EngineHandler::scanMessageFile() Open/Seek
20130627183819, arg1=F:\\proc\work\D8619017300004205.smd : XCI Error!:
FileError snf_EngineHandler::scanMessageFile() Open/Seek
20130627183819, arg1=F:\\proc\work\D861d017700004230.smd : XCI Error!:
FileError snf_EngineHandler::scanMessageFile() Open/Seek
20130627183819, arg1=F:\\proc\work\D8641016d00004310.smd : XCI Error!:
FileError snf_EngineHandler::scanMessageFile() Open/Seek
20130627183819, arg1=F:\\proc\work\D865000e600004363.smd : XCI Error!:
FileError snf_EngineHandler::scanMessageFile() Open/Seek
20130627183819, arg1=F:\\proc\work\D865000e100004361.smd : XCI Error!:
FileError snf_EngineHandler::scanMessageFile() Open/Seek
20130627183819, arg1=F:\\proc\work\D85fe00e600004152.smd : XCI Error!:
FileError snf_EngineHandler::scanMessageFile() Open/Seek
20130627183819, arg1=F:\\proc\work\D8610016d000041c5.smd : XCI Error!:
FileError snf_EngineHandler::scanMessageFile() Open/Seek
20130627183819, arg1=F:\\proc\work\D861f00ce00004231.smd : XCI Error!:
FileError snf_EngineHandler::scanMessageFile() Open/Seek
20130627183819, arg1=F:\\proc\work\D8644018b00004320.smd : XCI Error!:
FileError snf_EngineHandler::scanMessageFile() Open/Seek
20130627183819, arg1=F:\\proc\work\D86150173000041ef.smd : XCI Error!:
FileError snf_EngineHandler::scanMessageFile() Open/Seek
20130627183819, arg1=F:\\proc\work\D864100da00004314.smd : XCI Error!:
FileError snf_EngineHandler::scanMessageFile() Open/Seek
20130627183819, arg1=F:\\proc\work\D8644009500004329.smd : XCI Error!:
FileError snf_EngineHandler::scanMessageFile() Open/Seek
20130627183819, arg1=F:\\proc\work\D8633016d000042c1.smd : XCI Error!:
FileError snf_EngineHandler::scanMessageFile() Open/Seek
20130627183819, arg1=F:\\proc\work\D861200da000041d1.smd : XCI Error!:
FileError snf_EngineHandler::scanMessageFile() Open/Seek
20130627183819, arg1=F:\\proc\work\D865700d700004376.smd : XCI Error!:
FileError snf_EngineHandler::scanMessageFile() Open/Seek
20130627183819, arg1=F:\\proc\work\D862c01770000429a.smd : XCI Error!:
FileError snf_EngineHandler::scanMessageFile() Open/Seek
20130627183819, arg1=F:\\proc\work\D86360170000042ce.smd : XCI Error!:
FileError snf_EngineHandler::scanMessageFile() Open/Seek
20130627183819, arg1=F:\\proc\work\D86360183000042d0.smd : XCI Error!:
FileError snf_EngineHandler::scanMessageFile() Open/Seek
20130627183819, arg1=F:\\proc\work\D8655019700004373.smd : XCI Error!:
FileError snf_EngineHandler::scanMessageFile() Open/Seek
20130627183819, arg1=F:\\proc\work\D863600e9000042cd.smd : XCI Error!:
FileError snf_EngineHandler::scanMessageFile() Open/Seek
20130627183819, arg1=F:\\proc\work\D86350177000042c6.smd : XCI Error!:
FileError snf_EngineHandler::scanMessageFile() Open/Seek
20130627183819, arg1=F:\\proc\work\D862500ce0000425b.smd : XCI Error!:
FileError snf_EngineHandler::scanMessageFile() Open/Seek
20130627183819, arg1=F:\\proc\work\D864400da00004327.smd : XCI Error!:
FileError snf_EngineHandler::scanMessageFile() Open/Seek
20130627183819, arg1=F:\\proc\work\D8644016d00004325.smd : XCI Error!:
FileError snf_EngineHandler::scanMessageFile() Open/Seek
20130627183819, arg1=F:\\proc\work\D864800e10000433d.smd : XCI Error!:
FileError snf_EngineHandler::scanMessageFile() Open/Seek
20130627183819, arg1=F:\\proc\work\D862600e10000426e.smd : XCI Error!:
FileError snf_EngineHandler::scanMessageFile() Open/Seek
20130627183819, arg1=F:\\proc\work\D865c00e600004391.smd : XCI Error!:
FileError snf_EngineHandler::scanMessageFile() Open/Seek
20130627183819, arg1=F:\\proc\work\D8638018b000042e1.smd : XCI Error!:
FileError snf_EngineHandler::scanMessageFile() Open/Seek
20130627183819, arg1=F:\\proc\work\D865d009500004394.smd : XCI Error!:
FileError snf_EngineHandler::scanMessageFile() Open/Seek
20130627183819, arg1=F:\\proc\work\D8628017300004273.smd : XCI Error!:
FileError snf_EngineHandler::scanMessageFile() Open/Seek
20130627183819, arg1=F:\\proc\work\D8657018000004378.smd : XCI Error!:
FileError snf_EngineHandler::scanMessageFile() Open/Seek
20130627183819, arg1=F:\\proc\work\D86390177000042eb.smd : XCI Error!:
FileError snf_EngineHandler::scanMessageFile() Open/Seek
20130627183819, arg1=F:\\proc\work\D865d00d700004396.smd : XCI Error!:
FileError snf_EngineHandler::scanMessageFile() Open/Seek
20130627183819, arg1=F:\\proc\work\D8659015800004388.smd : XCI Error!:
FileError snf_EngineHandler::scanMessageFile() Open/Seek
20130627183819, arg1=F:\\proc\work\D865700e10000437b.smd : XCI Error!:
FileError snf_EngineHandler::scanMessageFile() Open/Seek
20130627183819, arg1=F:\\proc\work\D863500ce000042c7.smd : XCI Error!:
FileError snf_EngineHandler::scanMessageFile() Open/Seek
20130627183820, arg1=F:\\proc\work\D864a015800004348.smd : XCI Error!:
FileError snf_EngineHandler::scanMessageFile() Open/Seek
20130627183820, arg1=F:\\proc\work\D864800e600004343.smd : XCI Error!:
FileError snf_EngineHandler::scanMessageFile() Open/Seek
20130627183820, arg1=F:\\proc\work\D865b01970000438f.smd : XCI Error!:
FileError snf_EngineHandler::scanMessageFile() Open/Seek
20130627183820, arg1=F:\\proc\work\D865c016d00004392.smd : XCI Error!:
FileError snf_EngineHandler::scanMessageFile() Open/Seek
20130627183820, arg1=F:\\proc\work\D865b00e10000438c.smd : XCI Error!:
FileError snf_EngineHandler::scanMessageFile() Open/Seek
20130627183820, arg1=F:\\proc\work\D865b00e90000438b.smd : XCI Error!:
FileError snf_EngineHandler::scanMessageFile() Open/Seek
20130627183820, arg1=F:\\proc\work\D864a017d00004347.smd : XCI Error!:
FileError snf_EngineHandler::scanMessageFile() Open/Seek
20130627183820, arg1=F:\\proc\work\D86380180000042df.smd : XCI Error!:
FileError snf_EngineHandler::scanMessageFile() Open/Seek
20130627183820, arg1=F:\\proc\work\D861901770000420d.smd : XCI Error!:
FileError snf_EngineHandler::scanMessageFile() Open/Seek
20130627183820, arg1=F:\\proc\work\D865700950000437a.smd : XCI Error!:
FileError snf_EngineHandler::scanMessageFile() Open/Seek
20130627183820, arg1=F:\\proc\work\D8629017700004280.smd : XCI Error!:
FileError snf_EngineHandler::scanMessageFile() Open/Seek
20130627183820, arg1=F:\\proc\work\D8638016d000042db.smd : XCI Error!:
FileError snf_EngineHandler::scanMessageFile() Open/Seek
20130627183820, arg1=F:\\proc\work\D862b00e600004292.smd : XCI Error!:
FileError snf_EngineHandler::scanMessageFile() Open/Seek
20130627183820, arg1=F:\\proc\work\D86380173000042e0.smd : XCI Error!:
FileError snf_EngineHandler::scanMessageFile() Open/Seek
20130627183820, arg1=F:\\proc\work\D865d018000004393.smd : XCI Error!:
FileError snf_EngineHandler::scanMessageFile() Open/Seek
20130627183820, arg1=F:\\proc\work\D86310177000042b1.smd : XCI Error!:
FileError snf_EngineHandler::scanMessageFile() Open/Seek
20130627183820, arg1=F:\\proc\work\D85b100e300003f7f.smd : XCI Error!:
FileError snf_EngineHandler::scanMessageFile() Open/Seek


I am looking to retool presently just because it's time.  So if you are
convinced that this is due to low resources, don't concern yourself with it.

Matt



On 6/28/2013 10:36 AM, Pete McNeil wrote:

On 2013-06-27 20:01, Matt wrote:

I'm attaching a snippet of my log.  About 100 lines past the start, where
you see a smattering of error messages, you then see a large block of them
while the Sniffer service is restarting, and then after that no errors at
all.  There have in fact been no errors at all in several hours since this
restart of Sniffer. 


I can promise you that the error you're reporting comes directly from a
problem with the file system. Here is the code where that error is
generated. Put simply the code tries to open the file and determine it's
size. If that doesn't work it throws the ERROR_MSG_FILE exception in one of
two forms -- that's what ends up in the log.




    try 
{ 
// Try opening the message file.
 
MessageFile.open(MessageFilePath.c_str(), ios::in | ios::binary); 
// Open the file, binary mode.
 
MessageFile.seekg(0, ios::end); 
// Find the end of the file,
 
MessageFileSize = MessageFile.tellg(); 
// read that position as the size,
 
MessageFile.seekg(0, ios::beg); 
// then go back to the beginning.
 
MyScanData.ScanSize = MessageFileSize; 
// Capture the message file size.
    }
    catch(...) 
{                                                                // Trouble?
Throw FileError.
 
MyRulebase->MyLOGmgr.logThisError( 
// Log the error.
 
MyScanData, "scanMessageFile().open",
          snf_ERROR_MSG_FILE, "ERROR_MSG_FILE"
        );
        throw 
FileError("snf_EngineHandler::scanMessageFile() 
Open/Seek");
    }
 
    if(0 >= 
MessageFileSize) {                                                  //
Handle zero length files.
 
MessageFile.close(); 
// No need to keep this open.
 
MyRulebase->MyLOGmgr.logThisError( 
// Log the error.
 
MyScanData, "scanMessageFile().isFileEmpty?",
          snf_ERROR_MSG_FILE, "ERROR_MSG_FILE"
        );
        throw 
FileError("snf_EngineHandler::scanMessageFile() 
FileEmpty!");
    }


Another clue is that in the log snippet you provide, there are hints of a
problem brewing when there are sporadic instances of this error. Then, when
there is a large block -- virtually all requests to open the files for scan
are rejected by the OS. Either something made those files unavailable, or
the OS was unable to handle the request. I find it interesting also that the
time required to report the error started at about 172 milliseconds and
continued to climb to 406, 578, and then 656 before the restart.

SNF does not make log entries in the classic log during a restart, by the
way.

Note also the timestamps associated with these events and you can see that
the event was precipitated by a dramatic rise in message rates. The first
part of your log seems to indicate about 7-10 messages per second. During
the large block of errors, the message rate appears to have been in excess
of 120 (I counted approximately 126 at timestamp 20130627183819). That's an
increase at least an order of magnitude higher than the rate that was
causing sporadic errors.

I suspect based on the data you have provided that something on your system
generated a very large spike of activity that your IO subsystem was unable
to manage and this caused snf scans to fail because snf was unable to open
the files it was asked to scan.

Your restart of SNF apparently coincided with the event, but since all of
the SMD file names are unique during the event, and since SNF has no way to
generate scan requests on it's own, SNF does not appear to have been the
cause of the event in any way. It was able to record the event, none the
less.

So the question in my mind now is: 

* Is there a way to improve your IO subsystem so that it can gain some
headroom above 10 msg/sec?
* What caused the sudden dramatic spike that led to this episode?

On most tiny systems I monitor, scan times are generally < 100 ms. On your
system they are frequently in excess of 400 ms which leads me to believe
your system is a bit underpowered for it's current load.

Hope this helps,
_M




-- 
Pete McNeil
Chief Scientist
ARM Research Labs, LLC
www.armresearch.com <http://www.armresearch.com> 
866-770-1044 x7010
twitter/codedweller
#############################################################
 
This message is sent to you because you are subscribed to
 
  the mailing list mailto:sniffer@sortmonster.com.
 
This list is for discussing Message Sniffer,
 
Anti-spam, Anti-Malware, and related email topics.
 
For More information see http://www.armresearch.com
 
To unsubscribe, E-mail to: mailto:sniffer-...@sortmonster.com
 
To switch to the DIGEST mode, E-mail to
mailto:sniffer-dig...@sortmonster.com
 
To switch to the INDEX mode, E-mail to mailto:sniffer-in...@sortmonster.com
 
Send administrative queries to  mailto:sniffer-requ...@sortmonster.com
 

 

 

Reply via email to