Pete,
 
The speed problem has been found. McAfee Netshield 4.51 was making our server RIDICULOUSLY slow, despite the fact that we tried excluding the Sniffer folder and even disabling the service from the tray-icon. Upgrading to Virusscan Enterprise 7.x fixed our problem and our performance levels are in the regions that you mentioned.
 
Thanks for thinking along!
 
 
Groet, (regards)
------------------------------------------
ing. Michiel Prins bsc   [EMAIL PROTECTED]
SOS Small Office Solutions / Reject / ****
Wannepad 27   -   1066 HW   -    Amsterdam
t.+31(0)20-4082627  -  f.+31-(0)20-4082628
------------------------------------------
Consultancy -  Installation -  Maintenance
Network Security   -  Internet  -   E-mail
Software Development -  Project Management
------------------------------------------
 
 


From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Michiel Prins
Sent: donderdag 8 april 2004 21:11
To: [EMAIL PROTECTED]
Subject: RE: [sniffer] Final beta (b2) for snfrv2r3

Preliminary tests show there's no I/O problem but I'll do some additional benchmarking here and get back to you on this.
 
Groet, (regards)
------------------------------------------
ing. Michiel Prins bsc   [EMAIL PROTECTED]
SOS Small Office Solutions / Reject / ****
Wannepad 27   -   1066 HW   -    Amsterdam
t.+31(0)20-4082627  -  f.+31-(0)20-4082628
------------------------------------------
Consultancy -  Installation -  Maintenance
Network Security   -  Internet  -   E-mail
Software Development -  Project Management
------------------------------------------
 
 


From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Pete McNeil
Sent: woensdag 7 april 2004 17:38
To: [EMAIL PROTECTED]
Subject: RE: [sniffer] Final beta (b2) for snfrv2r3

Extraordinary...
Compare with a snippet from our IMail/NT4 test platform (severely underpowered)...

snf2beta 20040407140913 D0b86122.SMD 30 90 Final 75148 63 0 6891 68
snf2beta 20040407140913 D0b8614e.SMD 90 140 Final 103691 57 0 8878 72
snf2beta 20040407140914 D0b88122.SMD 40 141 Final 103689 57 0 9003 71
snf2beta 20040407140915 D0b880b6.SMD 90 20 Final 106244 52 0 817 65
snf2beta 20040407140916 D0b8a0de.SMD 40 210 Final 104044 52 0 8779 76
snf2beta 20040407140917 D0b8b122.SMD 30 60 Final 70077 53 0 3727 73
snf2beta 20040407140920 D0b8e0b6.SMD 20 40 Clean 0 0 0 2958 54
snf2beta 20040407140927 D0b960b6.SMD 30 80 Final 30439 54 0 3885 73
snf2beta 20040407140934 D0b930b6.SMD 20 40 Clean 0 0 0 2647 67
snf2beta 20040407140935 D0b9e0a8.SMD 20 130 Final 73558 52 0 6242 80
snf2beta 20040407140942 D0ba414e.SMD 20 160 Final 105444 52 0 8252 87
snf2beta 20040407140942 D0ba40de.SMD 201 60 Final 105825 52 0 3351 68
snf2beta 20040407140947 D0baa0b6.SMD 30 121 Final 30439 54 0 3898 72
snf2beta 20040407140947 D0baa14e.SMD 40 80 Final 66835 52 0 5358 64
snf2beta 20040407140952 D0bad122.SMD 20 110 Final 97422 57 0 6104 79
snf2beta 20040407140952 D0bae0d2.SMD 30 81 Final 83761 57 0 4790 72
snf2beta 20040407140952 D0bac0b6.SMD 40 90 Final 1686 48 0 5415 80
snf2beta 20040407141003 D0bb90b6.SMD 20 40 Final 49992 54 0 2186 69

The first thing I notice is that the setup times (first number) on your system are consistently large. According to your log entries it is taking a quarter of a second to scan the working directory for a job... That's a LOT of time for a directory scan to take.

The message scan itself doesn't seem to be out of range.

The next thing I notice is that your messages arrive several seconds apart consistently. I see 10 sec, 16, 12, 4, 10, etc... In our log we frequently scan several messages in the same second.

I see two things going on based on this data:

I suspect your system is I/O bound. There is no reason that a directory scan should take more than a few tens of milliseconds except occasionally... That puts your numbers out by nearly an order of magnitude (compare 20s & 30s w/ 109, 187, 280+!).

Be sure that Sniffer's working directory does not have any extra files in it. Sniffer instances measure their apparent work load by counting the number of files in their working directory... The theory is that aside from a handful of necessary files the rest are jobs waiting to be processed... so if the number of files is large then the load must be high and so a Sniffer instance should be prepared to wait a bit longer for service.

Sniffer should be running in it's own directory with no other files present that don't need to be there. Be sure to clean out any dead job files that might have built up with a prior error etc...

My thinking on I/O is that if it takes 100-280 msec to scan the directory for job files then it's likely to take quite a while to load any program - including the shell. This can explain the additional time you are seeing in your measurements. Under normal circumstances I would expect that operation to happen almost instantaneously since the Sniffer executable, command shell, and other files that must load should remain consistently in memory due to their being called so frequently. It's a good bet that much of your delay time is bound in this part of the equation.

The next place I think you're finding delays is in sleeping. There are several seconds between messages on your system consistently so Sniffer is going to sleep much of the time. If Sniffer can't find work for several seconds the poll delay times will expand accordingly. It's a good bet that the rest of the time in your 1.5 seconds is due to the fact that the next message you're going to process is 5-10 seconds away from the last.

After waiting 1 second the poll delay will be ~ 630ms
After about 2.5 seconds the poll delay will be ~ 1650ms
...
By the time you get beyond 5 seconds the poll delay will be 4000ms, so your average sleep time will be 2 secs. Based on this I think 1.5 seconds is not unlikely... on the other hand since the next message is likely to be 5 or more seconds away this should have no apparent effect on throughput, and since Sniffer is sleeping most of the time your system will have plenty of resources for other work and should be very responsive.

If you are uncomfortable with this part of the process you can now put a ceiling on the poll time by setting the MaxPollTime parameter in your .cfg file. You indicate that you've set this to 50ms (this is very aggressive!). Does the monitor output from the beta reflect the new time delay? If it does not then the timing scenarios I've discussed above would still apply.

Another thought or two. When you are not running a persistent instance of sniffer you are guaranteed that each instance will have to load the rulebase file simply because you have so much time between each message.

If I've understood your post correctly the increase in performance is due to the fact that the rulebase file will not be reloaded once a persistent instance of Sniffer is present. That much is expected. What is interesting though is that the setup/queue times remain high even after that load is removed - this seems to point again at an I/O limitation on your system.

~500ms processing times for normal Sniffer processing are understandable given the numbers that are coming from your system. Considering that it's taking 260+ to load the rulebase it's reasonable to expect that launching the shell will take a comparable amount of time.

The extra delays seen when a persistent Sniffer instance is running are most likely a combination of the timing issues described above and more significantly the file I/O delays that seem to be present on your system.

Consider that a Sniffer client will scan the directory to see if there is a server present. Let's give that 200ms. Then it will post it's job file. The server instance will scan the directory looking for a job and will find it in about 200ms. The file will be loaded and scanned in about 50ms. Then the server instance will post the finished job result. The client will have waited at least 200ms by now, so it might sleep for another 200 (even more if there are a lot of files in the working directory). Then it will scan for it's result and find it in about 200ms.

So far we've got 200+200+50+200+200 = 850ms. That's a sizeable chunk of your 1.5 seconds.

Solve the I/O problem and your single message processing time will shrink quite a bit... and you might prevent some headaches later on... In my mind, there's no good reason for a directory scan to consistently take 200ms on today's hardware (or even yesterdays as indicated by my clunky p2/450, 256Mram, 4500rpm IDE system can do this in under 10ms!!). I think this may be a big problem waiting to bite you in other ways. I suspect that there are a lot of ugly things happening on your system that you may not yet know about.

Have you defragmented your HD recently?
How much free space do you have?
What other processes are running (especially heavy I/O work)?

That said... if it takes 1.5 seconds to process a message that arrives once every 5-10 seconds then the processing time is "in the noise" and does not represent a throughput problem. To put this in the frame of queue theory the service requirement is one message per 7 seconds and it takes 1.5 seconds to process the message then the "load" is ~22% indicating very short wait-times (relatively), rapid recovery from peaks, and no appreciable standing queue.

Sorry for the bandwidth and the rambling nature of this post - I was thinking out loud for much of it... (+ been up all night)

Hope this helps,
_M


At 07:14 AM 4/7/2004, you wrote:
Hmmm, log file from sniffer shows significant increase in performance (up to 50% faster, see below). However, according to my own logs, the total time that sniffer takes is way longer. During non-persistent operation about 300 ms on top of what sniffer logs, which could be because of loading times of sniffer itself. When sniffer is persistent, 'loading' time is about 1.5 seconds.
 
My conclusion from this, is that when sniffer is running persistent, cpu usage and rulebase loading times are decreased but total execution time seems to have tripled from about 550 ms to about 1650 ms.
 
To calculate the total execution time, I store system time in ms just before and after ShellExecuteEx() and calculate the difference. That seems like an honest and reliable way to determine execution time for sniffer.
 
sniffer log:
h0t861s4 20040407080330 md50000581512.msg 265 32 Clean 0 0 0 2214 32
h0t861s4 20040407080340 md50000581513.msg 265 16 Clean 0 0 0 1503 35
h0t861s4 20040407080356 md50000581514.msg 282 78 Clean 0 0 0 13664 40
h0t861s4 20040407080408 md50000581515.msg 265 110 Clean 0 0 0 26929 44
h0t861s4 20040407080412 md50000581516.msg 281 32 Clean 0 0 0 2199 35
h0t861s4 20040407080422 md50000581517.msg 281 16 Final 33612 54 0 2520 40
h0t861s4 20040407080426 md50000581518.msg 250 31 Clean 0 0 0 2636 35
h0t861s4 20040407080431 md50000581519.msg 266 31 Clean 0 0 0 5913 41
h0t861s4 20040407080436 md50000581520.msg 188 46 Final 105667 52 0 3522 41
h0t861s4 20040407080446 md50000581521.msg 109 32 Clean 0 0 0 2152 36
h0t861s4 20040407080454 md50000581522.msg 125 47 Clean 0 0 0 4083 35
h0t861s4 20040407080506 md50000581523.msg 187 47 Clean 0 0 0 5205 32
h0t861s4 20040407080514 md50000581524.msg 188 47 Clean 0 0 0 5632 34
h0t861s4 20040407080524 md50000581525.msg 188 109 Clean 0 0 0 24763 43
h0t861s4 20040407080531 md50000581526.msg 188 47 Final 105667 52 0 2742 39
h0t861s4 20040407080538 md50000581527.msg 188 16 Clean 0 0 0 1967 35
h0t861s4 20040407080550 md50000581528.msg 187 125 Clean 0 0 0 24718 50
h0t861s4 20040407080557 md50000581529.msg 187 32 Clean 0 0 0 3236 34
h0t861s4 20040407080607 md50000581530.msg 125 31 Clean 0 0 0 2918 32
h0t861s4 20040407080620 md50000581531.msg 187 32 Final 105073 50 0 2374 44
h0t861s4 20040407080632 md50000581532.msg 188 15 Clean 0 0 0 3611 33
h0t861s4 20040407080638 md50000581533.msg 125 125 Clean 0 0 0 27568 45
h0t861s4 20040407080650 md50000581534.msg 187 78 Clean 0 0 0 16155 33
I'm really puzzled about the cause for the extra delays.
 
 
Groet, (regards)
------------------------------------------
ing. Michiel Prins bsc   [EMAIL PROTECTED]
SOS Small Office Solutions / Reject / ****
Wannepad 27   -   1066 HW   -    Amsterdam
t.+31(0)20-4082627  -  f.+31-(0)20-4082628
------------------------------------------
Consultancy -  Installation -  Maintenance
Network Security   -  Internet  -   E-mail
Software Development -  Project Management
------------------------------------------
 
 


From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]] On Behalf Of Pete McNeil
Sent: woensdag 7 april 2004 11:21
To: [EMAIL PROTECTED]
Subject: RE: [sniffer] Final beta (b2) for snfrv2r3

What does the sniffer log show during this time?
_M

At 04:48 AM 4/7/2004, you wrote:

Pete,

Despite my suggestions with less polling time, I can't seem to get the persistent version to speed up my message processing. I've copied part of my custom log file below. Bold numbers are the amount of ms it takes to execute sniffer (timed by an external program that executes it). Persistent sniffer was turned ON on the
blue lines. I've set max polling time to 50ms for this test. However, scanning takes more than a second longer...

0,"2004-04-07 10:03:31",md50000581512.msg,672,546,78,0,2223,0,0,3,1
0,"2004-04-07 10:03:40",md50000581513.msg,657,531,93,0,1490,0,0,3,1
0,"2004-04-07 10:03:57",md50000581514.msg,734,594,93,0,14601,0,0,3,1
0,"2004-04-07 10:04:09",md50000581515.msg,797,624,93,0,29398,0,0,3,1
0,"2004-04-07 10:04:13",md50000581516.msg,686,562,93,0,42408,2,0,3,1
0,"2004-04-07 10:04:22",md50000581517.msg,749,547,93,0,2611,1,0,3,1
0,"2004-04-07 10:04:26",md50000581518.msg,656,532,93,0,43402,2,0,3,1
0,"2004-04-07 10:04:32",md50000581519.msg,671,547,93,0,6022,0,0,3,1
0,"2004-04-07 10:04:37",md50000581520.msg,1905,1672,92,0,3564,1,0,3,1
0,"2004-04-07 10:04:47",md50000581521.msg,1811,1688,93,0,2152,0,0,3,1
0,"2004-04-07 10:04:55",md50000581522.msg,1811,1688,78,0,4122,0,0,3,1
0,"2004-04-07 10:05:05",md50000581523.msg,1843,1671,93,0,5250,0,0,3,1
0,"2004-04-07 10:05:13",md50000581524.msg,1811,1688,78,0,5677,0,0,3,1
0,"2004-04-07 10:05:21",md50000581525.msg,1797,1671,93,0,273387,0,0,3,1
0,"2004-04-07 10:05:30",md50000581526.msg,1891,1671,93,0,2760,1,0,3,1
0,"2004-04-07 10:05:37",md50000581527.msg,1811,1672,93,0,36384,2,0,3,1
0,"2004-04-07 10:05:49",md50000581528.msg,1796,1656,93,0,27065,0,0,3,1
0,"2004-04-07 10:05:56",md50000581529.msg,1812,1686,79,0,3554,2,0,3,1
0,"2004-04-07 10:06:06",md50000581530.msg,1843,1671,78,0,44939,2,0,3,1
0,"2004-04-07 10:06:19",md50000581531.msg,1874,1655,94,0,2363,1,0,3,1
0,"2004-04-07 10:06:31",md50000581532.msg,1811,1671,94,0,3670,0,0,3,1
0,"2004-04-07 10:06:39",md50000581533.msg,1812,1686,78,0,30054,0,0,3,1
0,"2004-04-07 10:06:52",md50000581534.msg,1812,1671,93,0,21543,0,0,3,1
0,"2004-04-07 10:07:10",md50000581535.msg,1891,1672,78,0,3390,0,0,3,1
0,"2004-04-07 10:07:15",md50000581536.msg,594,452,93,0,3184,0,0,3,1
0,"2004-04-07 10:07:25",md50000581537.msg,687,532,93,0,3971,0,0,3,1
0,"2004-04-07 10:07:29",md50000581538.msg,671,532,78,0,3954,0,0,3,1
0,"2004-04-07 10:07:35",md50000581539.msg,686,547,78,0,4251,0,0,3,1
0,"2004-04-07 10:07:39",md50000581540.msg,672,546,94,0,42536,2,0,3,1
0,"2004-04-07 10:07:43",md50000581541.msg,686,562,93,0,2573,0,0,3,1
0,"2004-04-07 10:07:47",md50000581542.msg,780,640,78,0,144294,0,0,3,1
0,"2004-04-07 10:07:57",md50000581543.msg,718,594,94,0,2628,0,0,3,1

I'm sort of puzzled about what's going on here... can you help?


Groet, (regards)
------------------------------------------
ing. Michiel Prins bsc   [EMAIL PROTECTED]
SOS Small Office Solutions / Reject / ****
Wannepad 27   -   1066 HW   -    Amsterdam
t.+31(0)20-4082627  -  f.+31-(0)20-4082628
------------------------------------------
Consultancy -  Installation -  Maintenance
Network Security   -  Internet  -   E-mail
Software Development -  Project Management
------------------------------------------


-----Original Message-----
From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]] On Behalf Of Pete McNeil
Sent: woensdag 7 april 2004 5:36
To: [EMAIL PROTECTED]
Subject: [sniffer] Final beta (b2) for snfrv2r3

Hello folks,

I'm posting the final beta for version 2-3.
You can get it at the following location:

http://www.sortmonster.com/MessageSniffer/Betas/snfrv2r3b2-dist.zip

This version still spits out some monitoring data - the final production version will have this removed - but there will be no other changes (hopefully). This beta has all of the V2-3 features implemented.

Some of the nicest (I think) are:

- Stable Persistent Server Instance w/ Command Line controls.
- Command line log rotation.
- A shorter logging format is available.
- RulePanic feature for handling critical FP problems.

This version has been running and stable for the past few hours in testing and on our NT test server and appears solid as a rock. Please let me know how it does for you. Here is the major feature list from an excerpt in the main source file...

// 20040406 _M - Finished feature set for V2-3b2. This is the production candidate for // verison 2-3. Features now compete:
//
// * Persistent Server Instance Capability // * Implemented configuration file for extended features.
// * Three logging modes - Full, NoDups, SingleLine.
// * Command line rotate feature - active log renamed & timestamped:
sniffer rotate
// * Command line reload feature for persistent server instances: sniffer reload // * Command line stop feature for persistent server instances: sniffer stop // * Tunable persistent server lifetime - program exit after ### secs.
// * Tunable instance persistence - program reload after ### secs.
// * Tunable max job polling time - dynamic from 0ms to ### msecs.
// * Implemented RulePanic table - temporarily block chronic FP rules.

The distribution contains a win32 executable (P2 or better) as well as the source which _should_ compile on any linux or bsd system (I didn't test it
- but I don't expect any trouble.)

It would be a good idea for all users to take a look at the testbed in this distribution. There is a full configuration file with a good deal of documentation and some .CMD files the show how to use the new reload, rotate, and stop command line options for persistent servers.

We will create documentation on our web site once this moves from beta to production.

This version installs just like the usual sniffer executable and will act just like it (except for log format features if you wish). The real benefits happen when you launch an additional instance of the program as a persistent server. See the previous thread on this. There are a lot of ways to do this including running it as a service using the win2k toolkit, or one of many third party utilities.

See:

http://www.mail-archive.com/[EMAIL PROTECTED]/msg00165.html

Everyone please report back. If there are no errors or problems for a few
days then I will strip out the monitoring code and create the final 2-3
distribution.

Good Luck!
Thanks!
_M


This E-Mail came from the Message Sniffer mailing list. For information and (un)subscription instructions go to http://www.sortmonster.com/MessageSniffer/Help/Help.html

Reply via email to