I just looked through my history of the SNFclient.exe.err log and I was a bit off. There were two small occurrences before 5/22, and both were so small they were without a doubt unnoticeable. Here's my history since installation on 4/10/2012 with the number of occurrences of "Could Not Connect!" noted:

   11/19/2012 (949 occurrences) - * No backup in excess of 2 minutes
   occurred
   4/26/2012 (181 occurrences) - * No backup in excess of 2 minutes
   occurred
   5/22 (3,568 occurrences) - * No backup in excess of 2 minutes occurred
   5/31 (13,745 occurrences)
   6/3 (2,630 occurrences) - * No backup in excess of 2 minutes occurred
   6/11 (26,194 occurrences)
   6/13 (28,633 occurrences)
   6/14 (83,342 occurrences)
   6/17 (5,952 occurrences) - * No backup in excess of 2 minutes occurred
   6/21 (10,894 occurrences)
   6/27 (34,959 occurrences)

At peak times, we are probably doing between 30,000 and 40,000 messages per hour. When this hit at peak hour yesterday, the server was backed up 2 minutes within 10 minutes of this starting. If it happens outside of business hours, it's likely not seen at all. The backup was resolved within 1 hour using different Declude settings, but Sniffer wasn't restarted for 2 hours and 45 minutes after the problem started. So those ~35,000 occurrences were from 165 minutes of email.

After looking through my logs, it doesn't seem that I rebooted hardly any of these times. I tend to not want to take the server down outside of rush hour. I do tweak Declude for rush processing when backed up, so Declude is restarted which will stop sending files to Sniffer for a period of time. It seems that Sniffer is mostly healing itself without restarting it's service, though maybe the updates will cause a service restart/reset?

I also checked and found that we added that larger client on 6/1, and outside of that, customer counts have been fairly consistent.

Matt


On 6/28/2013 2:56 PM, e...@protologic.com wrote:
Matt:
Coincidentally (I hope) this happened to us on the 22nd also. It did not stop working completely although we didn't get the throughput you did. We also saw the messages indicating it was not able to open the file. Pretty much the same message as in your first post and not one I've seen before.
Eric




Sent using SmarterSync Over-The-Air sync for iPad, iPhone, BlackBerry and other SmartPhones. May use speech to text. If something seems odd please don't hesitate to ask for clarification. E.&O.E.

On 2013-06-28, at 11:39 AM, Matt wrote:

> Eric,
>
> I'm guessing based on what you were seeing, that it was unrelated to what I was seeing. Sniffer never actually died, it just got over 100 times slower, and 1/8th of the time it timed out. This never happened before 5/22, and this same server has been there for years, and the same installation of Sniffer for 2 years or so. I would think that if the issue was I/O (under normal conditions), it would have happened before 5/22 as there were clearly bursty periods often enough that my own traffic didn't change dramatically enough so that it happened 4 to 5 times in one month.
>
> The server itself could have some issues that could be causing this. Maybe the file system is screwy, or Windows itself, or memory errors, or whatever.
>
> Matt
>
>
> On 6/28/2013 2:12 PM, E. H. (Eric) Fletcher wrote:
>> Matt:
>>
>> I mentioned in a previous post that we had experienced something similar at >> about that time and resolved it a day or so later by re-installing sniffer >> when service restarts, reboots and some basic troubleshooting did not give
>> us the results we needed. At this point that still seems to have been
>> effective (about 5 days now).
>>
>> At the time, we did move things around to see whether it was related to the >> number of items in the queue or anywhere else within the structure of the >> mail system and found it made no difference. A single item arriving in an
>> empty Queue was still not processed. CPU utilization was modest (single
>> digit across 4 cores) and disk I/O was lighter than usual as it took place
>> over a weekend. Memory utilization was a little higher than I'd like to
>> see, we are addressing that now.
>>
>> Following a suggestion from another ISP, we moved the spool folders onto a >> RAM drive a couple of months ago. That has worked well for us, we did rule >> it out as the source of the problem by moving back onto the conventional >> hard disk during the last part of the troubleshooting and for the first hour >> or two following the reload. We are processing on the Ramdisk now and have
>> been for over 4 days again.
>>
>> For what it's worth . . .
>>
>> Eric
>>
>>
>> -----Original Message-----
>> From: Message Sniffer Community [mailto:sniffer@sortmonster.com] On Behalf
>> Of Matt
>> Sent: Friday, June 28, 2013 10:32 AM
>> To: Message Sniffer Community
>> Subject: [sniffer] Re: Slow processing times, errors
>>
>> Pete,
>>
>> Just after the restart of the Sniffer service, times dropped back down into >> the ms from 30+ seconds before, so what I am saying is that if I/O was the >> issue, it was merely the trigger for something that put the service in a bad >> state when it started. I/O issues are not persistent, but could happen from >> time to time I'm sure. Restarting Sniffer with a backlog of 2,500 messages
>> and normal peak traffic will not re-trigger the condition, and I press
>> Declude to run up to 300 messages at a time in situations like that, and the
>> CPU's are pegged until the backlog clears. In the past, I restarted the
>> whole system, not knowing why it worked. During normal peak times (without >> bursts), the Declude is processing about 125 messages at a time which take >> an average of 6 seconds to fully process, and therefore Sniffer is probably
>> handling only about 10 messages at a time (at peak).
>>
>> Since 5/22 I have seen 4 or 5 different events like this, and I confirmed
>> that they are all present in the SNFclient.exe.err log.
>>
>> Matt
>>
>>
>>
>> On 6/28/2013 12:41 PM, Pete McNeil wrote:
>>> On 2013-06-28 12:10, Matt wrote:
>>>> I am looking to retool presently just because it's time. So if you
>>>> are convinced that this is due to low resources, don't concern
>>>> yourself with it.
>>> Ok. It makes sense that the ~200 messages all at once could have
>>> happend at the restart. SNFClient will keep trying for 30-90 seconds
>>> before it gives up and spits out it's error file. That's where your
>>> delays are coming from. SNF itself was clocking only about 100-800ms
>>> for all of the scans.
>>>
>>> The error result you report is exactly the one sent by SNF -- that it
>>> was unable to open the file.
>>>
>>> I am very sure this is resource related -- your scans should not be
>>> taking the amount of time they are and I suspect most of that time is
>>> eaten up trying to get to the files. The occasional errors of the same
>>> time are a good hint that IO is to blame.
>>>
>>> The new spam that we've seen often includes large messages -- so
>>> that's going to put a higher load on IO resources -- I'll bet that the
>>> increased volume and large message sizes are pushing IO over the edge
>>> or at least very close to it.
>>>
>>> Best,
>>>
>>> _M
>>
>>
>> #############################################################
>> This message is sent to you because you are subscribed to
>> the mailing list .
>> This list is for discussing Message Sniffer, Anti-spam, Anti-Malware, and
>> related email topics.
>> For More information see http://www.armresearch.com To unsubscribe, E-mail
>> to: To switch to the DIGEST mode, E-mail to
>> To switch to the INDEX mode, E-mail to
>> Send administrative queries to
>>
>>
>>
>>
>> #############################################################
>> This message is sent to you because you are subscribed to
>> the mailing list .
>> This list is for discussing Message Sniffer,
>> Anti-spam, Anti-Malware, and related email topics.
>> For More information see http://www.armresearch.com
>> To unsubscribe, E-mail to:
>> To switch to the DIGEST mode, E-mail to
>> To switch to the INDEX mode, E-mail to
>> Send administrative queries to
>
>
>
> #############################################################
> This message is sent to you because you are subscribed to
> the mailing list .
> This list is for discussing Message Sniffer,
> Anti-spam, Anti-Malware, and related email topics.
> For More information see http://www.armresearch.com
> To unsubscribe, E-mail to:
> To switch to the DIGEST mode, E-mail to
> To switch to the INDEX mode, E-mail to
> Send administrative queries to
>


Reply via email to