I just looked through my history of the SNFclient.exe.err log and I was
a bit off. There were two small occurrences before 5/22, and both were
so small they were without a doubt unnoticeable. Here's my history
since installation on 4/10/2012 with the number of occurrences of "Could
Not Connect!" noted:
11/19/2012 (949 occurrences) - * No backup in excess of 2 minutes
occurred
4/26/2012 (181 occurrences) - * No backup in excess of 2 minutes
occurred
5/22 (3,568 occurrences) - * No backup in excess of 2 minutes occurred
5/31 (13,745 occurrences)
6/3 (2,630 occurrences) - * No backup in excess of 2 minutes occurred
6/11 (26,194 occurrences)
6/13 (28,633 occurrences)
6/14 (83,342 occurrences)
6/17 (5,952 occurrences) - * No backup in excess of 2 minutes occurred
6/21 (10,894 occurrences)
6/27 (34,959 occurrences)
At peak times, we are probably doing between 30,000 and 40,000 messages
per hour. When this hit at peak hour yesterday, the server was backed
up 2 minutes within 10 minutes of this starting. If it happens outside
of business hours, it's likely not seen at all. The backup was resolved
within 1 hour using different Declude settings, but Sniffer wasn't
restarted for 2 hours and 45 minutes after the problem started. So
those ~35,000 occurrences were from 165 minutes of email.
After looking through my logs, it doesn't seem that I rebooted hardly
any of these times. I tend to not want to take the server down outside
of rush hour. I do tweak Declude for rush processing when backed up, so
Declude is restarted which will stop sending files to Sniffer for a
period of time. It seems that Sniffer is mostly healing itself without
restarting it's service, though maybe the updates will cause a service
restart/reset?
I also checked and found that we added that larger client on 6/1, and
outside of that, customer counts have been fairly consistent.
Matt
On 6/28/2013 2:56 PM, e...@protologic.com wrote:
Matt:
Coincidentally (I hope) this happened to us on the 22nd also. It did
not stop working completely although we didn't get the throughput you
did. We also saw the messages indicating it was not able to open the
file. Pretty much the same message as in your first post and not one
I've seen before.
Eric
Sent using SmarterSync Over-The-Air sync for iPad, iPhone, BlackBerry
and other SmartPhones. May use speech to text. If something seems odd
please don't hesitate to ask for clarification. E.&O.E.
On 2013-06-28, at 11:39 AM, Matt wrote:
> Eric,
>
> I'm guessing based on what you were seeing, that it was unrelated to
what I was seeing. Sniffer never actually died, it just got over 100
times slower, and 1/8th of the time it timed out. This never happened
before 5/22, and this same server has been there for years, and the
same installation of Sniffer for 2 years or so. I would think that if
the issue was I/O (under normal conditions), it would have happened
before 5/22 as there were clearly bursty periods often enough that my
own traffic didn't change dramatically enough so that it happened 4 to
5 times in one month.
>
> The server itself could have some issues that could be causing this.
Maybe the file system is screwy, or Windows itself, or memory errors,
or whatever.
>
> Matt
>
>
> On 6/28/2013 2:12 PM, E. H. (Eric) Fletcher wrote:
>> Matt:
>>
>> I mentioned in a previous post that we had experienced something
similar at
>> about that time and resolved it a day or so later by re-installing
sniffer
>> when service restarts, reboots and some basic troubleshooting did
not give
>> us the results we needed. At this point that still seems to have been
>> effective (about 5 days now).
>>
>> At the time, we did move things around to see whether it was
related to the
>> number of items in the queue or anywhere else within the structure
of the
>> mail system and found it made no difference. A single item arriving
in an
>> empty Queue was still not processed. CPU utilization was modest (single
>> digit across 4 cores) and disk I/O was lighter than usual as it
took place
>> over a weekend. Memory utilization was a little higher than I'd like to
>> see, we are addressing that now.
>>
>> Following a suggestion from another ISP, we moved the spool folders
onto a
>> RAM drive a couple of months ago. That has worked well for us, we
did rule
>> it out as the source of the problem by moving back onto the
conventional
>> hard disk during the last part of the troubleshooting and for the
first hour
>> or two following the reload. We are processing on the Ramdisk now
and have
>> been for over 4 days again.
>>
>> For what it's worth . . .
>>
>> Eric
>>
>>
>> -----Original Message-----
>> From: Message Sniffer Community [mailto:sniffer@sortmonster.com] On
Behalf
>> Of Matt
>> Sent: Friday, June 28, 2013 10:32 AM
>> To: Message Sniffer Community
>> Subject: [sniffer] Re: Slow processing times, errors
>>
>> Pete,
>>
>> Just after the restart of the Sniffer service, times dropped back
down into
>> the ms from 30+ seconds before, so what I am saying is that if I/O
was the
>> issue, it was merely the trigger for something that put the service
in a bad
>> state when it started. I/O issues are not persistent, but could
happen from
>> time to time I'm sure. Restarting Sniffer with a backlog of 2,500
messages
>> and normal peak traffic will not re-trigger the condition, and I press
>> Declude to run up to 300 messages at a time in situations like
that, and the
>> CPU's are pegged until the backlog clears. In the past, I restarted the
>> whole system, not knowing why it worked. During normal peak times
(without
>> bursts), the Declude is processing about 125 messages at a time
which take
>> an average of 6 seconds to fully process, and therefore Sniffer is
probably
>> handling only about 10 messages at a time (at peak).
>>
>> Since 5/22 I have seen 4 or 5 different events like this, and I
confirmed
>> that they are all present in the SNFclient.exe.err log.
>>
>> Matt
>>
>>
>>
>> On 6/28/2013 12:41 PM, Pete McNeil wrote:
>>> On 2013-06-28 12:10, Matt wrote:
>>>> I am looking to retool presently just because it's time. So if you
>>>> are convinced that this is due to low resources, don't concern
>>>> yourself with it.
>>> Ok. It makes sense that the ~200 messages all at once could have
>>> happend at the restart. SNFClient will keep trying for 30-90 seconds
>>> before it gives up and spits out it's error file. That's where your
>>> delays are coming from. SNF itself was clocking only about 100-800ms
>>> for all of the scans.
>>>
>>> The error result you report is exactly the one sent by SNF -- that it
>>> was unable to open the file.
>>>
>>> I am very sure this is resource related -- your scans should not be
>>> taking the amount of time they are and I suspect most of that time is
>>> eaten up trying to get to the files. The occasional errors of the same
>>> time are a good hint that IO is to blame.
>>>
>>> The new spam that we've seen often includes large messages -- so
>>> that's going to put a higher load on IO resources -- I'll bet that the
>>> increased volume and large message sizes are pushing IO over the edge
>>> or at least very close to it.
>>>
>>> Best,
>>>
>>> _M
>>
>>
>> #############################################################
>> This message is sent to you because you are subscribed to
>> the mailing list .
>> This list is for discussing Message Sniffer, Anti-spam,
Anti-Malware, and
>> related email topics.
>> For More information see http://www.armresearch.com To unsubscribe,
E-mail
>> to: To switch to the DIGEST mode, E-mail to
>> To switch to the INDEX mode, E-mail to
>> Send administrative queries to
>>
>>
>>
>>
>> #############################################################
>> This message is sent to you because you are subscribed to
>> the mailing list .
>> This list is for discussing Message Sniffer,
>> Anti-spam, Anti-Malware, and related email topics.
>> For More information see http://www.armresearch.com
>> To unsubscribe, E-mail to:
>> To switch to the DIGEST mode, E-mail to
>> To switch to the INDEX mode, E-mail to
>> Send administrative queries to
>
>
>
> #############################################################
> This message is sent to you because you are subscribed to
> the mailing list .
> This list is for discussing Message Sniffer,
> Anti-spam, Anti-Malware, and related email topics.
> For More information see http://www.armresearch.com
> To unsubscribe, E-mail to:
> To switch to the DIGEST mode, E-mail to
> To switch to the INDEX mode, E-mail to
> Send administrative queries to
>