Getting back to basics, you might look in the task scheduler to see if
there are any re-occurring tasks.  In addition, you might run the "at"
command at the command prompt to see if there are any tasks.   You might
also check the task scheduler's logs.  Something with this consistency
almost screams scheduled task. 

-----Original Message-----
From: Stephen Wimberly [mailto:[EMAIL PROTECTED] 
Sent: Wednesday, July 16, 2008 1:34 PM
To: NT System Admin Issues
Subject: RE: Disconnected on a schedule???

We will "un-team" in the next couple of days as a test; but keep in mind
the SQL Server is teamed using the same NICs as well with no issues,
that's why it hasn't been suspect yet.

I'm going to look into the firmware tomorrow morning when we have
scheduled downtime, thanks for mentioning.

As for Software firewall; we normally run the Windows firewall, but
turned that off for testing with no change.

The problem occurred again today at 1:15 PM.  It seems that Windows
Explorer 'freezes' on almost all domain computers and no one can access
their file shares for a few seconds, until a reconnect can be
established.  One diagnostic script we have running appends a text file
on the server every 15 seconds and during the outage could not append
for a full five minutes!

Network ports are not ours to swap, but our network team.  Once they
give the word we could try that.

There are hardware firewalls at play as well; the firewall team is
looking into those to determine possible issues with load balancing,
etc.

Thanks for your suggestions!



-----Original Message-----
From: Miller Bonnie L. [mailto:[EMAIL PROTECTED]
Sent: Wednesday, July 16, 2008 1:42 PM
To: NT System Admin Issues
Subject: RE: Disconnected on a schedule???

Hmm.. sounds like it's already been set then, but I don't know as I've
always done both the reg entry and the RSS on the Bcom NIC itself.  We
also are not using teaming at the moment, so I don't know if that might
have a separate issue.

Just re-read your post.  I see you mentioned all drivers updated, but
how about firmware?

Are you able to swap a network port the file server is using with the
SQL server that works?  What else is running on your file servers that
is the same across both--any software firewalls?

-----Original Message-----
From: Stephen Wimberly [mailto:[EMAIL PROTECTED]
Sent: Wednesday, July 16, 2008 8:23 AM
To: NT System Admin Issues
Subject: RE: Disconnected on a schedule???

All the registry entries are as you have them.... Although; my "Broadcom
BCM5708C NetXtreme II GigE" cards were set to ENABLE 'Receive Side
Scaling'.
I changed them to 'Disable'.  Each card disabled for a moment, then auto
re-enabled; so I assume this does not need a restart.

These servers have teamed NICs; all our servers do.  The BACS (BroadCom
Advanced Control Suite) is set up for switch failover as each NIC is
physically plugged to a different switch for failover.

-----Original Message-----
From: Miller Bonnie L. [mailto:[EMAIL PROTECTED]
Sent: Wednesday, July 16, 2008 10:29 AM
To: NT System Admin Issues
Subject: RE: Disconnected on a schedule???

They're in the same area of the registry--My .reg file that I import
looks like this:

Windows Registry Editor Version 5.00

[HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Services\Tcpip\Parameters]
"EnableTCPA"=dword:00000000
"EnableRSS"=dword:00000000
"EnableTCPChimney"=dword:00000000

Also, on the Broadcom NIC(s) properties, look at the advanced tab.  Make
sure "Receive Side Scaling" is set to Disable.

I haven't done the netsh method, but I understand that can change it
w/out needing a server reboot.

-Bonnie

-----Original Message-----
From: Stephen Wimberly [mailto:[EMAIL PROTECTED]
Sent: Wednesday, July 16, 2008 7:23 AM
To: NT System Admin Issues
Subject: RE: Disconnected on a schedule???

Thanks Bonnie!

The TCP Chimney options are off!
(I had to look, @
HKLM\System\CurrentControlSet\Services\Tcpip\Parapeters\EnableTCPChimney
=0
I've never configured them either way!)

The SNP I don't know how to check.  I see where I can use a netsh to set
it to disabled, but how would I see its current state?




-----Original Message-----
From: Miller Bonnie L. [mailto:[EMAIL PROTECTED]
Sent: Wednesday, July 16, 2008 8:56 AM
To: NT System Admin Issues
Subject: RE: Disconnected on a schedule???

Any kind of backup or snapshot taking place at those times?  Although I
can't say this would happen like clockwork, have you already disabled
the Chimney/SNP network options on those servers?



-Bonnie



From: Stephen Wimberly [mailto:[EMAIL PROTECTED]
Sent: Wednesday, July 16, 2008 5:51 AM
To: NT System Admin Issues
Subject: Disconnected on a schedule???



We have workstations that appear to be losing connection to the file
share on the server at almost precise times, every six hours.  7 AM, 1
PM, 7 PM, 1 AM; Repeat.



The event logs on the workstation and servers are clean, Domain
controllers and file share server.  So I assume the loss is not long
enough for the OS to recognize it.  Although we have a custom
application running on many machines that can't seem to handle the brief
outage and fails like clockwork.  The application vendor tells us it has
a sixty second timeout before it will fail; certainly long enough to
handle any brief disconnect.



Network traces (using wireshark) from the server to workstation and
workstation to server do not show any sign of failure.



A script that updates a text file on the server every fifteen seconds
does show the failure, it fails to update the text file on the server
for up to four _minutes_ at a time!  Although during the four minute
failure period it's able to update once or twice during the outage, so
it's not a total blackout.



Workstations map a drive to the file share using a DFS path; ie:
\\domain\share <file:///\\domain\share> .  So we tested a direct mapping
using \\server\share <file:///\\server\share> , and we get the same
result.



We mapped drives to two different file servers, each file server is in a
different building on different ends of campus.  The workstations used
four test drive mappings, two for each server, one DFS on each server
and one direct for each server.  All four drive mappings failed at the
same time.



The connection to the SQL server is never lost.  The SQL server is
plugged into the same network switch as the file server.



The Windows Domain has no trusts; it's a single domain forest.  There
are no services on any server with a six hour schedule that we know of.
Backup runs daily at midnight and completes prior to 7 AM.  Virus scan
is still running at the 7 AM hour, but is long since complete by the 1
PM hour.



Both file servers are Dell PE 2950 running Windows Server 2003 R2; All
drivers seem up to date with Dell's support site.



Workstations are a variety of makes, running either Windows XP Pro SP2,
Windows XP Pro SP3 and Windows Vista SP1 and are scattered all over
campus on different network subnets.



Our network department is telling us that the network is fine, it's
either a workstation or a server issue.



Anyone seen this type of thing before???



Thanks!


~ Upgrade to Next Generation Antispam/Antivirus with Ninja!    ~
~ <http://www.sunbelt-software.com/SunbeltMessagingNinja.cfm>  ~


~ Upgrade to Next Generation Antispam/Antivirus with Ninja!    ~
~ <http://www.sunbelt-software.com/SunbeltMessagingNinja.cfm>  ~

~ Upgrade to Next Generation Antispam/Antivirus with Ninja!    ~
~ <http://www.sunbelt-software.com/SunbeltMessagingNinja.cfm>  ~

Reply via email to