Cluster is in a specific VLAN, the new switch doesn't have that VLAN
sourced to it yet, but it will after a change. I want to try that first,
before we move further. 

 

I did catch a few things on google about NC373i NIC's and network
issues. The only other thing I can see is that the NIC firmware and the
BIOS firmware needs an update on each node ( about one revision behind.)


 

But both servers connected into the same switch, seeing the same problem
at the same time, doesn't sound like a server issue to me, at least not
right now. 

 

Z

 

Edward E. Ziots

CISSP, Network +, Security +

Network Engineer

Lifespan Organization

Email:[email protected]

Cell:401-639-3505

 

From: William Robbins [mailto:[email protected]] 
Sent: Friday, February 18, 2011 11:46 AM
To: NT System Admin Issues
Subject: Re: Sounding board on issue we are seeing with a Windows 2003
Cluster with SQL 2005

 

I agree I would think if it were a physical NIC problem you could see
the dropped packets...

So your private NIC's are direct connected?  Are you running a trace on
both machines watching the private NIC's?  

The specific timeframe of the disconnect also doesn't lend itself to a
NIC problem either.  Hopefully you can change switches to see if that
changes anything.  Any hopes to get the cluster on an isolated switch?
Is it in a specific VLAN now?

 - WJR



On Fri, Feb 18, 2011 at 10:35, Ziots, Edward <[email protected]>
wrote:

Not going to tell anymore STFU, its why I am asking for a sounding
board, right now I am at whits end, I also agree on the switch issue, I
ran across a few internet posts complaining about NC373i and HP
Broadcomm NIC's and lost packets, and I got some action items to update
the BIOS on the server and the NIC Firmware to the latest support
version and see if that helps. But I would defintely like to try moving
to another switch first to eliminate the switch as the issue.

Here is the real kicker though, if it was a NIC issue ( Physical NIC
issue), then wouldn't I also see the dropped packets on the private NIC,
which we didn't see. ( Even though they are connected via a cross over
cable)

Also I replaced the primary NIC cables, and verified the other cables
are fine ( cable tester), so the only thing I could say right now is
either it's a NIC issue not showing itself to me. ( I am not sure how
you could add another NIC to the Server and then make it the Public NIC,
without breaking the cluster itself, or bring the clustered Groups down.


Z

Edward E. Ziots
CISSP, Network +, Security +
Network Engineer
Lifespan Organization
Email:[email protected] <mailto:email%[email protected]> 
Cell:401-639-3505



-----Original Message-----
From: William Robbins [mailto:[email protected]]
Sent: Friday, February 18, 2011 10:33 AM
To: NT System Admin Issues
Subject: Re: Sounding board on issue we are seeing with a Windows 2003
Cluster with SQL 2005

Also that's a very specific timeframe...even if it's not backups on
the cluster, could there be a backup or scheduled task on another
server on the same switch in that timeframe?

Feel free to tell me to STFU...I'm just spitballing.  :)

 - WJR



On Fri, Feb 18, 2011 at 07:48, Ziots, Edward <[email protected]>
wrote:
> I have a two node X64bit Windows 2003 SP2 enterprise edition cluster
running
> SQL 2005 Standard Edition 64bit.
>
>
>
> What I am seeing is event ID's 1123, 1124 in the event logs on each
Cluster
> Node, and we are getting complaints of disconnects from the database.
>
>
>
> We are seeing it happen around 5:50-6:00pm each night.  ( shows in the
> cluster log and we seen it via pings)
>
>
>
> 1)      We have eliminated the backup of the server, which happens at
3:30am
> in the morning ( via Legato)
>
> 2)      I have gone through with Microsoft Support the entire KB
892422.
> Which covers these errors.
>
> 3)      I have switched out the cables to the public and the private
NIC's
> with no change in issues.
>
> 4)      RSS/TCP Chimney are disabled in the registry and on the NIC's
on
> each node.
>
> 5)      NIC Drivers are the latest from HP Site ( NC373i) and EMC
Powerpath
> software 5.3 SP1 for the SAN disk on each node.
>
>
>
> Basically we are pinging the Owning Node server from our workstations
and we
> loose about 5-10 pings during this time, on both the primary and the
> secondary nodes of the cluster. ( both are into the same Cisco Switch
45xx)
>
>
>
> We also was pinging each of the servers from each other ( both on the
same
> switch/VLAN) and we also saw the ping loss at the same time.
>
>
>
> Only idea I had is to move the public NIC's to another switch to
eliminate
> the switch as the point of contention, or get new hardware and migrate
the
> databases off this cluster and decommission it.
>
>
>
> I checked other cluster nodes connected to these switches ( 32bit) and
we
> don't see this problem.
>
>
>
> Anything I might be missing or overlooked? Questions, or bouncing some
ideas
> off the wall is appreciated...
>
>
>
> Z
>
>
>
> Edward E. Ziots
>
> CISSP, Network +, Security +
>
> Network Engineer
>
> Lifespan Organization
>
> Email:[email protected] <mailto:email%[email protected]> 
>
> Cell:401-639-3505
>
>
>
> ~ Finally, powerful endpoint security that ISN'T a resource hog! ~
> ~ <http://www.sunbeltsoftware.com/Business/VIPRE-Enterprise/>  ~
>
> ---
> To manage subscriptions click here:
> http://lyris.sunbelt-software.com/read/my_forums/
> or send an email to [email protected]
> with the body: unsubscribe ntsysadmin

~ Finally, powerful endpoint security that ISN'T a resource hog! ~
~ <http://www.sunbeltsoftware.com/Business/VIPRE-Enterprise/>  ~

---
To manage subscriptions click here:
http://lyris.sunbelt-software.com/read/my_forums/
or send an email to [email protected]
with the body: unsubscribe ntsysadmin


~ Finally, powerful endpoint security that ISN'T a resource hog! ~
~ <http://www.sunbeltsoftware.com/Business/VIPRE-Enterprise/>  ~

---
To manage subscriptions click here:
http://lyris.sunbelt-software.com/read/my_forums/
or send an email to [email protected]
with the body: unsubscribe ntsysadmin

 

~ Finally, powerful endpoint security that ISN'T a resource hog! ~
~ <http://www.sunbeltsoftware.com/Business/VIPRE-Enterprise/>  ~

---
To manage subscriptions click here:
http://lyris.sunbelt-software.com/read/my_forums/
or send an email to [email protected]
with the body: unsubscribe ntsysadmin


~ Finally, powerful endpoint security that ISN'T a resource hog! ~
~ <http://www.sunbeltsoftware.com/Business/VIPRE-Enterprise/>  ~

---
To manage subscriptions click here: 
http://lyris.sunbelt-software.com/read/my_forums/
or send an email to [email protected]
with the body: unsubscribe ntsysadmin

Reply via email to