Me either. When will the change occur? - WJR
On Fri, Feb 18, 2011 at 13:05, Ziots, Edward <[email protected]> wrote: > Cluster is in a specific VLAN, the new switch doesn’t have that VLAN > sourced to it yet, but it will after a change. I want to try that first, > before we move further. > > > > I did catch a few things on google about NC373i NIC’s and network issues. > The only other thing I can see is that the NIC firmware and the BIOS > firmware needs an update on each node ( about one revision behind.) > > > > But both servers connected into the same switch, seeing the same problem at > the same time, doesn’t sound like a server issue to me, at least not right > now. > > > > Z > > > > Edward E. Ziots > > CISSP, Network +, Security + > > Network Engineer > > Lifespan Organization > > Email:[email protected] > > Cell:401-639-3505 > > > > *From:* William Robbins [mailto:[email protected]] > *Sent:* Friday, February 18, 2011 11:46 AM > > *To:* NT System Admin Issues > *Subject:* Re: Sounding board on issue we are seeing with a Windows 2003 > Cluster with SQL 2005 > > > > I agree I would *think* if it were a physical NIC problem you could see > the dropped packets... > > > So your private NIC's are direct connected? Are you running a trace on > both machines watching the private NIC's? > > The specific timeframe of the disconnect also doesn't lend itself to a NIC > problem either. Hopefully you can change switches to see if that changes > anything. Any hopes to get the cluster on an isolated switch? Is it in a > specific VLAN now? > > - WJR > > On Fri, Feb 18, 2011 at 10:35, Ziots, Edward <[email protected]> wrote: > > Not going to tell anymore STFU, its why I am asking for a sounding board, > right now I am at whits end, I also agree on the switch issue, I ran across > a few internet posts complaining about NC373i and HP Broadcomm NIC's and > lost packets, and I got some action items to update the BIOS on the server > and the NIC Firmware to the latest support version and see if that helps. > But I would defintely like to try moving to another switch first to > eliminate the switch as the issue. > > Here is the real kicker though, if it was a NIC issue ( Physical NIC > issue), then wouldn't I also see the dropped packets on the private NIC, > which we didn't see. ( Even though they are connected via a cross over > cable) > > Also I replaced the primary NIC cables, and verified the other cables are > fine ( cable tester), so the only thing I could say right now is either it's > a NIC issue not showing itself to me. ( I am not sure how you could add > another NIC to the Server and then make it the Public NIC, without breaking > the cluster itself, or bring the clustered Groups down. > > > Z > > Edward E. Ziots > CISSP, Network +, Security + > Network Engineer > Lifespan Organization > Email:[email protected] > Cell:401-639-3505 > > -----Original Message----- > From: William Robbins [mailto:[email protected]] > Sent: Friday, February 18, 2011 10:33 AM > To: NT System Admin Issues > Subject: Re: Sounding board on issue we are seeing with a Windows 2003 > Cluster with SQL 2005 > > Also that's a very specific timeframe...even if it's not backups on > the cluster, could there be a backup or scheduled task on another > server on the same switch in that timeframe? > > Feel free to tell me to STFU...I'm just spitballing. :) > > - WJR > > > > On Fri, Feb 18, 2011 at 07:48, Ziots, Edward <[email protected]> wrote: > > I have a two node X64bit Windows 2003 SP2 enterprise edition cluster > running > > SQL 2005 Standard Edition 64bit. > > > > > > > > What I am seeing is event ID's 1123, 1124 in the event logs on each > Cluster > > Node, and we are getting complaints of disconnects from the database. > > > > > > > > We are seeing it happen around 5:50-6:00pm each night. ( shows in the > > cluster log and we seen it via pings) > > > > > > > > 1) We have eliminated the backup of the server, which happens at > 3:30am > > in the morning ( via Legato) > > > > 2) I have gone through with Microsoft Support the entire KB 892422. > > Which covers these errors. > > > > 3) I have switched out the cables to the public and the private > NIC's > > with no change in issues. > > > > 4) RSS/TCP Chimney are disabled in the registry and on the NIC's on > > each node. > > > > 5) NIC Drivers are the latest from HP Site ( NC373i) and EMC > Powerpath > > software 5.3 SP1 for the SAN disk on each node. > > > > > > > > Basically we are pinging the Owning Node server from our workstations and > we > > loose about 5-10 pings during this time, on both the primary and the > > secondary nodes of the cluster. ( both are into the same Cisco Switch > 45xx) > > > > > > > > We also was pinging each of the servers from each other ( both on the > same > > switch/VLAN) and we also saw the ping loss at the same time. > > > > > > > > Only idea I had is to move the public NIC's to another switch to > eliminate > > the switch as the point of contention, or get new hardware and migrate > the > > databases off this cluster and decommission it. > > > > > > > > I checked other cluster nodes connected to these switches ( 32bit) and we > > don't see this problem. > > > > > > > > Anything I might be missing or overlooked? Questions, or bouncing some > ideas > > off the wall is appreciated... > > > > > > > > Z > > > > > > > > Edward E. Ziots > > > > CISSP, Network +, Security + > > > > Network Engineer > > > > Lifespan Organization > > > > Email:[email protected] > > > > Cell:401-639-3505 > > > > > > > > ~ Finally, powerful endpoint security that ISN'T a resource hog! ~ > > ~ <http://www.sunbeltsoftware.com/Business/VIPRE-Enterprise/> ~ > > > > --- > > To manage subscriptions click here: > > http://lyris.sunbelt-software.com/read/my_forums/ > > or send an email to [email protected] > > with the body: unsubscribe ntsysadmin > > ~ Finally, powerful endpoint security that ISN'T a resource hog! ~ > ~ <http://www.sunbeltsoftware.com/Business/VIPRE-Enterprise/> ~ > > --- > To manage subscriptions click here: > http://lyris.sunbelt-software.com/read/my_forums/ > or send an email to [email protected] > with the body: unsubscribe ntsysadmin > > > ~ Finally, powerful endpoint security that ISN'T a resource hog! ~ > ~ <http://www.sunbeltsoftware.com/Business/VIPRE-Enterprise/> ~ > > --- > To manage subscriptions click here: > http://lyris.sunbelt-software.com/read/my_forums/ > or send an email to [email protected] > with the body: unsubscribe ntsysadmin > > > > ~ Finally, powerful endpoint security that ISN'T a resource hog! ~ > ~ <http://www.sunbeltsoftware.com/Business/VIPRE-Enterprise/> ~ > > --- > To manage subscriptions click here: > http://lyris.sunbelt-software.com/read/my_forums/ > or send an email to [email protected] > with the body: unsubscribe ntsysadmin > > ~ Finally, powerful endpoint security that ISN'T a resource hog! ~ > ~ <http://www.sunbeltsoftware.com/Business/VIPRE-Enterprise/> ~ > > --- > To manage subscriptions click here: > http://lyris.sunbelt-software.com/read/my_forums/ > or send an email to [email protected] > with the body: unsubscribe ntsysadmin > ~ Finally, powerful endpoint security that ISN'T a resource hog! ~ ~ <http://www.sunbeltsoftware.com/Business/VIPRE-Enterprise/> ~ --- To manage subscriptions click here: http://lyris.sunbelt-software.com/read/my_forums/ or send an email to [email protected] with the body: unsubscribe ntsysadmin
