Re: [CentOS] Networking just stopped working

2010-07-11 Thread Christopher Chan

 Are there 'services' that the network 'depends' on, but which are are
 started *later* then network?  Running 'service network restart' as a cure
 suggests this.  Do you have any special or custom init scripts relating
 to your bonding (maybe something that loads special kernel modules or
 something like that)?

Hmm, now that you mention it, I highly suspect the qemu/libvirt network 
but I have already shot down these two services along with dnsmasq. What 
else will setup the 192.168.122.0 space?
___
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos


Re: [CentOS] Networking just stopped working

2010-07-09 Thread Robert Heller
At Fri, 09 Jul 2010 10:30:06 +0800 CentOS mailing list centos@centos.org 
wrote:

 
 On Thursday, July 08, 2010 09:40 PM, JohnS wrote:
 
  On Thu, 2010-07-08 at 07:51 -0500, Les Mikesell wrote:
  I think some bridge or vlan scenarios require promiscuous mode (and the
  corresponding disabling of hardware acceleration).  Maybe the real issue 
  is that
 something accidentally disabled it and you now only work when tcpdump
  re-enables it.  I'm not sure how this is supposed to be managed atomically 
  when
  multiple programs may manipulate it and it needs to be propagated across
  multiple bonded nics, but maybe something went wrong there.  At least some
  things log the change so maybe you can get a hint about when it was turned 
  on
  and off.
  ---
 
  Check out /proc/net/bonding/bond/YOUR_BOND.  Make sure your slave IDs
  are the same as in aggregator ID.  If not it will cause the problem your
  having.  Bad NIC hardware also it's failing over for a reason as the log
  showed.
 
 
 They check out. What did help besides running tcpdump forever was to do 
 a 'service network restart'. That made the network behave. I wonder 
 what's going on...

Are there 'services' that the network 'depends' on, but which are are
started *later* then network?  Running 'service network restart' as a cure
suggests this.  Do you have any special or custom init scripts relating
to your bonding (maybe something that loads special kernel modules or
something like that)?

 ___
 CentOS mailing list
 CentOS@centos.org
 http://lists.centos.org/mailman/listinfo/centos
 


-- 
Robert Heller -- Get the Deepwoods Software FireFox Toolbar!
Deepwoods Software-- Linux Installation and Administration
http://www.deepsoft.com/  -- Web Hosting, with CGI and Database
hel...@deepsoft.com   -- Contract Programming: C/C++, Tcl/Tk


  
___
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos


Re: [CentOS] Networking just stopped working

2010-07-08 Thread Kahlil Hodgson
On 08/07/10 15:41, Christopher Chan wrote:
 No new boxes. Not possible for any other box to be assigned the same
 ip internally via dhcp and definitely not the same Internet ip.

Exactly.  DHCP server would check for a conflict before assigning an
address and is definitely not the source of the problem.

 Perhaps you care to explain why BOTH vlan interfaces stopped working?
 The odd chance that two other boxes each took one of the other ip
 address?

Did not know that both had stopped. Conflicting IP addresses was just a
suggestion.  May not be the problem at all.  With bonding, breaking one
might break both down at the MAC level ...

Hmmm ... which bond mode are you using?

 The box with the problem just so happens to be the only box using 
 bonding, 802.1q and a four port Qlogic Netxen NIC. I think the
 chances of there being a problem between these three more likely than
 some 'ghost' boxes getting assigned the same ip addresses when I am
 the only admin around.

If you are the only admin, then its not that likely.  Then again, I once
had a power spike reset a wireless router on my network without me
knowing.  Default settings were close by not quite right, and it took me
a couple of days to track down the problem :-(

If it was working, then suddenly stops, then something must have
changed.  I gather you have some configuration and change management
system in place?  Backups of conf files?

K


___
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos


Re: [CentOS] Networking just stopped working

2010-07-08 Thread Christopher Chan

 Did not know that both had stopped. Conflicting IP addresses was just a
 suggestion.  May not be the problem at all.  With bonding, breaking one
 might break both down at the MAC level ...

 Hmmm ... which bond mode are you using?

Why mode 4 of course.


 The box with the problem just so happens to be the only box using
 bonding, 802.1q and a four port Qlogic Netxen NIC. I think the
 chances of there being a problem between these three more likely than
 some 'ghost' boxes getting assigned the same ip addresses when I am
 the only admin around.

 If you are the only admin, then its not that likely.  Then again, I once
 had a power spike reset a wireless router on my network without me
 knowing.  Default settings were close by not quite right, and it took me
 a couple of days to track down the problem :-(

Too bad there are no defaults that use the subnet assigned to the school 
or the 192.168.0.0/16 (no, not my idea - inherited)



 If it was working, then suddenly stops, then something must have
 changed.  I gather you have some configuration and change management
 system in place?  Backups of conf files?


Hahaha, that was the best part. It just stopped. And stayed that way too 
after a reboot, reboot of switches and only started working again when I 
ran tcpdump for some reason.

But another colleague did find this in the iLo report:

Repaired Network 07/06/2010 12:35 07/06/2010 12:00 2 Network Adapters 
Redundancy Reduced (Slot 10, Port 3)

Repaired Network 07/06/2010 12:35 07/06/2010 12:00 2 Network Adapters 
Redundancy Reduced (Slot 10, Port 4)

Repaired Network 07/06/2010 12:35 07/06/2010 12:00 2 Network Adapters 
Redundancy Reduced (Slot 10, Port 1)

Repaired Network 07/06/2010 12:01 07/06/2010 12:00 1 Network Adapter 
Link Down (Slot 10, Port 2)

Time to ask the HP chap what this is all about.

___
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos


Re: [CentOS] Networking just stopped working

2010-07-08 Thread Kahlil Hodgson
On 07/08/2010 05:08 PM, Christopher Chan wrote:
 Hmmm ... which bond mode are you using?
 
 Why mode 4 of course.

Ouch.  Never used that mode.

snip
mode=4 (802.3ad)
IEEE 802.3ad Dynamic link aggregation. Creates aggregation groups that
share the same speed and duplex settings. Utilizes all slaves in the
active aggregator according to the 802.3ad specification.

Pre-requisites:
1. Ethtool support in the base drivers for retrieving
the speed and duplex of each slave.
2. A switch that supports IEEE 802.3ad Dynamic link
aggregation.
Most switches will require some type of configuration
to enable 802.3ad mode.
/snip

So I gather the bonding on the CentOS box is cooperating with the
switches in some non-trivial fashion.

 Too bad there are no defaults that use the subnet assigned to the school 
 or the 192.168.0.0/16 (no, not my idea - inherited)

That is a big network.  Might make sense in a school though.  How many
nodes on it?  Any chance a ahem staff member plugged an unauthorised
piece of hardware in somewhere.

 If it was working, then suddenly stops, then something must have
 changed.  I gather you have some configuration and change management
 system in place?  Backups of conf files?
 
 Hahaha, that was the best part. It just stopped. And stayed that way too 
 after a reboot, reboot of switches and only started working again when I 
 ran tcpdump for some reason.

tcpdump is probably putting your interface into promiscuous mode which
is triggering something. Perhaps ARP packets.

I think something (perhaps obscure) has changed, you may just not be
aware of it.  Comparing your event timeline against your configuration
change management systems may help.

 But another colleague did find this in the iLo report:

You're the only admin but you have a colleague with access to an iLo
report?  That puts a big question mark over a previous assertion :-)

 Repaired Network 07/06/2010 12:35 07/06/2010 12:00 2 Network Adapters 
 Redundancy Reduced (Slot 10, Port 3)
 
 Repaired Network 07/06/2010 12:35 07/06/2010 12:00 2 Network Adapters 
 Redundancy Reduced (Slot 10, Port 4)
 
 Repaired Network 07/06/2010 12:35 07/06/2010 12:00 2 Network Adapters 
 Redundancy Reduced (Slot 10, Port 1)
 
 Repaired Network 07/06/2010 12:01 07/06/2010 12:00 1 Network Adapter 
 Link Down (Slot 10, Port 2)
 
 Time to ask the HP chap what this is all about.

Looks like the bonding failover process is doing what it should.

A bit more info on you setup might help.

1. What is the purpose of the box with the fat network?
2. are all 4 interfaces being used?
3. are they plugged into the same switch?
4. you've got at least 2 networks, plus 2 vlans, plus a public internet
connection to this box?

K
___
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos


Re: [CentOS] Networking just stopped working

2010-07-08 Thread Christopher Chan
On Thursday, July 08, 2010 05:09 PM, Kahlil Hodgson wrote:
 On 07/08/2010 05:08 PM, Christopher Chan wrote:
 Hmmm ... which bond mode are you using?

 Why mode 4 of course.

 Ouch.  Never used that mode.

Huh? Like why? It's the recommended mode unless the switch does not 
suppoprt it or the boards don't.


 snip
 mode=4 (802.3ad)
 IEEE 802.3ad Dynamic link aggregation. Creates aggregation groups that
 share the same speed and duplex settings. Utilizes all slaves in the
 active aggregator according to the 802.3ad specification.

   Pre-requisites:
   1. Ethtool support in the base drivers for retrieving
   the speed and duplex of each slave.
   2. A switch that supports IEEE 802.3ad Dynamic link
   aggregation.
   Most switches will require some type of configuration
   to enable 802.3ad mode.
 /snip

 So I gather the bonding on the CentOS box is cooperating with the
 switches in some non-trivial fashion.

And it works just fine thank you very much.



 Too bad there are no defaults that use the subnet assigned to the school
 or the 192.168.0.0/16 (no, not my idea - inherited)

 That is a big network.  Might make sense in a school though.  How many
 nodes on it?  Any chance aahem  staff member plugged an unauthorised
 piece of hardware in somewhere.

Nada, zip, zilch. School is closed and the issue is now very reliably 
demonstrated that running tcpdump makes it behave and the network is 
gone the moment you stop tcpdump. So there are no external factors to 
this problem. Been on the phone with HP. I will be upgrading the hp 
packages to the latest version to see if that fixes things.


 If it was working, then suddenly stops, then something must have
 changed.  I gather you have some configuration and change management
 system in place?  Backups of conf files?

 Hahaha, that was the best part. It just stopped. And stayed that way too
 after a reboot, reboot of switches and only started working again when I
 ran tcpdump for some reason.

 tcpdump is probably putting your interface into promiscuous mode which
 is triggering something. Perhaps ARP packets.

Yeah, it is triggering something alright.



 I think something (perhaps obscure) has changed, you may just not be
 aware of it.  Comparing your event timeline against your configuration
 change management systems may help.

No changes have been made to the box whether by me or by my colleague at 
the HQ. I checked the logs too. No reboot prior to the manifestation of 
the problem. Stumped really here...



 But another colleague did find this in the iLo report:

 You're the only admin but you have a colleague with access to an iLo
 report?  That puts a big question mark over a previous assertion :-)

He is not physically on site so he cannot add anything. Nor have the 
logs shown anything done by him.


 Repaired Network 07/06/2010 12:35 07/06/2010 12:00 2 Network Adapters
 Redundancy Reduced (Slot 10, Port 3)

 Repaired Network 07/06/2010 12:35 07/06/2010 12:00 2 Network Adapters
 Redundancy Reduced (Slot 10, Port 4)

 Repaired Network 07/06/2010 12:35 07/06/2010 12:00 2 Network Adapters
 Redundancy Reduced (Slot 10, Port 1)

 Repaired Network 07/06/2010 12:01 07/06/2010 12:00 1 Network Adapter
 Link Down (Slot 10, Port 2)

 Time to ask the HP chap what this is all about.

 Looks like the bonding failover process is doing what it should.

 A bit more info on you setup might help.

 1. What is the purpose of the box with the fat network?

Besides being able to saturate the network, what other reason can there be?


 2. are all 4 interfaces being used?

Oh yes!


 3. are they plugged into the same switch?

Yup.


 4. you've got at least 2 networks, plus 2 vlans, plus a public internet
 connection to this box?


The vlans use bond0 as their phy interface. One vlan is internal and the 
other is the Internet subnet.
___
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos


Re: [CentOS] Networking just stopped working

2010-07-08 Thread Chan Chung Hang Christopher
Christopher Chan wrote:
 On Thursday, July 08, 2010 05:09 PM, Kahlil Hodgson wrote:
 On 07/08/2010 05:08 PM, Christopher Chan wrote:
 Hmmm ... which bond mode are you using?
 Why mode 4 of course.
 Ouch.  Never used that mode.
 
 Huh? Like why? It's the recommended mode unless the switch does not 
 suppoprt it or the boards don't.
 

Oh sorry, got a bit grouchy there. I don't like overtime and was getting 
tired too. Did not read your mail properly.
___
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos


Re: [CentOS] Networking just stopped working

2010-07-08 Thread Hakan Koseoglu
HiChristopher,

On 08/07/10 10:25, Christopher Chan wrote:
 Why mode 4 of course.
 Huh? Like why? It's the recommended mode unless the switch does not
 suppoprt it or the boards don't.
I never realised this is the recommended mode. Do you have pointers 
where it is recommended so that I can read on why?

Cheers
___
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos


Re: [CentOS] Networking just stopped working

2010-07-08 Thread Les Mikesell
Chan Chung Hang Christopher wrote:
 Christopher Chan wrote:
 On Thursday, July 08, 2010 05:09 PM, Kahlil Hodgson wrote:
 On 07/08/2010 05:08 PM, Christopher Chan wrote:
 Hmmm ... which bond mode are you using?
 Why mode 4 of course.
 Ouch.  Never used that mode.
 Huh? Like why? It's the recommended mode unless the switch does not 
 suppoprt it or the boards don't.

 
 Oh sorry, got a bit grouchy there. I don't like overtime and was getting 
 tired too. Did not read your mail properly.


I think some bridge or vlan scenarios require promiscuous mode (and the 
corresponding disabling of hardware acceleration).  Maybe the real issue is 
that 
  something accidentally disabled it and you now only work when tcpdump 
re-enables it.  I'm not sure how this is supposed to be managed atomically when 
multiple programs may manipulate it and it needs to be propagated across 
multiple bonded nics, but maybe something went wrong there.  At least some 
things log the change so maybe you can get a hint about when it was turned on 
and off.

--
   Les Mikesell
lesmikes...@gmail.com
___
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos


Re: [CentOS] Networking just stopped working

2010-07-08 Thread JohnS

On Thu, 2010-07-08 at 07:51 -0500, Les Mikesell wrote:
 I think some bridge or vlan scenarios require promiscuous mode (and the 
 corresponding disabling of hardware acceleration).  Maybe the real issue is 
 that 
   something accidentally disabled it and you now only work when tcpdump 
 re-enables it.  I'm not sure how this is supposed to be managed atomically 
 when 
 multiple programs may manipulate it and it needs to be propagated across 
 multiple bonded nics, but maybe something went wrong there.  At least some 
 things log the change so maybe you can get a hint about when it was turned on 
 and off.
---

Check out /proc/net/bonding/bond/YOUR_BOND.  Make sure your slave IDs
are the same as in aggregator ID.  If not it will cause the problem your
having.  Bad NIC hardware also it's failing over for a reason as the log
showed.

John

___
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos


Re: [CentOS] Networking just stopped working

2010-07-08 Thread Chan Chung Hang Christopher
Hakan Koseoglu wrote:
 HiChristopher,
 
 On 08/07/10 10:25, Christopher Chan wrote:
 Why mode 4 of course.
 Huh? Like why? It's the recommended mode unless the switch does not
 suppoprt it or the boards don't.
 I never realised this is the recommended mode. Do you have pointers 
 where it is recommended so that I can read on why?
 

Maybe 'the recommended' is a bit too much. But here is a read.

http://useopensource.blogspot.com/2010/02/linux-nic-teaming-recommendations.html


___
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos


Re: [CentOS] Networking just stopped working

2010-07-08 Thread Chan Chung Hang Christopher
JohnS wrote:
 On Thu, 2010-07-08 at 07:51 -0500, Les Mikesell wrote:
 I think some bridge or vlan scenarios require promiscuous mode (and the 
 corresponding disabling of hardware acceleration).  Maybe the real issue is 
 that 
   something accidentally disabled it and you now only work when tcpdump 
 re-enables it.  I'm not sure how this is supposed to be managed atomically 
 when 
 multiple programs may manipulate it and it needs to be propagated across 
 multiple bonded nics, but maybe something went wrong there.  At least some 
 things log the change so maybe you can get a hint about when it was turned 
 on 
 and off.
 ---
 
 Check out /proc/net/bonding/bond/YOUR_BOND.  Make sure your slave IDs
 are the same as in aggregator ID.  If not it will cause the problem your
 having.  Bad NIC hardware also it's failing over for a reason as the log
 showed.
 

Okay, I'll take a look tomorrow when I get in to work.
___
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos


Re: [CentOS] Networking just stopped working

2010-07-08 Thread Chan Chung Hang Christopher
Les Mikesell wrote:
 Chan Chung Hang Christopher wrote:
 Christopher Chan wrote:
 On Thursday, July 08, 2010 05:09 PM, Kahlil Hodgson wrote:
 On 07/08/2010 05:08 PM, Christopher Chan wrote:
 Hmmm ... which bond mode are you using?
 Why mode 4 of course.
 Ouch.  Never used that mode.
 Huh? Like why? It's the recommended mode unless the switch does not 
 suppoprt it or the boards don't.

 Oh sorry, got a bit grouchy there. I don't like overtime and was getting 
 tired too. Did not read your mail properly.

 
 I think some bridge or vlan scenarios require promiscuous mode (and the 
 corresponding disabling of hardware acceleration).  Maybe the real issue is 
 that 
   something accidentally disabled it and you now only work when tcpdump 
 re-enables it.  I'm not sure how this is supposed to be managed atomically 
 when 
 multiple programs may manipulate it and it needs to be propagated across 
 multiple bonded nics, but maybe something went wrong there.  At least some 
 things log the change so maybe you can get a hint about when it was turned on 
 and off.
 

/me wonders if the loading of the bridge and another related module has 
anything to do with this.

I'll prepare a list of targets for rmmod.
___
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos


Re: [CentOS] Networking just stopped working

2010-07-08 Thread Christopher Chan
On Thursday, July 08, 2010 09:40 PM, JohnS wrote:

 On Thu, 2010-07-08 at 07:51 -0500, Les Mikesell wrote:
 I think some bridge or vlan scenarios require promiscuous mode (and the
 corresponding disabling of hardware acceleration).  Maybe the real issue is 
 that
something accidentally disabled it and you now only work when tcpdump
 re-enables it.  I'm not sure how this is supposed to be managed atomically 
 when
 multiple programs may manipulate it and it needs to be propagated across
 multiple bonded nics, but maybe something went wrong there.  At least some
 things log the change so maybe you can get a hint about when it was turned on
 and off.
 ---

 Check out /proc/net/bonding/bond/YOUR_BOND.  Make sure your slave IDs
 are the same as in aggregator ID.  If not it will cause the problem your
 having.  Bad NIC hardware also it's failing over for a reason as the log
 showed.


They check out. What did help besides running tcpdump forever was to do 
a 'service network restart'. That made the network behave. I wonder 
what's going on...
___
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos


Re: [CentOS] Networking just stopped working

2010-07-07 Thread Kahlil Hodgson
On 06/07/10 22:48, Les Mikesell wrote:
 Chan Chung Hang Christopher wrote:
 Christopher Chan wrote:
 And now the thing is working again...

 It's not working again.

 Running tcpdump -i vlan seems to trigger something to get the network 
 working again but as soon as I stop tcpdump...nada, zip, zilch.


If you have two machines on the same network with the same IP address
you get behaviour like this.  Had this happen once when an engineer
reset a UPSs and it took on the IP address of a main switch.
arpwatch is your friend.

K


___
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos


Re: [CentOS] Networking just stopped working

2010-07-07 Thread Christopher Chan
On Thursday, July 08, 2010 09:26 AM, Kahlil Hodgson wrote:
 On 06/07/10 22:48, Les Mikesell wrote:
 Chan Chung Hang Christopher wrote:
 Christopher Chan wrote:
 And now the thing is working again...

 It's not working again.

 Running tcpdump -i vlan seems to trigger something to get the network
 working again but as soon as I stop tcpdump...nada, zip, zilch.


 If you have two machines on the same network with the same IP address
 you get behaviour like this.  Had this happen once when an engineer
 reset a UPSs and it took on the IP address of a main switch.
 arpwatch is your friend.


Unfortunately all addresses, both internal and Internet, on this box are 
static and assigned so there is no hope of a collision. The dhcp server 
does not serve any address in the same range that the box uses internally.
___
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos


Re: [CentOS] Networking just stopped working

2010-07-07 Thread Kahlil Hodgson
On 08/07/10 14:58, Christopher Chan wrote:
 If you have two machines on the same network with the same IP address
 you get behaviour like this.  Had this happen once when an engineer
 reset a UPSs and it took on the IP address of a main switch.
 arpwatch is your friend.

 Unfortunately all addresses, both internal and Internet, on this box are 
 static and assigned so there is no hope of a collision. The dhcp server 
 does not serve any address in the same range that the box uses internally.

I was referring to the case where another box (or network device) on the
same network (i.e. plugged into the same switch/router/hub) has been
given a static IP address the same as that used by the problem box.
This could be a new server, a printer, a UPS, or any number of other
network devices.  It could also be a device being reset to factory
settings which conflicts with the problem box.

I'm you have another Linux machine on the same network that is not
having the same problem, try installing arpwatch.  It should pick up the
conflict with 30mins or so.

K
___
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos


Re: [CentOS] Networking just stopped working

2010-07-07 Thread Christopher Chan
On Thursday, July 08, 2010 01:32 PM, Kahlil Hodgson wrote:
 On 08/07/10 14:58, Christopher Chan wrote:
 If you have two machines on the same network with the same IP address
 you get behaviour like this.  Had this happen once when an engineer
 reset a UPSs and it took on the IP address of a main switch.
 arpwatch is your friend.

 Unfortunately all addresses, both internal and Internet, on this box are
 static and assigned so there is no hope of a collision. The dhcp server
 does not serve any address in the same range that the box uses internally.

 I was referring to the case where another box (or network device) on the
 same network (i.e. plugged into the same switch/router/hub) has been
 given a static IP address the same as that used by the problem box.
 This could be a new server, a printer, a UPS, or any number of other
 network devices.  It could also be a device being reset to factory
 settings which conflicts with the problem box.

No new boxes. Not possible for any other box to be assigned the same ip 
internally via dhcp and definitely not the same Internet ip. Perhaps you 
care to explain why BOTH vlan interfaces stopped working? The odd chance 
that two other boxes each took one of the other ip address?


 I'm you have another Linux machine on the same network that is not
 having the same problem, try installing arpwatch.  It should pick up the
 conflict with 30mins or so.

The box with the problem just so happens to be the only box using 
bonding, 802.1q and a four port Qlogic Netxen NIC. I think the chances 
of there being a problem between these three more likely than some 
'ghost' boxes getting assigned the same ip addresses when I am the only 
admin around.
___
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos


Re: [CentOS] Networking just stopped working

2010-07-06 Thread Christopher Chan
And now the thing is working again...
___
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos


Re: [CentOS] Networking just stopped working

2010-07-06 Thread Chan Chung Hang Christopher
Christopher Chan wrote:
 And now the thing is working again...

It's not working again.

Running tcpdump -i vlan seems to trigger something to get the network 
working again but as soon as I stop tcpdump...nada, zip, zilch.

Any ideas? I see no errors in the logs whether of the switch or the box, 
just about everything reports fine. Would the loading of the kernel 
bridge module cause this?
___
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos


Re: [CentOS] Networking just stopped working

2010-07-06 Thread Les Mikesell
Chan Chung Hang Christopher wrote:
 Christopher Chan wrote:
 And now the thing is working again...
 
 It's not working again.
 
 Running tcpdump -i vlan seems to trigger something to get the network 
 working again but as soon as I stop tcpdump...nada, zip, zilch.
 
 Any ideas? I see no errors in the logs whether of the switch or the box, 
 just about everything reports fine. Would the loading of the kernel 
 bridge module cause this?

Running tcpdump would put the interface in promiscuous mode.  Does your setup 
need this to work?

-- 
   Les Mikesell
lesmikes...@gmail.com
___
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos


Re: [CentOS] Networking just stopped working

2010-07-06 Thread Chan Chung Hang Christopher
Les Mikesell wrote:
 Chan Chung Hang Christopher wrote:
 Christopher Chan wrote:
 And now the thing is working again...
 It's not working again.

 Running tcpdump -i vlan seems to trigger something to get the network 
 working again but as soon as I stop tcpdump...nada, zip, zilch.

 Any ideas? I see no errors in the logs whether of the switch or the box, 
 just about everything reports fine. Would the loading of the kernel 
 bridge module cause this?
 
 Running tcpdump would put the interface in promiscuous mode.  Does your setup 
 need this to work?
 

I don't think so. The thing was working fine since December last year 
until this morning. Then poof! I just realized I forgot to boot older 
kernels to check for the same problem...
___
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos


Re: [CentOS] Networking just stopped working

2010-07-06 Thread Christopher Chan
On Tuesday, July 06, 2010 09:21 PM, Chan Chung Hang Christopher wrote:
 Les Mikesell wrote:
 Chan Chung Hang Christopher wrote:
 Christopher Chan wrote:
 And now the thing is working again...
 It's not working again.

 Running tcpdump -i vlan seems to trigger something to get the network
 working again but as soon as I stop tcpdump...nada, zip, zilch.

 Any ideas? I see no errors in the logs whether of the switch or the box,
 just about everything reports fine. Would the loading of the kernel
 bridge module cause this?

 Running tcpdump would put the interface in promiscuous mode.  Does your setup
 need this to work?


 I don't think so. The thing was working fine since December last year
 until this morning. Then poof! I just realized I forgot to boot older
 kernels to check for the same problem...

Box behaving for the moment after tcpdump was run on one of the 
interfaces and then stopped. I'll just wait for the next weirdo event.
___
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos


[CentOS] Networking just stopped working

2010-07-05 Thread Christopher Chan
Hi all,

I have a box with a quad port Netxen NIC running Centos 5. All four 
interfaces are slaves of bond0 and bond0 is used by two vlan interfaces.

All was working just fine until just recently when everything just 
stopped working. ethtool reports all the individual interfaces are just 
fine. The switch is not complaining either. But I cannot ping anywhere 
not can others 'see' the box. Any ideas?

I have turned off iptables, rebooted the switches but still the thing 
won't work.

Christopher
___
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos