Svavar Örn Eysteinsson wrote:
> 
> The Tx Unit hang messages don't log anymore in my logs. So the TX hang
> issue is gone I think.
This is good news to me.

> After I execute my firewall script I tell my friend to download from
> my FTP server.
> Then some minutes later suddenly these messages pops in my logs
> (note : PUBLIC_IP was replaced) :
> 
> Oct  9 09:15:54 localhost NetworkManager: <WARN>  check_one_address():
> (eth0) error -99 returned from rtnl_addr_delete(): Sucess#012
> Oct  9 09:15:54 localhost NetworkManager: <WARN>  check_one_address():
> (eth0) error -99 returned from rtnl_addr_delete(): Sucess#012
> ....(more like this)
> .... 
I don't know much about Network Manager and avahi-daemon, but they are 
primarily intended to dynamically select & prepare routes from 
configured interfaces, which is mainly useful for mobile platforms which 
might connect to different wireless &/or wired networks as they move 
around. For some reason we haven't yet figured out, you have one 
interface (eth0, your internal wired LAN) connection that occasionally 
loses link, and it may well be that avahi/Network Manager is 
reconfiguring your network when that happens. And when eth0 comes back 
up, whatever it has done has left your firewall configuration in ruins. 
Because you are managing a fixed server, I think you might be better off 
without network manager and avahi-daemon. Why not disable them on your 
server, and retest. Hopefully your firewall will remain intact even if 
eth0 has a link bounce.

We'll still need to figure out why eth0 drops under the high traffic 
condition, but your customers will be happy while we debug that 
remaining issue.

Please try disabling Network Manager & avahi-daemon and let me know the 
result.
> 
> As I totally forgot to mention, that I also have router PC machine in
> front, that also has the e1000 (same cards) running ZEBRA and only
> Zebra.
> What this machine does is only allows packet forwarding with Zebra.
> 
> I also updated the drivers from 7.3 to 8.0.16 in that machine today,
> and issued a TSO Off on those cards. (also the TxDescriptorStep=4,4 )
> 
> So, my Firewall and my Router are connected right now to a Gigabit
> switch, and state'ing : 1000 Mbps Full Duplex, Flow Control: RX/TX.
> 
Svavar, could you please send me a simple network diagram so that I can 
refer to it when I'm configuring my system. Something that shows the 
numbered (ethx) interfaces, which ones are dhcp served, dhcp serving, 
IPV4 network numbers, where the email & ftp servers are, and where you 
initiate the FTP test from. I think I already have it in my head, but a 
simple diagram would be a great help too. Thanks.
> 
> 
>>> One thing I don't understand is, if the interface goes down and comes
>>> right up after 1-2sec or so, why doesn't the firewall (netfilter)
>>> rules hang in ?
>>> Why do I have to relaunch interface config, and netfilter rules ?
> 
>> I don't know. Its possible (though unlikely) that if the one of the
>> Gateway interfaces is DHCP served, it could come back up and be
>> served a new IP address, which would break any existing connections.
>> As I say this is not likely. I am running a NAT firewall now (with
>> 82541PI on the 'private' side, and an 82572EI on the 'public'
>> interface, in an attampt to repro this very issue, and see only a
>> momentary interruption in continuous traffic streaming when I ifdown
>> and then ifup either the public or private interfaces. Do you think
>> it is possible that your firewall itself may be causing a problem ?
>> If your firewall is simply a set of IPTables rules, we could try and
>> run it (or something like it) here.
> 
> I don't think that the firewall rules have nothing to do with this
> problem, as I have used these rules and script (as I extend it of
> course and configure to my needs in time to time) on 3 pieces of PC
> machines or so.
You are probably right.
> Yes, my firewall script is generated with fwbuilder.
> 
> 
> 
>>> At the end, I will defiantly try to replace my HP Procurve 2524 with
>>> another one, but I don't think the Procurve is the problem.... Or
>>> what?
>>> As today the port, and the interface are both configured with
>>> AutoNeg,
>>> And Full Duplex…
>> I agree, the switch is probably not the problem here.
> 
> I just had to check it. replaced it with a Gigabit switch to see if
> the procurve just didn't handle all the load on the 100MBit Fiber.
> 10 - 11MB/s is clearly all that the link can handle.
> 
> 
> 
> So, this is the stats as for now :
> 
> 1. Replaced my Switch to a Gigabit switch.
> 2. Updated the 7.3 (e1000) drivers to 8.0.16 on my Router.
> 3. Both Firewall, and Router machine have TSO turned off.
> 4. Both Firewall and Router state that they have 1000Mbs Full Duplex,
> Flow Control : RX/TX connection to my switch.
> 
>      (So I would think it can handle my 100MB fiber connection)
> 
> Today I will try to load up my link with fast and huge data.
> 
> 
> Thanks allot Dave for your time and solutions.
> 
> And sorry for my bad english writing.
Its not so bad :)
> 
> Will post status today.
Thanks !
> 
> 
> 
> 
> 
> 
>>> As before, I had a another firewall server that had the old 3COM
>>> 3C905
>>> cards.
>>> In time to time, I also got Tx Unit hangs on those cards, but the
>>> internet link, and or netfilter rules, network configuration never
>>> crashed.
>>> It just keep going and going regard of those tx unit hangs.
>>> That was one of my main reasons I upgraded to INTEL e1000 cards. To
>>> upgrade my old PC, 3com cards and to handle my 100MB fiber dark fiber
>>> with ~200 devices connected in 3-4 networks doing NAT and pretty
>>> things.
>>> Correct me if I'm wrong, this Tx Unit hang problem on e1000 is what
>>> most related to AMD, and or AMD chipset platforms ?
>> Yes, you are right. There are a lot of things that can cause a "TX
>> Unit Hang", but I was hoping that this one was the one that we had
>> already tied to older AMD platforms.
>>> Today I found a old PC that only collects dust in my company. It has
>>> the legendary Intel 440BX (same as in Cisco Pix) chipset and also 2x
>>> SMP Intel Celeron 533 CPU's.
>>> Would it be a problem solver to change my rusty AMD, VIA combination
>>> in my current firewall to a rock solid 440BX with Intel CPU's ?
>> I would hope so.
>>> I remember, that I never had any problems at all with PIII and or
>>> Intel chipset mobos in the past.
>>> Thanks allot.
>>> In desperate need for help. :)
>>> Best regards,
>>> Svavar O
>>> Reykjavik - Iceland
> 
> 


------------------------------------------------------------------------------
Come build with us! The BlackBerry(R) Developer Conference in SF, CA
is the only developer event you need to attend this year. Jumpstart your
developing skills, take BlackBerry mobile applications to market and stay 
ahead of the curve. Join us from November 9 - 12, 2009. Register now!
http://p.sf.net/sfu/devconference
_______________________________________________
E1000-devel mailing list
E1000-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/e1000-devel

Reply via email to