I was actually thinking about disable'ing NetworkManager and the avahi- 
daemon.
But wasn't sure about if I need it it or what.

BUT, I got some great great news.

I told my friend yesterday to start a download from my FTP server.  
Everything went smooth.
No crash, no nothing. So I decided to try it out my self just to make  
sure everything was OK.
Connected my laptop to my public network switch and gave it a legal  
public IP from my network.
Started a download from the ftp server and got about 10.8 - 10.9 MB/ 
sec in download rate.
Did that for about 20-30min or so, and no logs or no crash on my  
firewall or router machine.

I also downloaded while my friend was downloading too, more stress on  
the link right ?

So, I would think that the problem is gone.
With updated drivers (8.0.16) and configuration ( the  
TxDescriptorStep ) on both firewall and my router machine, thanks to  
you David.
And replaced a 100Mbit switch to a 1GB switch ( unmanaged although : 
(    )

I'm still going to try more of a stress test on my link after the  
weekend.. This situation is too good to be true :)


Here's a simple and ugly diagram of the network in PDF format :

http://dl.getdropbox.com/u/16134/g37-network-diagram.pdf





On 9.10.2009, at 16:33, Graham, David wrote:

> Svavar Örn Eysteinsson wrote:
>> The Tx Unit hang messages don't log anymore in my logs. So the TX  
>> hang
>> issue is gone I think.
> This is good news to me.
>
>> After I execute my firewall script I tell my friend to download from
>> my FTP server.
>> Then some minutes later suddenly these messages pops in my logs
>> (note : PUBLIC_IP was replaced) :
>> Oct  9 09:15:54 localhost NetworkManager: <WARN>  check_one_address 
>> ():
>> (eth0) error -99 returned from rtnl_addr_delete(): Sucess#012
>> Oct  9 09:15:54 localhost NetworkManager: <WARN>  check_one_address 
>> ():
>> (eth0) error -99 returned from rtnl_addr_delete(): Sucess#012
>> ....(more like this)
>> ....
> I don't know much about Network Manager and avahi-daemon, but they  
> are primarily intended to dynamically select & prepare routes from  
> configured interfaces, which is mainly useful for mobile platforms  
> which might connect to different wireless &/or wired networks as  
> they move around. For some reason we haven't yet figured out, you  
> have one interface (eth0, your internal wired LAN) connection that  
> occasionally loses link, and it may well be that avahi/Network  
> Manager is reconfiguring your network when that happens. And when  
> eth0 comes back up, whatever it has done has left your firewall  
> configuration in ruins. Because you are managing a fixed server, I  
> think you might be better off without network manager and avahi- 
> daemon. Why not disable them on your server, and retest. Hopefully  
> your firewall will remain intact even if eth0 has a link bounce.
>
> We'll still need to figure out why eth0 drops under the high traffic  
> condition, but your customers will be happy while we debug that  
> remaining issue.
>
> Please try disabling Network Manager & avahi-daemon and let me know  
> the result.
>> As I totally forgot to mention, that I also have router PC machine in
>> front, that also has the e1000 (same cards) running ZEBRA and only
>> Zebra.
>> What this machine does is only allows packet forwarding with Zebra.
>> I also updated the drivers from 7.3 to 8.0.16 in that machine today,
>> and issued a TSO Off on those cards. (also the TxDescriptorStep=4,4 )
>> So, my Firewall and my Router are connected right now to a Gigabit
>> switch, and state'ing : 1000 Mbps Full Duplex, Flow Control: RX/TX.
> Svavar, could you please send me a simple network diagram so that I  
> can refer to it when I'm configuring my system. Something that shows  
> the numbered (ethx) interfaces, which ones are dhcp served, dhcp  
> serving, IPV4 network numbers, where the email & ftp servers are,  
> and where you initiate the FTP test from. I think I already have it  
> in my head, but a simple diagram would be a great help too. Thanks.
>>>> One thing I don't understand is, if the interface goes down and  
>>>> comes
>>>> right up after 1-2sec or so, why doesn't the firewall (netfilter)
>>>> rules hang in ?
>>>> Why do I have to relaunch interface config, and netfilter rules ?
>>> I don't know. Its possible (though unlikely) that if the one of the
>>> Gateway interfaces is DHCP served, it could come back up and be
>>> served a new IP address, which would break any existing connections.
>>> As I say this is not likely. I am running a NAT firewall now (with
>>> 82541PI on the 'private' side, and an 82572EI on the 'public'
>>> interface, in an attampt to repro this very issue, and see only a
>>> momentary interruption in continuous traffic streaming when I ifdown
>>> and then ifup either the public or private interfaces. Do you think
>>> it is possible that your firewall itself may be causing a problem ?
>>> If your firewall is simply a set of IPTables rules, we could try and
>>> run it (or something like it) here.
>> I don't think that the firewall rules have nothing to do with this
>> problem, as I have used these rules and script (as I extend it of
>> course and configure to my needs in time to time) on 3 pieces of PC
>> machines or so.
> You are probably right.
>> Yes, my firewall script is generated with fwbuilder.
>>>> At the end, I will defiantly try to replace my HP Procurve 2524  
>>>> with
>>>> another one, but I don't think the Procurve is the problem.... Or
>>>> what?
>>>> As today the port, and the interface are both configured with
>>>> AutoNeg,
>>>> And Full Duplex…
>>> I agree, the switch is probably not the problem here.
>> I just had to check it. replaced it with a Gigabit switch to see if
>> the procurve just didn't handle all the load on the 100MBit Fiber.
>> 10 - 11MB/s is clearly all that the link can handle.
>> So, this is the stats as for now :
>> 1. Replaced my Switch to a Gigabit switch.
>> 2. Updated the 7.3 (e1000) drivers to 8.0.16 on my Router.
>> 3. Both Firewall, and Router machine have TSO turned off.
>> 4. Both Firewall and Router state that they have 1000Mbs Full Duplex,
>> Flow Control : RX/TX connection to my switch.
>>     (So I would think it can handle my 100MB fiber connection)
>> Today I will try to load up my link with fast and huge data.
>> Thanks allot Dave for your time and solutions.
>> And sorry for my bad english writing.
> Its not so bad :)
>> Will post status today.
> Thanks !
>>>> As before, I had a another firewall server that had the old 3COM
>>>> 3C905
>>>> cards.
>>>> In time to time, I also got Tx Unit hangs on those cards, but the
>>>> internet link, and or netfilter rules, network configuration never
>>>> crashed.
>>>> It just keep going and going regard of those tx unit hangs.
>>>> That was one of my main reasons I upgraded to INTEL e1000 cards. To
>>>> upgrade my old PC, 3com cards and to handle my 100MB fiber dark  
>>>> fiber
>>>> with ~200 devices connected in 3-4 networks doing NAT and pretty
>>>> things.
>>>> Correct me if I'm wrong, this Tx Unit hang problem on e1000 is what
>>>> most related to AMD, and or AMD chipset platforms ?
>>> Yes, you are right. There are a lot of things that can cause a "TX
>>> Unit Hang", but I was hoping that this one was the one that we had
>>> already tied to older AMD platforms.
>>>> Today I found a old PC that only collects dust in my company. It  
>>>> has
>>>> the legendary Intel 440BX (same as in Cisco Pix) chipset and also  
>>>> 2x
>>>> SMP Intel Celeron 533 CPU's.
>>>> Would it be a problem solver to change my rusty AMD, VIA  
>>>> combination
>>>> in my current firewall to a rock solid 440BX with Intel CPU's ?
>>> I would hope so.
>>>> I remember, that I never had any problems at all with PIII and or
>>>> Intel chipset mobos in the past.
>>>> Thanks allot.
>>>> In desperate need for help. :)
>>>> Best regards,
>>>> Svavar O
>>>> Reykjavik - Iceland
>



------------------------------------------------------------------------------
Come build with us! The BlackBerry(R) Developer Conference in SF, CA
is the only developer event you need to attend this year. Jumpstart your
developing skills, take BlackBerry mobile applications to market and stay 
ahead of the curve. Join us from November 9 - 12, 2009. Register now!
http://p.sf.net/sfu/devconference
_______________________________________________
E1000-devel mailing list
E1000-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/e1000-devel

Reply via email to