On Sun, 08 Apr 2012 01:34:58 +0200
Soren Kristensen <[email protected]> wrote:
> Alvar Kusma wrote:
> >
> >> As I have stated before, afaik the net5501 do not have any design
> >> issues, Attila's problem is most likely either software related,
> >
> > Please, can you explain, why similar board from PCEngines (Alix 2D13)
> > with same software (OpenWRT image) just works, but Soekris board shows
> > some unstability? Can you explain, why this same exact software works on
> > one net5501 without a glitch over year now, but two other units show
> > unstability signs - random hangs, sometimes works over month, sometimes
> > crashes 2 times a day? This is still a mystery for me. Just bad luck?
>
> No, pretty simple:
>
> The Linux VT6105M driver has interrupt race problems, reported to have
> been fixed recently, don't know if it have ported to the main Linux sources.
>
> The Atheros wlan drivers seems to also have interrupt race problems,
> don't remember if that have been fixed too.
You repeat this argument over and over. But apearently, you are the only
one who knows about these race conditions. I cannot find any reference
to the race condition on the VT6105M at all. And for the ath9k race,
the only one i could find was fixed october 2010 in the mainline kernel.
Can you provide us with references to what race conditions you mean and
where they are to be found?
But to kill that driver bug argument once and for all please explain the
following which i've seen during my test:
Setup:
net5501 running debian/stable with a self build vanilla linux kernel
version 3.2.1. Connected to the net5501 are a notebook sata harddisk
and AR9200 wlan card. The LAN is connected on eth0.
If the WLAN card is _not_ running (driver not loaded or disbaled by rfkill)
no problems can be seen. No crashes, nothing. For months.
Test #1:
Setup as above. WLAN card enabled, traffic going trough both WLAN and eth0.
Result: System crashes in 2minutes (+/- 1 minute). No Oops, as would be
seen with most driver bugs on the serial console. It just hangs.
Test #2:
Setup and test procedure as in Test #1, but with two 1000uF capacitors
connected to J5 at 5V and 3.3V power supplies.
Result: System crashes in 5minutes (-1min, +2min).
Test #3:
Setup nd test procedure as in Test #2, but with three dozen ceramic capacitors
soldered on the board.
Result: No crash at all after one week. Even heavy system load doesn't
affect the system anymore.
Notes:
1) Test #1 and Test #2 were repeated several dozen times. Although i have not
writen down the times it takes to crash the system and didnt do a
mathematically rigourus statistical analysis, i can state that the
difference between Test #1 and #2 is significant. Ie the additional capacitors
improve the situation considerably.
2) I run Test #1 and #2 before i did the modifications for Test #3 to ensure
the bug is still present and can be reproduced. I did not do any software
upgrades or any configuration changes in between. Ie if it would be a software
bug, it would be present in all three tests.
Soren, if you really have an explenation how a software bug (a race
condition as you say) can be fixed with a soldering iron, i really like
to hear that. I have systems that experience race conditions under
every once in a while and i'd like to fix those as well with my soldering
iron.
Attila Kinali
--
Why does it take years to find the answers to
the questions one should have asked long ago?
_______________________________________________
Soekris-tech mailing list
[email protected]
http://lists.soekris.com/mailman/listinfo/soekris-tech