2012/4/8 Soren Kristensen <[email protected]>:
> Hi Attila,
>
> Attila Kinali wrote:
>> On Sun, 08 Apr 2012 01:34:58 +0200
>> Soren Kristensen<[email protected]>  wrote:
>>
>>> Alvar Kusma wrote:
>>>>
>>>>> As I have stated before, afaik the net5501 do not have any design
>>>>> issues, Attila's problem is most likely either software related,
>>>>
>>>> Please, can you explain, why similar board from PCEngines (Alix 2D13)
>>>> with same software (OpenWRT image) just works, but Soekris board shows
>>>> some unstability? Can you explain, why this same exact software works on
>>>> one net5501 without a glitch over year now, but two other units show
>>>> unstability signs - random hangs, sometimes works over month, sometimes
>>>> crashes 2 times a day? This is still a mystery for me. Just bad luck?
>>>
>>> No, pretty simple:
>>>
>>> The Linux VT6105M driver has interrupt race problems, reported to have
>>> been fixed recently, don't know if it have ported to the main Linux sources.
>>>
>>> The Atheros wlan drivers seems to also have interrupt race problems,
>>> don't remember if that have been fixed too.
>>
>> You repeat this argument over and over. But apearently, you are the only
>> one who knows about these race conditions. I cannot find any reference
>> to the race condition on the VT6105M at all. And for the ath9k race,
>> the only one i could find was fixed october 2010 in the mainline kernel.
>> Can you provide us with references to what race conditions you mean and
>> where they are to be found?
>
>  From my post on 12/7/2011:
>
> Looking though the archieves I found two reported issues, both on Linux.
>
> 1) The thread in Sept/Oct 2010, concluding with Andrey Safonov reporting
> the Linux VIA VT6105M driver to have bug, and how to fix it:
>
> http://lists.soekris.com/pipermail/soekris-tech/2010-October/016884.html
> http://lists.soekris.com/pipermail/soekris-tech/2010-October/016889.html

Hey,

I tried submitting a patch containing those two lines upstream,
resulting in some work from Francois Romieu that fixes it the right
way. (See the mail I sent to this list on January 22. requesting help
testing, what nobody replied to).
Those patches where merged into mainline in linux 3.3-rc1, so version
3.3 and forward contains those fixes, which help fix the interrupt
crashes.

So for everyone using a kernel below version 3.3 and complaining about
crashes, they really should have read their mail and tested with
something newer - they had been notified :)

/Bjarke

> 2) And "green" reporting a fix to either ath9k, or all wireless drivers,
> in his post on Jan 25, 2011:
>
> http://lists.soekris.com/pipermail/soekris-tech/2011-January/017001.html
>
>>
>> But to kill that driver bug argument once and for all please explain the
>> following which i've seen during my test:
>>
>> Setup:
>> net5501 running debian/stable with a self build vanilla linux kernel
>> version 3.2.1. Connected to the net5501 are a notebook sata harddisk
>> and AR9200 wlan card. The LAN is connected on eth0.
>>
>> If the WLAN card is _not_ running (driver not loaded or disbaled by rfkill)
>> no problems can be seen. No crashes, nothing. For months.
>>
>> Test #1:
>> Setup as above. WLAN card enabled, traffic going trough both WLAN and eth0.
>> Result: System crashes in 2minutes (+/- 1 minute). No Oops, as would be
>> seen with most driver bugs on the serial console. It just hangs.
>>
>> Test #2:
>> Setup and test procedure as in Test #1, but with two 1000uF capacitors
>> connected to J5 at 5V and 3.3V power supplies.
>> Result: System crashes in 5minutes (-1min, +2min).
>>
>> Test #3:
>> Setup nd test procedure as in Test #2, but with three dozen ceramic 
>> capacitors
>> soldered on the board.
>> Result: No crash at all after one week. Even heavy system load doesn't
>> affect the system anymore.
>>
>>
>> Notes:
>> 1) Test #1 and Test #2 were repeated several dozen times. Although i have not
>> writen down the times it takes to crash the system and didnt do a
>> mathematically rigourus statistical analysis, i can state that the
>> difference between Test #1 and #2 is significant. Ie the additional 
>> capacitors
>> improve the situation considerably.
>>
>> 2) I run Test #1 and #2 before i did the modifications for Test #3 to ensure
>> the bug is still present and can be reproduced. I did not do any software
>> upgrades or any configuration changes in between. Ie if it would be a 
>> software
>> bug, it would be present in all three tests.
>>
>>
>> Soren, if you really have an explenation how a software bug (a race
>> condition as you say) can be fixed with a soldering iron, i really like
>> to hear that. I have systems that experience race conditions under
>> every once in a while and i'd like to fix those as well with my soldering
>> iron.
>
> Attila, thanks for the detailed testing done. I agreed with you that
> adding capacitors should not change behavior if it's a software problem
> alone.
>
> I will still state that the net5501 has the decoupling it needs for
> itself and the expansions it's designed for. One possible sources of
> problem could be the power supply regulators as they located just behind
> the mini-PCI slot, RF could be affecting t.ex. the compensation circuit,
> so adding decoupling capacitors just fix the symptoms.
>
> I would also like to investigate the problem further. Can you please
> tell me the exact wlan card ?
>
> And can you please ensure that the vt6105 driver is updated to a fixed
> one, would really love data after that is done....
>
> I still have the problem that nobody running FreeBSD and OpenBSD have
> reported similar issues, somebody correct me if I'm wrong.
>
>
> Best Regards,
>
>
> Soren Kristensen
>
> CEO & Chief Engineer
> Soekris Engineering, Inc.
> _______________________________________________
> Soekris-tech mailing list
> [email protected]
> http://lists.soekris.com/mailman/listinfo/soekris-tech
_______________________________________________
Soekris-tech mailing list
[email protected]
http://lists.soekris.com/mailman/listinfo/soekris-tech

Reply via email to