In message <[EMAIL PROTECTED]>, "Kevin S. Martin" writes: >99.999% of the time the interrupt latencies have a range of 20-40uSec >which is fine and acceptable. However, at a rate of about once every 2-4 >days there is a "glitch". I don't know what causes the "glitch" but when >it happens my application's interrupt is held off for 2-4mSec :O. This >totally breaks my realtime signal processing and causes the application >to "trip".
Well, that's certainly very noticable! You say you use 50% of CPU. Is anything else happening on the system normally? Is it possible to run this application in an artificial environment (say, no network) on a test box, to see whether the glitch still occurs? >The only other processing going on in the system (other then kernel >processing) is the network. The system is on Ethernet. Normal network >communication to the system does not cause this problem. I have >exercised the system (via the network) up to the point of the CPU being >100% utilized with affecting my high priority realtime application. I am assuming you mean "without". Have you tried with no network cable? I understand that the application may be useless without it, but if you were to do long runs during which you recorded latencies only over a period of time, without the network, you could see whether the network can be ruled out as a source of error; if the glitch happens with no network, then it's probably not the network. >Is it possible that on rare occasions there is a network "storm" which >causes many network interrupts, one right after the next, causing my >applications interrupt not to get handled? If so, how can I see this and >then of course stop it? It is probably possible that such a thing could happen, but I'd guess that you could generate it. Is your network switched? Do you have an easy way to, say, interpose another machine which logs absolutely every packet on the wire that hits your machine, with close enough timing that you could compare? Given the latencies you're talking about, I assume your network activity isn't in realtime during processing, so presumably a couple extra ms of latency in it wouldn't kill you. >One idea that I had was to run the Ethernet driver in polled mode rather >then interrupt mode. This would prevent a network "storm" from affecting >my high priority application. Can this be done? It should be theoretically possible. >Is there any other possible cause for this "glitch" that any one can >think of? 1. Garbage collection. I don't know that there should be any, but I know some C++ implementations use it for some things. 2. Other hardware. Do you have any local storage devices? For instance, if you had a disk, a disk write could occasionally hang for a couple of mSec. Sounds like a really fascinating problem, which would probably be a lot of fun if you were doing it purely as a hobby. :) -s -- Before posting, please read the FAQ: http://ecos.sourceware.org/fom/ecos and search the list archive: http://ecos.sourceware.org/ml/ecos-discuss
