In a sense, this is a classic scenario for a backoff algorithm. However... I really don't know when this situation happens, and if it's ever temporary. What I've done in my local patch is also send a log message every time we get this zero, before I throw the exception. So, I'll be able to get a sense of just how common this situation is.
-Tal On 08/17/2009 06:27 AM, Jerome Louvel wrote: > Hi Tal, > > Thanks for your investigation. Let us know how your workaround behaves. > > I was wondering, instead of throwing an IO exception immediately if we > couldn't sleep for a bit and try a couple of times to see if this is a > temporary issue of not. > > What do you think? > > Best regards, > Jerome Louvel > -- > Restlet ~ Founder and Lead developer ~ http://www.restlet.org > Noelios Technologies ~ Co-founder ~ http://www.noelios.com > > > -----Message d'origine----- > De : Tal Liron [mailto:[email protected]] > Envoyé : dimanche 16 août 2009 21:48 > À : [email protected] > Objet : Re: CPU at 100% > > Alright, folk, I'm fairly confident that this is a Restlet bug. To be > more precise, it's a case in which Restlet could be more robust than it > it is right now. Possibly an easy fix! Read on. > > Here's the code, from > org.restlet.engine.application.RangeInputStream.read (line 149): > > public int read(byte[] b, int off, int len) throws IOException { > // Reach the start index. > while (!(position>= startIndex)) { > position += skip(startIndex - position); > } > > As you can see, not the most robust code. It assumes that skip() would > always move you ahead, when in fact there is no guarantee. Here's the > JDK documentation: > > "The |skip| method may, for a variety of reasons, end up skipping over > some smaller number of bytes, possibly |0|. The actual number of bytes > skipped is returned." > > And that's what causes this loop to stay stuck and thrash the thread. As > to why this happens, it very may well be a bug in the implementation, or > even in the JDK. I did see this same bug in both Grizzly and Jetty. My > intuition tells me that these are certain nio-based requests that get > cut somewhere down the line, which can happen anywhere between the > client to your host OS, and that the upper-level world of simulated > streams doesn't or can't acknowledge this through instant failure. > Perhaps there's indirect acknowledgment of this exactly in the "silent" > failure of methods such as skip()? > > Anyway. My rather trivial suggested solution/workaround: > > public int read(byte[] b, int off, int len) throws IOException { > // Reach the start index. > while (!(position>= startIndex)) { > long skipped = skip(startIndex - position); > if(skipped<= 0) { > throw new IOException("Cannot skip ahead in > FilterInputStream"); > } > position += skipped; > } > > I will give this a try with a patched build, and let you know how it > goes. (It can take a few days for the bug to appear.) However, I'd love > to hear thoughts about this issue from the rest of the Restlet community! > > -Tal > > On 08/13/2009 08:14 AM, Tal Liron wrote: > >> Unfortunately, the thread thrashing re-appears even with the latest >> version of Sun's JDK6. So, I'm considering this a Restlet bug and will >> continue investigating. >> >> Using a recent Restlet svn build (a few days after 2.0M4). Here is the >> stack trace of the evil thread, in case anyone would like to lend >> their brain to the task: >> >> >> > org.restlet.engine.application.RangeInputStream.read(RangeInputStream.java:1 > 49) > >> java.io.FilterInputStream.read(FilterInputStream.java:90) >> org.restlet.engine.io.ByteUtils.write(ByteUtils.java:506) >> >> > org.restlet.engine.application.RangeRepresentation.write(RangeRepresentation > .java:86) > >> >> > org.restlet.engine.http.HttpServerCall.writeResponseBody(HttpServerCall.java > :492) > >> >> > org.restlet.engine.http.HttpServerCall.sendResponse(HttpServerCall.java:150) > > >> >> > org.restlet.ext.jetty.JettyServerHelper$WrappedServer.handle(JettyServerHelp > er.java:184) > >> org.mortbay.jetty.HttpConnection.handleRequest(HttpConnection.java:534) >> >> > org.mortbay.jetty.HttpConnection$RequestHandler.headerComplete(HttpConnectio > n.java:864) > >> org.mortbay.jetty.HttpParser.parseNext(HttpParser.java:539) >> org.mortbay.jetty.HttpParser.parseAvailable(HttpParser.java:212) >> org.mortbay.jetty.HttpConnection.handle(HttpConnection.java:404) >> >> > org.mortbay.io.nio.SelectChannelEndPoint.run(SelectChannelEndPoint.java:409) > > >> >> > org.mortbay.thread.QueuedThreadPool$PoolThread.run(QueuedThreadPool.java:520 > ) > >> >> -Tal >> >> On 08/12/2009 02:30 PM, Rob Heittman wrote: >> >>> Yes. Sun JDK 6U9 and above on Ubuntu Jaunty have not demonstrated >>> any real showstoppers for my purposes. My highest traffic server (a >>> Rackspace dedicated box) handles about 15 requests per second >>> continuously, so while we're not pressing up against the theoretical >>> maximums here, I feel pretty happy that the JDK platform is stable >>> anyway. We haven't seen any unpleasantness on Slicehost or Amazon >>> EC2, the other two hosting providers we use. EC2/Jaunty/Sun is where >>> we put all our new applications. We use the "alestic" EC2 images as >>> a Ubuntu baseline and then prepare our own AMIs. >>> >>> Unrelated, since you're also a scripting guy ... those high load >>> servers still need an occasional JVM restart to clear memory leaks >>> associated with script engine use. I can't officially blame Rhino, >>> Jython, or JRuby for this, as the real culprit is almost certainly >>> antipatterns in user-authored scripts, but definitely the sites that >>> run lots of scripts do leak memory over time. That's the only >>> stability problem that forces occasional restarts on our Restlet >>> 1.2M2 based platform. >>> >>> Currently I'm benchmarking stuff on 2.0M4 (well, technically a >>> snapshot from 2 days later that fixes a Mac bug) and the performance >>> seems better, stability just as good. >>> >>> - R >>> >>> On Wed, Aug 12, 2009 at 3:03 PM, Tal Liron >>> <[email protected]<mailto:[email protected]>> >>> wrote: >>> >>> This is a very useful thread! >>> >>> So, when you listed the problems with OpenJDK6 on Hardy and >>> Jaunty, does this mean you didn't see the same problems using >>> Sun's JDK6 on those platforms? >>> >>> -Tal >>> >>> >>> > ------------------------------------------------------ > http://restlet.tigris.org/ds/viewMessage.do?dsForumId=4447&dsMessageId=23841 > 09 > > ------------------------------------------------------ > http://restlet.tigris.org/ds/viewMessage.do?dsForumId=4447&dsMessageId=2384291 > ------------------------------------------------------ http://restlet.tigris.org/ds/viewMessage.do?dsForumId=4447&dsMessageId=2384320

