Re: CPU at 100%

Tal Liron Mon, 17 Aug 2009 05:52:08 -0700

In a sense, this is a classic scenario for a backoff algorithm. 
However... I really don't know when this situation happens, and if it's 
ever temporary. What I've done in my local patch is also send a log 
message every time we get this zero, before I throw the exception. So, 
I'll be able to get a sense of just how common this situation is.


-Tal

On 08/17/2009 06:27 AM, Jerome Louvel wrote:
> Hi Tal,
>
> Thanks for your investigation. Let us know how your workaround behaves.
>
> I was wondering, instead of throwing an IO exception immediately if we
> couldn't sleep for a bit and try a couple of times to see if this is a
> temporary issue of not.
>
> What do you think?
>
> Best regards,
> Jerome Louvel
> --
> Restlet ~ Founder and Lead developer ~ http://www.restlet.org
> Noelios Technologies ~ Co-founder ~ http://www.noelios.com
>
>
> -----Message d'origine-----
> De : Tal Liron [mailto:[email protected]]
> Envoyé : dimanche 16 août 2009 21:48
> À : [email protected]
> Objet : Re: CPU at 100%
>
> Alright, folk, I'm fairly confident that this is a Restlet bug. To be
> more precise, it's a case in which Restlet could be more robust than it
> it is right now. Possibly an easy fix! Read on.
>
> Here's the code, from
> org.restlet.engine.application.RangeInputStream.read (line 149):
>
>       public int read(byte[] b, int off, int len) throws IOException {
>           // Reach the start index.
>           while (!(position>= startIndex)) {
>               position += skip(startIndex - position);
>           }
>
> As you can see, not the most robust code. It assumes that skip() would
> always move you ahead, when in fact there is no guarantee. Here's the
> JDK documentation:
>
> "The |skip| method may, for a variety of reasons, end up skipping over
> some smaller number of bytes, possibly |0|. The actual number of bytes
> skipped is returned."
>
> And that's what causes this loop to stay stuck and thrash the thread. As
> to why this happens, it very may well be a bug in the implementation, or
> even in the JDK. I did see this same bug in both Grizzly and Jetty. My
> intuition tells me that these are certain nio-based requests that get
> cut somewhere down the line, which can happen anywhere between the
> client to your host OS, and that the upper-level world of simulated
> streams doesn't or can't acknowledge this through instant failure.
> Perhaps there's indirect acknowledgment of this exactly in the "silent"
> failure of methods such as skip()?
>
> Anyway. My rather trivial suggested solution/workaround:
>
>       public int read(byte[] b, int off, int len) throws IOException {
>           // Reach the start index.
>           while (!(position>= startIndex)) {
>               long skipped = skip(startIndex - position);
>               if(skipped<= 0) {
>                   throw new IOException("Cannot skip ahead in
> FilterInputStream");
>               }
>               position += skipped;
>           }
>
> I will give this a try with a patched build, and let you know how it
> goes. (It can take a few days for the bug to appear.) However, I'd love
> to hear thoughts about this issue from the rest of the Restlet community!
>
> -Tal
>
> On 08/13/2009 08:14 AM, Tal Liron wrote:
>    
>> Unfortunately, the thread thrashing re-appears even with the latest
>> version of Sun's JDK6. So, I'm considering this a Restlet bug and will
>> continue investigating.
>>
>> Using a recent Restlet svn build (a few days after 2.0M4). Here is the
>> stack trace of the evil thread, in case anyone would like to lend
>> their brain to the task:
>>
>>
>>      
> org.restlet.engine.application.RangeInputStream.read(RangeInputStream.java:1
> 49)
>    
>> java.io.FilterInputStream.read(FilterInputStream.java:90)
>> org.restlet.engine.io.ByteUtils.write(ByteUtils.java:506)
>>
>>      
> org.restlet.engine.application.RangeRepresentation.write(RangeRepresentation
> .java:86)
>    
>>
>>      
> org.restlet.engine.http.HttpServerCall.writeResponseBody(HttpServerCall.java
> :492)
>    
>>
>>      
> org.restlet.engine.http.HttpServerCall.sendResponse(HttpServerCall.java:150)
>
>    
>>
>>      
> org.restlet.ext.jetty.JettyServerHelper$WrappedServer.handle(JettyServerHelp
> er.java:184)
>    
>> org.mortbay.jetty.HttpConnection.handleRequest(HttpConnection.java:534)
>>
>>      
> org.mortbay.jetty.HttpConnection$RequestHandler.headerComplete(HttpConnectio
> n.java:864)
>    
>> org.mortbay.jetty.HttpParser.parseNext(HttpParser.java:539)
>> org.mortbay.jetty.HttpParser.parseAvailable(HttpParser.java:212)
>> org.mortbay.jetty.HttpConnection.handle(HttpConnection.java:404)
>>
>>      
> org.mortbay.io.nio.SelectChannelEndPoint.run(SelectChannelEndPoint.java:409)
>
>    
>>
>>      
> org.mortbay.thread.QueuedThreadPool$PoolThread.run(QueuedThreadPool.java:520
> )
>    
>>
>> -Tal
>>
>> On 08/12/2009 02:30 PM, Rob Heittman wrote:
>>      
>>> Yes.  Sun JDK 6U9 and above on Ubuntu Jaunty have not demonstrated
>>> any real showstoppers for my purposes.  My highest traffic server (a
>>> Rackspace dedicated box) handles about 15 requests per second
>>> continuously, so while we're not pressing up against the theoretical
>>> maximums here, I feel pretty happy that the JDK platform is stable
>>> anyway.  We haven't seen any unpleasantness on Slicehost or Amazon
>>> EC2, the other two hosting providers we use.  EC2/Jaunty/Sun is where
>>> we put all our new applications.  We use the "alestic" EC2 images as
>>> a Ubuntu baseline and then prepare our own AMIs.
>>>
>>> Unrelated, since you're also a scripting guy ... those high load
>>> servers still need an occasional JVM restart to clear memory leaks
>>> associated with script engine use.  I can't officially blame Rhino,
>>> Jython, or JRuby for this, as the real culprit is almost certainly
>>> antipatterns in user-authored scripts, but definitely the sites that
>>> run lots of scripts do leak memory over time.  That's the only
>>> stability problem that forces occasional restarts on our Restlet
>>> 1.2M2 based platform.
>>>
>>> Currently I'm benchmarking stuff on 2.0M4 (well, technically a
>>> snapshot from 2 days later that fixes a Mac bug) and the performance
>>> seems better, stability just as good.
>>>
>>> - R
>>>
>>> On Wed, Aug 12, 2009 at 3:03 PM, Tal Liron
>>> <[email protected]<mailto:[email protected]>>
>>> wrote:
>>>
>>>      This is a very useful thread!
>>>
>>>      So, when you listed the problems with OpenJDK6 on Hardy and
>>>      Jaunty, does this mean you didn't see the same problems using
>>>      Sun's JDK6 on those platforms?
>>>
>>>      -Tal
>>>
>>>
>>>        
> ------------------------------------------------------
> http://restlet.tigris.org/ds/viewMessage.do?dsForumId=4447&dsMessageId=23841
> 09
>
> ------------------------------------------------------
> http://restlet.tigris.org/ds/viewMessage.do?dsForumId=4447&dsMessageId=2384291
>

------------------------------------------------------
http://restlet.tigris.org/ds/viewMessage.do?dsForumId=4447&dsMessageId=2384320

Re: CPU at 100%

Reply via email to