Re: [Haskell-cafe] thread killed

2012-04-05 Thread tsuraan
>> I think I might know what your problem is. You're accepting file uploads
>> using handleMultipart, yes? Snap kills uploads that are going too slow,
>> otherwise you would be vulnerable to slowloris
>> (http://ha.ckers.org/slowloris/) DoS attacks. What's probably happening here
>> is that you're doing slow work inside the "Iteratee IO a" handler you pass
>> to that function, which makes Snap think the client is trickling bytes to
>> you. If that's the case, either finish the iteratee more quickly and do the
>> slow work back in the Snap handler (preferable), or disable the minimum
>> upload rate guard (although that's not recommended on a server talking to
>> the public internet.)

Ok, so I butchered Snap by replacing all of snap-server's killThread
calls with putStrLn calls, and the putStrLn that is triggered by
Snap.Internal.Http.Server.SimpleBackend's runSession (line 163 in
snap-server 0.8.0.1) seems to be the culprit.  Is that a rate limiter,
or is that something else?  Anyhow, I think there's a bug in there
somewhere.  I'll be poking at it a bit more, but that seems to be the
top-level source of the errors.

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] thread killed

2012-04-05 Thread tsuraan
>> I think I might know what your problem is. You're accepting file uploads
>> using handleMultipart, yes? Snap kills uploads that are going too slow,
>> otherwise you would be vulnerable to slowloris
>> (http://ha.ckers.org/slowloris/) DoS attacks. What's probably happening here
>> is that you're doing slow work inside the "Iteratee IO a" handler you pass
>> to that function, which makes Snap think the client is trickling bytes to
>> you. If that's the case, either finish the iteratee more quickly and do the
>> slow work back in the Snap handler (preferable), or disable the minimum
>> upload rate guard (although that's not recommended on a server talking to
>> the public internet.)

I tried adding a "setMinimumUploadRate 0" to my handleMultipart and
doing the upload, and it's still getting killed.  The uploads are
pretty fast; a 150MB file takes around 10s.  I really don't think the
problem is with Snap, but pulling my code out of snap would be pretty
painful.  I have some pretty nasty crap going on with what I'm doing
anyhow, with threads communicating asynchronously, sockets being held
open between requests, that sort of horrid ugliness.  I'm going to try
to get rid of that, and then see if any bugs I had in there were
causing the problem.  Hopefully I'll be done with that by the end of
the weekend, and I'll either have the problem fixed, or have a
reasonable way to reproduce the errors.

Thanks for the idea, anyhow.  And, if setMinimumUploadRate 0 doesn't
actually disable the rate limiter, I'd be happy to try something more
correct.

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] thread killed

2012-04-05 Thread Michael Snoyman
On Thu, Apr 5, 2012 at 12:05 PM, Gregory Collins
 wrote:
> +haskell-cafe, oops
>
> On Thu, Apr 5, 2012 at 11:04 AM, Gregory Collins 
> wrote:
>>
>> On Wed, Apr 4, 2012 at 10:09 PM, tsuraan  wrote:
>>>
>>> > It's hard to rule Snap timeouts out; try building snap-core with the
>>> > "-fdebug" flag and running your app with "DEBUG=1", you'll get a spew
>>> > of
>>> > debugging output from Snap on stderr.
>>>
>>> Heh, that was quite a spew.  I normally get the exceptions tens of MB
>>> into files that are hundreds of MB, and I sometimes don't get them at
>>> all, so printing out the entire request body was a bit slow :)  After
>>> commenting out some of the more talkative debug statements, I got the
>>> exception to happen, and it looks generally like this:
>>
>>
>> I think I might know what your problem is. You're accepting file uploads
>> using handleMultipart, yes? Snap kills uploads that are going too slow,
>> otherwise you would be vulnerable to slowloris
>> (http://ha.ckers.org/slowloris/) DoS attacks. What's probably happening here
>> is that you're doing slow work inside the "Iteratee IO a" handler you pass
>> to that function, which makes Snap think the client is trickling bytes to
>> you. If that's the case, either finish the iteratee more quickly and do the
>> slow work back in the Snap handler (preferable), or disable the minimum
>> upload rate guard (although that's not recommended on a server talking to
>> the public internet.)

Wouldn't it make more sense to pause the timeout handler when running
user code? That's what we do in Warp.

Michael

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] thread killed

2012-04-05 Thread Gregory Collins
+haskell-cafe, oops

On Thu, Apr 5, 2012 at 11:04 AM, Gregory Collins wrote:

> On Wed, Apr 4, 2012 at 10:09 PM, tsuraan  wrote:
>
>> > It's hard to rule Snap timeouts out; try building snap-core with the
>> > "-fdebug" flag and running your app with "DEBUG=1", you'll get a spew of
>> > debugging output from Snap on stderr.
>>
>> Heh, that was quite a spew.  I normally get the exceptions tens of MB
>> into files that are hundreds of MB, and I sometimes don't get them at
>> all, so printing out the entire request body was a bit slow :)  After
>> commenting out some of the more talkative debug statements, I got the
>> exception to happen, and it looks generally like this:
>>
>
> I think I might know what your problem is. You're accepting file uploads
> using handleMultipart, yes? Snap kills uploads that are going too slow,
> otherwise you would be vulnerable to slowloris (
> http://ha.ckers.org/slowloris/) DoS attacks. What's probably happening
> here is that you're doing slow work inside the "Iteratee IO a" handler you
> pass to that function, which makes Snap think the client is trickling bytes
> to you. If that's the case, either finish the iteratee more quickly and do
> the slow work back in the Snap handler (preferable), or disable the minimum
> upload rate guard (although that's not recommended on a server talking to
> the public internet.)
>
> G
> --
> Gregory Collins 
>



-- 
Gregory Collins 
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] thread killed

2012-04-04 Thread tsuraan
> Whenever I've deadlocked, it terminated the program with "thread
> blocked indefinitely in an MVar operation".

Well, I guess that's probably not what I'm seeing.  I'm currently
trying to simplify the heck out of the code that near where the thread
killed exceptions are emanating; maybe once that's done, the thread
killing will either magically go away, or at least I'll have a smaller
surface area to try to debug.

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] thread killed

2012-04-04 Thread Clark Gaebel
Whenever I've deadlocked, it terminated the program with "thread
blocked indefinitely in an MVar operation".

On Wed, Apr 4, 2012 at 5:59 PM, tsuraan  wrote:
> My Snap handlers communicate with various resource pools, often
> through MVars.  Is it possible that MVar deadlock would be causing the
> runtime system to kill off a contending thread, giving it a
> ThreadKilled exception?  It looks like ghc does do deadlock detection,
> but I can't find any docs on how exactly it deals with deadlocks.
>
> ___
> Haskell-Cafe mailing list
> Haskell-Cafe@haskell.org
> http://www.haskell.org/mailman/listinfo/haskell-cafe
>

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] thread killed

2012-04-04 Thread tsuraan
My Snap handlers communicate with various resource pools, often
through MVars.  Is it possible that MVar deadlock would be causing the
runtime system to kill off a contending thread, giving it a
ThreadKilled exception?  It looks like ghc does do deadlock detection,
but I can't find any docs on how exactly it deals with deadlocks.

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] thread killed

2012-04-04 Thread tsuraan
> This is a long shot, but it's easy to test - turn off GHC's RTS timer,
> +RTS -V0 -RTS.  That removes a source of SIGALRM interrupts.

I was really hoping this one would reveal something interesting, but
it seems to have no effect.  Thanks for the hint though.

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] thread killed

2012-04-04 Thread tsuraan
> It's hard to rule Snap timeouts out; try building snap-core with the
> "-fdebug" flag and running your app with "DEBUG=1", you'll get a spew of
> debugging output from Snap on stderr.

Heh, that was quite a spew.  I normally get the exceptions tens of MB
into files that are hundreds of MB, and I sometimes don't get them at
all, so printing out the entire request body was a bit slow :)  After
commenting out some of the more talkative debug statements, I got the
exception to happen, and it looks generally like this:

[  16] killIfTooSlow: continue
[  16] rqBody iterator: continue
[  16] httpSession iteratee: continue
[  16] SimpleBackend.enumerate(13): got continue
[  16] SimpleBackend.enumerate(13): reading from socket
[  16] SimpleBackend.enumerate(13): got 8192 bytes from read end
[  16] SimpleBackend.enumerate(13): sending 8192 bytes to continuation
[  16] killIfTooSlow: continue
[  16] rqBody iterator: continue
[  16] httpSession iteratee: continue
[  16] SimpleBackend.enumerate(13): got continue
[  16] SimpleBackend.enumerate(13): reading from socket
[  16] SimpleBackend.enumerate(13): got 8192 bytes from read end
[  16] SimpleBackend.enumerate(13): sending 8192 bytes to continuation
[  16] killIfTooSlow: continue
[  16] rqBody iterator: continue
[  16] httpSession iteratee: continue
[  16] SimpleBackend.enumerate(13): got continue
[  16] SimpleBackend.enumerate(13): reading from socket
[  16] SimpleBackend.enumerate(13): got 1878 bytes from read end
[  16] SimpleBackend.enumerate(13): sending 1878 bytes to continuation
[  16] killIfTooSlow: continue
[  16] rqBody iterator: continue
[  16] rateLimit: caught thread killed
[  16] Snap.Http.Server.Config errorHandler:
[  16] During processing of request from 127.0.0.1:38088
< a bunch of headers snipped >
[  16] Server.httpSession: finished running user handler
[  16] Server.httpSession: handled, skipping request body
[  16] httpSession/skipToEof: BEGIN
[  16] httpSession/skipToEof: continue
[  16] Server.httpSession: request body skipped, sending response
[  16] sendResponse: whenEnum: enumerating bytes
[  16] countBytes writeEnd: BEGIN
[  16] writeEnd: BEGIN
[  16] writeEnd: continue
[  16] countBytes writeEnd: continue
[  16] SimpleBackend.writeOut(13): got chunk with 233 bytes
[  16] SimpleBackend.writeOut(13): wrote 233 bytes, last 10="ead killed"


So, I'm not sure what that means.  rateLimit caught the thread kill,
but I don't see anything snap-related that caused it.  That rateLimit
message is the rateLimit seeing an error, and not rateLimit causing
one, right?

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] thread killed

2012-04-04 Thread tsuraan
> This is a long shot, but it's easy to test - turn off GHC's RTS timer,
> +RTS -V0 -RTS.  That removes a source of SIGALRM interrupts.

Awesome, I'll give that a try.  It's worth a shot, anyhow :)

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] thread killed

2012-04-04 Thread tsuraan
> That's probably not where the threadKill is being sent *from*, it's where
> your thread received it.

Yeah, it's definitely where my thread received it.  It's just sort of
crazy, because when I get a ThreadKilled, it's almost always in
Tiger.update.  My handler does much slower things, such as connecting
to a database and doing operations that can take tens of milliseconds,
but somehow the ThreadKilled nearly always emanates from my
Tiger.update.  I even went so far as to wrap my Tiger.update in an IO
operation that catches the ThreadKilled and tries the update again,
and that "fixed" upwards of 90% of my thread deaths.  Crazy...

> It's hard to rule Snap timeouts out; try building snap-core with the
> "-fdebug" flag and running your app with "DEBUG=1", you'll get a spew of
> debugging output from Snap on stderr.

I'll give that a try, but whenever I've added any sort of printing to
my handler to try to track things down, the issue goes away entirely.
My toolkit for debugging race conditions is pretty weak; I usually
have been able to think real hard and then fix them intuitively, but
my intuition about Haskell is still weak enough that my normal
approach isn't working :)

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] thread killed

2012-04-03 Thread Donn Cave
On Wed, Apr 4, 2012 at 6:37 AM, tsuraan  wrote:

> What sorts of things can cause a thread to get an asynchronous "thread
> killed" exception?  I've been seeing rare, inexplicable "thread
> killed" messages in my Snap handlers for a long time, but they aren't
> from Snap's timeout code.

This is a long shot, but it's easy to test - turn off GHC's RTS timer,
+RTS -V0 -RTS.  That removes a source of SIGALRM interrupts.

Donn

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] thread killed

2012-04-03 Thread Gregory Collins
On Wed, Apr 4, 2012 at 6:37 AM, tsuraan  wrote:

> What sorts of things can cause a thread to get an asynchronous "thread
> killed" exception?  I've been seeing rare, inexplicable "thread
> killed" messages in my Snap handlers for a long time, but they aren't
> from Snap's timeout code.  I recently upgraded to ghc 7.4.1, and that
> caused the kills to happen a lot more often, but also gave me some
> traceback capabilities.  I tracked the most common kills down to
> cryptohash's Crypto.Hash.Tiger.update function, but that's about as
>

That's probably not where the threadKill is being sent *from*, it's where
your thread received it.



> pure a FFI function can be, so I don't know how that would be causing
> anything weird to happen.  I also sometimes get the kills in the
> Tiger.finalize function, and I get other ones in functions that I
> haven't been able to track down yet.  Given that the thread kills
> aren't from Snap's timeout code (they happen in under a second, and I
> have snap's timeouts turned to an insanely high value), what sort of
> other things cause ThreadKilled exceptions?
>

It's hard to rule Snap timeouts out; try building snap-core with the
"-fdebug" flag and running your app with "DEBUG=1", you'll get a spew of
debugging output from Snap on stderr.

G
-- 
Gregory Collins 
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


[Haskell-cafe] thread killed

2012-04-03 Thread tsuraan
What sorts of things can cause a thread to get an asynchronous "thread
killed" exception?  I've been seeing rare, inexplicable "thread
killed" messages in my Snap handlers for a long time, but they aren't
from Snap's timeout code.  I recently upgraded to ghc 7.4.1, and that
caused the kills to happen a lot more often, but also gave me some
traceback capabilities.  I tracked the most common kills down to
cryptohash's Crypto.Hash.Tiger.update function, but that's about as
pure a FFI function can be, so I don't know how that would be causing
anything weird to happen.  I also sometimes get the kills in the
Tiger.finalize function, and I get other ones in functions that I
haven't been able to track down yet.  Given that the thread kills
aren't from Snap's timeout code (they happen in under a second, and I
have snap's timeouts turned to an insanely high value), what sort of
other things cause ThreadKilled exceptions?

Thanks for any help; this is really driving me mad :-/

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe