Re: [Haskell-cafe] thread killed
>> I think I might know what your problem is. You're accepting file uploads >> using handleMultipart, yes? Snap kills uploads that are going too slow, >> otherwise you would be vulnerable to slowloris >> (http://ha.ckers.org/slowloris/) DoS attacks. What's probably happening here >> is that you're doing slow work inside the "Iteratee IO a" handler you pass >> to that function, which makes Snap think the client is trickling bytes to >> you. If that's the case, either finish the iteratee more quickly and do the >> slow work back in the Snap handler (preferable), or disable the minimum >> upload rate guard (although that's not recommended on a server talking to >> the public internet.) Ok, so I butchered Snap by replacing all of snap-server's killThread calls with putStrLn calls, and the putStrLn that is triggered by Snap.Internal.Http.Server.SimpleBackend's runSession (line 163 in snap-server 0.8.0.1) seems to be the culprit. Is that a rate limiter, or is that something else? Anyhow, I think there's a bug in there somewhere. I'll be poking at it a bit more, but that seems to be the top-level source of the errors. ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] thread killed
>> I think I might know what your problem is. You're accepting file uploads >> using handleMultipart, yes? Snap kills uploads that are going too slow, >> otherwise you would be vulnerable to slowloris >> (http://ha.ckers.org/slowloris/) DoS attacks. What's probably happening here >> is that you're doing slow work inside the "Iteratee IO a" handler you pass >> to that function, which makes Snap think the client is trickling bytes to >> you. If that's the case, either finish the iteratee more quickly and do the >> slow work back in the Snap handler (preferable), or disable the minimum >> upload rate guard (although that's not recommended on a server talking to >> the public internet.) I tried adding a "setMinimumUploadRate 0" to my handleMultipart and doing the upload, and it's still getting killed. The uploads are pretty fast; a 150MB file takes around 10s. I really don't think the problem is with Snap, but pulling my code out of snap would be pretty painful. I have some pretty nasty crap going on with what I'm doing anyhow, with threads communicating asynchronously, sockets being held open between requests, that sort of horrid ugliness. I'm going to try to get rid of that, and then see if any bugs I had in there were causing the problem. Hopefully I'll be done with that by the end of the weekend, and I'll either have the problem fixed, or have a reasonable way to reproduce the errors. Thanks for the idea, anyhow. And, if setMinimumUploadRate 0 doesn't actually disable the rate limiter, I'd be happy to try something more correct. ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] thread killed
On Thu, Apr 5, 2012 at 12:05 PM, Gregory Collins wrote: > +haskell-cafe, oops > > On Thu, Apr 5, 2012 at 11:04 AM, Gregory Collins > wrote: >> >> On Wed, Apr 4, 2012 at 10:09 PM, tsuraan wrote: >>> >>> > It's hard to rule Snap timeouts out; try building snap-core with the >>> > "-fdebug" flag and running your app with "DEBUG=1", you'll get a spew >>> > of >>> > debugging output from Snap on stderr. >>> >>> Heh, that was quite a spew. I normally get the exceptions tens of MB >>> into files that are hundreds of MB, and I sometimes don't get them at >>> all, so printing out the entire request body was a bit slow :) After >>> commenting out some of the more talkative debug statements, I got the >>> exception to happen, and it looks generally like this: >> >> >> I think I might know what your problem is. You're accepting file uploads >> using handleMultipart, yes? Snap kills uploads that are going too slow, >> otherwise you would be vulnerable to slowloris >> (http://ha.ckers.org/slowloris/) DoS attacks. What's probably happening here >> is that you're doing slow work inside the "Iteratee IO a" handler you pass >> to that function, which makes Snap think the client is trickling bytes to >> you. If that's the case, either finish the iteratee more quickly and do the >> slow work back in the Snap handler (preferable), or disable the minimum >> upload rate guard (although that's not recommended on a server talking to >> the public internet.) Wouldn't it make more sense to pause the timeout handler when running user code? That's what we do in Warp. Michael ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] thread killed
+haskell-cafe, oops On Thu, Apr 5, 2012 at 11:04 AM, Gregory Collins wrote: > On Wed, Apr 4, 2012 at 10:09 PM, tsuraan wrote: > >> > It's hard to rule Snap timeouts out; try building snap-core with the >> > "-fdebug" flag and running your app with "DEBUG=1", you'll get a spew of >> > debugging output from Snap on stderr. >> >> Heh, that was quite a spew. I normally get the exceptions tens of MB >> into files that are hundreds of MB, and I sometimes don't get them at >> all, so printing out the entire request body was a bit slow :) After >> commenting out some of the more talkative debug statements, I got the >> exception to happen, and it looks generally like this: >> > > I think I might know what your problem is. You're accepting file uploads > using handleMultipart, yes? Snap kills uploads that are going too slow, > otherwise you would be vulnerable to slowloris ( > http://ha.ckers.org/slowloris/) DoS attacks. What's probably happening > here is that you're doing slow work inside the "Iteratee IO a" handler you > pass to that function, which makes Snap think the client is trickling bytes > to you. If that's the case, either finish the iteratee more quickly and do > the slow work back in the Snap handler (preferable), or disable the minimum > upload rate guard (although that's not recommended on a server talking to > the public internet.) > > G > -- > Gregory Collins > -- Gregory Collins ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] thread killed
> Whenever I've deadlocked, it terminated the program with "thread > blocked indefinitely in an MVar operation". Well, I guess that's probably not what I'm seeing. I'm currently trying to simplify the heck out of the code that near where the thread killed exceptions are emanating; maybe once that's done, the thread killing will either magically go away, or at least I'll have a smaller surface area to try to debug. ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] thread killed
Whenever I've deadlocked, it terminated the program with "thread blocked indefinitely in an MVar operation". On Wed, Apr 4, 2012 at 5:59 PM, tsuraan wrote: > My Snap handlers communicate with various resource pools, often > through MVars. Is it possible that MVar deadlock would be causing the > runtime system to kill off a contending thread, giving it a > ThreadKilled exception? It looks like ghc does do deadlock detection, > but I can't find any docs on how exactly it deals with deadlocks. > > ___ > Haskell-Cafe mailing list > Haskell-Cafe@haskell.org > http://www.haskell.org/mailman/listinfo/haskell-cafe > ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] thread killed
My Snap handlers communicate with various resource pools, often through MVars. Is it possible that MVar deadlock would be causing the runtime system to kill off a contending thread, giving it a ThreadKilled exception? It looks like ghc does do deadlock detection, but I can't find any docs on how exactly it deals with deadlocks. ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] thread killed
> This is a long shot, but it's easy to test - turn off GHC's RTS timer, > +RTS -V0 -RTS. That removes a source of SIGALRM interrupts. I was really hoping this one would reveal something interesting, but it seems to have no effect. Thanks for the hint though. ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] thread killed
> It's hard to rule Snap timeouts out; try building snap-core with the > "-fdebug" flag and running your app with "DEBUG=1", you'll get a spew of > debugging output from Snap on stderr. Heh, that was quite a spew. I normally get the exceptions tens of MB into files that are hundreds of MB, and I sometimes don't get them at all, so printing out the entire request body was a bit slow :) After commenting out some of the more talkative debug statements, I got the exception to happen, and it looks generally like this: [ 16] killIfTooSlow: continue [ 16] rqBody iterator: continue [ 16] httpSession iteratee: continue [ 16] SimpleBackend.enumerate(13): got continue [ 16] SimpleBackend.enumerate(13): reading from socket [ 16] SimpleBackend.enumerate(13): got 8192 bytes from read end [ 16] SimpleBackend.enumerate(13): sending 8192 bytes to continuation [ 16] killIfTooSlow: continue [ 16] rqBody iterator: continue [ 16] httpSession iteratee: continue [ 16] SimpleBackend.enumerate(13): got continue [ 16] SimpleBackend.enumerate(13): reading from socket [ 16] SimpleBackend.enumerate(13): got 8192 bytes from read end [ 16] SimpleBackend.enumerate(13): sending 8192 bytes to continuation [ 16] killIfTooSlow: continue [ 16] rqBody iterator: continue [ 16] httpSession iteratee: continue [ 16] SimpleBackend.enumerate(13): got continue [ 16] SimpleBackend.enumerate(13): reading from socket [ 16] SimpleBackend.enumerate(13): got 1878 bytes from read end [ 16] SimpleBackend.enumerate(13): sending 1878 bytes to continuation [ 16] killIfTooSlow: continue [ 16] rqBody iterator: continue [ 16] rateLimit: caught thread killed [ 16] Snap.Http.Server.Config errorHandler: [ 16] During processing of request from 127.0.0.1:38088 < a bunch of headers snipped > [ 16] Server.httpSession: finished running user handler [ 16] Server.httpSession: handled, skipping request body [ 16] httpSession/skipToEof: BEGIN [ 16] httpSession/skipToEof: continue [ 16] Server.httpSession: request body skipped, sending response [ 16] sendResponse: whenEnum: enumerating bytes [ 16] countBytes writeEnd: BEGIN [ 16] writeEnd: BEGIN [ 16] writeEnd: continue [ 16] countBytes writeEnd: continue [ 16] SimpleBackend.writeOut(13): got chunk with 233 bytes [ 16] SimpleBackend.writeOut(13): wrote 233 bytes, last 10="ead killed" So, I'm not sure what that means. rateLimit caught the thread kill, but I don't see anything snap-related that caused it. That rateLimit message is the rateLimit seeing an error, and not rateLimit causing one, right? ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] thread killed
> This is a long shot, but it's easy to test - turn off GHC's RTS timer, > +RTS -V0 -RTS. That removes a source of SIGALRM interrupts. Awesome, I'll give that a try. It's worth a shot, anyhow :) ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] thread killed
> That's probably not where the threadKill is being sent *from*, it's where > your thread received it. Yeah, it's definitely where my thread received it. It's just sort of crazy, because when I get a ThreadKilled, it's almost always in Tiger.update. My handler does much slower things, such as connecting to a database and doing operations that can take tens of milliseconds, but somehow the ThreadKilled nearly always emanates from my Tiger.update. I even went so far as to wrap my Tiger.update in an IO operation that catches the ThreadKilled and tries the update again, and that "fixed" upwards of 90% of my thread deaths. Crazy... > It's hard to rule Snap timeouts out; try building snap-core with the > "-fdebug" flag and running your app with "DEBUG=1", you'll get a spew of > debugging output from Snap on stderr. I'll give that a try, but whenever I've added any sort of printing to my handler to try to track things down, the issue goes away entirely. My toolkit for debugging race conditions is pretty weak; I usually have been able to think real hard and then fix them intuitively, but my intuition about Haskell is still weak enough that my normal approach isn't working :) ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] thread killed
On Wed, Apr 4, 2012 at 6:37 AM, tsuraan wrote: > What sorts of things can cause a thread to get an asynchronous "thread > killed" exception? I've been seeing rare, inexplicable "thread > killed" messages in my Snap handlers for a long time, but they aren't > from Snap's timeout code. This is a long shot, but it's easy to test - turn off GHC's RTS timer, +RTS -V0 -RTS. That removes a source of SIGALRM interrupts. Donn ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] thread killed
On Wed, Apr 4, 2012 at 6:37 AM, tsuraan wrote: > What sorts of things can cause a thread to get an asynchronous "thread > killed" exception? I've been seeing rare, inexplicable "thread > killed" messages in my Snap handlers for a long time, but they aren't > from Snap's timeout code. I recently upgraded to ghc 7.4.1, and that > caused the kills to happen a lot more often, but also gave me some > traceback capabilities. I tracked the most common kills down to > cryptohash's Crypto.Hash.Tiger.update function, but that's about as > That's probably not where the threadKill is being sent *from*, it's where your thread received it. > pure a FFI function can be, so I don't know how that would be causing > anything weird to happen. I also sometimes get the kills in the > Tiger.finalize function, and I get other ones in functions that I > haven't been able to track down yet. Given that the thread kills > aren't from Snap's timeout code (they happen in under a second, and I > have snap's timeouts turned to an insanely high value), what sort of > other things cause ThreadKilled exceptions? > It's hard to rule Snap timeouts out; try building snap-core with the "-fdebug" flag and running your app with "DEBUG=1", you'll get a spew of debugging output from Snap on stderr. G -- Gregory Collins ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
[Haskell-cafe] thread killed
What sorts of things can cause a thread to get an asynchronous "thread killed" exception? I've been seeing rare, inexplicable "thread killed" messages in my Snap handlers for a long time, but they aren't from Snap's timeout code. I recently upgraded to ghc 7.4.1, and that caused the kills to happen a lot more often, but also gave me some traceback capabilities. I tracked the most common kills down to cryptohash's Crypto.Hash.Tiger.update function, but that's about as pure a FFI function can be, so I don't know how that would be causing anything weird to happen. I also sometimes get the kills in the Tiger.finalize function, and I get other ones in functions that I haven't been able to track down yet. Given that the thread kills aren't from Snap's timeout code (they happen in under a second, and I have snap's timeouts turned to an insanely high value), what sort of other things cause ThreadKilled exceptions? Thanks for any help; this is really driving me mad :-/ ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe