Hi, I have similar observations after a while server keep accepting requests but they all timeout and nothing gets returned in response. I`m using akka http 2.4.2 and streams to create a simple server which handle requests and return files from S3.
In my case i don`t need to do even high load, doing request one after another is enough to hang the server. I played with max connections parameter and increasing it makes app process more requests but eventually it stuck anyway. >From my observation issue is in http connections pool that are not properly releasing connections and when new request comes in runnable graph is created but sinks and sources are not properly connected to start the flow. During my tests i see (not sure this is related though): ERROR [datastore-rest-api-akka.actor.default-dispatcher-5] - Error in stage [One2OneBidi]: Inner stream finished before inputs completed. Outputs might have been truncated. akka.http.impl.util.One2OneBidiFlow$OutputTruncationException$: Inner stream finished before inputs completed. Outputs might have been truncated. ERROR [datastore-rest-api-akka.actor.default-dispatcher-5] - Error in stage [One2OneBidi]: Inner stream finished before inputs completed. Outputs might have been truncated. akka.http.impl.util.One2OneBidiFlow$OutputTruncationException$: Inner stream finished before inputs completed. Outputs might have been truncated. ERROR [datastore-rest-api-akka.actor.default-dispatcher-5] - Error in stage [One2OneBidi]: Inner stream finished before inputs completed. Outputs might have been truncated. akka.http.impl.util.One2OneBidiFlow$OutputTruncationException$: Inner stream finished before inputs completed. Outputs might have been truncated. ERROR [datastore-rest-api-akka.actor.default-dispatcher-5] - Error in stage [One2OneBidi]: Inner stream finished before inputs completed. Outputs might have been truncated. akka.http.impl.util.One2OneBidiFlow$OutputTruncationException$: Inner stream finished before inputs completed. Outputs might have been truncated. INFO [datastore-rest-api-akka.actor.default-dispatcher-5] - Message [akka.io.Tcp$ResumeReading$] from Actor[akka://datastore-rest-api/user/StreamSupervisor-0/$$a#1262265379] to Actor[akka://datastore-rest-api/system/IO-TCP/selectors/$a/3#-1262857800] was not delivered. [1] dead letters encountered. This logging can be turned off or adjusted with configuration settings 'akka.log-dead-letters' and 'akka.log-dead-letters-during-shutdown'. INFO [datastore-rest-api-akka.actor.default-dispatcher-5] - Message [akka.io.Tcp$ResumeReading$] from Actor[akka://datastore-rest-api/user/StreamSupervisor-0/$$e#585879533] to Actor[akka://datastore-rest-api/system/IO-TCP/selectors/$a/7#1750981790] was not delivered. [2] dead letters encountered. This logging can be turned off or adjusted with configuration settings 'akka.log-dead-letters' and 'akka.log-dead-letters-during-shutdown'. On Saturday, March 5, 2016 at 6:03:57 AM UTC-6, Giovanni Alberto Caporaletti wrote: > > Hi, > I'll try to explain what I'm experiencing in my akka-http app. > (I found this issue but it's not been updated for almost a year and I'm > not sure it's relevant: https://github.com/akka/akka/issues/17395) > > I noticed that under load a lot of connections (~1-2%) were dropped or > timed out. I started investigating, tuning os and akka params and trimming > down my sample app until I got this: > > > //N.B.: this is a test > > implicit val system = ActorSystem() > implicit val mat: ActorMaterializer = ActorMaterializer() > implicit val ec = system.dispatcher > > val binding: Future[ServerBinding] = Http().bind("0.0.0.0", 1104).map { conn ⇒ > val promise = Promise[Unit]() > // I don't even wait for the end of the flow > val handler = Flow[HttpRequest].map { _ ⇒ promise.success(()); > HttpResponse() } > > // to be sure it's not a mapAsync(1) problem I use map and block here, > same result > val t0 = System.currentTimeMillis() > println(s"${Thread.currentThread().getName} start") > > conn handleWith handler > > Await.result(promise.future, 10.seconds) > println(s"${Thread.currentThread().getName} end > ${System.currentTimeMillis() - t0}ms"); > }.to(Sink.ignore).run() > > Await.result(binding, 10.seconds) > > > > When I run a small test using ab with something like "-c 1000" concurrent > connections or more (even if I'm handling one at a time here), some of the > requests immediately start getting unusual delays: > > default-akka.actor.default-dispatcher-3 start > default-akka.actor.default-dispatcher-3 end 2015ms -> gets bigger > > This keeps getting worse. After a while I can kill ab, wait some minutes > and make a single request and it either gets refused or times out. The > server is basically *dead* > > > *I get the exact same result with this, if you're wondering why I did all > that blocking and printing stuff above:* > > val handler = Flow[HttpRequest].map(_ ⇒ > HttpResponse()).alsoToMat(Sink.ignore)(Keep.right) > > val binding: Future[ServerBinding] = Http().bind("0.0.0.0", 1104).mapAsync(1) > { conn ⇒ > conn handleWith handler > }.to(Sink.ignore).run() > > and the same happens if I use bindAndHandle with a simple route. > > > In my standard setup (bindAndHandle, any number of concurrent connections > (1k to 10k tried) and keepalive for the requests) I see a number of > connections between 1 and 3% failing. > This is what I get calling a simple route with bindAndHandle, > MaxConnections(10000) and connection keepalive enabled on the client: > lots of timeouts after just 10k calls already: > > Concurrency Level: 4000 > Time taken for tests: 60.605 seconds > Complete requests: 10000 > Failed requests: 261 > (Connect: 0, Receive: 87, Length: 87, Exceptions: 87) > Keep-Alive requests: 9913 > ... > > Connection Times (ms) > min mean[+/-sd] median max > Connect: 0 7 31.3 0 191 > Processing: 0 241 2780.8 5 60396 > *Waiting*: 0 92 1270.8 5 *60396* > Total: 0 248 2783.5 5 60459 > > Percentage of the requests served within a certain time (ms) > ... > 90% 13 > 95% 255 > 98% 2061 > 99% 3911 > 100% 60459 (longest request) > > It looks like it does the same on my local machine (mac) but I'm not 100% > sure. I'm doing the tests on an ubuntu 8-core 24GB ram vm > I really don't know what to do, I'm trying every possible combination of > system parameters and akka config but I keep getting the same result. > Basically everything I tried (changing /etc/security/limits.conf, changing > sysctl params, changing akka concurrent connections, backlog, dispatchers > etc) led to the same result, that is: *connections doing nothing and > timing out.* As if the execution were queued somehow > > > Is there something I'm missing? Some tuning parameter/config/something > else? > It looks like the piece of code that times out is conn handleWith handler > even > if 'handler' does nothing and and it keeps doing it even after the load > stops. I.e. the connection is established correctly, but the processing is > stuck. > > > this is my ulimit -a: > core file size (blocks, -c) 0 > data seg size (kbytes, -d) unlimited > scheduling priority (-e) 0 > file size (blocks, -f) unlimited > pending signals (-i) 96360 > max locked memory (kbytes, -l) unlimited > max memory size (kbytes, -m) unlimited > open files (-n) 100000 > pipe size (512 bytes, -p) 8 > POSIX message queues (bytes, -q) 819200 > real-time priority (-r) 0 > stack size (kbytes, -s) 8192 > cpu time (seconds, -t) unlimited > max user processes (-u) 32768 > virtual memory (kbytes, -v) unlimited > file locks (-x) unlimited > > vm.swappiness = 0 > > > Cheers > > > -- >>>>>>>>>> Read the docs: http://akka.io/docs/ >>>>>>>>>> Check the FAQ: >>>>>>>>>> http://doc.akka.io/docs/akka/current/additional/faq.html >>>>>>>>>> Search the archives: https://groups.google.com/group/akka-user --- You received this message because you are subscribed to the Google Groups "Akka User List" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To post to this group, send email to [email protected]. Visit this group at https://groups.google.com/group/akka-user. For more options, visit https://groups.google.com/d/optout.
