Re: [akka-user][deprecated] Re: [akka-user] Spray->Akka-Http Migration - seeing high 99th percentile latencies post-migration

2018-10-09 Thread Johannes Rudolph
That the entity directive is part of the picture could be a hint that
indeed streaming requests might be the cause of this. In spray, there was
no request streaming enabled by default and the engine just collected the
complete stream into a buffer and dispatched it to the app only after
everything was received. This has changed in akka-http where streaming is
on by default if the complete request wasn't received in one go from the
network. In this case the streaming case is actually more likely to happen
on low-traffic servers with a real network where network packages are not
aggregated in lower levels but are really processed immediately when they
are received.

The question is still if the 200ms are really added latency in akka-http or
just an artifact of how request processing time is measured. There's
definitely *some* overhead of processing a request in streaming fashion but
it's not 200ms. I haven't checked seriously but it seems that Kamon might
be measuring something else than you are thinking in akka-http: it seems to
start measuring the time from when the request is dispatched to your app
but at this point the request body might not have been received fully. That
means that whenever the HTTP client is slow with sending a request for
whatever reason, it will show in your request processing times.

Johannes

On Mon, Oct 8, 2018 at 10:42 PM Gary Malouf  wrote:

> We ultimately decided to rollout despite this glitch.  Not happy about it,
> and hoping whatever is causing this gets resolved in a future release.  My
> hunch is that it's a fixed price being paid that if 1000's of more
> requests/second were sent to the app would make this unnoticeable.
>
>
>
> On Sun, Oct 7, 2018 at 11:18 AM Avshalom Manevich 
> wrote:
>
>> Hi Gary,
>>
>> Did you end up finding a solution to this?
>>
>> We're hitting a similar issue with Akka HTTP (10.0.11) and a low-load
>> server.
>>
>> Average latency is great but 99th percentile is horrible (~200ms).
>>
>> Appreciate your input.
>>
>> Regards,
>> Avshalom
>>
>>
>> I wonder if you could start a timer when you enter the trace block and
>>> then e.g. after 200ms trigger one or multiple stack dumps (using JMX or
>>> just by printing out the result of `Thread.getAllStackTraces`). It's not
>>> super likely that something will turn up but it seems like a simple enough
>>> thing to try.
>>>
>>> Johannes
>>>
>>> On Thursday, November 16, 2017 at 1:28:23 PM UTC+1, Gary Malouf wrote:
>>>
 Hi Johannes,

 Yes; we are seeing 2-3 requests/second (only in production) with the
 latency spikes.  We found no correlation between the gc times and these
 request latencies, nor between the size/type of requests.

 We had to pause the migration effort for 2 weeks because of the time
 being taken, but just jumped back on it the other day.

 Our current strategy is to implement this with the low level api to see
 if we get the same results.

 Gary

 On Nov 16, 2017 6:57 AM,  wrote:

 Hi Gary,

 did you find out what's going on by now? If I understand correctly, you
 get latency spikes as soon as you use the `entity[as[String]]` directive?
 Could you narrow down if there's anything special to those requests? I
 guess you monitor your GC times?

 Johannes


 On Wednesday, November 1, 2017 at 8:56:50 PM UTC+1, Gary Malouf wrote:

> So the only way I was able to successfully identify the suspicious
> code was to route a percentage of my production traffic to a stubbed route
> that I incrementally added back pieces of our implementation into.  What I
> found was that we started getting spikes when the entity(as[Case
> ClassFromJson]) stubbed was added back in.  To figure out if it was
> the json parsing or 'POST' entity consumption itself, I replaced that 
> class
> with a string - turns out we experience the latency spikes with that as
> well (on low traffic as noted earlier in this thread).
>
> I by no means have a deep understanding of streams, but it makes me
> wonder if the way I have our code consuming the entity is not correct.
>
> On Monday, October 30, 2017 at 4:27:13 PM UTC-4, Gary Malouf wrote:
>
>> Hi Roland - thank you for the tip.  We shrunk the thread pool size
>> down to 1, but were disheartened to still see the latency spikes.  Using
>> Kamon's tracing library (which we validated with various tests to ensure
>> it's own numbers are most likely correct), we could not find anything in
>> our code within the route that was causing the latency (it all appeared 
>> to
>> be classified to be that route but no code segments within it).
>>
>> As mentioned earlier, running loads of 100-1000 requests/second
>> completely hides the issue (save for the max latency) as everything 
>> through
>> 99th percentiles is under a few milliseconds.
>>
>> On Tuesday, October 

Re: [akka-user][deprecated] Re: [akka-user] Spray->Akka-Http Migration - seeing high 99th percentile latencies post-migration

2018-10-08 Thread Gary Malouf
We ultimately decided to rollout despite this glitch.  Not happy about it,
and hoping whatever is causing this gets resolved in a future release.  My
hunch is that it's a fixed price being paid that if 1000's of more
requests/second were sent to the app would make this unnoticeable.



On Sun, Oct 7, 2018 at 11:18 AM Avshalom Manevich 
wrote:

> Hi Gary,
>
> Did you end up finding a solution to this?
>
> We're hitting a similar issue with Akka HTTP (10.0.11) and a low-load
> server.
>
> Average latency is great but 99th percentile is horrible (~200ms).
>
> Appreciate your input.
>
> Regards,
> Avshalom
>
>
> I wonder if you could start a timer when you enter the trace block and
>> then e.g. after 200ms trigger one or multiple stack dumps (using JMX or
>> just by printing out the result of `Thread.getAllStackTraces`). It's not
>> super likely that something will turn up but it seems like a simple enough
>> thing to try.
>>
>> Johannes
>>
>> On Thursday, November 16, 2017 at 1:28:23 PM UTC+1, Gary Malouf wrote:
>>
>>> Hi Johannes,
>>>
>>> Yes; we are seeing 2-3 requests/second (only in production) with the
>>> latency spikes.  We found no correlation between the gc times and these
>>> request latencies, nor between the size/type of requests.
>>>
>>> We had to pause the migration effort for 2 weeks because of the time
>>> being taken, but just jumped back on it the other day.
>>>
>>> Our current strategy is to implement this with the low level api to see
>>> if we get the same results.
>>>
>>> Gary
>>>
>>> On Nov 16, 2017 6:57 AM,  wrote:
>>>
>>> Hi Gary,
>>>
>>> did you find out what's going on by now? If I understand correctly, you
>>> get latency spikes as soon as you use the `entity[as[String]]` directive?
>>> Could you narrow down if there's anything special to those requests? I
>>> guess you monitor your GC times?
>>>
>>> Johannes
>>>
>>>
>>> On Wednesday, November 1, 2017 at 8:56:50 PM UTC+1, Gary Malouf wrote:
>>>
 So the only way I was able to successfully identify the suspicious code
 was to route a percentage of my production traffic to a stubbed route that
 I incrementally added back pieces of our implementation into.  What I found
 was that we started getting spikes when the entity(as[CaseClassFromJson
 ]) stubbed was added back in.  To figure out if it was the json
 parsing or 'POST' entity consumption itself, I replaced that class with a
 string - turns out we experience the latency spikes with that as well (on
 low traffic as noted earlier in this thread).

 I by no means have a deep understanding of streams, but it makes me
 wonder if the way I have our code consuming the entity is not correct.

 On Monday, October 30, 2017 at 4:27:13 PM UTC-4, Gary Malouf wrote:

> Hi Roland - thank you for the tip.  We shrunk the thread pool size
> down to 1, but were disheartened to still see the latency spikes.  Using
> Kamon's tracing library (which we validated with various tests to ensure
> it's own numbers are most likely correct), we could not find anything in
> our code within the route that was causing the latency (it all appeared to
> be classified to be that route but no code segments within it).
>
> As mentioned earlier, running loads of 100-1000 requests/second
> completely hides the issue (save for the max latency) as everything 
> through
> 99th percentiles is under a few milliseconds.
>
> On Tuesday, October 24, 2017 at 2:23:07 AM UTC-4, rkuhn wrote:
>
>> You could try to decrease your thread pool size to 1 to exclude
>> wakeup latencies when things (like CPU cores) have gone to sleep.
>>
>> Regards, Roland
>>
>> Sent from my iPhone
>>
>> On 23. Oct 2017, at 22:49, Gary Malouf  wrote:
>>
>> Yes, it gets parsed using entity(as[]) with spray-json support.
>> Under a load test of say 1000 requests/second these latencies are not
>> visible in the percentiles - they are easy to see because this web server
>> is getting 10-20 requests/second currently.  Trying to brainstorm if a
>> dispatcher needed to be tuned or something of that sort but have yet to 
>> see
>> evidence supporting that.
>>
>> path("foos") {
>> traceName("FooSelection") {
>>
>> entity(as[ExternalPageRequest]) { pr =>
>> val spr = toSelectionPageRequest(pr)
>> shouldTracePageId(spr.pageId).fold(
>> Tracer.currentContext.withNewSegment(s"Page-${pr.pageId}", "PageTrace
>> ", "kamon") {
>> processPageRequestAndComplete(pr, spr)
>> },
>> processPageRequestAndComplete(pr, spr)
>> )
>> }
>> }
>>
>>
>> }
>>
>> On Mon, Oct 23, 2017 at 4:42 PM, Viktor Klang 
>> wrote:
>>
>>> And you consume the entityBytes I presume?
>>>
>>> On Mon, Oct 23, 2017 at 10:35 PM, Gary Malouf 
>>> wrote:
>>>
 It is from when I start the Kamon trace (just inside of my

[akka-user][deprecated] Re: [akka-user] Spray->Akka-Http Migration - seeing high 99th percentile latencies post-migration

2018-10-07 Thread Avshalom Manevich
Hi Gary,

Did you end up finding a solution to this?

We're hitting a similar issue with Akka HTTP (10.0.11) and a low-load 
server.

Average latency is great but 99th percentile is horrible (~200ms).

Appreciate your input.

Regards,
Avshalom 


I wonder if you could start a timer when you enter the trace block and then 
> e.g. after 200ms trigger one or multiple stack dumps (using JMX or just by 
> printing out the result of `Thread.getAllStackTraces`). It's not super 
> likely that something will turn up but it seems like a simple enough thing 
> to try.
>
> Johannes
>
> On Thursday, November 16, 2017 at 1:28:23 PM UTC+1, Gary Malouf wrote:
>
>> Hi Johannes,
>>
>> Yes; we are seeing 2-3 requests/second (only in production) with the 
>> latency spikes.  We found no correlation between the gc times and these 
>> request latencies, nor between the size/type of requests.
>>
>> We had to pause the migration effort for 2 weeks because of the time 
>> being taken, but just jumped back on it the other day.  
>>
>> Our current strategy is to implement this with the low level api to see 
>> if we get the same results.
>>
>> Gary
>>
>> On Nov 16, 2017 6:57 AM,  wrote:
>>
>> Hi Gary,
>>
>> did you find out what's going on by now? If I understand correctly, you 
>> get latency spikes as soon as you use the `entity[as[String]]` directive? 
>> Could you narrow down if there's anything special to those requests? I 
>> guess you monitor your GC times?
>>
>> Johannes
>>
>>
>> On Wednesday, November 1, 2017 at 8:56:50 PM UTC+1, Gary Malouf wrote:
>>
>>> So the only way I was able to successfully identify the suspicious code 
>>> was to route a percentage of my production traffic to a stubbed route that 
>>> I incrementally added back pieces of our implementation into.  What I found 
>>> was that we started getting spikes when the entity(as[CaseClassFromJson
>>> ]) stubbed was added back in.  To figure out if it was the json parsing 
>>> or 'POST' entity consumption itself, I replaced that class with a string - 
>>> turns out we experience the latency spikes with that as well (on low 
>>> traffic as noted earlier in this thread).  
>>>
>>> I by no means have a deep understanding of streams, but it makes me 
>>> wonder if the way I have our code consuming the entity is not correct.
>>>
>>> On Monday, October 30, 2017 at 4:27:13 PM UTC-4, Gary Malouf wrote:
>>>
 Hi Roland - thank you for the tip.  We shrunk the thread pool size down 
 to 1, but were disheartened to still see the latency spikes.  Using 
 Kamon's 
 tracing library (which we validated with various tests to ensure it's own 
 numbers are most likely correct), we could not find anything in our code 
 within the route that was causing the latency (it all appeared to be 
 classified to be that route but no code segments within it).  

 As mentioned earlier, running loads of 100-1000 requests/second 
 completely hides the issue (save for the max latency) as everything 
 through 
 99th percentiles is under a few milliseconds.

 On Tuesday, October 24, 2017 at 2:23:07 AM UTC-4, rkuhn wrote:

> You could try to decrease your thread pool size to 1 to exclude wakeup 
> latencies when things (like CPU cores) have gone to sleep.
>
> Regards, Roland 
>
> Sent from my iPhone
>
> On 23. Oct 2017, at 22:49, Gary Malouf  wrote:
>
> Yes, it gets parsed using entity(as[]) with spray-json support.  Under 
> a load test of say 1000 requests/second these latencies are not visible 
> in 
> the percentiles - they are easy to see because this web server is getting 
> 10-20 requests/second currently.  Trying to brainstorm if a dispatcher 
> needed to be tuned or something of that sort but have yet to see evidence 
> supporting that.
>
> path("foos") { 
> traceName("FooSelection") {
>
> entity(as[ExternalPageRequest]) { pr => 
> val spr = toSelectionPageRequest(pr) 
> shouldTracePageId(spr.pageId).fold( 
> Tracer.currentContext.withNewSegment(s"Page-${pr.pageId}", "PageTrace", 
> "kamon") { 
> processPageRequestAndComplete(pr, spr) 
> }, 
> processPageRequestAndComplete(pr, spr) 
> ) 
> }
> } 
>
>
> }
>
> On Mon, Oct 23, 2017 at 4:42 PM, Viktor Klang  
> wrote:
>
>> And you consume the entityBytes I presume?
>>
>> On Mon, Oct 23, 2017 at 10:35 PM, Gary Malouf  
>> wrote:
>>
>>> It is from when I start the Kamon trace (just inside of my 
>>> path("myawesomepath") declaration until (theoretically) a 'complete' 
>>> call 
>>> is made.  
>>>
>>> path("myawesomepath") {
>>>   traceName("CoolStory") {
>>> ///do some stuff
>>>  complete("This is great")
>>> } }
>>>
>>> For what it's worth, this route is a 'POST' call.
>>>
>>> On Mon, Oct 23, 2017 at 4:30 PM, Viktor Klang  
>>> wrote:
>>>
 No, I 

Re: [akka-user] Spray->Akka-Http Migration - seeing high 99th percentile latencies post-migration

2017-11-16 Thread johannes . rudolph
I wonder if you could start a timer when you enter the trace block and then 
e.g. after 200ms trigger one or multiple stack dumps (using JMX or just by 
printing out the result of `Thread.getAllStackTraces`). It's not super 
likely that something will turn up but it seems like a simple enough thing 
to try.

Johannes

On Thursday, November 16, 2017 at 1:28:23 PM UTC+1, Gary Malouf wrote:
>
> Hi Johannes,
>
> Yes; we are seeing 2-3 requests/second (only in production) with the 
> latency spikes.  We found no correlation between the gc times and these 
> request latencies, nor between the size/type of requests.
>
> We had to pause the migration effort for 2 weeks because of the time being 
> taken, but just jumped back on it the other day.  
>
> Our current strategy is to implement this with the low level api to see if 
> we get the same results.
>
> Gary
>
> On Nov 16, 2017 6:57 AM,  wrote:
>
> Hi Gary,
>
> did you find out what's going on by now? If I understand correctly, you 
> get latency spikes as soon as you use the `entity[as[String]]` directive? 
> Could you narrow down if there's anything special to those requests? I 
> guess you monitor your GC times?
>
> Johannes
>
>
> On Wednesday, November 1, 2017 at 8:56:50 PM UTC+1, Gary Malouf wrote:
>>
>> So the only way I was able to successfully identify the suspicious code 
>> was to route a percentage of my production traffic to a stubbed route that 
>> I incrementally added back pieces of our implementation into.  What I found 
>> was that we started getting spikes when the entity(as[CaseClassFromJson
>> ]) stubbed was added back in.  To figure out if it was the json parsing 
>> or 'POST' entity consumption itself, I replaced that class with a string - 
>> turns out we experience the latency spikes with that as well (on low 
>> traffic as noted earlier in this thread).  
>>
>> I by no means have a deep understanding of streams, but it makes me 
>> wonder if the way I have our code consuming the entity is not correct.
>>
>> On Monday, October 30, 2017 at 4:27:13 PM UTC-4, Gary Malouf wrote:
>>>
>>> Hi Roland - thank you for the tip.  We shrunk the thread pool size down 
>>> to 1, but were disheartened to still see the latency spikes.  Using Kamon's 
>>> tracing library (which we validated with various tests to ensure it's own 
>>> numbers are most likely correct), we could not find anything in our code 
>>> within the route that was causing the latency (it all appeared to be 
>>> classified to be that route but no code segments within it).  
>>>
>>> As mentioned earlier, running loads of 100-1000 requests/second 
>>> completely hides the issue (save for the max latency) as everything through 
>>> 99th percentiles is under a few milliseconds.
>>>
>>> On Tuesday, October 24, 2017 at 2:23:07 AM UTC-4, rkuhn wrote:

 You could try to decrease your thread pool size to 1 to exclude wakeup 
 latencies when things (like CPU cores) have gone to sleep.

 Regards, Roland 

 Sent from my iPhone

 On 23. Oct 2017, at 22:49, Gary Malouf  wrote:

 Yes, it gets parsed using entity(as[]) with spray-json support.  Under 
 a load test of say 1000 requests/second these latencies are not visible in 
 the percentiles - they are easy to see because this web server is getting 
 10-20 requests/second currently.  Trying to brainstorm if a dispatcher 
 needed to be tuned or something of that sort but have yet to see evidence 
 supporting that.

 path("foos") { 
 traceName("FooSelection") {
 entity(as[ExternalPageRequest]) { pr => 
 val spr = toSelectionPageRequest(pr) 
 shouldTracePageId(spr.pageId).fold( 
 Tracer.currentContext.withNewSegment(s"Page-${pr.pageId}", "PageTrace", 
 "kamon") { 
 processPageRequestAndComplete(pr, spr) 
 }, 
 processPageRequestAndComplete(pr, spr) 
 ) 
 }
 } 

 }

 On Mon, Oct 23, 2017 at 4:42 PM, Viktor Klang  
 wrote:

> And you consume the entityBytes I presume?
>
> On Mon, Oct 23, 2017 at 10:35 PM, Gary Malouf  
> wrote:
>
>> It is from when I start the Kamon trace (just inside of my 
>> path("myawesomepath") declaration until (theoretically) a 'complete' 
>> call 
>> is made.  
>>
>> path("myawesomepath") {
>>   traceName("CoolStory") {
>> ///do some stuff
>>  complete("This is great")
>> } }
>>
>> For what it's worth, this route is a 'POST' call.
>>
>> On Mon, Oct 23, 2017 at 4:30 PM, Viktor Klang  
>> wrote:
>>
>>> No, I mean, is it from first-byte-received to last-byte-sent or what?
>>>
>>> On Mon, Oct 23, 2017 at 10:22 PM, Gary Malouf  
>>> wrote:
>>>
 We are using percentiles computed via Kamon 0.6.8.  In a very low 
 request rate environment like this, 

Re: [akka-user] Spray->Akka-Http Migration - seeing high 99th percentile latencies post-migration

2017-11-16 Thread Gary Malouf
Hi Johannes,

Yes; we are seeing 2-3 requests/second (only in production) with the
latency spikes.  We found no correlation between the gc times and these
request latencies, nor between the size/type of requests.

We had to pause the migration effort for 2 weeks because of the time being
taken, but just jumped back on it the other day.

Our current strategy is to implement this with the low level api to see if
we get the same results.

Gary

On Nov 16, 2017 6:57 AM,  wrote:

Hi Gary,

did you find out what's going on by now? If I understand correctly, you get
latency spikes as soon as you use the `entity[as[String]]` directive? Could
you narrow down if there's anything special to those requests? I guess you
monitor your GC times?

Johannes


On Wednesday, November 1, 2017 at 8:56:50 PM UTC+1, Gary Malouf wrote:
>
> So the only way I was able to successfully identify the suspicious code
> was to route a percentage of my production traffic to a stubbed route that
> I incrementally added back pieces of our implementation into.  What I found
> was that we started getting spikes when the entity(as[CaseClassFromJson]) 
> stubbed
> was added back in.  To figure out if it was the json parsing or 'POST'
> entity consumption itself, I replaced that class with a string - turns out
> we experience the latency spikes with that as well (on low traffic as noted
> earlier in this thread).
>
> I by no means have a deep understanding of streams, but it makes me wonder
> if the way I have our code consuming the entity is not correct.
>
> On Monday, October 30, 2017 at 4:27:13 PM UTC-4, Gary Malouf wrote:
>>
>> Hi Roland - thank you for the tip.  We shrunk the thread pool size down
>> to 1, but were disheartened to still see the latency spikes.  Using Kamon's
>> tracing library (which we validated with various tests to ensure it's own
>> numbers are most likely correct), we could not find anything in our code
>> within the route that was causing the latency (it all appeared to be
>> classified to be that route but no code segments within it).
>>
>> As mentioned earlier, running loads of 100-1000 requests/second
>> completely hides the issue (save for the max latency) as everything through
>> 99th percentiles is under a few milliseconds.
>>
>> On Tuesday, October 24, 2017 at 2:23:07 AM UTC-4, rkuhn wrote:
>>>
>>> You could try to decrease your thread pool size to 1 to exclude wakeup
>>> latencies when things (like CPU cores) have gone to sleep.
>>>
>>> Regards, Roland
>>>
>>> Sent from my iPhone
>>>
>>> On 23. Oct 2017, at 22:49, Gary Malouf  wrote:
>>>
>>> Yes, it gets parsed using entity(as[]) with spray-json support.  Under a
>>> load test of say 1000 requests/second these latencies are not visible in
>>> the percentiles - they are easy to see because this web server is getting
>>> 10-20 requests/second currently.  Trying to brainstorm if a dispatcher
>>> needed to be tuned or something of that sort but have yet to see evidence
>>> supporting that.
>>>
>>> path("foos") {
>>> traceName("FooSelection") {
>>> entity(as[ExternalPageRequest]) { pr =>
>>> val spr = toSelectionPageRequest(pr)
>>> shouldTracePageId(spr.pageId).fold(
>>> Tracer.currentContext.withNewSegment(s"Page-${pr.pageId}", "PageTrace",
>>> "kamon") {
>>> processPageRequestAndComplete(pr, spr)
>>> },
>>> processPageRequestAndComplete(pr, spr)
>>> )
>>> }
>>> }
>>>
>>> }
>>>
>>> On Mon, Oct 23, 2017 at 4:42 PM, Viktor Klang 
>>> wrote:
>>>
 And you consume the entityBytes I presume?

 On Mon, Oct 23, 2017 at 10:35 PM, Gary Malouf 
 wrote:

> It is from when I start the Kamon trace (just inside of my
> path("myawesomepath") declaration until (theoretically) a 'complete' call
> is made.
>
> path("myawesomepath") {
>   traceName("CoolStory") {
> ///do some stuff
>  complete("This is great")
> } }
>
> For what it's worth, this route is a 'POST' call.
>
> On Mon, Oct 23, 2017 at 4:30 PM, Viktor Klang 
> wrote:
>
>> No, I mean, is it from first-byte-received to last-byte-sent or what?
>>
>> On Mon, Oct 23, 2017 at 10:22 PM, Gary Malouf 
>> wrote:
>>
>>> We are using percentiles computed via Kamon 0.6.8.  In a very low
>>> request rate environment like this, it takes roughly 1 super slow
>>> request/second to throw off the percentiles (which is what I think is
>>> happening).
>>>
>>>
>>>
>>> On Mon, Oct 23, 2017 at 4:20 PM, Viktor Klang 
>>> wrote:
>>>
 What definition of latency are you using? (i.e. how is it derived)

 On Mon, Oct 23, 2017 at 10:11 PM, Gary Malouf 
 wrote:

> Hi Konrad,
>
> Our real issue is that we can not reproduce the results.  The web
> server we are having latency 

Re: [akka-user] Spray->Akka-Http Migration - seeing high 99th percentile latencies post-migration

2017-11-16 Thread johannes . rudolph
Hi Gary,

did you find out what's going on by now? If I understand correctly, you get 
latency spikes as soon as you use the `entity[as[String]]` directive? Could 
you narrow down if there's anything special to those requests? I guess you 
monitor your GC times?

Johannes

On Wednesday, November 1, 2017 at 8:56:50 PM UTC+1, Gary Malouf wrote:
>
> So the only way I was able to successfully identify the suspicious code 
> was to route a percentage of my production traffic to a stubbed route that 
> I incrementally added back pieces of our implementation into.  What I found 
> was that we started getting spikes when the entity(as[CaseClassFromJson]) 
> stubbed 
> was added back in.  To figure out if it was the json parsing or 'POST' 
> entity consumption itself, I replaced that class with a string - turns out 
> we experience the latency spikes with that as well (on low traffic as noted 
> earlier in this thread).  
>
> I by no means have a deep understanding of streams, but it makes me wonder 
> if the way I have our code consuming the entity is not correct.
>
> On Monday, October 30, 2017 at 4:27:13 PM UTC-4, Gary Malouf wrote:
>>
>> Hi Roland - thank you for the tip.  We shrunk the thread pool size down 
>> to 1, but were disheartened to still see the latency spikes.  Using Kamon's 
>> tracing library (which we validated with various tests to ensure it's own 
>> numbers are most likely correct), we could not find anything in our code 
>> within the route that was causing the latency (it all appeared to be 
>> classified to be that route but no code segments within it).  
>>
>> As mentioned earlier, running loads of 100-1000 requests/second 
>> completely hides the issue (save for the max latency) as everything through 
>> 99th percentiles is under a few milliseconds.
>>
>> On Tuesday, October 24, 2017 at 2:23:07 AM UTC-4, rkuhn wrote:
>>>
>>> You could try to decrease your thread pool size to 1 to exclude wakeup 
>>> latencies when things (like CPU cores) have gone to sleep.
>>>
>>> Regards, Roland 
>>>
>>> Sent from my iPhone
>>>
>>> On 23. Oct 2017, at 22:49, Gary Malouf  wrote:
>>>
>>> Yes, it gets parsed using entity(as[]) with spray-json support.  Under a 
>>> load test of say 1000 requests/second these latencies are not visible in 
>>> the percentiles - they are easy to see because this web server is getting 
>>> 10-20 requests/second currently.  Trying to brainstorm if a dispatcher 
>>> needed to be tuned or something of that sort but have yet to see evidence 
>>> supporting that.
>>>
>>> path("foos") { 
>>> traceName("FooSelection") {
>>> entity(as[ExternalPageRequest]) { pr => 
>>> val spr = toSelectionPageRequest(pr) 
>>> shouldTracePageId(spr.pageId).fold( 
>>> Tracer.currentContext.withNewSegment(s"Page-${pr.pageId}", "PageTrace", 
>>> "kamon") { 
>>> processPageRequestAndComplete(pr, spr) 
>>> }, 
>>> processPageRequestAndComplete(pr, spr) 
>>> ) 
>>> }
>>> } 
>>>
>>> }
>>>
>>> On Mon, Oct 23, 2017 at 4:42 PM, Viktor Klang  
>>> wrote:
>>>
 And you consume the entityBytes I presume?

 On Mon, Oct 23, 2017 at 10:35 PM, Gary Malouf  
 wrote:

> It is from when I start the Kamon trace (just inside of my 
> path("myawesomepath") declaration until (theoretically) a 'complete' call 
> is made.  
>
> path("myawesomepath") {
>   traceName("CoolStory") {
> ///do some stuff
>  complete("This is great")
> } }
>
> For what it's worth, this route is a 'POST' call.
>
> On Mon, Oct 23, 2017 at 4:30 PM, Viktor Klang  
> wrote:
>
>> No, I mean, is it from first-byte-received to last-byte-sent or what?
>>
>> On Mon, Oct 23, 2017 at 10:22 PM, Gary Malouf  
>> wrote:
>>
>>> We are using percentiles computed via Kamon 0.6.8.  In a very low 
>>> request rate environment like this, it takes roughly 1 super slow 
>>> request/second to throw off the percentiles (which is what I think is 
>>> happening).  
>>>
>>>
>>>
>>> On Mon, Oct 23, 2017 at 4:20 PM, Viktor Klang  
>>> wrote:
>>>
 What definition of latency are you using? (i.e. how is it derived)

 On Mon, Oct 23, 2017 at 10:11 PM, Gary Malouf  
 wrote:

> Hi Konrad,
>
> Our real issue is that we can not reproduce the results.  The web 
> server we are having latency issues with is under peak load of 10-15 
> requests/second - obviously not much to deal with.  
>
> When we use load tests (https://github.com/apigee/apib), it's 
> easy for us to throw a few thousand requests/second at it and get 
> latencies 
> in the ~ 3 ms range.  We use kamon to track internal metrics - what 
> we see 
> is that our 95th and 99th percentiles only look bad under the 

Re: [akka-user] Spray->Akka-Http Migration - seeing high 99th percentile latencies post-migration

2017-11-01 Thread Gary Malouf
So the only way I was able to successfully identify the suspicious code was 
to route a percentage of my production traffic to a stubbed route that I 
incrementally added back pieces of our implementation into.  What I found 
was that we started getting spikes when the entity(as[CaseClassFromJson]) 
stubbed 
was added back in.  To figure out if it was the json parsing or 'POST' 
entity consumption itself, I replaced that class with a string - turns out 
we experience the latency spikes with that as well (on low traffic as noted 
earlier in this thread).  

I by no means have a deep understanding of streams, but it makes me wonder 
if the way I have our code consuming the entity is not correct.

On Monday, October 30, 2017 at 4:27:13 PM UTC-4, Gary Malouf wrote:
>
> Hi Roland - thank you for the tip.  We shrunk the thread pool size down to 
> 1, but were disheartened to still see the latency spikes.  Using Kamon's 
> tracing library (which we validated with various tests to ensure it's own 
> numbers are most likely correct), we could not find anything in our code 
> within the route that was causing the latency (it all appeared to be 
> classified to be that route but no code segments within it).  
>
> As mentioned earlier, running loads of 100-1000 requests/second completely 
> hides the issue (save for the max latency) as everything through 99th 
> percentiles is under a few milliseconds.
>
> On Tuesday, October 24, 2017 at 2:23:07 AM UTC-4, rkuhn wrote:
>>
>> You could try to decrease your thread pool size to 1 to exclude wakeup 
>> latencies when things (like CPU cores) have gone to sleep.
>>
>> Regards, Roland 
>>
>> Sent from my iPhone
>>
>> On 23. Oct 2017, at 22:49, Gary Malouf  wrote:
>>
>> Yes, it gets parsed using entity(as[]) with spray-json support.  Under a 
>> load test of say 1000 requests/second these latencies are not visible in 
>> the percentiles - they are easy to see because this web server is getting 
>> 10-20 requests/second currently.  Trying to brainstorm if a dispatcher 
>> needed to be tuned or something of that sort but have yet to see evidence 
>> supporting that.
>>
>> path("foos") { 
>> traceName("FooSelection") {
>> entity(as[ExternalPageRequest]) { pr => 
>> val spr = toSelectionPageRequest(pr) 
>> shouldTracePageId(spr.pageId).fold( 
>> Tracer.currentContext.withNewSegment(s"Page-${pr.pageId}", "PageTrace", "
>> kamon") { 
>> processPageRequestAndComplete(pr, spr) 
>> }, 
>> processPageRequestAndComplete(pr, spr) 
>> ) 
>> }
>> } 
>>
>> }
>>
>> On Mon, Oct 23, 2017 at 4:42 PM, Viktor Klang  
>> wrote:
>>
>>> And you consume the entityBytes I presume?
>>>
>>> On Mon, Oct 23, 2017 at 10:35 PM, Gary Malouf  
>>> wrote:
>>>
 It is from when I start the Kamon trace (just inside of my 
 path("myawesomepath") declaration until (theoretically) a 'complete' call 
 is made.  

 path("myawesomepath") {
   traceName("CoolStory") {
 ///do some stuff
  complete("This is great")
 } }

 For what it's worth, this route is a 'POST' call.

 On Mon, Oct 23, 2017 at 4:30 PM, Viktor Klang  
 wrote:

> No, I mean, is it from first-byte-received to last-byte-sent or what?
>
> On Mon, Oct 23, 2017 at 10:22 PM, Gary Malouf  
> wrote:
>
>> We are using percentiles computed via Kamon 0.6.8.  In a very low 
>> request rate environment like this, it takes roughly 1 super slow 
>> request/second to throw off the percentiles (which is what I think is 
>> happening).  
>>
>>
>>
>> On Mon, Oct 23, 2017 at 4:20 PM, Viktor Klang  
>> wrote:
>>
>>> What definition of latency are you using? (i.e. how is it derived)
>>>
>>> On Mon, Oct 23, 2017 at 10:11 PM, Gary Malouf  
>>> wrote:
>>>
 Hi Konrad,

 Our real issue is that we can not reproduce the results.  The web 
 server we are having latency issues with is under peak load of 10-15 
 requests/second - obviously not much to deal with.  

 When we use load tests (https://github.com/apigee/apib), it's easy 
 for us to throw a few thousand requests/second at it and get latencies 
 in 
 the ~ 3 ms range.  We use kamon to track internal metrics - what we 
 see is 
 that our 95th and 99th percentiles only look bad under the production 
 traffic but not under load tests.  

 I've since used kamon to print out the actual requests trying to 
 find any pattern in them to hint at what's wrong in my own code, but 
 they 
 seem to be completely random.  What we do know is that downgrading to 
 spray 
 gets us 99.9th percentile latencies under 2ms, so something related to 
 the 
 upgrade is allowing this.


Re: [akka-user] Spray->Akka-Http Migration - seeing high 99th percentile latencies post-migration

2017-10-30 Thread Gary Malouf
Hi Roland - thank you for the tip.  We shrunk the thread pool size down to 
1, but were disheartened to still see the latency spikes.  Using Kamon's 
tracing library (which we validated with various tests to ensure it's own 
numbers are most likely correct), we could not find anything in our code 
within the route that was causing the latency (it all appeared to be 
classified to be that route but no code segments within it).  

As mentioned earlier, running loads of 100-1000 requests/second completely 
hides the issue (save for the max latency) as everything through 99th 
percentiles is under a few milliseconds.

On Tuesday, October 24, 2017 at 2:23:07 AM UTC-4, rkuhn wrote:
>
> You could try to decrease your thread pool size to 1 to exclude wakeup 
> latencies when things (like CPU cores) have gone to sleep.
>
> Regards, Roland 
>
> Sent from my iPhone
>
> On 23. Oct 2017, at 22:49, Gary Malouf  
> wrote:
>
> Yes, it gets parsed using entity(as[]) with spray-json support.  Under a 
> load test of say 1000 requests/second these latencies are not visible in 
> the percentiles - they are easy to see because this web server is getting 
> 10-20 requests/second currently.  Trying to brainstorm if a dispatcher 
> needed to be tuned or something of that sort but have yet to see evidence 
> supporting that.
>
> path("foos") { 
> traceName("FooSelection") {
> entity(as[ExternalPageRequest]) { pr => 
> val spr = toSelectionPageRequest(pr) 
> shouldTracePageId(spr.pageId).fold( 
> Tracer.currentContext.withNewSegment(s"Page-${pr.pageId}", "PageTrace", "
> kamon") { 
> processPageRequestAndComplete(pr, spr) 
> }, 
> processPageRequestAndComplete(pr, spr) 
> ) 
> }
> } 
>
> }
>
> On Mon, Oct 23, 2017 at 4:42 PM, Viktor Klang  > wrote:
>
>> And you consume the entityBytes I presume?
>>
>> On Mon, Oct 23, 2017 at 10:35 PM, Gary Malouf > > wrote:
>>
>>> It is from when I start the Kamon trace (just inside of my 
>>> path("myawesomepath") declaration until (theoretically) a 'complete' call 
>>> is made.  
>>>
>>> path("myawesomepath") {
>>>   traceName("CoolStory") {
>>> ///do some stuff
>>>  complete("This is great")
>>> } }
>>>
>>> For what it's worth, this route is a 'POST' call.
>>>
>>> On Mon, Oct 23, 2017 at 4:30 PM, Viktor Klang >> > wrote:
>>>
 No, I mean, is it from first-byte-received to last-byte-sent or what?

 On Mon, Oct 23, 2017 at 10:22 PM, Gary Malouf  wrote:

> We are using percentiles computed via Kamon 0.6.8.  In a very low 
> request rate environment like this, it takes roughly 1 super slow 
> request/second to throw off the percentiles (which is what I think is 
> happening).  
>
>
>
> On Mon, Oct 23, 2017 at 4:20 PM, Viktor Klang  > wrote:
>
>> What definition of latency are you using? (i.e. how is it derived)
>>
>> On Mon, Oct 23, 2017 at 10:11 PM, Gary Malouf > > wrote:
>>
>>> Hi Konrad,
>>>
>>> Our real issue is that we can not reproduce the results.  The web 
>>> server we are having latency issues with is under peak load of 10-15 
>>> requests/second - obviously not much to deal with.  
>>>
>>> When we use load tests (https://github.com/apigee/apib), it's easy 
>>> for us to throw a few thousand requests/second at it and get latencies 
>>> in 
>>> the ~ 3 ms range.  We use kamon to track internal metrics - what we see 
>>> is 
>>> that our 95th and 99th percentiles only look bad under the production 
>>> traffic but not under load tests.  
>>>
>>> I've since used kamon to print out the actual requests trying to 
>>> find any pattern in them to hint at what's wrong in my own code, but 
>>> they 
>>> seem to be completely random.  What we do know is that downgrading to 
>>> spray 
>>> gets us 99.9th percentile latencies under 2ms, so something related to 
>>> the 
>>> upgrade is allowing this.
>>>
>>> Thanks,
>>>
>>> Gary
>>>
>>> On Tuesday, October 17, 2017 at 12:07:51 PM UTC-4, Konrad Malawski 
>>> wrote:

 Step 1 – don’t panic ;-)
 Step 2 – as I already asked for, please share actual details of the 
 benchmarks. It is not good to discuss benchmarks without any insight 
 into 
 what / how exactly you’re measuring.

 -- 
 Cheers,
 Konrad 'ktoso ' Malawski
 Akka  @ Lightbend 

 On October 12, 2017 at 15:31:19, Gary Malouf (malou...@gmail.com) 
 wrote:

 We have a web service that we just finished migrating from spray 
 1.3 to Akka-Http 10.0.9.  While in most cases it is performing well, 
 we are 
 seeing terrible 99th percentile latencies 

Re: [akka-user] Spray->Akka-Http Migration - seeing high 99th percentile latencies post-migration

2017-10-24 Thread Roland Kuhn
You could try to decrease your thread pool size to 1 to exclude wakeup 
latencies when things (like CPU cores) have gone to sleep.

Regards, Roland 

Sent from my iPhone

> On 23. Oct 2017, at 22:49, Gary Malouf  wrote:
> 
> Yes, it gets parsed using entity(as[]) with spray-json support.  Under a load 
> test of say 1000 requests/second these latencies are not visible in the 
> percentiles - they are easy to see because this web server is getting 10-20 
> requests/second currently.  Trying to brainstorm if a dispatcher needed to be 
> tuned or something of that sort but have yet to see evidence supporting that.
> 
> path("foos") {
> traceName("FooSelection") {
> 
> entity(as[ExternalPageRequest]) { pr =>
> val spr = toSelectionPageRequest(pr)
> shouldTracePageId(spr.pageId).fold(
> Tracer.currentContext.withNewSegment(s"Page-${pr.pageId}", "PageTrace", 
> "kamon") {
> processPageRequestAndComplete(pr, spr)
> },
> processPageRequestAndComplete(pr, spr)
> )
> }
> }
> 
> }
> 
>> On Mon, Oct 23, 2017 at 4:42 PM, Viktor Klang  wrote:
>> And you consume the entityBytes I presume?
>> 
>>> On Mon, Oct 23, 2017 at 10:35 PM, Gary Malouf  wrote:
>>> It is from when I start the Kamon trace (just inside of my 
>>> path("myawesomepath") declaration until (theoretically) a 'complete' call 
>>> is made.  
>>> 
>>> path("myawesomepath") {
>>>   traceName("CoolStory") {
>>> ///do some stuff
>>>  complete("This is great")
>>> } }
>>> 
>>> For what it's worth, this route is a 'POST' call.
>>> 
 On Mon, Oct 23, 2017 at 4:30 PM, Viktor Klang  
 wrote:
 No, I mean, is it from first-byte-received to last-byte-sent or what?
 
> On Mon, Oct 23, 2017 at 10:22 PM, Gary Malouf  
> wrote:
> We are using percentiles computed via Kamon 0.6.8.  In a very low request 
> rate environment like this, it takes roughly 1 super slow request/second 
> to throw off the percentiles (which is what I think is happening).  
> 
> 
> 
>> On Mon, Oct 23, 2017 at 4:20 PM, Viktor Klang  
>> wrote:
>> What definition of latency are you using? (i.e. how is it derived)
>> 
>>> On Mon, Oct 23, 2017 at 10:11 PM, Gary Malouf  
>>> wrote:
>>> Hi Konrad,
>>> 
>>> Our real issue is that we can not reproduce the results.  The web 
>>> server we are having latency issues with is under peak load of 10-15 
>>> requests/second - obviously not much to deal with.  
>>> 
>>> When we use load tests (https://github.com/apigee/apib), it's easy for 
>>> us to throw a few thousand requests/second at it and get latencies in 
>>> the ~ 3 ms range.  We use kamon to track internal metrics - what we see 
>>> is that our 95th and 99th percentiles only look bad under the 
>>> production traffic but not under load tests.  
>>> 
>>> I've since used kamon to print out the actual requests trying to find 
>>> any pattern in them to hint at what's wrong in my own code, but they 
>>> seem to be completely random.  What we do know is that downgrading to 
>>> spray gets us 99.9th percentile latencies under 2ms, so something 
>>> related to the upgrade is allowing this.
>>> 
>>> Thanks,
>>> 
>>> Gary
>>> 
 On Tuesday, October 17, 2017 at 12:07:51 PM UTC-4, Konrad Malawski 
 wrote:
 Step 1 – don’t panic ;-)
 Step 2 – as I already asked for, please share actual details of the 
 benchmarks. It is not good to discuss benchmarks without any insight 
 into what / how exactly you’re measuring.
 
 -- 
 Cheers,
 Konrad 'ktoso' Malawski
 Akka @ Lightbend
 
> On October 12, 2017 at 15:31:19, Gary Malouf (malou...@gmail.com) 
> wrote:
> 
> We have a web service that we just finished migrating from spray 1.3 
> to Akka-Http 10.0.9.  While in most cases it is performing well, we 
> are seeing terrible 99th percentile latencies 300-450ms range) 
> starting from a very low request rate (10/second) on an ec2 m3.large. 
>  
> 
> Our service does not do anything complicated - it does a few Map 
> lookups and returns a response to a request.  In spray, even 99th 
> percentile latencies were on the order of 1-3 ms, so we are 
> definitely concerned.  Connections as with many pixel-type servers 
> are short-lived -> we actually pass the Connection: Close header 
> intentionally in our responses.  
> 
> Is there any obvious tuning that should be done on the server 
> configuration that others have found?
> --
> >> Read the docs: http://akka.io/docs/
> >> Check the FAQ: 
> >> 

Re: [akka-user] Spray->Akka-Http Migration - seeing high 99th percentile latencies post-migration

2017-10-23 Thread Gary Malouf
Yes, it gets parsed using entity(as[]) with spray-json support.  Under a
load test of say 1000 requests/second these latencies are not visible in
the percentiles - they are easy to see because this web server is getting
10-20 requests/second currently.  Trying to brainstorm if a dispatcher
needed to be tuned or something of that sort but have yet to see evidence
supporting that.

path("foos") {
traceName("FooSelection") {
entity(as[ExternalPageRequest]) { pr =>
val spr = toSelectionPageRequest(pr)
shouldTracePageId(spr.pageId).fold(
Tracer.currentContext.withNewSegment(s"Page-${pr.pageId}", "PageTrace", "
kamon") {
processPageRequestAndComplete(pr, spr)
},
processPageRequestAndComplete(pr, spr)
)
}
}

}

On Mon, Oct 23, 2017 at 4:42 PM, Viktor Klang 
wrote:

> And you consume the entityBytes I presume?
>
> On Mon, Oct 23, 2017 at 10:35 PM, Gary Malouf 
> wrote:
>
>> It is from when I start the Kamon trace (just inside of my
>> path("myawesomepath") declaration until (theoretically) a 'complete' call
>> is made.
>>
>> path("myawesomepath") {
>>   traceName("CoolStory") {
>> ///do some stuff
>>  complete("This is great")
>> } }
>>
>> For what it's worth, this route is a 'POST' call.
>>
>> On Mon, Oct 23, 2017 at 4:30 PM, Viktor Klang 
>> wrote:
>>
>>> No, I mean, is it from first-byte-received to last-byte-sent or what?
>>>
>>> On Mon, Oct 23, 2017 at 10:22 PM, Gary Malouf 
>>> wrote:
>>>
 We are using percentiles computed via Kamon 0.6.8.  In a very low
 request rate environment like this, it takes roughly 1 super slow
 request/second to throw off the percentiles (which is what I think is
 happening).



 On Mon, Oct 23, 2017 at 4:20 PM, Viktor Klang 
 wrote:

> What definition of latency are you using? (i.e. how is it derived)
>
> On Mon, Oct 23, 2017 at 10:11 PM, Gary Malouf 
> wrote:
>
>> Hi Konrad,
>>
>> Our real issue is that we can not reproduce the results.  The web
>> server we are having latency issues with is under peak load of 10-15
>> requests/second - obviously not much to deal with.
>>
>> When we use load tests (https://github.com/apigee/apib), it's easy
>> for us to throw a few thousand requests/second at it and get latencies in
>> the ~ 3 ms range.  We use kamon to track internal metrics - what we see 
>> is
>> that our 95th and 99th percentiles only look bad under the production
>> traffic but not under load tests.
>>
>> I've since used kamon to print out the actual requests trying to find
>> any pattern in them to hint at what's wrong in my own code, but they seem
>> to be completely random.  What we do know is that downgrading to spray 
>> gets
>> us 99.9th percentile latencies under 2ms, so something related to the
>> upgrade is allowing this.
>>
>> Thanks,
>>
>> Gary
>>
>> On Tuesday, October 17, 2017 at 12:07:51 PM UTC-4, Konrad Malawski
>> wrote:
>>>
>>> Step 1 – don’t panic ;-)
>>> Step 2 – as I already asked for, please share actual details of the
>>> benchmarks. It is not good to discuss benchmarks without any insight 
>>> into
>>> what / how exactly you’re measuring.
>>>
>>> --
>>> Cheers,
>>> Konrad 'ktoso ' Malawski
>>> Akka  @ Lightbend 
>>>
>>> On October 12, 2017 at 15:31:19, Gary Malouf (malou...@gmail.com)
>>> wrote:
>>>
>>> We have a web service that we just finished migrating from spray 1.3
>>> to Akka-Http 10.0.9.  While in most cases it is performing well, we are
>>> seeing terrible 99th percentile latencies 300-450ms range) starting 
>>> from a
>>> very low request rate (10/second) on an ec2 m3.large.
>>>
>>> Our service does not do anything complicated - it does a few Map
>>> lookups and returns a response to a request.  In spray, even 99th
>>> percentile latencies were on the order of 1-3 ms, so we are definitely
>>> concerned.  Connections as with many pixel-type servers are short-lived 
>>> ->
>>> we actually pass the Connection: Close header intentionally in our
>>> responses.
>>>
>>> Is there any obvious tuning that should be done on the server
>>> configuration that others have found?
>>> --
>>> >> Read the docs: http://akka.io/docs/
>>> >> Check the FAQ: http://doc.akka.io/docs/akka/c
>>> urrent/additional/faq.html
>>> >> Search the archives: https://groups.google.com/grou
>>> p/akka-user
>>> ---
>>> You received this message because you are subscribed to the Google
>>> Groups "Akka User List" group.
>>> To unsubscribe from this group and stop receiving emails from it,
>>> send an email to 

Re: [akka-user] Spray->Akka-Http Migration - seeing high 99th percentile latencies post-migration

2017-10-23 Thread Viktor Klang
And you consume the entityBytes I presume?

On Mon, Oct 23, 2017 at 10:35 PM, Gary Malouf  wrote:

> It is from when I start the Kamon trace (just inside of my
> path("myawesomepath") declaration until (theoretically) a 'complete' call
> is made.
>
> path("myawesomepath") {
>   traceName("CoolStory") {
> ///do some stuff
>  complete("This is great")
> } }
>
> For what it's worth, this route is a 'POST' call.
>
> On Mon, Oct 23, 2017 at 4:30 PM, Viktor Klang 
> wrote:
>
>> No, I mean, is it from first-byte-received to last-byte-sent or what?
>>
>> On Mon, Oct 23, 2017 at 10:22 PM, Gary Malouf 
>> wrote:
>>
>>> We are using percentiles computed via Kamon 0.6.8.  In a very low
>>> request rate environment like this, it takes roughly 1 super slow
>>> request/second to throw off the percentiles (which is what I think is
>>> happening).
>>>
>>>
>>>
>>> On Mon, Oct 23, 2017 at 4:20 PM, Viktor Klang 
>>> wrote:
>>>
 What definition of latency are you using? (i.e. how is it derived)

 On Mon, Oct 23, 2017 at 10:11 PM, Gary Malouf 
 wrote:

> Hi Konrad,
>
> Our real issue is that we can not reproduce the results.  The web
> server we are having latency issues with is under peak load of 10-15
> requests/second - obviously not much to deal with.
>
> When we use load tests (https://github.com/apigee/apib), it's easy
> for us to throw a few thousand requests/second at it and get latencies in
> the ~ 3 ms range.  We use kamon to track internal metrics - what we see is
> that our 95th and 99th percentiles only look bad under the production
> traffic but not under load tests.
>
> I've since used kamon to print out the actual requests trying to find
> any pattern in them to hint at what's wrong in my own code, but they seem
> to be completely random.  What we do know is that downgrading to spray 
> gets
> us 99.9th percentile latencies under 2ms, so something related to the
> upgrade is allowing this.
>
> Thanks,
>
> Gary
>
> On Tuesday, October 17, 2017 at 12:07:51 PM UTC-4, Konrad Malawski
> wrote:
>>
>> Step 1 – don’t panic ;-)
>> Step 2 – as I already asked for, please share actual details of the
>> benchmarks. It is not good to discuss benchmarks without any insight into
>> what / how exactly you’re measuring.
>>
>> --
>> Cheers,
>> Konrad 'ktoso ' Malawski
>> Akka  @ Lightbend 
>>
>> On October 12, 2017 at 15:31:19, Gary Malouf (malou...@gmail.com)
>> wrote:
>>
>> We have a web service that we just finished migrating from spray 1.3
>> to Akka-Http 10.0.9.  While in most cases it is performing well, we are
>> seeing terrible 99th percentile latencies 300-450ms range) starting from 
>> a
>> very low request rate (10/second) on an ec2 m3.large.
>>
>> Our service does not do anything complicated - it does a few Map
>> lookups and returns a response to a request.  In spray, even 99th
>> percentile latencies were on the order of 1-3 ms, so we are definitely
>> concerned.  Connections as with many pixel-type servers are short-lived 
>> ->
>> we actually pass the Connection: Close header intentionally in our
>> responses.
>>
>> Is there any obvious tuning that should be done on the server
>> configuration that others have found?
>> --
>> >> Read the docs: http://akka.io/docs/
>> >> Check the FAQ: http://doc.akka.io/docs/akka/c
>> urrent/additional/faq.html
>> >> Search the archives: https://groups.google.com/grou
>> p/akka-user
>> ---
>> You received this message because you are subscribed to the Google
>> Groups "Akka User List" group.
>> To unsubscribe from this group and stop receiving emails from it,
>> send an email to akka-user+...@googlegroups.com.
>> To post to this group, send email to akka...@googlegroups.com.
>> Visit this group at https://groups.google.com/group/akka-user.
>> For more options, visit https://groups.google.com/d/optout.
>>
>> --
> >> Read the docs: http://akka.io/docs/
> >> Check the FAQ: http://doc.akka.io/docs/akka/c
> urrent/additional/faq.html
> >> Search the archives: https://groups.google.com/grou
> p/akka-user
> ---
> You received this message because you are subscribed to the Google
> Groups "Akka User List" group.
> To unsubscribe from this group and stop receiving emails from it, send
> an email to akka-user+unsubscr...@googlegroups.com.
> To post to this group, send email to akka-user@googlegroups.com.
> Visit this group at https://groups.google.com/group/akka-user.
> For more options, visit 

Re: [akka-user] Spray->Akka-Http Migration - seeing high 99th percentile latencies post-migration

2017-10-23 Thread Gary Malouf
It is from when I start the Kamon trace (just inside of my
path("myawesomepath") declaration until (theoretically) a 'complete' call
is made.

path("myawesomepath") {
  traceName("CoolStory") {
///do some stuff
 complete("This is great")
} }

For what it's worth, this route is a 'POST' call.

On Mon, Oct 23, 2017 at 4:30 PM, Viktor Klang 
wrote:

> No, I mean, is it from first-byte-received to last-byte-sent or what?
>
> On Mon, Oct 23, 2017 at 10:22 PM, Gary Malouf 
> wrote:
>
>> We are using percentiles computed via Kamon 0.6.8.  In a very low request
>> rate environment like this, it takes roughly 1 super slow request/second to
>> throw off the percentiles (which is what I think is happening).
>>
>>
>>
>> On Mon, Oct 23, 2017 at 4:20 PM, Viktor Klang 
>> wrote:
>>
>>> What definition of latency are you using? (i.e. how is it derived)
>>>
>>> On Mon, Oct 23, 2017 at 10:11 PM, Gary Malouf 
>>> wrote:
>>>
 Hi Konrad,

 Our real issue is that we can not reproduce the results.  The web
 server we are having latency issues with is under peak load of 10-15
 requests/second - obviously not much to deal with.

 When we use load tests (https://github.com/apigee/apib), it's easy for
 us to throw a few thousand requests/second at it and get latencies in the ~
 3 ms range.  We use kamon to track internal metrics - what we see is that
 our 95th and 99th percentiles only look bad under the production traffic
 but not under load tests.

 I've since used kamon to print out the actual requests trying to find
 any pattern in them to hint at what's wrong in my own code, but they seem
 to be completely random.  What we do know is that downgrading to spray gets
 us 99.9th percentile latencies under 2ms, so something related to the
 upgrade is allowing this.

 Thanks,

 Gary

 On Tuesday, October 17, 2017 at 12:07:51 PM UTC-4, Konrad Malawski
 wrote:
>
> Step 1 – don’t panic ;-)
> Step 2 – as I already asked for, please share actual details of the
> benchmarks. It is not good to discuss benchmarks without any insight into
> what / how exactly you’re measuring.
>
> --
> Cheers,
> Konrad 'ktoso ' Malawski
> Akka  @ Lightbend 
>
> On October 12, 2017 at 15:31:19, Gary Malouf (malou...@gmail.com)
> wrote:
>
> We have a web service that we just finished migrating from spray 1.3
> to Akka-Http 10.0.9.  While in most cases it is performing well, we are
> seeing terrible 99th percentile latencies 300-450ms range) starting from a
> very low request rate (10/second) on an ec2 m3.large.
>
> Our service does not do anything complicated - it does a few Map
> lookups and returns a response to a request.  In spray, even 99th
> percentile latencies were on the order of 1-3 ms, so we are definitely
> concerned.  Connections as with many pixel-type servers are short-lived ->
> we actually pass the Connection: Close header intentionally in our
> responses.
>
> Is there any obvious tuning that should be done on the server
> configuration that others have found?
> --
> >> Read the docs: http://akka.io/docs/
> >> Check the FAQ: http://doc.akka.io/docs/akka/c
> urrent/additional/faq.html
> >> Search the archives: https://groups.google.com/grou
> p/akka-user
> ---
> You received this message because you are subscribed to the Google
> Groups "Akka User List" group.
> To unsubscribe from this group and stop receiving emails from it, send
> an email to akka-user+...@googlegroups.com.
> To post to this group, send email to akka...@googlegroups.com.
> Visit this group at https://groups.google.com/group/akka-user.
> For more options, visit https://groups.google.com/d/optout.
>
> --
 >> Read the docs: http://akka.io/docs/
 >> Check the FAQ: http://doc.akka.io/docs/akka/c
 urrent/additional/faq.html
 >> Search the archives: https://groups.google.com/grou
 p/akka-user
 ---
 You received this message because you are subscribed to the Google
 Groups "Akka User List" group.
 To unsubscribe from this group and stop receiving emails from it, send
 an email to akka-user+unsubscr...@googlegroups.com.
 To post to this group, send email to akka-user@googlegroups.com.
 Visit this group at https://groups.google.com/group/akka-user.
 For more options, visit https://groups.google.com/d/optout.

>>>
>>>
>>>
>>> --
>>> Cheers,
>>> √
>>>
>>> --
>>> >> Read the docs: http://akka.io/docs/
>>> >> Check the FAQ: http://doc.akka.io/docs/akka/c
>>> urrent/additional/faq.html
>>> >> Search the archives: https://groups.google.com/grou
>>> 

Re: [akka-user] Spray->Akka-Http Migration - seeing high 99th percentile latencies post-migration

2017-10-23 Thread Viktor Klang
No, I mean, is it from first-byte-received to last-byte-sent or what?

On Mon, Oct 23, 2017 at 10:22 PM, Gary Malouf  wrote:

> We are using percentiles computed via Kamon 0.6.8.  In a very low request
> rate environment like this, it takes roughly 1 super slow request/second to
> throw off the percentiles (which is what I think is happening).
>
>
>
> On Mon, Oct 23, 2017 at 4:20 PM, Viktor Klang 
> wrote:
>
>> What definition of latency are you using? (i.e. how is it derived)
>>
>> On Mon, Oct 23, 2017 at 10:11 PM, Gary Malouf 
>> wrote:
>>
>>> Hi Konrad,
>>>
>>> Our real issue is that we can not reproduce the results.  The web server
>>> we are having latency issues with is under peak load of 10-15
>>> requests/second - obviously not much to deal with.
>>>
>>> When we use load tests (https://github.com/apigee/apib), it's easy for
>>> us to throw a few thousand requests/second at it and get latencies in the ~
>>> 3 ms range.  We use kamon to track internal metrics - what we see is that
>>> our 95th and 99th percentiles only look bad under the production traffic
>>> but not under load tests.
>>>
>>> I've since used kamon to print out the actual requests trying to find
>>> any pattern in them to hint at what's wrong in my own code, but they seem
>>> to be completely random.  What we do know is that downgrading to spray gets
>>> us 99.9th percentile latencies under 2ms, so something related to the
>>> upgrade is allowing this.
>>>
>>> Thanks,
>>>
>>> Gary
>>>
>>> On Tuesday, October 17, 2017 at 12:07:51 PM UTC-4, Konrad Malawski wrote:

 Step 1 – don’t panic ;-)
 Step 2 – as I already asked for, please share actual details of the
 benchmarks. It is not good to discuss benchmarks without any insight into
 what / how exactly you’re measuring.

 --
 Cheers,
 Konrad 'ktoso ' Malawski
 Akka  @ Lightbend 

 On October 12, 2017 at 15:31:19, Gary Malouf (malou...@gmail.com)
 wrote:

 We have a web service that we just finished migrating from spray 1.3 to
 Akka-Http 10.0.9.  While in most cases it is performing well, we are seeing
 terrible 99th percentile latencies 300-450ms range) starting from a very
 low request rate (10/second) on an ec2 m3.large.

 Our service does not do anything complicated - it does a few Map
 lookups and returns a response to a request.  In spray, even 99th
 percentile latencies were on the order of 1-3 ms, so we are definitely
 concerned.  Connections as with many pixel-type servers are short-lived ->
 we actually pass the Connection: Close header intentionally in our
 responses.

 Is there any obvious tuning that should be done on the server
 configuration that others have found?
 --
 >> Read the docs: http://akka.io/docs/
 >> Check the FAQ: http://doc.akka.io/docs/akka/c
 urrent/additional/faq.html
 >> Search the archives: https://groups.google.com/grou
 p/akka-user
 ---
 You received this message because you are subscribed to the Google
 Groups "Akka User List" group.
 To unsubscribe from this group and stop receiving emails from it, send
 an email to akka-user+...@googlegroups.com.
 To post to this group, send email to akka...@googlegroups.com.
 Visit this group at https://groups.google.com/group/akka-user.
 For more options, visit https://groups.google.com/d/optout.

 --
>>> >> Read the docs: http://akka.io/docs/
>>> >> Check the FAQ: http://doc.akka.io/docs/akka/c
>>> urrent/additional/faq.html
>>> >> Search the archives: https://groups.google.com/grou
>>> p/akka-user
>>> ---
>>> You received this message because you are subscribed to the Google
>>> Groups "Akka User List" group.
>>> To unsubscribe from this group and stop receiving emails from it, send
>>> an email to akka-user+unsubscr...@googlegroups.com.
>>> To post to this group, send email to akka-user@googlegroups.com.
>>> Visit this group at https://groups.google.com/group/akka-user.
>>> For more options, visit https://groups.google.com/d/optout.
>>>
>>
>>
>>
>> --
>> Cheers,
>> √
>>
>> --
>> >> Read the docs: http://akka.io/docs/
>> >> Check the FAQ: http://doc.akka.io/docs/akka/c
>> urrent/additional/faq.html
>> >> Search the archives: https://groups.google.com/group/akka-user
>> ---
>> You received this message because you are subscribed to a topic in the
>> Google Groups "Akka User List" group.
>> To unsubscribe from this topic, visit https://groups.google.com/d/to
>> pic/akka-user/-_C9jCPDwts/unsubscribe.
>> To unsubscribe from this group and all its topics, send an email to
>> akka-user+unsubscr...@googlegroups.com.
>> To post to this group, send email to akka-user@googlegroups.com.
>> Visit this group at https://groups.google.com/group/akka-user.
>> 

Re: [akka-user] Spray->Akka-Http Migration - seeing high 99th percentile latencies post-migration

2017-10-23 Thread Gary Malouf
We are using percentiles computed via Kamon 0.6.8.  In a very low request
rate environment like this, it takes roughly 1 super slow request/second to
throw off the percentiles (which is what I think is happening).



On Mon, Oct 23, 2017 at 4:20 PM, Viktor Klang 
wrote:

> What definition of latency are you using? (i.e. how is it derived)
>
> On Mon, Oct 23, 2017 at 10:11 PM, Gary Malouf 
> wrote:
>
>> Hi Konrad,
>>
>> Our real issue is that we can not reproduce the results.  The web server
>> we are having latency issues with is under peak load of 10-15
>> requests/second - obviously not much to deal with.
>>
>> When we use load tests (https://github.com/apigee/apib), it's easy for
>> us to throw a few thousand requests/second at it and get latencies in the ~
>> 3 ms range.  We use kamon to track internal metrics - what we see is that
>> our 95th and 99th percentiles only look bad under the production traffic
>> but not under load tests.
>>
>> I've since used kamon to print out the actual requests trying to find any
>> pattern in them to hint at what's wrong in my own code, but they seem to be
>> completely random.  What we do know is that downgrading to spray gets us
>> 99.9th percentile latencies under 2ms, so something related to the upgrade
>> is allowing this.
>>
>> Thanks,
>>
>> Gary
>>
>> On Tuesday, October 17, 2017 at 12:07:51 PM UTC-4, Konrad Malawski wrote:
>>>
>>> Step 1 – don’t panic ;-)
>>> Step 2 – as I already asked for, please share actual details of the
>>> benchmarks. It is not good to discuss benchmarks without any insight into
>>> what / how exactly you’re measuring.
>>>
>>> --
>>> Cheers,
>>> Konrad 'ktoso ' Malawski
>>> Akka  @ Lightbend 
>>>
>>> On October 12, 2017 at 15:31:19, Gary Malouf (malou...@gmail.com) wrote:
>>>
>>> We have a web service that we just finished migrating from spray 1.3 to
>>> Akka-Http 10.0.9.  While in most cases it is performing well, we are seeing
>>> terrible 99th percentile latencies 300-450ms range) starting from a very
>>> low request rate (10/second) on an ec2 m3.large.
>>>
>>> Our service does not do anything complicated - it does a few Map lookups
>>> and returns a response to a request.  In spray, even 99th percentile
>>> latencies were on the order of 1-3 ms, so we are definitely concerned.
>>> Connections as with many pixel-type servers are short-lived -> we actually
>>> pass the Connection: Close header intentionally in our responses.
>>>
>>> Is there any obvious tuning that should be done on the server
>>> configuration that others have found?
>>> --
>>> >> Read the docs: http://akka.io/docs/
>>> >> Check the FAQ: http://doc.akka.io/docs/akka/c
>>> urrent/additional/faq.html
>>> >> Search the archives: https://groups.google.com/grou
>>> p/akka-user
>>> ---
>>> You received this message because you are subscribed to the Google
>>> Groups "Akka User List" group.
>>> To unsubscribe from this group and stop receiving emails from it, send
>>> an email to akka-user+...@googlegroups.com.
>>> To post to this group, send email to akka...@googlegroups.com.
>>> Visit this group at https://groups.google.com/group/akka-user.
>>> For more options, visit https://groups.google.com/d/optout.
>>>
>>> --
>> >> Read the docs: http://akka.io/docs/
>> >> Check the FAQ: http://doc.akka.io/docs/akka/c
>> urrent/additional/faq.html
>> >> Search the archives: https://groups.google.com/group/akka-user
>> ---
>> You received this message because you are subscribed to the Google Groups
>> "Akka User List" group.
>> To unsubscribe from this group and stop receiving emails from it, send an
>> email to akka-user+unsubscr...@googlegroups.com.
>> To post to this group, send email to akka-user@googlegroups.com.
>> Visit this group at https://groups.google.com/group/akka-user.
>> For more options, visit https://groups.google.com/d/optout.
>>
>
>
>
> --
> Cheers,
> √
>
> --
> >> Read the docs: http://akka.io/docs/
> >> Check the FAQ: http://doc.akka.io/docs/akka/
> current/additional/faq.html
> >> Search the archives: https://groups.google.com/group/akka-user
> ---
> You received this message because you are subscribed to a topic in the
> Google Groups "Akka User List" group.
> To unsubscribe from this topic, visit https://groups.google.com/d/
> topic/akka-user/-_C9jCPDwts/unsubscribe.
> To unsubscribe from this group and all its topics, send an email to
> akka-user+unsubscr...@googlegroups.com.
> To post to this group, send email to akka-user@googlegroups.com.
> Visit this group at https://groups.google.com/group/akka-user.
> For more options, visit https://groups.google.com/d/optout.
>

-- 
>>  Read the docs: http://akka.io/docs/
>>  Check the FAQ: 
>> http://doc.akka.io/docs/akka/current/additional/faq.html
>>  Search the archives: 

Re: [akka-user] Spray->Akka-Http Migration - seeing high 99th percentile latencies post-migration

2017-10-23 Thread Viktor Klang
What definition of latency are you using? (i.e. how is it derived)

On Mon, Oct 23, 2017 at 10:11 PM, Gary Malouf  wrote:

> Hi Konrad,
>
> Our real issue is that we can not reproduce the results.  The web server
> we are having latency issues with is under peak load of 10-15
> requests/second - obviously not much to deal with.
>
> When we use load tests (https://github.com/apigee/apib), it's easy for us
> to throw a few thousand requests/second at it and get latencies in the ~ 3
> ms range.  We use kamon to track internal metrics - what we see is that our
> 95th and 99th percentiles only look bad under the production traffic but
> not under load tests.
>
> I've since used kamon to print out the actual requests trying to find any
> pattern in them to hint at what's wrong in my own code, but they seem to be
> completely random.  What we do know is that downgrading to spray gets us
> 99.9th percentile latencies under 2ms, so something related to the upgrade
> is allowing this.
>
> Thanks,
>
> Gary
>
> On Tuesday, October 17, 2017 at 12:07:51 PM UTC-4, Konrad Malawski wrote:
>>
>> Step 1 – don’t panic ;-)
>> Step 2 – as I already asked for, please share actual details of the
>> benchmarks. It is not good to discuss benchmarks without any insight into
>> what / how exactly you’re measuring.
>>
>> --
>> Cheers,
>> Konrad 'ktoso ' Malawski
>> Akka  @ Lightbend 
>>
>> On October 12, 2017 at 15:31:19, Gary Malouf (malou...@gmail.com) wrote:
>>
>> We have a web service that we just finished migrating from spray 1.3 to
>> Akka-Http 10.0.9.  While in most cases it is performing well, we are seeing
>> terrible 99th percentile latencies 300-450ms range) starting from a very
>> low request rate (10/second) on an ec2 m3.large.
>>
>> Our service does not do anything complicated - it does a few Map lookups
>> and returns a response to a request.  In spray, even 99th percentile
>> latencies were on the order of 1-3 ms, so we are definitely concerned.
>> Connections as with many pixel-type servers are short-lived -> we actually
>> pass the Connection: Close header intentionally in our responses.
>>
>> Is there any obvious tuning that should be done on the server
>> configuration that others have found?
>> --
>> >> Read the docs: http://akka.io/docs/
>> >> Check the FAQ: http://doc.akka.io/docs/akka/c
>> urrent/additional/faq.html
>> >> Search the archives: https://groups.google.com/group/akka-user
>> ---
>> You received this message because you are subscribed to the Google Groups
>> "Akka User List" group.
>> To unsubscribe from this group and stop receiving emails from it, send an
>> email to akka-user+...@googlegroups.com.
>> To post to this group, send email to akka...@googlegroups.com.
>> Visit this group at https://groups.google.com/group/akka-user.
>> For more options, visit https://groups.google.com/d/optout.
>>
>> --
> >> Read the docs: http://akka.io/docs/
> >> Check the FAQ: http://doc.akka.io/docs/akka/
> current/additional/faq.html
> >> Search the archives: https://groups.google.com/group/akka-user
> ---
> You received this message because you are subscribed to the Google Groups
> "Akka User List" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to akka-user+unsubscr...@googlegroups.com.
> To post to this group, send email to akka-user@googlegroups.com.
> Visit this group at https://groups.google.com/group/akka-user.
> For more options, visit https://groups.google.com/d/optout.
>



-- 
Cheers,
√

-- 
>>  Read the docs: http://akka.io/docs/
>>  Check the FAQ: 
>> http://doc.akka.io/docs/akka/current/additional/faq.html
>>  Search the archives: https://groups.google.com/group/akka-user
--- 
You received this message because you are subscribed to the Google Groups "Akka 
User List" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to akka-user+unsubscr...@googlegroups.com.
To post to this group, send email to akka-user@googlegroups.com.
Visit this group at https://groups.google.com/group/akka-user.
For more options, visit https://groups.google.com/d/optout.


Re: [akka-user] Spray->Akka-Http Migration - seeing high 99th percentile latencies post-migration

2017-10-23 Thread Gary Malouf
Hi Konrad,

Our real issue is that we can not reproduce the results.  The web server we 
are having latency issues with is under peak load of 10-15 requests/second 
- obviously not much to deal with.  

When we use load tests (https://github.com/apigee/apib), it's easy for us 
to throw a few thousand requests/second at it and get latencies in the ~ 3 
ms range.  We use kamon to track internal metrics - what we see is that our 
95th and 99th percentiles only look bad under the production traffic but 
not under load tests.  

I've since used kamon to print out the actual requests trying to find any 
pattern in them to hint at what's wrong in my own code, but they seem to be 
completely random.  What we do know is that downgrading to spray gets us 
99.9th percentile latencies under 2ms, so something related to the upgrade 
is allowing this.

Thanks,

Gary

On Tuesday, October 17, 2017 at 12:07:51 PM UTC-4, Konrad Malawski wrote:
>
> Step 1 – don’t panic ;-)
> Step 2 – as I already asked for, please share actual details of the 
> benchmarks. It is not good to discuss benchmarks without any insight into 
> what / how exactly you’re measuring.
>
> -- 
> Cheers,
> Konrad 'ktoso ' Malawski
> Akka  @ Lightbend 
>
> On October 12, 2017 at 15:31:19, Gary Malouf (malou...@gmail.com 
> ) wrote:
>
> We have a web service that we just finished migrating from spray 1.3 to 
> Akka-Http 10.0.9.  While in most cases it is performing well, we are seeing 
> terrible 99th percentile latencies 300-450ms range) starting from a very 
> low request rate (10/second) on an ec2 m3.large.  
>
> Our service does not do anything complicated - it does a few Map lookups 
> and returns a response to a request.  In spray, even 99th percentile 
> latencies were on the order of 1-3 ms, so we are definitely concerned.  
> Connections as with many pixel-type servers are short-lived -> we actually 
> pass the Connection: Close header intentionally in our responses.  
>
> Is there any obvious tuning that should be done on the server 
> configuration that others have found?
> --
> >> Read the docs: http://akka.io/docs/
> >> Check the FAQ: 
> http://doc.akka.io/docs/akka/current/additional/faq.html
> >> Search the archives: https://groups.google.com/group/akka-user
> ---
> You received this message because you are subscribed to the Google Groups 
> "Akka User List" group.
> To unsubscribe from this group and stop receiving emails from it, send an 
> email to akka-user+...@googlegroups.com .
> To post to this group, send email to akka...@googlegroups.com 
> .
> Visit this group at https://groups.google.com/group/akka-user.
> For more options, visit https://groups.google.com/d/optout.
>
>

-- 
>>  Read the docs: http://akka.io/docs/
>>  Check the FAQ: 
>> http://doc.akka.io/docs/akka/current/additional/faq.html
>>  Search the archives: https://groups.google.com/group/akka-user
--- 
You received this message because you are subscribed to the Google Groups "Akka 
User List" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to akka-user+unsubscr...@googlegroups.com.
To post to this group, send email to akka-user@googlegroups.com.
Visit this group at https://groups.google.com/group/akka-user.
For more options, visit https://groups.google.com/d/optout.


Re: [akka-user] Spray->Akka-Http Migration - seeing high 99th percentile latencies post-migration

2017-10-17 Thread Konrad “ktoso” Malawski
Step 1 – don’t panic ;-)
Step 2 – as I already asked for, please share actual details of the
benchmarks. It is not good to discuss benchmarks without any insight into
what / how exactly you’re measuring.

-- 
Cheers,
Konrad 'ktoso ' Malawski
Akka  @ Lightbend 

On October 12, 2017 at 15:31:19, Gary Malouf (malouf.g...@gmail.com) wrote:

We have a web service that we just finished migrating from spray 1.3 to
Akka-Http 10.0.9.  While in most cases it is performing well, we are seeing
terrible 99th percentile latencies 300-450ms range) starting from a very
low request rate (10/second) on an ec2 m3.large.

Our service does not do anything complicated - it does a few Map lookups
and returns a response to a request.  In spray, even 99th percentile
latencies were on the order of 1-3 ms, so we are definitely concerned.
Connections as with many pixel-type servers are short-lived -> we actually
pass the Connection: Close header intentionally in our responses.

Is there any obvious tuning that should be done on the server configuration
that others have found?
--
>> Read the docs: http://akka.io/docs/
>> Check the FAQ:
http://doc.akka.io/docs/akka/current/additional/faq.html
>> Search the archives: https://groups.google.com/group/akka-user
---
You received this message because you are subscribed to the Google Groups
"Akka User List" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to akka-user+unsubscr...@googlegroups.com.
To post to this group, send email to akka-user@googlegroups.com.
Visit this group at https://groups.google.com/group/akka-user.
For more options, visit https://groups.google.com/d/optout.

-- 
>>  Read the docs: http://akka.io/docs/
>>  Check the FAQ: 
>> http://doc.akka.io/docs/akka/current/additional/faq.html
>>  Search the archives: https://groups.google.com/group/akka-user
--- 
You received this message because you are subscribed to the Google Groups "Akka 
User List" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to akka-user+unsubscr...@googlegroups.com.
To post to this group, send email to akka-user@googlegroups.com.
Visit this group at https://groups.google.com/group/akka-user.
For more options, visit https://groups.google.com/d/optout.