Right, you should definitely test and see what the results are. My first 
inclination was also to increase max_concurrent_requests, but because then 
all those requests have increased latency, the actual QPS per instance 
decreased! Lowering max_concurrent_requests decreased request latency, so 
each instance could process more requests/second.

We use F1 instances, because we do not need the additional memory, and our 
requests perform mostly RPCs. In our testing, faster instance classes do 
process requests faster, but also cost significantly more.  F1s provide the 
best performance/cost ratio for us. This could be a Python thing, not sure. 
Again, you should really test and figure out what is the best for your 
application+runtime.


On Thursday, June 30, 2016 at 10:09:56 AM UTC+2, Trevor wrote:
>
> Thanks for the advice! I hadn't considered setting it that low because we 
> are on F4 instances and I would really, *really* hope that they could 
> handle at least the default 8 concurrent requests. That being said, I will 
> throw the front-end on those settings this evening and monitor until noon 
> tomorrow. The only thing I worry about is the amount of instances 
> increasing rapidly and blowing out our monthly costs. We have 10 PV per sec 
> during low times, and up to 30/35 per sec during peak hours. That means to 
> maintain our current costs (3-5 billed f4 instances) we would need a 
> sub-300ms request-completion time during peak, if my muddled math is 
> correct. I suppose if there are less requests going to each instance, we 
> could drop down to a lower class, perhaps?
>
>
> <https://lh3.googleusercontent.com/-WzVtyzcqOVE/V3TTyx9sLrI/AAAAAAAAQV4/Eu-qJGXoJTYlvchNEvOMGdBonNAAjv6SACLcB/s1600/Screen%2BShot%2B2016-06-30%2Bat%2B17.04.26.png>
>
>
> What instance class do you run, if I may ask?
>
> On Thursday, June 30, 2016 at 4:53:53 PM UTC+9, troberti wrote:
>>
>> I recommend trying to set max_concurrent_requests to 2. As you said, 
>> higher values only makes latency (much) worse. In our experience, a value 
>> of 2 gets the best latency results without an increase in cost.
>>
>> We still have some of those pauses mid-request in a trace, and I really 
>> would like to know where they come from (and get rid of them), but they 
>> seem much shorter with a lower max_concurrent_requests value.
>>
>

-- 
You received this message because you are subscribed to the Google Groups 
"Google App Engine" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To post to this group, send email to [email protected].
Visit this group at https://groups.google.com/group/google-appengine.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/google-appengine/d43195ca-6d54-470f-bf2b-eef20274a2e0%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to