Hey Folks,

As for the amount of concurrent requests an instance can handle, it depends 
on the CPU usage on the instance, and the number of concurrent requests the 
instance can handle is dependent on that, and the distribution of requests 
to instances is also dependent on statistical trends in latency. It's 
possible to see variable concurrent request performance dependent on how 
requests use up the resources for the given instance class 
<https://cloud.google.com/appengine/docs/about-the-standard-environment?authuser=0#instance_classes>
 
and the latency statistics of requests on an instance.

I have one small recommendation relating to the mysterious gaps of time in 
requests. Using System.currentTimeMillis() 
<https://docs.oracle.com/javase/7/docs/api/java/lang/System.html#currentTimeMillis()>
 
calls (this is for java, but other runtimes have equivalent system calls) 
to surround calls to complex libraries, or any calls which require network 
activity, and you could be able to determine what exactly is taking up that 
time. It might be something optimize-able, or it might be a network issue. 
Depending on the nature of the network call itself, it could also be 
optimize-able.

Regards,

Nick
Cloud Platform Community Support

On Friday, July 1, 2016 at 3:28:51 AM UTC-4, Thomas Taschauer wrote:
>
> One thing I noticed is that the first request(s?) served by a fresh 
> instance will always be really slow. Not that they stay in the request 
> queue for a longer time (which is expected behaviour of course), but they 
> have long "pauses" in the middle of the request as you mentioned before, 
> usually up to 5 seconds in my case.
>
> What I'm going to test next is upgrading to F2 - hoping for smaller pauses 
> due to a faster CPU - and reverting other scaling-options to default (used 
> max_concurrent_requests and max_idle_instances before) hoping for the 
> AppEngine scaler to figure it out himself. :)
>
> On Thursday, June 30, 2016 at 1:13:42 PM UTC+2, troberti wrote:
>>
>> Great to hear that it helps. Actually, if you are using F4s, I might try 
>> a slightly higher max_concurrent_requests , say 4. Again, test and compare 
>> to be sure.
>>
>> Finally, to reduce costs, I would recommend to set max_idle_instances to 
>> 1. Keep min_idle_instances to what you need for your application. For us 
>> this reduces cost significantly without any apparent drawbacks.
>>
>> On Thursday, June 30, 2016 at 11:44:34 AM UTC+2, Trevor wrote:
>>>
>>> Well, I have to say thank you very, very much. Thanks to your advice we 
>>> have our lowest latency in 3 years! Sub 300ms average.  As expected though, 
>>> we are now sitting on 21 billed f4 instances, which will potentially cost 
>>> us in the order of 3x our current ($30-40 -> $100+), but we will tweak that 
>>> from tomorrow onwards. Peak hour is about to hit so we are going to see if 
>>> the system can keep sub-300ms at the current "automatic" setting for 
>>> scaling. But yes, once again, thank you for solving in 5 minutes what I 
>>> have been working on doing for 2 weeks (my tears are from joy and sadness 
>>> all at once)
>>>
>>>
>>> <https://lh3.googleusercontent.com/-eEUuw3hSLYU/V3Tox-bhe6I/AAAAAAAAQWM/zPzgBRJkRHcoBSPmVrP2xsmN2FDK6Yl_wCLcB/s1600/Screen%2BShot%2B2016-06-30%2Bat%2B18.37.20.png>
>>>
>>>
>>> <https://lh3.googleusercontent.com/-4c7xvBsQ_tk/V3TpBfSBUWI/AAAAAAAAQWU/0tgD4v43X44D5Q-gBULBeQu11KIApRPYQCLcB/s1600/Screen%2BShot%2B2016-06-30%2Bat%2B18.39.51.png>
>>>
>>>
>>> On Thursday, June 30, 2016 at 6:03:23 PM UTC+9, troberti wrote:
>>>>
>>>> Right, you should definitely test and see what the results are. My 
>>>> first inclination was also to increase max_concurrent_requests, but 
>>>> because 
>>>> then all those requests have increased latency, the actual QPS per 
>>>> instance 
>>>> decreased! Lowering max_concurrent_requests decreased request latency, so 
>>>> each instance could process more requests/second.
>>>>
>>>> We use F1 instances, because we do not need the additional memory, and 
>>>> our requests perform mostly RPCs. In our testing, faster instance classes 
>>>> do process requests faster, but also cost significantly more.  F1s provide 
>>>> the best performance/cost ratio for us. This could be a Python thing, not 
>>>> sure. Again, you should really test and figure out what is the best for 
>>>> your application+runtime.
>>>>
>>>

-- 
You received this message because you are subscribed to the Google Groups 
"Google App Engine" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To post to this group, send email to [email protected].
Visit this group at https://groups.google.com/group/google-appengine.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/google-appengine/920f0f94-42fe-4237-aa51-9c17de2c736d%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to