Hello, According to the App Engine Standard Official Documentation [1] , you can't configure max-concurrent-requests for manual scaling, as for manual scaling, only B1, B2, B4, B4_1G, and B8 instance classes are available. And to use max-concurrent-requests <https://cloud.google.com/appengine/docs/standard/java/config/appref#automatic_scaling_max_concurrent_requests> , you need instance class F2 or higher. But F2 or higher instance classes [2] doesn't support manual scaling.
You also mentioned that you tried the automatic scaling setting with max-concurrent-requests, and you saw 14 seconds of latency between the first request entering the queue and the application writing the first log. This is expected behavior as it is also mentioned in the App Engine Official Documentation[3] that "You might experience increased API latency if the max-concurrent-request setting is too high" which in your case is the maximum value (80). The max-concurrent-request setting does not guarantee you the speed of the requests processing, if not , allows you to accept the number of concurrent requests before the scheduler spawns a new instance. To find the optimal value, you will need to monitor the performance of your application when setting different values for the max-concurrent-requests setting. Starting with a value of 15, you can go testing gradually increasing the value and monitoring the application until you find the best fit for you. Please let me know if you have more questions regarding this subject and I will be happy to assist you further. ======================= [1] https://cloud.google.com/appengine/docs/standard/java/config/appref#instance_class [2] https://cloud.google.com/appengine/docs/standard#instance_classes [3] https://cloud.google.com/appengine/docs/standard#instance_classes On Tuesday, November 3, 2020 at 9:47:43 AM UTC+1 [email protected] wrote: > How to change AppEngine standard max concurrent request for manual scaling? > > Current appengine-web.xml: > > *<?xml version="1.0" encoding="utf-8"?>* > *<appengine-web-app xmlns="http://appengine.google.com/ns/1.0 > <http://appengine.google.com/ns/1.0>">* > * <service>...</service>* > * <runtime>java8</runtime>* > * <threadsafe>true</threadsafe>* > * <manual-scaling>* > * <instances>1</instances>* > * </manual-scaling>* > * <instance-class>B2</instance-class>* > > * <precompilation-enabled>true</precompilation-enabled>* > * <sessions-enabled>false</sessions-enabled>* > * <warmup-requests-enabled>false</warmup-requests-enabled>* > *</appengine-web-app>* > > What we have noticed is that manual scaling instances handle max 10 > concurrent requests and queues the rest of the concurrent requests if over > 10. > > For appengine logs it is seen as the request comes in but when it enters > to application code whenever there is an available request handler. > > For the request which is not queued, timestampd between the request > entering the system and the first application code point is very near each > other. > *2020-11-03 09:34:36.247 EET GET 204 69 B 3.1 s Chrome 86 /valueflow/ping* > *2020-11-03 09:34:36.356 EET > com.koivusolutions.shared.commons.logging.JavaLogger log: First filter:72 > (JavaLogger.java:17)* > > For the request which is queued because of too much concurrency, > timestamps between request entering the system and first application code > point are far away from each other. > *2020-11-03 09:34:36.247 EET GET 204 69 B 6 s Chrome 86 /valueflow/ping* > *2020-11-03 09:34:39.277 EET > com.koivusolutions.shared.commons.logging.JavaLogger log: First filter:75 > (JavaLogger.java:17) * > > > There is max-concurrent-requests setting but it is only for > automatic-scaling. How to do the same for manual-scaling? > > We tried to "emulate" manual-scaling by using with using automatic-scaling > and settings min and max instances both as one like this: > > <automatic-scaling> > <max-concurrent-requests>80</max-concurrent-requests> > <target-throughput-utilization>.95</target-throughput-utilization> > <max-instances>1</max-instances> > <min-instances>1</min-instances> > </automatic-scaling> > <instance-class>F2</instance-class> > > It allows more concurrency but still, sometimes there is a very long > queuing time even level of concurrency is not near max > > *2020-11-03 09:54:58.553 EET GET 204 69 B 17.2 s Chrome 86 > /valueflow/ping?sleep=3* > *2020-11-03 09:55:12.742 EET > com.koivusolutions.shared.commons.logging.JavaLogger log: First filter:83 > (JavaLogger.java:17) * > > > -- You received this message because you are subscribed to the Google Groups "Google App Engine" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To view this discussion on the web visit https://groups.google.com/d/msgid/google-appengine/5b2c363c-1aab-4dcb-bcfe-1f92295376e9n%40googlegroups.com.
