As mentioned in this documentation [1], in order for the ‘min_instances’ element to function properly, the application must also handle warmup requests so you may verify that this is the case in your implementation. You can refer to this reference [2] for guidelines on how to achieve this.
Since Google Groups are channels aiming at discussing high-level conceptual discussions, I would recommend posting your issue and questions on Stack Overflow [3] where technical issues can be troubleshooted. Should you believe that your issue is due to the product not working as intended after troubleshooting it, you may report it on Issue Tracker here [4] [1] https://cloud.google.com/appengine/docs/standard/python/config/appref#automatic_scaling_min_instances [2] https://cloud.google.com/appengine/docs/standard/python/configuring-warmup-requests [3] https://stackoverflow.com/questions/tagged/google-app-engine [4] https://issuetracker.google.com/issues/new?component=187191&template=1162848 On Monday, April 6, 2020 at 9:35:46 AM UTC-4, Bala Subramanian Sutherson wrote: > > Our project is running on the Google App Engine standard environment with > auto-scaling configured as mentioned below. Warm up requests are enabled in > the app and we are using Google Endpoints service. However, I am facing a > latency issue in the different scenarios. Environment: *Java 8*, Instance > type: *F4_1G* Configuration for autoscaling: *min-instances: 2 > max-concurrent-requests: 80 min-pending-latency: 6s max-pending-latency: > 10s* > > I tested with JMeter with configuration of sending *85 asynchronous > requests* with a *ramp up period of 10 seconds*. From the application > logs I can notice that appengine takes a long time to serve the > request.Below are the questions I have > > > 1.Most of the requests are failing because of time exceed. In image 1, we > can spot that the request takes 88.2 seconds. I know that AppEngine auto > scaling has a 60 seconds request timeout limit. But we have configured > autoscaling with a minimum 2 instances and there is no restriction for > max-instance. The AppEngine Instance should handle the request otherwise > AppEngine should scale up to handle the request. *Why is it not > happening?* > > 1. 2.While scaling up, the request takes 43.6 seconds. In image 2, we > are able to see that the request came at 20:27:01:663 IST and the first > line of API execution starts at 20:27:40:407 IST. *What is happening > in between time? Can I get a log for this period?* > 2. 3.After the scaleup, subsequent requests also take a very long time > to serve. For instance an API request usually gets completed within 2 > seconds. In image 3, we can note that API takes 42.4s without > loading-request process and then request comes at 20:27:01:728 IST. The > first line of API execution starts at 20:27:40:708 IST. *What is > happening in between time?* > > -- You received this message because you are subscribed to the Google Groups "Google App Engine" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To view this discussion on the web visit https://groups.google.com/d/msgid/google-appengine/73771f34-a514-402f-bf94-ad663c2868d2%40googlegroups.com.
