Hi Takashi, I ran some experiments with an instance that had requests pending only from my own scripts (no user facing traffic at all).
What I found was that sending requests at about 1req/sec, regularly spaced, caused GAE to spin up new instances randomly. If I set the min instances setting to anything but "automatic", the very first request would cause a new instance to spin up (this was true even if the min instances was some high number, like 8, and I waited for all 8 instances to finish launching before sending a request - so in this case the # of instances started at 9 for the very first request). The only solution I found for this behavior was to package the entire app as a backend. - Kris On Friday, August 24, 2012 11:43:23 AM UTC-7, Takashi Matsuo (Google) wrote: > > > Hi Mos, > > On Sat, Aug 25, 2012 at 1:39 AM, Mos <[email protected] > <javascript:>>wrote: > >> Hello Takashi, >> >> >> > Actually there were almost 8 requests in a second. So App Engine likely >> needed more than one instance at this particular moment. >> >> I thought this is why GAE has the concept of "pending-latency" (which we >> discussed below). >> Meaning: Incoming requests may wait up to 15 seconds before starting a >> new instance. Therefore when 8 requests in one second occur that >> should not mean that more instance needs to be started. Especially if >> there is no other traffic in this minute, as seen in my example. >> Otherwise it would be a very bad implementation: >> Starting a new instance means around 30s waiting time. Serving 8 >> parallel requests from one instance, would result in a maximum of >> 8 seconds for the last request (assuming that each request takes around 1 >> second). >> There is no reason for this concrete example to fire up more instances >> and let requests wait more then 30 seconds until a new instance is loaded. >> > > Do you really read my e-mail? > > Setting Max Pending Latency doesn't force requests to be in the pending > queue for the specified time. Please use Min Pending Latency instead. > Can you try this first? If it doesn't work, try 2 min idle instances then. > > >> >> > ... here is what you've seen in the past weeks. >> > >> >* You have been almost always set 'Automatic-2' idle instance setting. >> >* More than 3 weeks ago, number of loading requests were very few. >> > * Recently you have seen more loading requests than before. >> >> That, right! To be even more concrete: At the 16. august the problems >> got significant worse. Please check especially the time area from 16. >> august until today. >> >> > First of all, it seems that you deployed 2 new versions on Aug 1 and >> Aug 2. Can you describe what kind of changes in those versions? >> >> I checked it in our version control. As I wrote no related changes were >> made! Just Html/Css stuff: >> * One picture upload >> * One html change >> * One JavaScript change >> * One css change >> >> >> > And, to be fair, we didn't think of any change in our scheduler around >> 3 weeks ago which can cause this issue. >> >> And around the 16th august? > > > Sigh... isn't it a waist of time? What is the reason you picked that date? > > >> > > >> > More than 3 weeks before, those 2 idle instances might have had longer >> lives than now, but it was not a concrete behavior. Please think this way: >> you were just kind of lucky. >> >> That shouldn't be luck! If GAE is not able to start Java instances in >> 5sec to 10 second, there needs be a guarantee that instances have longer >> lives. Otherwise Java applications on GAE are unusable because user would >> have a lot of 30seconds wait time (--> "failed requests"). (See also next >> comment regarding resistant instances) >> >> >> > If you want some instances always active, please set min idle instances. >> >> I tried this some days ago. I had one resistant instance. But that >> changed nothing. Instances get started and stopped as before. I assumed >> that requests would go to the resistant instance first. But that was no the >> case. Resistant instance was idle, but a dynamic instance got started and >> the request waits 30sec. > > Please check other discussion on this list and issues that reported >> similar observations. >> > > So I'd say please try 2. If you still saw the user-facing loading > requests, you need more resident instance to eliminate the user-facing > loading requests. > > >> >> > As you can see, I'm still not convinced to believe that the scheduler >> is misbehaving. I understand that you're having experiences which are bit >> worse than 3 weeks ago, and understand your feeling that you want to tell >> us 'fix it', but I'd say it's > >still something in the line of 'expected >> behavior' at least for now. >> > If you feel differently, please let me know. >> >> Yes I do feel differently (please see answers above). >> >> Please accept >> http://code.google.com/p/googleappengine/issues/detail?id=8004 >> > > So what is your expected behavior and actual result? Nobody in our > team can do anything if you just keep saying "the setting that used to work > doesn't work anymore" without trying mu suggestion. > > I think my answer is clear at least for some points. 1) You'd better use > 'min pending latency' instead of 'max pending latency' to prevent new > instances to spin up as much as possible. 2) If you need longer instance > lives, set appropriate number of min idle instances. > > -- Takashi > > >> >> >> Thanks >> Mos >> http://www.mosbase.com >> >> >> On Fri, Aug 24, 2012 at 4:22 PM, Takashi Matsuo >> <[email protected]<javascript:> >> > wrote: >> >>> >>> Hi Mos, >>> >>> On Fri, Aug 24, 2012 at 6:05 PM, Mos <[email protected]<javascript:> >>> > wrote: >>> >>>> > A possible explanation could be that the traffic pattern had changed. >>>> >>>> No. It's the same. Check for example the Request/Seconds statistics of >>>> my application for the last 30 days! >>> >>> >>>> >> It's very obvious that one instance should be enough for my >>>> application. And that was almost the case the last months! >>>> > Actually it's not true. In particular, check this log: >>>> >>>> That's one expection where one client did 8 request in a minute (+ one >>>> pingdom). Nothing else this minute. >>>> In those exceptional cases it could be ok if a second instance starts. >>>> (Nevertheless can't one instance not >>>> handle 8 requests a minute?) >>>> >>> >>> The issue here is not 8 requests in a minute. Actually there were almost >>> 8 requests in a second. So App Engine likely needed more than one instance >>> at this particular moment. Anyway, as you say, probably it's just a reason >>> for one of the loading requests you're seeing, and this is not very >>> important thing in this topic. >>> >>> It's kind of digressing, but at a first glance, the Requests/Seconds >>> stat seems an appropriate data source to discuss how many instances are >>> actually needed, but in fact, it's not. The real traffic is not spreading >>> equally. >>> >>> >>>> >>>> As I described: Instances are started and stopped without reason, even >>>> if less traffic per minute is available! >>> >>> >>> Okay. As far as I understand, here is what you've seen in the past weeks. >>> >>> * You have been almost always set 'Automatic-2' idle instance setting. >>> * More than 3 weeks ago, number of loading requests were very few. >>> * Recently you have seen more loading requests than before. >>> >>> First of all, it seems that you deployed 2 new versions on Aug 1 and Aug >>> 2. Can you describe what kind of changes in those versions? >>> I'd like to make sure that there is no changes that can cause the >>> scheduler/app server behaving differently. >>> >>> Especially, if you want me to escalate this issue to our engineering >>> team, you should provide the exact information. You say 'My application is >>> unchanged', but in fact you deployed the new version on that day when you >>> described the issue started. I need to make sure that there is no big >>> change which can cause something bad. >>> >>> And, to be fair, we didn't think of any change in our scheduler around 3 >>> weeks ago which can cause this issue. >>> >>> Secondly, you're setting max idle instances = 2. It does not guarantee >>> that you have always 2 instances. It just guarantees that we will never >>> charge you for more than 2 idle instances at any time. >>> >>> More than 3 weeks before, those 2 idle instances might have had longer >>> lives than now, but it was not a concrete behavior. Please think this way: >>> you were just kind of lucky. Now, presumably one or two of those instances >>> are occasionally killed for some reasons(there should be certain legitimate >>> reasons, but those are something you don't need to care). >>> >>> If you want some instances always active, please set min idle instances. >>> Certainly it will cost you a bit more, and you will loose the pending >>> queue, but considering the access pattern of your app(no bursty traffic >>> except for few access from the iPhone browser), I would recommend trying >>> this setting in order to achieve what you want here. I'd recommend 2 idle >>> instances in this case, but you should decide the number. >>> >>> >>>> > * What is the purpose of max-pending-latency = 14.9 setting? >>>> >>>> " is high App Engine will allow requests to wait rather than start new >>>> Instances to process them" >>>> --> One attempt to stop GAE to create unnecessary instances. >>>> >>> >>> I think you should set min pending latency instead of max pending >>> latency if you want to prevent new instance to spin up. However, if you're >>> going to set min idle instances, this setting will almost loose effect. If >>> you don't want to set any min idle instances for whatever reason, please >>> consider setting min pending latency instead of max pending latency. >>> >>> >>>> >>>> > * Can you try automatic-automatic for idle instances setting? >>>> >>>> I played around with this the last days and nothing changed. As I >>>> wrote: I had those configuration for months and it worked fine 3-4 weeks >>>> ago! >>>> >>> >>>> > * What is the purpose of those pingdom check? What happens if you >>>> stop that? >>>> >>>> To be alerted if GAE is down a again. "What happens if you stop that?" >>>> --> I wouldn't be angry anymore because I wouldn't recognize downtime's of >>>> my GAE application. ;) >>>> >>> >>>> Please forward >>>> http://code.google.com/p/googleappengine/issues/detail?id=8004 to the >>>> relevant GAE deparment. >>>> >>> >>> As you can see, I'm still not convinced to believe that the scheduler is >>> misbehaving. I understand that you're having experiences which are bit >>> worse than 3 weeks ago, and understand your feeling that you want to tell >>> us 'fix it', but I'd say it's still something in the line of 'expected >>> behavior' at least for now. >>> >>> If you feel differently, please let me know. >>> >>> Regards, >>> >>> -- Takashi >>> >>> >>>> >>>> Thanks! >>>> >>>> >>>> On Fri, Aug 24, 2012 at 1:39 AM, Takashi Matsuo >>>> <[email protected]<javascript:> >>>> > wrote: >>>> >>>>> >>>>> Hi Mos, >>>>> >>>>> On Thu, Aug 23, 2012 at 4:58 AM, Mos <[email protected]<javascript:> >>>>> > wrote: >>>>> >>>>>> Does anybody else experience abnormal behavior of the >>>>>> instance-scheduler the last three weeks (the last 7 days it got even >>>>>> worse)? (Java / HRD) >>>>>> Or does anybody has profound knowledge about it? >>>>>> >>>>>> Background: My application is unchanged for weeks, configuration not >>>>>> changed and application's traffic is constant. >>>>>> Traffic: One request per minute from Pingdom and around 200 >>>>>> additional pageviews the day (== around 1500 pageviews the day). The >>>>>> peek >>>>>> is not more then 3-4 request per minute. >>>>>> >>>>> >>>>> A possible explanation could be that the traffic pattern had changed. >>>>> >>>>> >>>>>> >>>>>> It's very obvious that one instance should be enough for my >>>>>> application. And that was almost the case the last months! >>>>>> >>>>> >>>>> Actually it's not true. In particular, check this log: >>>>> >>>>> https://appengine.google.com/logs?app_id=s~krisen-talk&version_id=1-0.360912144269287698&severity_level_override=1&severity_level=3&tz=Europe%2FBerlin&filter=&filter_type=regex&date_type=datetime&date=2012-08-23&time=23%3A57%3A00&limit=20&view=Search<https://appengine.google.com/logs?app_id=s%7Ekrisen-talk&version_id=1-0.360912144269287698&severity_level_override=1&severity_level=3&tz=Europe%2FBerlin&filter=&filter_type=regex&date_type=datetime&date=2012-08-23&time=23%3A57%3A00&limit=20&view=Search> >>>>> >>>>> You can see the iPhone client repeatedly requests your dynamic >>>>> resources in a very short amount of time. Presumably it's due to some >>>>> kind >>>>> of 'prefetch' feature of that device. Are you aware of those accesses, >>>>> and >>>>> that this access pattern can cause a new instance starting? >>>>> >>>>> I don't think this is the only reason, but this can explain that some >>>>> portion of your loading requests are expected behavior. >>>>> >>>>> Now I'd like to ask you some questions. >>>>> >>>>> >>>>> * What is the purpose of max-pending-latency = 14.9 setting? >>>>> * Can you try automatic-automatic for idle instances setting? >>>>> * What is the purpose of those pingdom check? What happens if you stop >>>>> that? >>>>> >>>>> >>>>>> >>>>>> But now GAE creates most of the time 3 instances, whereby on has a >>>>>> long life-time for days and the other ones are restarted around >>>>>> 10 to 30 times the day. >>>>>> Because load request takes between 30s to 40s and requests are >>>>>> waiting for loading instances, there are many request that >>>>>> fail (Users and Pingdom agree: *A request that takes more then a >>>>>> couple of seconds is a failed request!*) >>>>>> >>>>>> Please check the attached screenshots that show the behavior! >>>>>> >>>>>> Note: >>>>>> - Killing instances manually did not help >>>>>> - Idle Instances were ( Automatic – 2 ) . Changing it to whatever >>>>>> didn't not change anything; e.g. like ( Automatic – 4 ) >>>>>> >>>>>> Thanks and Cheers >>>>>> >>>>>> Mos >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> -- >>>>>> You received this message because you are subscribed to the Google >>>>>> Groups "Google App Engine" group. >>>>>> To post to this group, send email to >>>>>> [email protected]<javascript:> >>>>>> . >>>>>> To unsubscribe from this group, send email to >>>>>> [email protected] <javascript:>. >>>>>> For more options, visit this group at >>>>>> http://groups.google.com/group/google-appengine?hl=en. >>>>>> >>>>> >>>>> >>>>> >>>>> -- >>>>> Takashi Matsuo | Developers Advocate | [email protected] <javascript:> >>>>> >>>>> -- >>>>> You received this message because you are subscribed to the Google >>>>> Groups "Google App Engine" group. >>>>> To post to this group, send email to >>>>> [email protected]<javascript:> >>>>> . >>>>> To unsubscribe from this group, send email to >>>>> [email protected] <javascript:>. >>>>> For more options, visit this group at >>>>> http://groups.google.com/group/google-appengine?hl=en. >>>>> >>>> >>>> -- >>>> You received this message because you are subscribed to the Google >>>> Groups "Google App Engine" group. >>>> To post to this group, send email to >>>> [email protected]<javascript:> >>>> . >>>> To unsubscribe from this group, send email to >>>> [email protected] <javascript:>. >>>> For more options, visit this group at >>>> http://groups.google.com/group/google-appengine?hl=en. >>>> >>> >>> >>> >>> -- >>> Takashi Matsuo | Developers Advocate | [email protected] <javascript:> >>> >>> -- >>> You received this message because you are subscribed to the Google >>> Groups "Google App Engine" group. >>> To post to this group, send email to >>> [email protected]<javascript:> >>> . >>> To unsubscribe from this group, send email to >>> [email protected] <javascript:>. >>> For more options, visit this group at >>> http://groups.google.com/group/google-appengine?hl=en. >>> >> >> -- >> You received this message because you are subscribed to the Google Groups >> "Google App Engine" group. >> To post to this group, send email to >> [email protected]<javascript:> >> . >> To unsubscribe from this group, send email to >> [email protected] <javascript:>. >> For more options, visit this group at >> http://groups.google.com/group/google-appengine?hl=en. >> > > > > -- > Takashi Matsuo | Developers Advocate | [email protected] <javascript:> > > -- You received this message because you are subscribed to the Google Groups "Google App Engine" group. To view this discussion on the web visit https://groups.google.com/d/msg/google-appengine/-/YIzxpRbmyHMJ. To post to this group, send email to [email protected]. To unsubscribe from this group, send email to [email protected]. For more options, visit this group at http://groups.google.com/group/google-appengine?hl=en.
