Hi Takashi, I created a new GAE app to test this and found that I'm not getting the same instance tuning controls in the new app ID that I am getting in my current one.
In my current app, I can set both min and max idle instances, and min and max pending latency. In the new app, I can set only max idle instances, and min pending latency. Any ideas why this would be the case? It complicates the process of setting up a good testbed for this. - Kris On Friday, August 24, 2012 7:59:17 PM UTC-7, Takashi Matsuo (Google) wrote: > > On Sat, Aug 25, 2012 at 5:24 AM, Mos <[email protected] > <javascript:>>wrote: > >> > Setting Max Pending Latency doesn't force requests to be in the >> pending queue for the specified time. Please use Min Pending Latency >> instead. >> >> As you know my setting to "Min Pending Latency" was automatic. The >> expectation is that GAE takes a reasonable default latency if it is >> "automatic". >> And you say: Every parallel request starts a new instance if it is >> "automatic"? That' would be a "Min Pending Latency" of zero and not >> "automatic". >> >> > If it doesn't work, try 2 min idle instances then >> >> Please check the responses of other user in this thread. This feature is >> totally broken and can not be used. > > >> >> >> And around the 16th august? >> > Sigh... isn't it a waist of time? What is the reason you picked that >> date? >> >> Did you see/studied my pictures from the first post of this thread? >> The statistic shows that on this date the instance creation gets crazy. >> I double checked it with the Pingdom reports. >> Starting on this day there were even more downtimes. >> >> > So I'd say please try 2. If you still saw the user-facing loading >> requests, you need more resident instance to eliminate the user-facing >> loading requests. >> >> Again: As wrote in my post before that does not work. Check the responses >> from Kristopher and Jeff on this thread. >> >> > Yeah, it's very nice to hear concrete examples from Kristopher and Jeff, > other than just saying "I've tried that, but it didn't work". > > >> >> > So what is your expected behavior and actual result? Nobody in our >> team can do anything if you just keep saying "the setting that used to work >> doesn't work anymore" without trying mu suggestion. >> > I think my answer is clear at least for some points. 1) You'd better >> use 'min pending latency' instead of 'max pending latency' to prevent new >> instances to spin up as much as possible. 2) If you need longer instance >> lives, set appropriate number of min idle instances. >> >> As I wrote: I tried different settings. As many other people in this >> group as well. >> Me and other people are reporting: The settings are broken! >> It's very easy to reproduce. Please set up an application, send one >> request per minute (or second), configure 1 or 2 or 3 min idle instances >> and check what is happening. You will see that new instances are started >> although resistant instances are available. >> > > It's nice if we have a complete reproducible case. I've just started an > experiment you mentioned. This time, it's just a helloworld application, > and I set 1 min idle instances and 1 minutes cron. > > Presumably it will just work fine. Then I will try with slightly different > condition. That way, I hope I can determine what kind of condition could be > the culprit or not. What do you think? Can you provide some simple projects > for that experiment? > > >> Please take it serious and let somebody of the engineers check this! >> > > (I'm one of the engineers btw) A reproducible case is always the best > thing to get engineers' attention. > > Regards, > > -- Takashi > > >> Cheers >> Mos >> >> >> On Fri, Aug 24, 2012 at 8:43 PM, Takashi Matsuo >> <[email protected]<javascript:> >> > wrote: >> >>> >>> Hi Mos, >>> >>> On Sat, Aug 25, 2012 at 1:39 AM, Mos <[email protected]<javascript:> >>> > wrote: >>> >>>> Hello Takashi, >>>> >>>> >>>> > Actually there were almost 8 requests in a second. So App Engine >>>> likely needed more than one instance at this particular moment. >>>> >>>> I thought this is why GAE has the concept of "pending-latency" (which >>>> we discussed below). >>>> Meaning: Incoming requests may wait up to 15 seconds before starting a >>>> new instance. Therefore when 8 requests in one second occur that >>>> should not mean that more instance needs to be started. Especially if >>>> there is no other traffic in this minute, as seen in my example. >>>> Otherwise it would be a very bad implementation: >>>> Starting a new instance means around 30s waiting time. Serving 8 >>>> parallel requests from one instance, would result in a maximum of >>>> 8 seconds for the last request (assuming that each request takes around >>>> 1 second). >>>> There is no reason for this concrete example to fire up more instances >>>> and let requests wait more then 30 seconds until a new instance is loaded. >>>> >>> >>> Do you really read my e-mail? >>> >>> Setting Max Pending Latency doesn't force requests to be in the pending >>> queue for the specified time. Please use Min Pending Latency instead. >>> Can you try this first? If it doesn't work, try 2 min idle instances >>> then. >>> >>> >>>> >>>> > ... here is what you've seen in the past weeks. >>>> > >>>> >* You have been almost always set 'Automatic-2' idle instance setting. >>>> >* More than 3 weeks ago, number of loading requests were very few. >>>> > * Recently you have seen more loading requests than before. >>>> >>>> That, right! To be even more concrete: At the 16. august the problems >>>> got significant worse. Please check especially the time area from 16. >>>> august until today. >>>> >>>> > First of all, it seems that you deployed 2 new versions on Aug 1 and >>>> Aug 2. Can you describe what kind of changes in those versions? >>>> >>>> I checked it in our version control. As I wrote no related changes were >>>> made! Just Html/Css stuff: >>>> * One picture upload >>>> * One html change >>>> * One JavaScript change >>>> * One css change >>>> >>>> >>>> > And, to be fair, we didn't think of any change in our scheduler >>>> around 3 weeks ago which can cause this issue. >>>> >>>> And around the 16th august? >>> >>> >>> Sigh... isn't it a waist of time? What is the reason you picked that >>> date? >>> >>> >>>> >>> >>> >>>> > More than 3 weeks before, those 2 idle instances might have had >>>> longer lives than now, but it was not a concrete behavior. Please think >>>> this way: you were just kind of lucky. >>>> >>>> That shouldn't be luck! If GAE is not able to start Java instances in >>>> 5sec to 10 second, there needs be a guarantee that instances have longer >>>> lives. Otherwise Java applications on GAE are unusable because user would >>>> have a lot of 30seconds wait time (--> "failed requests"). (See also next >>>> comment regarding resistant instances) >>>> >>>> >>>> > If you want some instances always active, please set min idle >>>> instances. >>>> >>>> I tried this some days ago. I had one resistant instance. But that >>>> changed nothing. Instances get started and stopped as before. I assumed >>>> that requests would go to the resistant instance first. But that was no >>>> the >>>> case. Resistant instance was idle, but a dynamic instance got started and >>>> the request waits 30sec. >>> >>> Please check other discussion on this list and issues that reported >>>> similar observations. >>>> >>> >>> So I'd say please try 2. If you still saw the user-facing loading >>> requests, you need more resident instance to eliminate the user-facing >>> loading requests. >>> >>> >>>> >>>> > As you can see, I'm still not convinced to believe that the scheduler >>>> is misbehaving. I understand that you're having experiences which are bit >>>> worse than 3 weeks ago, and understand your feeling that you want to tell >>>> us 'fix it', but I'd say it's > >still something in the line of 'expected >>>> behavior' at least for now. >>>> > If you feel differently, please let me know. >>>> >>>> Yes I do feel differently (please see answers above). >>>> >>>> Please accept >>>> http://code.google.com/p/googleappengine/issues/detail?id=8004 >>>> >>> >>> So what is your expected behavior and actual result? Nobody in our >>> team can do anything if you just keep saying "the setting that used to work >>> doesn't work anymore" without trying mu suggestion. >>> >>> I think my answer is clear at least for some points. 1) You'd better use >>> 'min pending latency' instead of 'max pending latency' to prevent new >>> instances to spin up as much as possible. 2) If you need longer instance >>> lives, set appropriate number of min idle instances. >>> >>> -- Takashi >>> >>> >>>> >>>> >>>> Thanks >>>> Mos >>>> http://www.mosbase.com >>>> >>>> >>>> On Fri, Aug 24, 2012 at 4:22 PM, Takashi Matsuo >>>> <[email protected]<javascript:> >>>> > wrote: >>>> >>>>> >>>>> Hi Mos, >>>>> >>>>> On Fri, Aug 24, 2012 at 6:05 PM, Mos <[email protected]<javascript:> >>>>> > wrote: >>>>> >>>>>> > A possible explanation could be that the traffic pattern had >>>>>> changed. >>>>>> >>>>>> No. It's the same. Check for example the Request/Seconds statistics >>>>>> of my application for the last 30 days! >>>>> >>>>> >>>>>> >> It's very obvious that one instance should be enough for my >>>>>> application. And that was almost the case the last months! >>>>>> > Actually it's not true. In particular, check this log: >>>>>> >>>>>> That's one expection where one client did 8 request in a minute (+ >>>>>> one pingdom). Nothing else this minute. >>>>>> In those exceptional cases it could be ok if a second instance >>>>>> starts. (Nevertheless can't one instance not >>>>>> handle 8 requests a minute?) >>>>>> >>>>> >>>>> The issue here is not 8 requests in a minute. Actually there were >>>>> almost 8 requests in a second. So App Engine likely needed more than one >>>>> instance at this particular moment. Anyway, as you say, probably it's >>>>> just >>>>> a reason for one of the loading requests you're seeing, and this is not >>>>> very important thing in this topic. >>>>> >>>>> It's kind of digressing, but at a first glance, the Requests/Seconds >>>>> stat seems an appropriate data source to discuss how many instances are >>>>> actually needed, but in fact, it's not. The real traffic is not spreading >>>>> equally. >>>>> >>>>> >>>>>> >>>>>> As I described: Instances are started and stopped without reason, >>>>>> even if less traffic per minute is available! >>>>> >>>>> >>>>> Okay. As far as I understand, here is what you've seen in the past >>>>> weeks. >>>>> >>>>> * You have been almost always set 'Automatic-2' idle instance setting. >>>>> * More than 3 weeks ago, number of loading requests were very few. >>>>> * Recently you have seen more loading requests than before. >>>>> >>>>> First of all, it seems that you deployed 2 new versions on Aug 1 and >>>>> Aug 2. Can you describe what kind of changes in those versions? >>>>> I'd like to make sure that there is no changes that can cause the >>>>> scheduler/app server behaving differently. >>>>> >>>>> Especially, if you want me to escalate this issue to our engineering >>>>> team, you should provide the exact information. You say 'My application >>>>> is >>>>> unchanged', but in fact you deployed the new version on that day when you >>>>> described the issue started. I need to make sure that there is no big >>>>> change which can cause something bad. >>>>> >>>>> And, to be fair, we didn't think of any change in our scheduler around >>>>> 3 weeks ago which can cause this issue. >>>>> >>>>> Secondly, you're setting max idle instances = 2. It does not guarantee >>>>> that you have always 2 instances. It just guarantees that we will never >>>>> charge you for more than 2 idle instances at any time. >>>>> >>>>> More than 3 weeks before, those 2 idle instances might have had longer >>>>> lives than now, but it was not a concrete behavior. Please think this >>>>> way: >>>>> you were just kind of lucky. Now, presumably one or two of those >>>>> instances >>>>> are occasionally killed for some reasons(there should be certain >>>>> legitimate >>>>> reasons, but those are something you don't need to care). >>>>> >>>>> If you want some instances always active, please set min idle >>>>> instances. Certainly it will cost you a bit more, and you will loose the >>>>> pending queue, but considering the access pattern of your app(no bursty >>>>> traffic except for few access from the iPhone browser), I would recommend >>>>> trying this setting in order to achieve what you want here. I'd recommend >>>>> 2 >>>>> idle instances in this case, but you should decide the number. >>>>> >>>>> >>>>>> > * What is the purpose of max-pending-latency = 14.9 setting? >>>>>> >>>>>> " is high App Engine will allow requests to wait rather than start >>>>>> new Instances to process them" >>>>>> --> One attempt to stop GAE to create unnecessary instances. >>>>>> >>>>> >>>>> I think you should set min pending latency instead of max pending >>>>> latency if you want to prevent new instance to spin up. However, if >>>>> you're >>>>> going to set min idle instances, this setting will almost loose effect. >>>>> If >>>>> you don't want to set any min idle instances for whatever reason, please >>>>> consider setting min pending latency instead of max pending latency. >>>>> >>>>> >>>>>> >>>>>> > * Can you try automatic-automatic for idle instances setting? >>>>>> >>>>>> I played around with this the last days and nothing changed. As I >>>>>> wrote: I had those configuration for months and it worked fine 3-4 >>>>>> weeks >>>>>> ago! >>>>>> >>>>> >>>>>> > * What is the purpose of those pingdom check? What happens if you >>>>>> stop that? >>>>>> >>>>>> To be alerted if GAE is down a again. "What happens if you stop >>>>>> that?" --> I wouldn't be angry anymore because I wouldn't recognize >>>>>> downtime's of my GAE application. ;) >>>>>> >>>>> >>>>>> Please forward >>>>>> http://code.google.com/p/googleappengine/issues/detail?id=8004 to >>>>>> the relevant GAE deparment. >>>>>> >>>>> >>>>> As you can see, I'm still not convinced to believe that the scheduler >>>>> is misbehaving. I understand that you're having experiences which are bit >>>>> worse than 3 weeks ago, and understand your feeling that you want to tell >>>>> us 'fix it', but I'd say it's still something in the line of 'expected >>>>> behavior' at least for now. >>>>> >>>>> If you feel differently, please let me know. >>>>> >>>>> Regards, >>>>> >>>>> -- Takashi >>>>> >>>>> >>>>>> >>>>>> Thanks! >>>>>> >>>>>> >>>>>> On Fri, Aug 24, 2012 at 1:39 AM, Takashi Matsuo >>>>>> <[email protected]<javascript:> >>>>>> > wrote: >>>>>> >>>>>>> >>>>>>> Hi Mos, >>>>>>> >>>>>>> On Thu, Aug 23, 2012 at 4:58 AM, Mos <[email protected]<javascript:> >>>>>>> > wrote: >>>>>>> >>>>>>>> Does anybody else experience abnormal behavior of the >>>>>>>> instance-scheduler the last three weeks (the last 7 days it got even >>>>>>>> worse)? (Java / HRD) >>>>>>>> Or does anybody has profound knowledge about it? >>>>>>>> >>>>>>>> Background: My application is unchanged for weeks, configuration >>>>>>>> not changed and application's traffic is constant. >>>>>>>> Traffic: One request per minute from Pingdom and around 200 >>>>>>>> additional pageviews the day (== around 1500 pageviews the day). The >>>>>>>> peek >>>>>>>> is not more then 3-4 request per minute. >>>>>>>> >>>>>>> >>>>>>> A possible explanation could be that the traffic pattern had changed. >>>>>>> >>>>>>> >>>>>>>> >>>>>>>> It's very obvious that one instance should be enough for my >>>>>>>> application. And that was almost the case the last months! >>>>>>>> >>>>>>> >>>>>>> Actually it's not true. In particular, check this log: >>>>>>> >>>>>>> https://appengine.google.com/logs?app_id=s~krisen-talk&version_id=1-0.360912144269287698&severity_level_override=1&severity_level=3&tz=Europe%2FBerlin&filter=&filter_type=regex&date_type=datetime&date=2012-08-23&time=23%3A57%3A00&limit=20&view=Search<https://appengine.google.com/logs?app_id=s%7Ekrisen-talk&version_id=1-0.360912144269287698&severity_level_override=1&severity_level=3&tz=Europe%2FBerlin&filter=&filter_type=regex&date_type=datetime&date=2012-08-23&time=23%3A57%3A00&limit=20&view=Search> >>>>>>> >>>>>>> You can see the iPhone client repeatedly requests your dynamic >>>>>>> resources in a very short amount of time. Presumably it's due to some >>>>>>> kind >>>>>>> of 'prefetch' feature of that device. Are you aware of those accesses, >>>>>>> and >>>>>>> that this access pattern can cause a new instance starting? >>>>>>> >>>>>>> I don't think this is the only reason, but this can explain that >>>>>>> some portion of your loading requests are expected behavior. >>>>>>> >>>>>>> Now I'd like to ask you some questions. >>>>>>> >>>>>>> >>>>>>> * What is the purpose of max-pending-latency = 14.9 setting? >>>>>>> * Can you try automatic-automatic for idle instances setting? >>>>>>> * What is the purpose of those pingdom check? What happens if you >>>>>>> stop that? >>>>>>> >>>>>>> >>>>>>>> >>>>>>>> But now GAE creates most of the time 3 instances, whereby on has a >>>>>>>> long life-time for days and the other ones are restarted around >>>>>>>> 10 to 30 times the day. >>>>>>>> Because load request takes between 30s to 40s and requests are >>>>>>>> waiting for loading instances, there are many request that >>>>>>>> fail (Users and Pingdom agree: *A request that takes more then a >>>>>>>> couple of seconds is a failed request!*) >>>>>>>> >>>>>>>> Please check the attached screenshots that show the behavior! >>>>>>>> >>>>>>>> Note: >>>>>>>> - Killing instances manually did not help >>>>>>>> - Idle Instances were ( Automatic – 2 ) . Changing it to whatever >>>>>>>> didn't not change anything; e.g. like ( Automatic – 4 ) >>>>>>>> >>>>>>>> Thanks and Cheers >>>>>>>> >>>>>>>> Mos >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> -- >>>>>>>> You received this message because you are subscribed to the Google >>>>>>>> Groups "Google App Engine" group. >>>>>>>> To post to this group, send email to >>>>>>>> [email protected]<javascript:> >>>>>>>> . >>>>>>>> To unsubscribe from this group, send email to >>>>>>>> [email protected] <javascript:>. >>>>>>>> For more options, visit this group at >>>>>>>> http://groups.google.com/group/google-appengine?hl=en. >>>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> -- >>>>>>> Takashi Matsuo | Developers Advocate | [email protected]<javascript:> >>>>>>> >>>>>>> -- >>>>>>> You received this message because you are subscribed to the Google >>>>>>> Groups "Google App Engine" group. >>>>>>> To post to this group, send email to >>>>>>> [email protected]<javascript:> >>>>>>> . >>>>>>> To unsubscribe from this group, send email to >>>>>>> [email protected] <javascript:>. >>>>>>> For more options, visit this group at >>>>>>> http://groups.google.com/group/google-appengine?hl=en. >>>>>>> >>>>>> >>>>>> -- >>>>>> You received this message because you are subscribed to the Google >>>>>> Groups "Google App Engine" group. >>>>>> To post to this group, send email to >>>>>> [email protected]<javascript:> >>>>>> . >>>>>> To unsubscribe from this group, send email to >>>>>> [email protected] <javascript:>. >>>>>> For more options, visit this group at >>>>>> http://groups.google.com/group/google-appengine?hl=en. >>>>>> >>>>> >>>>> >>>>> >>>>> -- >>>>> Takashi Matsuo | Developers Advocate | [email protected] <javascript:> >>>>> >>>>> -- >>>>> You received this message because you are subscribed to the Google >>>>> Groups "Google App Engine" group. >>>>> To post to this group, send email to >>>>> [email protected]<javascript:> >>>>> . >>>>> To unsubscribe from this group, send email to >>>>> [email protected] <javascript:>. >>>>> For more options, visit this group at >>>>> http://groups.google.com/group/google-appengine?hl=en. >>>>> >>>> >>>> -- >>>> You received this message because you are subscribed to the Google >>>> Groups "Google App Engine" group. >>>> To post to this group, send email to >>>> [email protected]<javascript:> >>>> . >>>> To unsubscribe from this group, send email to >>>> [email protected] <javascript:>. >>>> For more options, visit this group at >>>> http://groups.google.com/group/google-appengine?hl=en. >>>> >>> >>> >>> >>> -- >>> Takashi Matsuo | Developers Advocate | [email protected] <javascript:> >>> >>> -- >>> You received this message because you are subscribed to the Google >>> Groups "Google App Engine" group. >>> To post to this group, send email to >>> [email protected]<javascript:> >>> . >>> To unsubscribe from this group, send email to >>> [email protected] <javascript:>. >>> For more options, visit this group at >>> http://groups.google.com/group/google-appengine?hl=en. >>> >> >> -- >> You received this message because you are subscribed to the Google Groups >> "Google App Engine" group. >> To post to this group, send email to >> [email protected]<javascript:> >> . >> To unsubscribe from this group, send email to >> [email protected] <javascript:>. >> For more options, visit this group at >> http://groups.google.com/group/google-appengine?hl=en. >> > > > > -- > Takashi Matsuo | Developers Advocate | [email protected] <javascript:> > > -- You received this message because you are subscribed to the Google Groups "Google App Engine" group. To view this discussion on the web visit https://groups.google.com/d/msg/google-appengine/-/vypbs4jA5cgJ. To post to this group, send email to [email protected]. To unsubscribe from this group, send email to [email protected]. For more options, visit this group at http://groups.google.com/group/google-appengine?hl=en.
