Hi Mos, I have experienced very similar issues for the last week. The loading time of my website increased from 2-3 seconds to 10-15 seconds all of a sudden. When it happened I tried to change the app settings, changed it from F1 to F2, increased the idle instances setting but nothing helped.
During the week I was playing with settings with no results. Suddenly the issue has been solved by itself today. I want to stress, that yesterday the problem was still there, and I haven't changed any settings since then. It means that the problem was on Google's side, and they solved it silently. It's a shame Google doesn't accept their mistakes, and keep saying that it's our fault because we didn't configure our applications in a right way. I will never deploy any new application on GAE. On Friday, August 24, 2012 5:28:01 PM UTC-4, Mos wrote: > > Thanks Johan. I read the post some days before. > > As often discussed on the mailing-list before and as Jeff said in this > thread. > It's the combination of "Requests should never be sent to cold instances." > and(!) the behavior of min idle instance which doesn't make any sense. > > Please check the last comment of > http://code.google.com/p/googleappengine/issues/detail?id=8004 where > wrote down the problems in my point of view. > > Senior Java-developers on this list which have many months of experience > with GAE stated again again that there is a big issue around instance > handling. > I think you have to trust your power-user and assign a team to work on > this! > > On Fri, Aug 24, 2012 at 10:58 PM, Johan Euphrosine > <[email protected]<javascript:> > > wrote: > >> Hi all, >> >> Please review the following thread where the lead engineer working on the >> scheduler (Jon McAlister) took the time to explain in great detail the >> behavior of min idle instance. >> https://groups.google.com/d/msg/google-appengine/nRtzGtG9790/hLS16qux_04J >> >> Once you read this, we can discuss if what you're experiencing is really >> a bug, or if you want the scheduler to behave differently from its current >> implementation, in which case the more constructive way out of this >> discussion is to fill feature request, and get it starred by your peers. >> On Aug 24, 2012 10:24 PM, "Mos" <[email protected] <javascript:>> >> wrote: >> >>> > Setting Max Pending Latency doesn't force requests to be in the >>> pending queue for the specified time. Please use Min Pending Latency >>> instead. >>> >>> As you know my setting to "Min Pending Latency" was automatic. The >>> expectation is that GAE takes a reasonable default latency if it is >>> "automatic". >>> And you say: Every parallel request starts a new instance if it is >>> "automatic"? That' would be a "Min Pending Latency" of zero and not >>> "automatic". >>> >>> > If it doesn't work, try 2 min idle instances then >>> >>> Please check the responses of other user in this thread. This feature >>> is totally broken and can not be used. >>> >>> >> And around the 16th august? >>> > Sigh... isn't it a waist of time? What is the reason you picked that >>> date? >>> >>> Did you see/studied my pictures from the first post of this thread? >>> The statistic shows that on this date the instance creation gets crazy. >>> I double checked it with the Pingdom reports. >>> Starting on this day there were even more downtimes. >>> >>> > So I'd say please try 2. If you still saw the user-facing loading >>> requests, you need more resident instance to eliminate the user-facing >>> loading requests. >>> >>> Again: As wrote in my post before that does not work. Check the >>> responses from Kristopher and Jeff on this thread. >>> >>> > So what is your expected behavior and actual result? Nobody in our >>> team can do anything if you just keep saying "the setting that used to work >>> doesn't work anymore" without trying mu suggestion. >>> > I think my answer is clear at least for some points. 1) You'd better >>> use 'min pending latency' instead of 'max pending latency' to prevent new >>> instances to spin up as much as possible. 2) If you need longer instance >>> lives, set appropriate number of min idle instances. >>> >>> As I wrote: I tried different settings. As many other people in this >>> group as well. >>> Me and other people are reporting: The settings are broken! >>> It's very easy to reproduce. Please set up an application, send one >>> request per minute (or second), configure 1 or 2 or 3 min idle instances >>> and check what is happening. You will see that new instances are started >>> although resistant instances are available. >>> >>> Please take it serious and let somebody of the engineers check this! >>> >>> Cheers >>> Mos >>> >>> >>> On Fri, Aug 24, 2012 at 8:43 PM, Takashi Matsuo >>> <[email protected]<javascript:> >>> > wrote: >>> >>>> >>>> Hi Mos, >>>> >>>> On Sat, Aug 25, 2012 at 1:39 AM, Mos <[email protected]<javascript:> >>>> > wrote: >>>> >>>>> Hello Takashi, >>>>> >>>>> >>>>> > Actually there were almost 8 requests in a second. So App Engine >>>>> likely needed more than one instance at this particular moment. >>>>> >>>>> I thought this is why GAE has the concept of "pending-latency" (which >>>>> we discussed below). >>>>> Meaning: Incoming requests may wait up to 15 seconds before starting >>>>> a new instance. Therefore when 8 requests in one second occur that >>>>> should not mean that more instance needs to be started. Especially if >>>>> there is no other traffic in this minute, as seen in my example. >>>>> Otherwise it would be a very bad implementation: >>>>> Starting a new instance means around 30s waiting time. Serving 8 >>>>> parallel requests from one instance, would result in a maximum of >>>>> 8 seconds for the last request (assuming that each request takes >>>>> around 1 second). >>>>> There is no reason for this concrete example to fire up more instances >>>>> and let requests wait more then 30 seconds until a new instance is loaded. >>>>> >>>> >>>> Do you really read my e-mail? >>>> >>>> Setting Max Pending Latency doesn't force requests to be in the pending >>>> queue for the specified time. Please use Min Pending Latency instead. >>>> Can you try this first? If it doesn't work, try 2 min idle instances >>>> then. >>>> >>>> >>>>> >>>>> > ... here is what you've seen in the past weeks. >>>>> > >>>>> >* You have been almost always set 'Automatic-2' idle instance setting. >>>>> >* More than 3 weeks ago, number of loading requests were very few. >>>>> > * Recently you have seen more loading requests than before. >>>>> >>>>> That, right! To be even more concrete: At the 16. august the problems >>>>> got significant worse. Please check especially the time area from 16. >>>>> august until today. >>>>> >>>>> > First of all, it seems that you deployed 2 new versions on Aug 1 and >>>>> Aug 2. Can you describe what kind of changes in those versions? >>>>> >>>>> I checked it in our version control. As I wrote no related changes >>>>> were made! Just Html/Css stuff: >>>>> * One picture upload >>>>> * One html change >>>>> * One JavaScript change >>>>> * One css change >>>>> >>>>> >>>>> > And, to be fair, we didn't think of any change in our scheduler >>>>> around 3 weeks ago which can cause this issue. >>>>> >>>>> And around the 16th august? >>>> >>>> >>>> Sigh... isn't it a waist of time? What is the reason you picked that >>>> date? >>>> >>>> >>>>> >>>> >>>> >>>>> > More than 3 weeks before, those 2 idle instances might have had >>>>> longer lives than now, but it was not a concrete behavior. Please think >>>>> this way: you were just kind of lucky. >>>>> >>>>> That shouldn't be luck! If GAE is not able to start Java instances in >>>>> 5sec to 10 second, there needs be a guarantee that instances have longer >>>>> lives. Otherwise Java applications on GAE are unusable because user >>>>> would >>>>> have a lot of 30seconds wait time (--> "failed requests"). (See also >>>>> next >>>>> comment regarding resistant instances) >>>>> >>>>> >>>>> > If you want some instances always active, please set min idle >>>>> instances. >>>>> >>>>> I tried this some days ago. I had one resistant instance. But that >>>>> changed nothing. Instances get started and stopped as before. I assumed >>>>> that requests would go to the resistant instance first. But that was no >>>>> the >>>>> case. Resistant instance was idle, but a dynamic instance got started and >>>>> the request waits 30sec. >>>> >>>> Please check other discussion on this list and issues that reported >>>>> similar observations. >>>>> >>>> >>>> So I'd say please try 2. If you still saw the user-facing loading >>>> requests, you need more resident instance to eliminate the user-facing >>>> loading requests. >>>> >>>> >>>>> >>>>> > As you can see, I'm still not convinced to believe that the >>>>> scheduler is misbehaving. I understand that you're having experiences >>>>> which >>>>> are bit worse than 3 weeks ago, and understand your feeling that you want >>>>> to tell us 'fix it', but I'd say it's > >still something in the line of >>>>> 'expected behavior' at least for now. >>>>> > If you feel differently, please let me know. >>>>> >>>>> Yes I do feel differently (please see answers above). >>>>> >>>>> Please accept >>>>> http://code.google.com/p/googleappengine/issues/detail?id=8004 >>>>> >>>> >>>> So what is your expected behavior and actual result? Nobody in our >>>> team can do anything if you just keep saying "the setting that used to >>>> work >>>> doesn't work anymore" without trying mu suggestion. >>>> >>>> I think my answer is clear at least for some points. 1) You'd better >>>> use 'min pending latency' instead of 'max pending latency' to prevent new >>>> instances to spin up as much as possible. 2) If you need longer instance >>>> lives, set appropriate number of min idle instances. >>>> >>>> -- Takashi >>>> >>>> >>>>> >>>>> >>>>> Thanks >>>>> Mos >>>>> http://www.mosbase.com >>>>> >>>>> >>>>> On Fri, Aug 24, 2012 at 4:22 PM, Takashi Matsuo >>>>> <[email protected]<javascript:> >>>>> > wrote: >>>>> >>>>>> >>>>>> Hi Mos, >>>>>> >>>>>> On Fri, Aug 24, 2012 at 6:05 PM, Mos <[email protected]<javascript:> >>>>>> > wrote: >>>>>> >>>>>>> > A possible explanation could be that the traffic pattern had >>>>>>> changed. >>>>>>> >>>>>>> No. It's the same. Check for example the Request/Seconds statistics >>>>>>> of my application for the last 30 days! >>>>>> >>>>>> >>>>>>> >> It's very obvious that one instance should be enough for my >>>>>>> application. And that was almost the case the last months! >>>>>>> > Actually it's not true. In particular, check this log: >>>>>>> >>>>>>> That's one expection where one client did 8 request in a minute (+ >>>>>>> one pingdom). Nothing else this minute. >>>>>>> In those exceptional cases it could be ok if a second instance >>>>>>> starts. (Nevertheless can't one instance not >>>>>>> handle 8 requests a minute?) >>>>>>> >>>>>> >>>>>> The issue here is not 8 requests in a minute. Actually there were >>>>>> almost 8 requests in a second. So App Engine likely needed more than one >>>>>> instance at this particular moment. Anyway, as you say, probably it's >>>>>> just >>>>>> a reason for one of the loading requests you're seeing, and this is not >>>>>> very important thing in this topic. >>>>>> >>>>>> It's kind of digressing, but at a first glance, the Requests/Seconds >>>>>> stat seems an appropriate data source to discuss how many instances are >>>>>> actually needed, but in fact, it's not. The real traffic is not >>>>>> spreading >>>>>> equally. >>>>>> >>>>>> >>>>>>> >>>>>>> As I described: Instances are started and stopped without reason, >>>>>>> even if less traffic per minute is available! >>>>>> >>>>>> >>>>>> Okay. As far as I understand, here is what you've seen in the past >>>>>> weeks. >>>>>> >>>>>> * You have been almost always set 'Automatic-2' idle instance setting. >>>>>> * More than 3 weeks ago, number of loading requests were very few. >>>>>> * Recently you have seen more loading requests than before. >>>>>> >>>>>> First of all, it seems that you deployed 2 new versions on Aug 1 and >>>>>> Aug 2. Can you describe what kind of changes in those versions? >>>>>> I'd like to make sure that there is no changes that can cause the >>>>>> scheduler/app server behaving differently. >>>>>> >>>>>> Especially, if you want me to escalate this issue to our engineering >>>>>> team, you should provide the exact information. You say 'My application >>>>>> is >>>>>> unchanged', but in fact you deployed the new version on that day when >>>>>> you >>>>>> described the issue started. I need to make sure that there is no big >>>>>> change which can cause something bad. >>>>>> >>>>>> And, to be fair, we didn't think of any change in our scheduler >>>>>> around 3 weeks ago which can cause this issue. >>>>>> >>>>>> Secondly, you're setting max idle instances = 2. It does not >>>>>> guarantee that you have always 2 instances. It just guarantees that we >>>>>> will >>>>>> never charge you for more than 2 idle instances at any time. >>>>>> >>>>>> More than 3 weeks before, those 2 idle instances might have had >>>>>> longer lives than now, but it was not a concrete behavior. Please think >>>>>> this way: you were just kind of lucky. Now, presumably one or two of >>>>>> those >>>>>> instances are occasionally killed for some reasons(there should be >>>>>> certain >>>>>> legitimate reasons, but those are something you don't need to care). >>>>>> >>>>>> If you want some instances always active, please set min idle >>>>>> instances. Certainly it will cost you a bit more, and you will loose the >>>>>> pending queue, but considering the access pattern of your app(no bursty >>>>>> traffic except for few access from the iPhone browser), I would >>>>>> recommend >>>>>> trying this setting in order to achieve what you want here. I'd >>>>>> recommend 2 >>>>>> idle instances in this case, but you should decide the number. >>>>>> >>>>>> >>>>>>> > * What is the purpose of max-pending-latency = 14.9 setting? >>>>>>> >>>>>>> " is high App Engine will allow requests to wait rather than start >>>>>>> new Instances to process them" >>>>>>> --> One attempt to stop GAE to create unnecessary instances. >>>>>>> >>>>>> >>>>>> I think you should set min pending latency instead of max pending >>>>>> latency if you want to prevent new instance to spin up. However, if >>>>>> you're >>>>>> going to set min idle instances, this setting will almost loose effect. >>>>>> If >>>>>> you don't want to set any min idle instances for whatever reason, please >>>>>> consider setting min pending latency instead of max pending latency. >>>>>> >>>>>> >>>>>>> >>>>>>> > * Can you try automatic-automatic for idle instances setting? >>>>>>> >>>>>>> I played around with this the last days and nothing changed. As I >>>>>>> wrote: I had those configuration for months and it worked fine 3-4 >>>>>>> weeks >>>>>>> ago! >>>>>>> >>>>>> >>>>>>> > * What is the purpose of those pingdom check? What happens if you >>>>>>> stop that? >>>>>>> >>>>>>> To be alerted if GAE is down a again. "What happens if you stop >>>>>>> that?" --> I wouldn't be angry anymore because I wouldn't recognize >>>>>>> downtime's of my GAE application. ;) >>>>>>> >>>>>> >>>>>>> Please forward >>>>>>> http://code.google.com/p/googleappengine/issues/detail?id=8004 to >>>>>>> the relevant GAE deparment. >>>>>>> >>>>>> >>>>>> As you can see, I'm still not convinced to believe that the scheduler >>>>>> is misbehaving. I understand that you're having experiences which are >>>>>> bit >>>>>> worse than 3 weeks ago, and understand your feeling that you want to >>>>>> tell >>>>>> us 'fix it', but I'd say it's still something in the line of 'expected >>>>>> behavior' at least for now. >>>>>> >>>>>> If you feel differently, please let me know. >>>>>> >>>>>> Regards, >>>>>> >>>>>> -- Takashi >>>>>> >>>>>> >>>>>>> >>>>>>> Thanks! >>>>>>> >>>>>>> >>>>>>> On Fri, Aug 24, 2012 at 1:39 AM, Takashi Matsuo >>>>>>> <[email protected]<javascript:> >>>>>>> > wrote: >>>>>>> >>>>>>>> >>>>>>>> Hi Mos, >>>>>>>> >>>>>>>> On Thu, Aug 23, 2012 at 4:58 AM, Mos >>>>>>>> <[email protected]<javascript:> >>>>>>>> > wrote: >>>>>>>> >>>>>>>>> Does anybody else experience abnormal behavior of the >>>>>>>>> instance-scheduler the last three weeks (the last 7 days it got even >>>>>>>>> worse)? (Java / HRD) >>>>>>>>> Or does anybody has profound knowledge about it? >>>>>>>>> >>>>>>>>> Background: My application is unchanged for weeks, configuration >>>>>>>>> not changed and application's traffic is constant. >>>>>>>>> Traffic: One request per minute from Pingdom and around 200 >>>>>>>>> additional pageviews the day (== around 1500 pageviews the day). The >>>>>>>>> peek >>>>>>>>> is not more then 3-4 request per minute. >>>>>>>>> >>>>>>>> >>>>>>>> A possible explanation could be that the traffic pattern had >>>>>>>> changed. >>>>>>>> >>>>>>>> >>>>>>>>> >>>>>>>>> It's very obvious that one instance should be enough for my >>>>>>>>> application. And that was almost the case the last months! >>>>>>>>> >>>>>>>> >>>>>>>> Actually it's not true. In particular, check this log: >>>>>>>> >>>>>>>> https://appengine.google.com/logs?app_id=s~krisen-talk&version_id=1-0.360912144269287698&severity_level_override=1&severity_level=3&tz=Europe%2FBerlin&filter=&filter_type=regex&date_type=datetime&date=2012-08-23&time=23%3A57%3A00&limit=20&view=Search<https://appengine.google.com/logs?app_id=s%7Ekrisen-talk&version_id=1-0.360912144269287698&severity_level_override=1&severity_level=3&tz=Europe%2FBerlin&filter=&filter_type=regex&date_type=datetime&date=2012-08-23&time=23%3A57%3A00&limit=20&view=Search> >>>>>>>> >>>>>>>> You can see the iPhone client repeatedly requests your dynamic >>>>>>>> resources in a very short amount of time. Presumably it's due to some >>>>>>>> kind >>>>>>>> of 'prefetch' feature of that device. Are you aware of those accesses, >>>>>>>> and >>>>>>>> that this access pattern can cause a new instance starting? >>>>>>>> >>>>>>>> I don't think this is the only reason, but this can explain that >>>>>>>> some portion of your loading requests are expected behavior. >>>>>>>> >>>>>>>> Now I'd like to ask you some questions. >>>>>>>> >>>>>>>> >>>>>>>> * What is the purpose of max-pending-latency = 14.9 setting? >>>>>>>> * Can you try automatic-automatic for idle instances setting? >>>>>>>> * What is the purpose of those pingdom check? What happens if you >>>>>>>> stop that? >>>>>>>> >>>>>>>> >>>>>>>>> >>>>>>>>> But now GAE creates most of the time 3 instances, whereby on has a >>>>>>>>> long life-time for days and the other ones are restarted around >>>>>>>>> 10 to 30 times the day. >>>>>>>>> Because load request takes between 30s to 40s and requests are >>>>>>>>> waiting for loading instances, there are many request that >>>>>>>>> fail (Users and Pingdom agree: *A request that takes more then a >>>>>>>>> couple of seconds is a failed request!*) >>>>>>>>> >>>>>>>>> Please check the attached screenshots that show the behavior! >>>>>>>>> >>>>>>>>> Note: >>>>>>>>> - Killing instances manually did not help >>>>>>>>> - Idle Instances were ( Automatic – 2 ) . Changing it to >>>>>>>>> whatever didn't not change anything; e.g. like ( Automatic – 4 ) >>>>>>>>> >>>>>>>>> Thanks and Cheers >>>>>>>>> >>>>>>>>> Mos >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> -- >>>>>>>>> You received this message because you are subscribed to the Google >>>>>>>>> Groups "Google App Engine" group. >>>>>>>>> To post to this group, send email to >>>>>>>>> [email protected]<javascript:> >>>>>>>>> . >>>>>>>>> To unsubscribe from this group, send email to >>>>>>>>> [email protected] <javascript:>. >>>>>>>>> For more options, visit this group at >>>>>>>>> http://groups.google.com/group/google-appengine?hl=en. >>>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> -- >>>>>>>> Takashi Matsuo | Developers Advocate | [email protected]<javascript:> >>>>>>>> >>>>>>>> -- >>>>>>>> You received this message because you are subscribed to the Google >>>>>>>> Groups "Google App Engine" group. >>>>>>>> To post to this group, send email to >>>>>>>> [email protected]<javascript:> >>>>>>>> . >>>>>>>> To unsubscribe from this group, send email to >>>>>>>> [email protected] <javascript:>. >>>>>>>> For more options, visit this group at >>>>>>>> http://groups.google.com/group/google-appengine?hl=en. >>>>>>>> >>>>>>> >>>>>>> -- >>>>>>> You received this message because you are subscribed to the Google >>>>>>> Groups "Google App Engine" group. >>>>>>> To post to this group, send email to >>>>>>> [email protected]<javascript:> >>>>>>> . >>>>>>> To unsubscribe from this group, send email to >>>>>>> [email protected] <javascript:>. >>>>>>> For more options, visit this group at >>>>>>> http://groups.google.com/group/google-appengine?hl=en. >>>>>>> >>>>>> >>>>>> >>>>>> >>>>>> -- >>>>>> Takashi Matsuo | Developers Advocate | [email protected]<javascript:> >>>>>> >>>>>> -- >>>>>> You received this message because you are subscribed to the Google >>>>>> Groups "Google App Engine" group. >>>>>> To post to this group, send email to >>>>>> [email protected]<javascript:> >>>>>> . >>>>>> To unsubscribe from this group, send email to >>>>>> [email protected] <javascript:>. >>>>>> For more options, visit this group at >>>>>> http://groups.google.com/group/google-appengine?hl=en. >>>>>> >>>>> >>>>> -- >>>>> You received this message because you are subscribed to the Google >>>>> Groups "Google App Engine" group. >>>>> To post to this group, send email to >>>>> [email protected]<javascript:> >>>>> . >>>>> To unsubscribe from this group, send email to >>>>> [email protected] <javascript:>. >>>>> For more options, visit this group at >>>>> http://groups.google.com/group/google-appengine?hl=en. >>>>> >>>> >>>> >>>> >>>> -- >>>> Takashi Matsuo | Developers Advocate | [email protected] <javascript:> >>>> >>>> -- >>>> You received this message because you are subscribed to the Google >>>> Groups "Google App Engine" group. >>>> To post to this group, send email to >>>> [email protected]<javascript:> >>>> . >>>> To unsubscribe from this group, send email to >>>> [email protected] <javascript:>. >>>> For more options, visit this group at >>>> http://groups.google.com/group/google-appengine?hl=en. >>>> >>> >>> -- >>> You received this message because you are subscribed to the Google >>> Groups "Google App Engine" group. >>> To post to this group, send email to >>> [email protected]<javascript:> >>> . >>> To unsubscribe from this group, send email to >>> [email protected] <javascript:>. >>> For more options, visit this group at >>> http://groups.google.com/group/google-appengine?hl=en. >>> >> -- >> You received this message because you are subscribed to the Google Groups >> "Google App Engine" group. >> To post to this group, send email to >> [email protected]<javascript:> >> . >> To unsubscribe from this group, send email to >> [email protected] <javascript:>. >> For more options, visit this group at >> http://groups.google.com/group/google-appengine?hl=en. >> > > -- You received this message because you are subscribed to the Google Groups "Google App Engine" group. To view this discussion on the web visit https://groups.google.com/d/msg/google-appengine/-/CxcnspcZJfsJ. To post to this group, send email to [email protected]. To unsubscribe from this group, send email to [email protected]. For more options, visit this group at http://groups.google.com/group/google-appengine?hl=en.
