On Fri, Aug 24, 2012 at 11:12 PM, Armen Danielyan <[email protected]> wrote: > Hi Mos, > > I have experienced very similar issues for the last week. The loading time > of my website increased from 2-3 seconds to 10-15 seconds all of a sudden. > When it happened I tried to change the app settings, changed it from F1 to > F2, increased the idle instances setting but nothing helped. > > During the week I was playing with settings with no results. Suddenly the > issue has been solved by itself today. I want to stress, that yesterday the > problem was still there, and I haven't changed any settings since then. > > It means that the problem was on Google's side, and they solved it silently. > It's a shame Google doesn't accept their mistakes, and keep saying that it's > our fault because we didn't configure our applications in a right way. I > will never deploy any new application on GAE. >
Hi Armen, Are you affected by http://code.google.com/p/googleappengine/issues/detail?id=7706? The engineering team is working on improving the high performance variance for apps that need to load a lot of code on loading request (typically Java apps with "big" dependency like spring, guice or depending on a lot of jars). If you app is hit by this problem, please star this issue and comment with your application id. Thanks in advance. > > On Friday, August 24, 2012 5:28:01 PM UTC-4, Mos wrote: >> >> Thanks Johan. I read the post some days before. >> >> As often discussed on the mailing-list before and as Jeff said in this >> thread. >> It's the combination of "Requests should never be sent to cold instances." >> and(!) the behavior of min idle instance which doesn't make any sense. >> >> Please check the last comment of >> http://code.google.com/p/googleappengine/issues/detail?id=8004 where wrote >> down the problems in my point of view. >> >> Senior Java-developers on this list which have many months of experience >> with GAE stated again again that there is a big issue around instance >> handling. >> I think you have to trust your power-user and assign a team to work on >> this! >> >> On Fri, Aug 24, 2012 at 10:58 PM, Johan Euphrosine <[email protected]> >> wrote: >>> >>> Hi all, >>> >>> Please review the following thread where the lead engineer working on the >>> scheduler (Jon McAlister) took the time to explain in great detail the >>> behavior of min idle instance. >>> https://groups.google.com/d/msg/google-appengine/nRtzGtG9790/hLS16qux_04J >>> >>> Once you read this, we can discuss if what you're experiencing is really >>> a bug, or if you want the scheduler to behave differently from its current >>> implementation, in which case the more constructive way out of this >>> discussion is to fill feature request, and get it starred by your peers. >>> >>> On Aug 24, 2012 10:24 PM, "Mos" <[email protected]> wrote: >>>> >>>> > Setting Max Pending Latency doesn't force requests to be in the >>>> > pending queue for the specified time. Please use Min Pending Latency >>>> > instead. >>>> >>>> As you know my setting to "Min Pending Latency" was automatic. The >>>> expectation is that GAE takes a reasonable default latency if it is >>>> "automatic". >>>> And you say: Every parallel request starts a new instance if it is >>>> "automatic"? That' would be a "Min Pending Latency" of zero and not >>>> "automatic". >>>> >>>> > If it doesn't work, try 2 min idle instances then >>>> >>>> Please check the responses of other user in this thread. This feature >>>> is totally broken and can not be used. >>>> >>>> >> And around the 16th august? >>>> > Sigh... isn't it a waist of time? What is the reason you picked that >>>> > date? >>>> >>>> Did you see/studied my pictures from the first post of this thread? >>>> The statistic shows that on this date the instance creation gets crazy. >>>> I double checked it with the Pingdom reports. >>>> Starting on this day there were even more downtimes. >>>> >>>> > So I'd say please try 2. If you still saw the user-facing loading >>>> > requests, you need more resident instance to eliminate the user-facing >>>> > loading requests. >>>> >>>> Again: As wrote in my post before that does not work. Check the >>>> responses from Kristopher and Jeff on this thread. >>>> >>>> > So what is your expected behavior and actual result? Nobody in our >>>> > team can do anything if you just keep saying "the setting that used to >>>> > work >>>> > doesn't work anymore" without trying mu suggestion. >>>> > I think my answer is clear at least for some points. 1) You'd better >>>> > use 'min pending latency' instead of 'max pending latency' to prevent new >>>> > instances to spin up as much as possible. 2) If you need longer instance >>>> > lives, set appropriate number of min idle instances. >>>> >>>> As I wrote: I tried different settings. As many other people in this >>>> group as well. >>>> Me and other people are reporting: The settings are broken! >>>> It's very easy to reproduce. Please set up an application, send one >>>> request per minute (or second), configure 1 or 2 or 3 min idle instances >>>> and >>>> check what is happening. You will see that new instances are started >>>> although resistant instances are available. >>>> >>>> Please take it serious and let somebody of the engineers check this! >>>> >>>> Cheers >>>> Mos >>>> >>>> >>>> On Fri, Aug 24, 2012 at 8:43 PM, Takashi Matsuo <[email protected]> >>>> wrote: >>>>> >>>>> >>>>> Hi Mos, >>>>> >>>>> On Sat, Aug 25, 2012 at 1:39 AM, Mos <[email protected]> wrote: >>>>>> >>>>>> Hello Takashi, >>>>>> >>>>>> >>>>>> > Actually there were almost 8 requests in a second. So App Engine >>>>>> > likely needed more than one instance at this particular moment. >>>>>> >>>>>> I thought this is why GAE has the concept of "pending-latency" (which >>>>>> we discussed below). >>>>>> Meaning: Incoming requests may wait up to 15 seconds before starting >>>>>> a new instance. Therefore when 8 requests in one second occur that >>>>>> should not mean that more instance needs to be started. Especially if >>>>>> there is no other traffic in this minute, as seen in my example. >>>>>> Otherwise it would be a very bad implementation: >>>>>> Starting a new instance means around 30s waiting time. Serving 8 >>>>>> parallel requests from one instance, would result in a maximum of >>>>>> 8 seconds for the last request (assuming that each request takes >>>>>> around 1 second). >>>>>> There is no reason for this concrete example to fire up more instances >>>>>> and let requests wait more then 30 seconds until a new instance is >>>>>> loaded. >>>>> >>>>> >>>>> Do you really read my e-mail? >>>>> >>>>> Setting Max Pending Latency doesn't force requests to be in the pending >>>>> queue for the specified time. Please use Min Pending Latency instead. >>>>> Can you try this first? If it doesn't work, try 2 min idle instances >>>>> then. >>>>> >>>>>> >>>>>> >>>>>> > ... here is what you've seen in the past weeks. >>>>>> > >>>>>> >* You have been almost always set 'Automatic-2' idle instance >>>>>> > setting. >>>>>> >* More than 3 weeks ago, number of loading requests were very few. >>>>>> > * Recently you have seen more loading requests than before. >>>>>> >>>>>> That, right! To be even more concrete: At the 16. august the problems >>>>>> got significant worse. Please check especially the time area from 16. >>>>>> august >>>>>> until today. >>>>>> >>>>>> > First of all, it seems that you deployed 2 new versions on Aug 1 and >>>>>> > Aug 2. Can you describe what kind of changes in those versions? >>>>>> >>>>>> I checked it in our version control. As I wrote no related changes >>>>>> were made! Just Html/Css stuff: >>>>>> * One picture upload >>>>>> * One html change >>>>>> * One JavaScript change >>>>>> * One css change >>>>>> >>>>>> >>>>>> > And, to be fair, we didn't think of any change in our scheduler >>>>>> > around 3 weeks ago which can cause this issue. >>>>>> >>>>>> And around the 16th august? >>>>> >>>>> >>>>> Sigh... isn't it a waist of time? What is the reason you picked that >>>>> date? >>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> > More than 3 weeks before, those 2 idle instances might have had >>>>>> > longer lives than now, but it was not a concrete behavior. Please >>>>>> > think this >>>>>> > way: you were just kind of lucky. >>>>>> >>>>>> That shouldn't be luck! If GAE is not able to start Java instances in >>>>>> 5sec to 10 second, there needs be a guarantee that instances have longer >>>>>> lives. Otherwise Java applications on GAE are unusable because user >>>>>> would >>>>>> have a lot of 30seconds wait time (--> "failed requests"). (See also >>>>>> next >>>>>> comment regarding resistant instances) >>>>>> >>>>>> >>>>>> > If you want some instances always active, please set min idle >>>>>> > instances. >>>>>> >>>>>> I tried this some days ago. I had one resistant instance. But that >>>>>> changed nothing. Instances get started and stopped as before. I assumed >>>>>> that requests would go to the resistant instance first. But that was no >>>>>> the >>>>>> case. Resistant instance was idle, but a dynamic instance got started and >>>>>> the request waits 30sec. >>>>>> >>>>>> Please check other discussion on this list and issues that reported >>>>>> similar observations. >>>>> >>>>> >>>>> So I'd say please try 2. If you still saw the user-facing loading >>>>> requests, you need more resident instance to eliminate the user-facing >>>>> loading requests. >>>>> >>>>>> >>>>>> >>>>>> > As you can see, I'm still not convinced to believe that the >>>>>> > scheduler is misbehaving. I understand that you're having experiences >>>>>> > which >>>>>> > are bit worse than 3 weeks ago, and understand your feeling that you >>>>>> > want to >>>>>> > tell us 'fix it', but I'd say it's > >still something in the line of >>>>>> > 'expected behavior' at least for now. >>>>>> > If you feel differently, please let me know. >>>>>> >>>>>> Yes I do feel differently (please see answers above). >>>>>> >>>>>> Please accept >>>>>> http://code.google.com/p/googleappengine/issues/detail?id=8004 >>>>> >>>>> >>>>> So what is your expected behavior and actual result? Nobody in our team >>>>> can do anything if you just keep saying "the setting that used to work >>>>> doesn't work anymore" without trying mu suggestion. >>>>> >>>>> I think my answer is clear at least for some points. 1) You'd better >>>>> use 'min pending latency' instead of 'max pending latency' to prevent new >>>>> instances to spin up as much as possible. 2) If you need longer instance >>>>> lives, set appropriate number of min idle instances. >>>>> >>>>> -- Takashi >>>>> >>>>>> >>>>>> >>>>>> >>>>>> Thanks >>>>>> Mos >>>>>> http://www.mosbase.com >>>>>> >>>>>> >>>>>> On Fri, Aug 24, 2012 at 4:22 PM, Takashi Matsuo <[email protected]> >>>>>> wrote: >>>>>>> >>>>>>> >>>>>>> Hi Mos, >>>>>>> >>>>>>> On Fri, Aug 24, 2012 at 6:05 PM, Mos <[email protected]> wrote: >>>>>>>> >>>>>>>> > A possible explanation could be that the traffic pattern had >>>>>>>> > changed. >>>>>>>> >>>>>>>> No. It's the same. Check for example the Request/Seconds statistics >>>>>>>> of my application for the last 30 days! >>>>>>>> >>>>>>>> >>>>>>>> >> It's very obvious that one instance should be enough for my >>>>>>>> >> application. And that was almost the case the last months! >>>>>>>> > Actually it's not true. In particular, check this log: >>>>>>>> >>>>>>>> That's one expection where one client did 8 request in a minute (+ >>>>>>>> one pingdom). Nothing else this minute. >>>>>>>> In those exceptional cases it could be ok if a second instance >>>>>>>> starts. (Nevertheless can't one instance not >>>>>>>> handle 8 requests a minute?) >>>>>>> >>>>>>> >>>>>>> The issue here is not 8 requests in a minute. Actually there were >>>>>>> almost 8 requests in a second. So App Engine likely needed more than one >>>>>>> instance at this particular moment. Anyway, as you say, probably it's >>>>>>> just a >>>>>>> reason for one of the loading requests you're seeing, and this is not >>>>>>> very >>>>>>> important thing in this topic. >>>>>>> >>>>>>> It's kind of digressing, but at a first glance, the Requests/Seconds >>>>>>> stat seems an appropriate data source to discuss how many instances are >>>>>>> actually needed, but in fact, it's not. The real traffic is not >>>>>>> spreading >>>>>>> equally. >>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> As I described: Instances are started and stopped without reason, >>>>>>>> even if less traffic per minute is available! >>>>>>> >>>>>>> >>>>>>> Okay. As far as I understand, here is what you've seen in the past >>>>>>> weeks. >>>>>>> >>>>>>> * You have been almost always set 'Automatic-2' idle instance >>>>>>> setting. >>>>>>> * More than 3 weeks ago, number of loading requests were very few. >>>>>>> * Recently you have seen more loading requests than before. >>>>>>> >>>>>>> First of all, it seems that you deployed 2 new versions on Aug 1 and >>>>>>> Aug 2. Can you describe what kind of changes in those versions? >>>>>>> I'd like to make sure that there is no changes that can cause the >>>>>>> scheduler/app server behaving differently. >>>>>>> >>>>>>> Especially, if you want me to escalate this issue to our engineering >>>>>>> team, you should provide the exact information. You say 'My application >>>>>>> is >>>>>>> unchanged', but in fact you deployed the new version on that day when >>>>>>> you >>>>>>> described the issue started. I need to make sure that there is no big >>>>>>> change >>>>>>> which can cause something bad. >>>>>>> >>>>>>> And, to be fair, we didn't think of any change in our scheduler >>>>>>> around 3 weeks ago which can cause this issue. >>>>>>> >>>>>>> Secondly, you're setting max idle instances = 2. It does not >>>>>>> guarantee that you have always 2 instances. It just guarantees that we >>>>>>> will >>>>>>> never charge you for more than 2 idle instances at any time. >>>>>>> >>>>>>> More than 3 weeks before, those 2 idle instances might have had >>>>>>> longer lives than now, but it was not a concrete behavior. Please think >>>>>>> this >>>>>>> way: you were just kind of lucky. Now, presumably one or two of those >>>>>>> instances are occasionally killed for some reasons(there should be >>>>>>> certain >>>>>>> legitimate reasons, but those are something you don't need to care). >>>>>>> >>>>>>> If you want some instances always active, please set min idle >>>>>>> instances. Certainly it will cost you a bit more, and you will loose the >>>>>>> pending queue, but considering the access pattern of your app(no bursty >>>>>>> traffic except for few access from the iPhone browser), I would >>>>>>> recommend >>>>>>> trying this setting in order to achieve what you want here. I'd >>>>>>> recommend 2 >>>>>>> idle instances in this case, but you should decide the number. >>>>>>> >>>>>>>> >>>>>>>> > * What is the purpose of max-pending-latency = 14.9 setting? >>>>>>>> >>>>>>>> " is high App Engine will allow requests to wait rather than start >>>>>>>> new Instances to process them" >>>>>>>> --> One attempt to stop GAE to create unnecessary instances. >>>>>>> >>>>>>> >>>>>>> I think you should set min pending latency instead of max pending >>>>>>> latency if you want to prevent new instance to spin up. However, if >>>>>>> you're >>>>>>> going to set min idle instances, this setting will almost loose effect. >>>>>>> If >>>>>>> you don't want to set any min idle instances for whatever reason, please >>>>>>> consider setting min pending latency instead of max pending latency. >>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> > * Can you try automatic-automatic for idle instances setting? >>>>>>>> >>>>>>>> I played around with this the last days and nothing changed. As I >>>>>>>> wrote: I had those configuration for months and it worked fine 3-4 >>>>>>>> weeks >>>>>>>> ago! >>>>>>>> >>>>>>>> >>>>>>>> > * What is the purpose of those pingdom check? What happens if you >>>>>>>> > stop that? >>>>>>>> >>>>>>>> To be alerted if GAE is down a again. "What happens if you stop >>>>>>>> that?" --> I wouldn't be angry anymore because I wouldn't recognize >>>>>>>> downtime's of my GAE application. ;) >>>>>>>> >>>>>>>> >>>>>>>> Please forward >>>>>>>> http://code.google.com/p/googleappengine/issues/detail?id=8004 to the >>>>>>>> relevant GAE deparment. >>>>>>> >>>>>>> >>>>>>> As you can see, I'm still not convinced to believe that the scheduler >>>>>>> is misbehaving. I understand that you're having experiences which are >>>>>>> bit >>>>>>> worse than 3 weeks ago, and understand your feeling that you want to >>>>>>> tell us >>>>>>> 'fix it', but I'd say it's still something in the line of 'expected >>>>>>> behavior' at least for now. >>>>>>> >>>>>>> If you feel differently, please let me know. >>>>>>> >>>>>>> Regards, >>>>>>> >>>>>>> -- Takashi >>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> Thanks! >>>>>>>> >>>>>>>> >>>>>>>> On Fri, Aug 24, 2012 at 1:39 AM, Takashi Matsuo <[email protected]> >>>>>>>> wrote: >>>>>>>>> >>>>>>>>> >>>>>>>>> Hi Mos, >>>>>>>>> >>>>>>>>> On Thu, Aug 23, 2012 at 4:58 AM, Mos <[email protected]> wrote: >>>>>>>>>> >>>>>>>>>> Does anybody else experience abnormal behavior of the >>>>>>>>>> instance-scheduler the last three weeks (the last 7 days it got even >>>>>>>>>> worse)? >>>>>>>>>> (Java / HRD) >>>>>>>>>> Or does anybody has profound knowledge about it? >>>>>>>>>> >>>>>>>>>> Background: My application is unchanged for weeks, configuration >>>>>>>>>> not changed and application's traffic is constant. >>>>>>>>>> Traffic: One request per minute from Pingdom and around 200 >>>>>>>>>> additional pageviews the day (== around 1500 pageviews the day). The >>>>>>>>>> peek is >>>>>>>>>> not more then 3-4 request per minute. >>>>>>>>> >>>>>>>>> >>>>>>>>> A possible explanation could be that the traffic pattern had >>>>>>>>> changed. >>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> It's very obvious that one instance should be enough for my >>>>>>>>>> application. And that was almost the case the last months! >>>>>>>>> >>>>>>>>> >>>>>>>>> Actually it's not true. In particular, check this log: >>>>>>>>> >>>>>>>>> https://appengine.google.com/logs?app_id=s~krisen-talk&version_id=1-0.360912144269287698&severity_level_override=1&severity_level=3&tz=Europe%2FBerlin&filter=&filter_type=regex&date_type=datetime&date=2012-08-23&time=23%3A57%3A00&limit=20&view=Search >>>>>>>>> >>>>>>>>> You can see the iPhone client repeatedly requests your dynamic >>>>>>>>> resources in a very short amount of time. Presumably it's due to some >>>>>>>>> kind >>>>>>>>> of 'prefetch' feature of that device. Are you aware of those >>>>>>>>> accesses, and >>>>>>>>> that this access pattern can cause a new instance starting? >>>>>>>>> >>>>>>>>> I don't think this is the only reason, but this can explain that >>>>>>>>> some portion of your loading requests are expected behavior. >>>>>>>>> >>>>>>>>> Now I'd like to ask you some questions. >>>>>>>>> >>>>>>>>> >>>>>>>>> * What is the purpose of max-pending-latency = 14.9 setting? >>>>>>>>> * Can you try automatic-automatic for idle instances setting? >>>>>>>>> * What is the purpose of those pingdom check? What happens if you >>>>>>>>> stop that? >>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> But now GAE creates most of the time 3 instances, whereby on has a >>>>>>>>>> long life-time for days and the other ones are restarted around >>>>>>>>>> 10 to 30 times the day. >>>>>>>>>> Because load request takes between 30s to 40s and requests are >>>>>>>>>> waiting for loading instances, there are many request that >>>>>>>>>> fail (Users and Pingdom agree: A request that takes more then a >>>>>>>>>> couple of seconds is a failed request!) >>>>>>>>>> >>>>>>>>>> Please check the attached screenshots that show the behavior! >>>>>>>>>> >>>>>>>>>> Note: >>>>>>>>>> - Killing instances manually did not help >>>>>>>>>> - Idle Instances were ( Automatic – 2 ) . Changing it to whatever >>>>>>>>>> didn't not change anything; e.g. like ( Automatic – 4 ) >>>>>>>>>> >>>>>>>>>> Thanks and Cheers >>>>>>>>>> >>>>>>>>>> Mos >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> -- >>>>>>>>>> You received this message because you are subscribed to the Google >>>>>>>>>> Groups "Google App Engine" group. >>>>>>>>>> To post to this group, send email to [email protected]. >>>>>>>>>> To unsubscribe from this group, send email to >>>>>>>>>> [email protected]. >>>>>>>>>> >>>>>>>>>> For more options, visit this group at >>>>>>>>>> http://groups.google.com/group/google-appengine?hl=en. >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> -- >>>>>>>>> Takashi Matsuo | Developers Advocate | [email protected] >>>>>>>>> >>>>>>>>> -- >>>>>>>>> You received this message because you are subscribed to the Google >>>>>>>>> Groups "Google App Engine" group. >>>>>>>>> To post to this group, send email to [email protected]. >>>>>>>>> To unsubscribe from this group, send email to >>>>>>>>> [email protected]. >>>>>>>>> >>>>>>>>> For more options, visit this group at >>>>>>>>> http://groups.google.com/group/google-appengine?hl=en. >>>>>>>> >>>>>>>> >>>>>>>> -- >>>>>>>> You received this message because you are subscribed to the Google >>>>>>>> Groups "Google App Engine" group. >>>>>>>> To post to this group, send email to [email protected]. >>>>>>>> To unsubscribe from this group, send email to >>>>>>>> [email protected]. >>>>>>>> >>>>>>>> For more options, visit this group at >>>>>>>> http://groups.google.com/group/google-appengine?hl=en. >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> -- >>>>>>> Takashi Matsuo | Developers Advocate | [email protected] >>>>>>> >>>>>>> -- >>>>>>> You received this message because you are subscribed to the Google >>>>>>> Groups "Google App Engine" group. >>>>>>> To post to this group, send email to [email protected]. >>>>>>> To unsubscribe from this group, send email to >>>>>>> [email protected]. >>>>>>> >>>>>>> For more options, visit this group at >>>>>>> http://groups.google.com/group/google-appengine?hl=en. >>>>>> >>>>>> >>>>>> -- >>>>>> You received this message because you are subscribed to the Google >>>>>> Groups "Google App Engine" group. >>>>>> To post to this group, send email to [email protected]. >>>>>> To unsubscribe from this group, send email to >>>>>> [email protected]. >>>>>> >>>>>> For more options, visit this group at >>>>>> http://groups.google.com/group/google-appengine?hl=en. >>>>> >>>>> >>>>> >>>>> >>>>> -- >>>>> Takashi Matsuo | Developers Advocate | [email protected] >>>>> >>>>> -- >>>>> You received this message because you are subscribed to the Google >>>>> Groups "Google App Engine" group. >>>>> To post to this group, send email to [email protected]. >>>>> To unsubscribe from this group, send email to >>>>> [email protected]. >>>>> >>>>> For more options, visit this group at >>>>> http://groups.google.com/group/google-appengine?hl=en. >>>> >>>> >>>> -- >>>> You received this message because you are subscribed to the Google >>>> Groups "Google App Engine" group. >>>> To post to this group, send email to [email protected]. >>>> To unsubscribe from this group, send email to >>>> [email protected]. >>>> >>>> For more options, visit this group at >>>> http://groups.google.com/group/google-appengine?hl=en. >>> >>> -- >>> You received this message because you are subscribed to the Google Groups >>> "Google App Engine" group. >>> To post to this group, send email to [email protected]. >>> To unsubscribe from this group, send email to >>> [email protected]. >>> >>> For more options, visit this group at >>> http://groups.google.com/group/google-appengine?hl=en. >> >> > -- > You received this message because you are subscribed to the Google Groups > "Google App Engine" group. > To view this discussion on the web visit > https://groups.google.com/d/msg/google-appengine/-/CxcnspcZJfsJ. > > To post to this group, send email to [email protected]. > To unsubscribe from this group, send email to > [email protected]. > For more options, visit this group at > http://groups.google.com/group/google-appengine?hl=en. -- Johan Euphrosine (proppy) Developer Programs Engineer Google Developer Relations -- You received this message because you are subscribed to the Google Groups "Google App Engine" group. To post to this group, send email to [email protected]. To unsubscribe from this group, send email to [email protected]. For more options, visit this group at http://groups.google.com/group/google-appengine?hl=en.
