I don't buy this. Even if the checkout queue is 15s deep and it takes 15s to bring a cashier online, there's still no value in sending Bob to wait by the register. For one thing, it doesn't actually reduce Bob's wait time by a material amount (he can move to the register in a few milliseconds). For another thing, we're only *speculating* that the register will be open in 15s - sometimes it's 30s or 45s because the cashier is hungover and keeps dropping the keys.
Sure, it's a lot easier to balance a 50-instance problem because a load that size will have a lot more inertia and be less spiky. But I still want to know how what percentage of your user-facing requests are hitting cold starts, and how many idle instances you need to keep it from happening. I have an e-commerce site. It doesn't get a lot of traffic but each request is very important - people are digging out their credit cards and buying things. It's *never* ok for a 20s pause to interrupt this process; people don't like giving money to sites that seem broken. Jeff On Tue, Jul 17, 2012 at 9:55 PM, Drake <[email protected]> wrote: > Jeff, > > Check the archive there are several check out lane analogies that I have > posted. > > I agree that the Queue is sub optimal, but it is more sub optimal the > smaller you are. When you get to 50 instances it is amazing how well the > load balancing works. On the climb up to peak new instances spin up on > requests rather than causing cascading failures or dramatic spin ups. And on > the way down instances de-utilized and end of life gracefully. > > Using your grocery store analogy, imagine that you are optimizing for a > guarantee that you will be checked out with in 30 seconds of entering the > queue. The ideal scenario is that when you get to a spot where you know you > are 15 seconds from being checked out, and it takes 15 seconds to "open a > new lane" you want to send users to go stand in line while the register > opens. > > Your goal is to never have to pay on that guarantee, not to serve the > highest percentage in the least time. When this is your ideal QoS the > current load balancing does really well. It does better if it has 10 > registers and can open 2 at a time, rather than when it has 1 register and > needs to decide if it is going to double capacity. > > -Brandon > > > -- > You received this message because you are subscribed to the Google Groups > "Google App Engine" group. > To post to this group, send email to [email protected]. > To unsubscribe from this group, send email to > [email protected]. > For more options, visit this group at > http://groups.google.com/group/google-appengine?hl=en. > -- You received this message because you are subscribed to the Google Groups "Google App Engine" group. To post to this group, send email to [email protected]. To unsubscribe from this group, send email to [email protected]. For more options, visit this group at http://groups.google.com/group/google-appengine?hl=en.
