I've been thinking about this use case for a DHT-like design, I think I
want to do what other people have alluded to here and try and intercept
problematic requests like this one in some sort of "pre sending to
ring-segment" stage. In this case the "pre-stage" could decide to send this
off to a scheduler that has a more complete view of the world.
Alternatively, don't make a single request for 50 instances, just send 50
requests for one? Is that a viable thing to do for this use case?

-Mike


On Tue, Nov 19, 2013 at 7:03 PM, Joshua Harlow <harlo...@yahoo-inc.com>wrote:

> At yahoo at least 50+ simultaneous will be the common case (maybe we are
> special).
>
> Think of what happens on www.yahoo.com say on the olympics, news.yahoo.com
> could need 50+ very very quickly (especially if say a gold medal is won by
> some famous person). So I wouldn't discount those being the common case
> (may not be common for some, but is common for others). In fact any
> website with spurious/spikey traffic will have the same desire; so it
> might be a target use-case for website like companies (or ones that can't
> upfront predict spikes).
>
> Overall though I think what u said about 'don't fill it up' is good
> general knowledge. Filling up stuff beyond a certain threshold is
> dangerous just in general (one should only push the limits so far before
> madness).
>
> On 11/19/13 4:08 PM, "Clint Byrum" <cl...@fewbar.com> wrote:
>
> >Excerpts from Chris Friesen's message of 2013-11-19 12:18:16 -0800:
> >> On 11/19/2013 01:51 PM, Clint Byrum wrote:
> >> > Excerpts from Chris Friesen's message of 2013-11-19 11:37:02 -0800:
> >> >> On 11/19/2013 12:35 PM, Clint Byrum wrote:
> >> >>
> >> >>> Each scheduler process can own a different set of resources. If they
> >> >>> each grab instance requests in a round-robin fashion, then they will
> >> >>> fill their resources up in a relatively well balanced way until one
> >> >>> scheduler's resources are exhausted. At that time it should bow out
> >>of
> >> >>> taking new instances. If it can't fit a request in, it should kick
> >>the
> >> >>> request out for retry on another scheduler.
> >> >>>
> >> >>> In this way, they only need to be in sync in that they need a way to
> >> >>> agree on who owns which resources. A distributed hash table that
> >>gets
> >> >>> refreshed whenever schedulers come and go would be fine for that.
> >> >>
> >> >> That has some potential, but at high occupancy you could end up
> >>refusing
> >> >> to schedule something because no one scheduler has sufficient
> >>resources
> >> >> even if the cluster as a whole does.
> >> >>
> >> >
> >> > I'm not sure what you mean here. What resource spans multiple compute
> >> > hosts?
> >>
> >> Imagine the cluster is running close to full occupancy, each scheduler
> >> has room for 40 more instances.  Now I come along and issue a single
> >> request to boot 50 instances.  The cluster has room for that, but none
> >> of the schedulers do.
> >>
> >
> >You're assuming that all 50 come in at once. That is only one use case
> >and not at all the most common.
> >
> >> >> This gets worse once you start factoring in things like heat and
> >> >> instance groups that will want to schedule whole sets of resources
> >> >> (instances, IP addresses, network links, cinder volumes, etc.) at
> >>once
> >> >> with constraints on where they can be placed relative to each other.
> >>
> >> > Actually that is rather simple. Such requests have to be serialized
> >> > into a work-flow. So if you say "give me 2 instances in 2 different
> >> > locations" then you allocate 1 instance, and then another one with
> >> > 'not_in_location(1)' as a condition.
> >>
> >> Actually, you don't want to serialize it, you want to hand the whole
> >>set
> >> of resource requests and constraints to the scheduler all at once.
> >>
> >> If you do them one at a time, then early decisions made with
> >> less-than-complete knowledge can result in later scheduling requests
> >> failing due to being unable to meet constraints, even if there are
> >> actually sufficient resources in the cluster.
> >>
> >> The "VM ensembles" document at
> >>
> >>
> https://docs.google.com/document/d/1bAMtkaIFn4ZSMqqsXjs_riXofuRvApa--qo4U
> >>Twsmhw/edit?pli=1
> >> has a good example of how one-at-a-time scheduling can cause spurious
> >> failures.
> >>
> >> And if you're handing the whole set of requests to a scheduler all at
> >> once, then you want the scheduler to have access to as many resources
> >>as
> >> possible so that it has the highest likelihood of being able to satisfy
> >> the request given the constraints.
> >
> >This use case is real and valid, which is why I think there is room for
> >multiple approaches. For instance the situation you describe can also be
> >dealt with by just having the cloud stay under-utilized and accepting
> >that when you get over a certain percentage utilized spurious failures
> >will happen. We have a similar solution in the ext3 filesystem on Linux.
> >Don't fill it up, or suffer a huge performance penalty.
> >
> >_______________________________________________
> >OpenStack-dev mailing list
> >OpenStack-dev@lists.openstack.org
> >http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>
>
> _______________________________________________
> OpenStack-dev mailing list
> OpenStack-dev@lists.openstack.org
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>
_______________________________________________
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Reply via email to