Le 19/05/2017 12:19, John Garbutt a écrit : > On 19 May 2017 at 10:03, Sylvain Bauza <sba...@redhat.com> wrote: >> >> >> Le 19/05/2017 10:02, Sylvain Bauza a écrit : >>> >>> >>> Le 19/05/2017 02:55, Matt Riedemann a écrit : >>>> The etherpad for this session is here [1]. The goal for this session was >>>> to inform operators and get feedback on the plan for what we're doing >>>> with moving claims from the computes to the control layer (scheduler or >>>> conductor). >>>> >>>> We mostly talked about retries, which also came up in the cells v2 >>>> session that Dan Smith led [2] and will recap later. >>>> >>>> Without getting into too many details, in the cells v2 session we came >>>> to a compromise on build retries and said that we could pass hosts down >>>> to the cell so that the cell-level conductor could retry if needed (even >>>> though we expect doing claims at the top will fix the majority of >>>> reasons you'd have a reschedule in the first place). >>>> >>> >>> And during that session, we said that given cell-local conductors (when >>> there is a reschedule) can't upcall the global (for all cells) >>> schedulers, that's why we agreed to use the conductor to be calling >>> Placement API for allocations. >>> >>> >>>> During the claims in the scheduler session, a new wrinkle came up which >>>> is the hosts that the scheduler returns to the top-level conductor may >>>> be in different cells. So if we have two cells, A and B, with hosts x >>>> and y in cell A and host z in cell B, we can't send z to A for retries, >>>> or x or y to B for retries. So we need some kind of post-filter/weigher >>>> filtering such that hosts are grouped by cell and then they can be sent >>>> to the cells for retries as necessary. >>>> >>> >>> That's already proposed for reviews in >>> https://review.openstack.org/#/c/465175/ >>> >>> >>>> There was also some side discussion asking if we somehow regressed >>>> pack-first strategies by using Placement in Ocata. John Garbutt and Dan >>>> Smith have the context on this (I think) so I'm hoping they can clarify >>>> if we really need to fix something in Ocata at this point, or is this >>>> more of a case of closing a loop-hole? >>>> >>> >>> The problem is that the scheduler doesn't verify the cells when trying >>> to find a destination for an instance, it's just using weights for packing. >>> >>> So, for example, say I have N hosts and 2 cells, the first weighting >>> host could be in cell1 while the second could be in cell2. Then, even if >>> the operator uses the weighers for packing, for example a RequestSpec >>> with num_instances=2 could push one instance in cell1 and the other in >>> cell2. >>> >>> From a scheduler point of view, I think we could possibly add a >>> CellWeigher that would help to pack instances within the same cell. >>> Anyway, that's not related to the claims series, so we could possibly >>> backport it for Ocata hopefully. >>> >> >> Melanie actually made a good point about the current logic based on the >> `host_subset_size`config option. If you're leaving it defaulted to 1, in >> theory all instances coming along the scheduler would get a sorted list >> of hosts by weights and only pick the first one (ie. packing all the >> instances onto the same host) which is good for that (except of course >> some user request that fits all the space of the host and where a spread >> could be better by shuffling between multiple hosts). >> >> So, while I began deprecating that option because I thought the race >> condition would be fixed by conductor claims, I think we should keep it >> for the time being until we clearly identify whether it's still necessary. >> >> All what I said earlier above remains valid tho. In a world where 2 >> hosts are given as the less weighed ones, we could send instances from >> the same user request onto different cells, but that only ties the >> problem to a multi-instance boot problem, which is far less impactful. > > FWIW, I think we need to keep this. > > If you have *lots* of contention when picking your host, increasing > host_subset_size should help reduce that contention (and maybe help > increase the throughput). I haven't written a simulator to test it > out, but it feels like we will still need to keep the fuzzy select. > That might just be a different way to say the same thing mel was > saying, not sure. >
Yup, agreed, thanks to Mel, that's why I'm providing a new revision that is no longer removing this conf opt. Melanie, very good point! -Sylvain > Thanks, > johnthetubaguy > > __________________________________________________________________________ > OpenStack Development Mailing List (not for usage questions) > Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev > __________________________________________________________________________ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev