Hi Tim,
for sure this is one of the main issues we are facing and the approach
you suggested is the same we are investigating on.
Could you provide some details about the Heat proxy renew mechanism?
Thank you very much for your feedback.
Cheers,
Lisa
On 01/07/2014 08:46, Tim Bell wrote:
Eric,
Thanks for sharing your work, it looks like an interesting development.
I was wondering how the Keystone token expiry is handled since the tokens
generally have a 1 day validity. If the request is scheduling for more than one
day, it would no longer have a valid token. We have similar scenarios with
Kerberos/AFS credentials in the CERN batch system. There are some interesting
proxy renew approaches used by Heat to get tokens at a later date which may be
useful for this problem.
$ nova credentials
+-----------+-----------------------------------------------------------------+
| Token | Value |
+-----------+-----------------------------------------------------------------+
| expires | 2014-07-02T06:39:59Z |
| id | 1a819279121f4235a8d85c694dea5e9e |
| issued_at | 2014-07-01T06:39:59.385417 |
| tenant | {"id": "841615a3-ece9-4622-9fa0-fdc178ed34f8", "enabled": true, |
| | "description": "Personal Project for user timbell", "name": |
| | "Personal timbell"} |
+-----------+-----------------------------------------------------------------+
Tim
-----Original Message-----
From: Eric Frizziero [mailto:[email protected]]
Sent: 30 June 2014 16:05
To: OpenStack Development Mailing List (not for usage questions)
Subject: [openstack-dev] [nova][scheduler] Proposal: FairShareScheduler.
Hi All,
we have analyzed the nova-scheduler component (FilterScheduler) in our
Openstack installation used by some scientific teams.
In our scenario, the cloud resources need to be distributed among the teams by
considering the predefined share (e.g. quota) assigned to each team, the portion
of the resources currently used and the resources they have already consumed.
We have observed that:
1) User requests are sequentially processed (FIFO scheduling), i.e.
FilterScheduler doesn't provide any dynamic priority algorithm;
2) User requests that cannot be satisfied (e.g. if resources are not
available) fail and will be lost, i.e. on that scenario nova-scheduler doesn't
provide any queuing of the requests;
3) OpenStack simply provides a static partitioning of resources among various
projects / teams (use of quotas). If project/team 1 in a period is
systematically
underutilizing its quota and the project/team 2 instead is systematically
saturating its quota, the only solution to give more resource to project/team 2
is
a manual change (to be done by the admin) to the related quotas.
The need to find a better approach to enable a more effective scheduling in
Openstack becomes more and more evident when the number of the user
requests to be handled increases significantly. This is a well known problem
which has already been solved in the past for the Batch Systems.
In order to solve those issues in our usage scenario of Openstack, we have
developed a prototype of a pluggable scheduler, named FairShareScheduler,
with the objective to extend the existing OpenStack scheduler (FilterScheduler)
by integrating a (batch like) dynamic priority algorithm.
The architecture of the FairShareScheduler is explicitly designed to provide a
high scalability level. To all user requests will be assigned a priority value
calculated by considering the share allocated to the user by the administrator
and the evaluation of the effective resource usage consumed in the recent past.
All requests will be inserted in a priority queue, and processed in parallel by
a
configurable pool of workers without interfering with the priority order.
Moreover all significant information (e.g. priority queue) will be stored in a
persistence layer in order to provide a fault tolerance mechanism while a proper
logging system will annotate all relevant events, useful for auditing
processing.
In more detail, some features of the FairshareScheduler are:
a) It assigns dynamically the proper priority to every new user requests;
b) The priority of the queued requests will be recalculated periodically using
the
fairshare algorithm. This feature guarantees the usage of the cloud resources is
distributed among users and groups by considering the portion of the cloud
resources allocated to them (i.e. share) and the resources already consumed;
c) all user requests will be inserted in a (persistent) priority queue and then
processed asynchronously by the dedicated process (filtering + weighting phase)
when compute resources are available;
d) From the client point of view the queued requests remain in "Scheduling"
state till the compute resources are available. No new states added: this
prevents any possible interaction issue with the Openstack clients;
e) User requests are dequeued by a pool of WorkerThreads (configurable), i.e.
no sequential processing of the requests;
f) The failed requests at filtering + weighting phase may be inserted again in
the
queue for n-times (configurable).
We have integrated the FairShareScheduler in our Openstack installation
(release "HAVANA"). We're now working to adapt the FairShareScheduler to the
new release "IceHouse".
Does anyone have experiences in those issues found in our cloud scenario?
Could the FairShareScheduler be useful for the Openstack community?
In that case, we'll be happy to share our work.
Any feedback/comment is welcome!
Cheers,
Eric.
_______________________________________________
OpenStack-dev mailing list
[email protected]
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
_______________________________________________
OpenStack-dev mailing list
[email protected]
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
_______________________________________________
OpenStack-dev mailing list
[email protected]
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev