Vish, I think I don't understand your statement fully. Unless we use different hostnames, (hostname, hypervisor_hostname) must be the same for all bare-metal nodes under a bare-metal nova-compute.
Could you elaborate the following statement a little bit more? > You would just have to use a little more than hostname. Perhaps > (hostname, hypervisor_hostname) could be used to update the entry? > Thanks, David ----- Original Message ----- > I would investigate changing the capabilities to key off of something > other than hostname. It looks from the table structure like > compute_nodes could be have a many-to-one relationship with services. > You would just have to use a little more than hostname. Perhaps > (hostname, hypervisor_hostname) could be used to update the entry? > > Vish > > On Aug 24, 2012, at 11:23 AM, David Kang <dk...@isi.edu> wrote: > > > > > Vish, > > > > I've tested your code and did more testing. > > There are a couple of problems. > > 1. host name should be unique. If not, any repetitive updates of new > > capabilities with the same host name are simply overwritten. > > 2. We cannot generate arbitrary host names on the fly. > > The scheduler (I tested filter scheduler) gets host names from db. > > So, if a host name is not in the 'services' table, it is not > > considered by the scheduler at all. > > > > So, to make your suggestions possible, nova-compute should register > > N different host names in 'services' table, > > and N corresponding entries in 'compute_nodes' table. > > Here is an example: > > > > mysql> select id, host, binary, topic, report_count, disabled, > > availability_zone from services; > > +----+-------------+----------------+-----------+--------------+----------+-------------------+ > > | id | host | binary | topic | report_count | disabled | > > | availability_zone | > > +----+-------------+----------------+-----------+--------------+----------+-------------------+ > > | 1 | bespin101 | nova-scheduler | scheduler | 17145 | 0 | nova | > > | 2 | bespin101 | nova-network | network | 16819 | 0 | nova | > > | 3 | bespin101-0 | nova-compute | compute | 16405 | 0 | nova | > > | 4 | bespin101-1 | nova-compute | compute | 1 | 0 | nova | > > +----+-------------+----------------+-----------+--------------+----------+-------------------+ > > > > mysql> select id, service_id, hypervisor_hostname from > > compute_nodes; > > +----+------------+------------------------+ > > | id | service_id | hypervisor_hostname | > > +----+------------+------------------------+ > > | 1 | 3 | bespin101.east.isi.edu | > > | 2 | 4 | bespin101.east.isi.edu | > > +----+------------+------------------------+ > > > > Then, nova db (compute_nodes table) has entries of all bare-metal > > nodes. > > What do you think of this approach. > > Do you have any better approach? > > > > Thanks, > > David > > > > > > > > ----- Original Message ----- > >> To elaborate, something the below. I'm not absolutely sure you need > >> to > >> be able to set service_name and host, but this gives you the option > >> to > >> do so if needed. > >> > >> iff --git a/nova/manager.py b/nova/manager.py > >> index c6711aa..c0f4669 100644 > >> --- a/nova/manager.py > >> +++ b/nova/manager.py > >> @@ -217,6 +217,8 @@ class SchedulerDependentManager(Manager): > >> > >> def update_service_capabilities(self, capabilities): > >> """Remember these capabilities to send on next periodic update.""" > >> + if not isinstance(capabilities, list): > >> + capabilities = [capabilities] > >> self.last_capabilities = capabilities > >> > >> @periodic_task > >> @@ -224,5 +226,8 @@ class SchedulerDependentManager(Manager): > >> """Pass data back to the scheduler at a periodic interval.""" > >> if self.last_capabilities: > >> LOG.debug(_('Notifying Schedulers of capabilities ...')) > >> - self.scheduler_rpcapi.update_service_capabilities(context, > >> - self.service_name, self.host, self.last_capabilities) > >> + for capability_item in self.last_capabilities: > >> + name = capability_item.get('service_name', self.service_name) > >> + host = capability_item.get('host', self.host) > >> + self.scheduler_rpcapi.update_service_capabilities(context, > >> + name, host, capability_item) > >> > >> On Aug 21, 2012, at 1:28 PM, David Kang <dk...@isi.edu> wrote: > >> > >>> > >>> Hi Vish, > >>> > >>> We are trying to change our code according to your comment. > >>> I want to ask a question. > >>> > >>>>>> a) modify driver.get_host_stats to be able to return a list of > >>>>>> host > >>>>>> stats instead of just one. Report the whole list back to the > >>>>>> scheduler. We could modify the receiving end to accept a list > >>>>>> as > >>>>>> well > >>>>>> or just make multiple calls to > >>>>>> self.update_service_capabilities(capabilities) > >>> > >>> Modifying driver.get_host_stats to return a list of host stats is > >>> easy. > >>> Calling muliple calls to > >>> self.update_service_capabilities(capabilities) doesn't seem to > >>> work, > >>> because 'capabilities' is overwritten each time. > >>> > >>> Modifying the receiving end to accept a list seems to be easy. > >>> However, 'capabilities' is assumed to be dictionary by all other > >>> scheduler routines, > >>> it looks like that we have to change all of them to handle > >>> 'capability' as a list of dictionary. > >>> > >>> If my understanding is correct, it would affect many parts of the > >>> scheduler. > >>> Is it what you recommended? > >>> > >>> Thanks, > >>> David > >>> > >>> > >>> ----- Original Message ----- > >>>> This was an immediate goal, the bare-metal nova-compute node > >>>> could > >>>> keep an internal database, but report capabilities through nova > >>>> in > >>>> the > >>>> common way with the changes below. Then the scheduler wouldn't > >>>> need > >>>> access to the bare metal database at all. > >>>> > >>>> On Aug 15, 2012, at 4:23 PM, David Kang <dk...@isi.edu> wrote: > >>>> > >>>>> > >>>>> Hi Vish, > >>>>> > >>>>> Is this discussion for long-term goal or for this Folsom > >>>>> release? > >>>>> > >>>>> We still believe that bare-metal database is needed > >>>>> because there is not an automated way how bare-metal nodes > >>>>> report > >>>>> their capabilities > >>>>> to their bare-metal nova-compute node. > >>>>> > >>>>> Thanks, > >>>>> David > >>>>> > >>>>>> > >>>>>> I am interested in finding a solution that enables bare-metal > >>>>>> and > >>>>>> virtualized requests to be serviced through the same scheduler > >>>>>> where > >>>>>> the compute_nodes table has a full view of schedulable > >>>>>> resources. > >>>>>> This > >>>>>> would seem to simplify the end-to-end flow while opening up > >>>>>> some > >>>>>> additional use cases (e.g. dynamic allocation of a node from > >>>>>> bare-metal to hypervisor and back). > >>>>>> > >>>>>> One approach would be to have a proxy running a single > >>>>>> nova-compute > >>>>>> daemon fronting the bare-metal nodes . That nova-compute daemon > >>>>>> would > >>>>>> report up many HostState objects (1 per bare-metal node) to > >>>>>> become > >>>>>> entries in the compute_nodes table and accessible through the > >>>>>> scheduler HostManager object. > >>>>>> > >>>>>> > >>>>>> > >>>>>> > >>>>>> The HostState object would set cpu_info, vcpus, member_mb and > >>>>>> local_gb > >>>>>> values to be used for scheduling with the hypervisor_host field > >>>>>> holding the bare-metal machine address (e.g. for IPMI based > >>>>>> commands) > >>>>>> and hypervisor_type = NONE. The bare-metal Flavors are created > >>>>>> with > >>>>>> an > >>>>>> extra_spec of hypervisor_type= NONE and the corresponding > >>>>>> compute_capabilities_filter would reduce the available hosts to > >>>>>> those > >>>>>> bare_metal nodes. The scheduler would need to understand that > >>>>>> hypervisor_type = NONE means you need an exact fit (or > >>>>>> best-fit) > >>>>>> host > >>>>>> vs weighting them (perhaps through the multi-scheduler). The > >>>>>> scheduler > >>>>>> would cast out the message to the <topic>.<service-hostname> > >>>>>> (code > >>>>>> today uses the HostState hostname), with the compute driver > >>>>>> having > >>>>>> to > >>>>>> understand if it must be serviced elsewhere (but does not break > >>>>>> any > >>>>>> existing implementations since it is 1 to 1). > >>>>>> > >>>>>> > >>>>>> > >>>>>> > >>>>>> > >>>>>> Does this solution seem workable? Anything I missed? > >>>>>> > >>>>>> The bare metal driver already is proxying for the other nodes > >>>>>> so > >>>>>> it > >>>>>> sounds like we need a couple of things to make this happen: > >>>>>> > >>>>>> > >>>>>> a) modify driver.get_host_stats to be able to return a list of > >>>>>> host > >>>>>> stats instead of just one. Report the whole list back to the > >>>>>> scheduler. We could modify the receiving end to accept a list > >>>>>> as > >>>>>> well > >>>>>> or just make multiple calls to > >>>>>> self.update_service_capabilities(capabilities) > >>>>>> > >>>>>> > >>>>>> b) make a few minor changes to the scheduler to make sure > >>>>>> filtering > >>>>>> still works. Note the changes here may be very helpful: > >>>>>> > >>>>>> > >>>>>> https://review.openstack.org/10327 > >>>>>> > >>>>>> > >>>>>> c) we have to make sure that instances launched on those nodes > >>>>>> take > >>>>>> up > >>>>>> the entire host state somehow. We could probably do this by > >>>>>> making > >>>>>> sure that the instance_type ram, mb, gb etc. matches what the > >>>>>> node > >>>>>> has, but we may want a new boolean field "used" if those aren't > >>>>>> sufficient. > >>>>>> > >>>>>> > >>>>>> I This approach seems pretty good. We could potentially get rid > >>>>>> of > >>>>>> the > >>>>>> shared bare_metal_node table. I guess the only other concern is > >>>>>> how > >>>>>> you populate the capabilities that the bare metal nodes are > >>>>>> reporting. > >>>>>> I guess an api extension that rpcs to a baremetal node to add > >>>>>> the > >>>>>> node. Maybe someday this could be autogenerated by the bare > >>>>>> metal > >>>>>> host > >>>>>> looking in its arp table for dhcp requests! :) > >>>>>> > >>>>>> > >>>>>> Vish > >>>>>> > >>>>>> _______________________________________________ > >>>>>> OpenStack-dev mailing list > >>>>>> openstack-...@lists.openstack.org > >>>>>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev > >>>>> > >>>>> _______________________________________________ > >>>>> OpenStack-dev mailing list > >>>>> openstack-...@lists.openstack.org > >>>>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev > >>>> > >>>> > >>>> _______________________________________________ > >>>> OpenStack-dev mailing list > >>>> openstack-...@lists.openstack.org > >>>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev > >>> > >>> _______________________________________________ > >>> OpenStack-dev mailing list > >>> openstack-...@lists.openstack.org > >>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev > >> > >> > >> _______________________________________________ > >> OpenStack-dev mailing list > >> openstack-...@lists.openstack.org > >> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev _______________________________________________ Mailing list: https://launchpad.net/~openstack Post to : openstack@lists.launchpad.net Unsubscribe : https://launchpad.net/~openstack More help : https://help.launchpad.net/ListHelp