[openstack-dev] [Neutron] Improving dhcp agent scheduling interface
Hi folks, I'd like to raise a discussion kept in irc and in gerrit recently: https://review.openstack.org/#/c/131944/ The intention of the patch is to clean up particular scheduling method/interface: schedule_network. Let me clarify why I think it needs to be done (beside code api consistency reasons): Scheduling process is ultimately just a two steps: 1) choosing appropriate agent for the network 2) adding binding between the agent and the network To perform those two steps one doesn't need network object, network_id is satisfactory for this need. However, there is a concern, that having full dict (or full network object) could allow us to do more flexible things in step 1 like deciding, whether network should be scheduled at all. See the TODO for the reference: https://github.com/openstack/neutron/blob/master/neutron/scheduler/dhcp_agent_scheduler.py#L64 However, this just puts an unnecessary (and actually, incorrect) requirement on the caller, to provide the network dict, mainly because caller doesn't know what content of the dict the callee (scheduler driver) expects. Currently scheduler is only interested in ID, if there is another scheduling driver, it may now require additional parameters (like list of full subnet dicts) in the dict which may or may not be provided by the calling code. Instead of making assumptions about what is in the dict, it's better to go with simpler and clearer interface that will allow scheduling driver to do whatever makes sense to it. In other words: caller provides id, driver fetches everything it needs using the id. For existing scheduling drivers it's a no-op. I think l3 scheduling is an example of interface done in the more right way; to me it looks clearer and more consistent. Thanks, Eugene. ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [Neutron] Improving dhcp agent scheduling interface
Hi Eugene thanks for bringing this up for discussion. My comments inline. Thanks, Armando On 5 November 2014 12:07, Eugene Nikanorov enikano...@mirantis.com wrote: Hi folks, I'd like to raise a discussion kept in irc and in gerrit recently: https://review.openstack.org/#/c/131944/ The intention of the patch is to clean up particular scheduling method/interface: schedule_network. Let me clarify why I think it needs to be done (beside code api consistency reasons): Scheduling process is ultimately just a two steps: 1) choosing appropriate agent for the network 2) adding binding between the agent and the network To perform those two steps one doesn't need network object, network_id is satisfactory for this need. I would argue that it isn't, actually. You may need to know the state of the network to make that placement decision. Just passing the id may cause the scheduling logic to issue an extra DB query that can be easily avoided if the right interface between the caller of a scheduler and the scheduler itself was in place. For instance we cannot fix [1] (as you pointed out) today because the method only accepts a dict that holds just a partial representation of the network. If we had the entire DB object we would avoid that and just passing the id is going in the opposite direction IMO However, there is a concern, that having full dict (or full network object) could allow us to do more flexible things in step 1 like deciding, whether network should be scheduled at all. That's the whole point of scheduling, is it not? If you are arguing that we should split the schedule method into two separate steps (get_me_available_agent and bind_network_to_agent), and make the caller of the schedule method carry out the two step process by itself, I think it could be worth exploring that, but at this point I don't believe this is the right refactoring. See the TODO for the reference: [1] https://github.com/openstack/neutron/blob/master/neutron/scheduler/dhcp_agent_scheduler.py#L64 However, this just puts an unnecessary (and actually, incorrect) requirement on the caller, to provide the network dict, mainly because caller doesn't know what content of the dict the callee (scheduler driver) expects. Why is it incorrect? We should move away from dictionaries and passing objects so that they can be reused where it makes sense without incurring in the overhead of re-fetching the object associated to the uuid when needed. We can even hide the complexity of refreshing the copy of the object every time it is accessed if needed. With information hiding and encapsulation we can wrap this logic in one place without scattering it around everywhere in the code base, like it's done today. Currently scheduler is only interested in ID, if there is another scheduling driver, No, the scheduler needs to know about the state of the network to do proper placement. It's a side-effect of the default scheduling (i.e. random). If we want to do more intelligent placement we need the state of the network. it may now require additional parameters (like list of full subnet dicts) in the dict which may or may not be provided by the calling code. Instead of making assumptions about what is in the dict, it's better to go with simpler and clearer interface that will allow scheduling driver to do whatever makes sense to it. In other words: caller provides id, driver fetches everything it needs using the id. For existing scheduling drivers it's a no-op. Again, the problem lies with the fact that we're passing dictionaries around. I think l3 scheduling is an example of interface done in the more right way; to me it looks clearer and more consistent. I may argue that the l3 scheduling api is the bad example for the above mentioned reasons. Thanks, Eugene. At this point I am still not convinced by the arguments provided that the patch 131944 https://review.openstack.org/#/c/131944/ should go forward as it is. ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [Neutron] Improving dhcp agent scheduling interface
My comments inline: I would argue that it isn't, actually. You may need to know the state of the network to make that placement decision. Yes, I could agree with that - and that's just a particular scheduling implementation, not a requirement for the interface. Just passing the id may cause the scheduling logic to issue an extra DB query that can be easily avoided if the right interface between the caller of a scheduler and the scheduler itself was in place. Yes, i may cause scheduling logic to issue a query, *iff* it needs it. For instance we cannot fix [1] (as you pointed out) today because the method only accepts a dict that holds just a partial representation of the network. If we had the entire DB object we would avoid that and just passing the id is going in the opposite direction IMO And here is another issue, I think. Requiring an object is something not quite clear at this point: if scheduling needs to be aware of subnets - network object is not enough then, and that's why I think we only need to provide ids that allow scheduling logic to act on it's own. However, there is a concern, that having full dict (or full network object) could allow us to do more flexible things in step 1 like deciding, whether network should be scheduled at all. That's the whole point of scheduling, is it not? Right, and we are just arguing, who should prepare the data needed to make the scheduling decision. I just think that scheduling logic may potentially require more than just network object. In my concrete example, i want to schedule a network which my code moves from a dead agent to some alive agent. I only have a network id during that operation. I'd like to avoid issuing DB query as well - just as you. My first thought was something like: self.schedule_network(context, {'id': network_id}) - which is clearly a dirty hack! But that's what the interface is forcing me to do. Or, it forces me to fetch the network which I'd like to avoid as well. That's why I want scheduling decide, whether it needs additional data or not. https://github.com/openstack/neutron/blob/master/neutron/scheduler/dhcp_agent_scheduler.py#L64 However, this just puts an unnecessary (and actually, incorrect) requirement on the caller, to provide the network dict, mainly because caller doesn't know what content of the dict the callee (scheduler driver) expects. Why is it incorrect? We should move away from dictionaries and passing objects Passing objects is for sure much stronger api contract, however I think it leads to the same level of overhead, if not worse! For instance, will network object include the collection of its subnet objects? Will they, in turn, include ipallocations and such? If the answer is No (and my opinion that it *must* be No), then we don't win much with object approach. If the answer is yes - we're fetching way too much from the DB to create network object; it's much bigger overhead then additional db query in a scheduling driver. No, the scheduler needs to know about the state of the network to do proper placement. It's a side-effect of the default scheduling (i.e. random). If we want to do more intelligent placement we need the state of the network. That's for sure, the question is only about who prepares the data: caller or the scheduler. Thanks, Eugene. ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev