Re: [openstack-dev] [neutron] [nova] scheduling bandwidth resources / NIC_BW_KB resource class

2016-04-22 Thread Sylvain Bauza



Le 22/04/2016 16:14, Matt Riedemann a écrit :

On 4/22/2016 2:48 AM, Sylvain Bauza wrote:



Le 22/04/2016 02:49, Jay Pipes a écrit :

On 04/20/2016 06:40 PM, Matt Riedemann wrote:
Note that I think the only time Nova gets details about ports in 
the API

during a server create request is when doing the network request
validation, and that's only if there is a fixed IP address or specific
port(s) in the request, otherwise Nova just gets the networks. [1]

[1]
https://github.com/openstack/nova/blob/ee7a01982611cdf8012a308fa49722146c51497f/nova/network/neutronv2/api.py#L1123 





Actually, nova.network.neutronv3.api.API.allocate_for_instance() is
*never* called by the Compute API service (though, strangely,
deallocate_for_instance() *is* called by the Compute API service.

allocate_for_instance() is *only* ever called in the nova-compute
service:

https://github.com/openstack/nova/blob/7be945b53944a44b26e49892e8a685815bf0cacb/nova/compute/manager.py#L1388 




I was actually on a hangout today with Carl, Miguel and Dan Smith
talking about just this particular section of code with regards to
routed networks IPAM handling.

What I believe we'd like to do is move to a model where we call out to
Neutron here in the conductor:

https://github.com/openstack/nova/blob/7be945b53944a44b26e49892e8a685815bf0cacb/nova/conductor/manager.py#L397 




and ask Neutron to give us as much information about available subnet
allocation pools and segment IDs as it can *before* we end up calling
the scheduler here:

https://github.com/openstack/nova/blob/7be945b53944a44b26e49892e8a685815bf0cacb/nova/conductor/manager.py#L415 




Not only will the segment IDs allow us to more properly use network
affinity in placement decisions, but doing this kind of "probing" for
network information in the conductor is inherently more scalable than
doing this all in allocate_for_instance() on the compute node while
holding the giant COMPUTE_NODE_SEMAPHORE lock.


I totally agree with that plan. I never replied to Ajo's point (thanks
Matt for doing that) but I was struggling to figure out an allocation
call in the Compute API service. Thanks Jay for clarifying this.

Funny, we do *deallocate* if an exception is raised when trying to find
a destination in the conductor, but since the port is not allocated yet,
I guess it's a no-op at the moment.

https://github.com/openstack/nova/blob/d57a4e8be9147bd79be12d3f5adccc9289a375b6/nova/conductor/manager.py#L423-L424 



Is this here for rebuilds where we setup networks on a compute node 
but something else failed, maybe setting up block devices? Although we 
have a lot of checks in the build flow in the compute manager for 
deallocating the network on failure.
Yeah, after git blaming, the reason is told in the commit msg : 
https://review.openstack.org/#/c/243477/


Fair enough, I just think it's another good reason to discuss where and 
when we should allocate and deallocate networks because I'm not super 
comfortable with the above. Or one other thing could be to trace that a 
port was already allocated for a specific instance and prevent doing 
that deallocation if not done yet instead of just doing what was 
necessary there 
https://review.openstack.org/#/c/269462/1/nova/conductor/manager.py ?


-Sylvain





Clarifying the above and making the conductor responsible for placing
calls to Neutron is something I'd love to see before moving further with
the routed networks and the QoS specs, and yes doing that in the
conductor seems to me the best fit.

-Sylvain




Best,
-jay

__ 



OpenStack Development Mailing List (not for usage questions)
Unsubscribe:
openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev



__ 


OpenStack Development Mailing List (not for usage questions)
Unsubscribe: 
openstack-dev-requ...@lists.openstack.org?subject:unsubscribe

http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev







__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [neutron] [nova] scheduling bandwidth resources / NIC_BW_KB resource class

2016-04-22 Thread Matt Riedemann

On 4/22/2016 2:48 AM, Sylvain Bauza wrote:



Le 22/04/2016 02:49, Jay Pipes a écrit :

On 04/20/2016 06:40 PM, Matt Riedemann wrote:

Note that I think the only time Nova gets details about ports in the API
during a server create request is when doing the network request
validation, and that's only if there is a fixed IP address or specific
port(s) in the request, otherwise Nova just gets the networks. [1]

[1]
https://github.com/openstack/nova/blob/ee7a01982611cdf8012a308fa49722146c51497f/nova/network/neutronv2/api.py#L1123



Actually, nova.network.neutronv3.api.API.allocate_for_instance() is
*never* called by the Compute API service (though, strangely,
deallocate_for_instance() *is* called by the Compute API service.

allocate_for_instance() is *only* ever called in the nova-compute
service:

https://github.com/openstack/nova/blob/7be945b53944a44b26e49892e8a685815bf0cacb/nova/compute/manager.py#L1388


I was actually on a hangout today with Carl, Miguel and Dan Smith
talking about just this particular section of code with regards to
routed networks IPAM handling.

What I believe we'd like to do is move to a model where we call out to
Neutron here in the conductor:

https://github.com/openstack/nova/blob/7be945b53944a44b26e49892e8a685815bf0cacb/nova/conductor/manager.py#L397


and ask Neutron to give us as much information about available subnet
allocation pools and segment IDs as it can *before* we end up calling
the scheduler here:

https://github.com/openstack/nova/blob/7be945b53944a44b26e49892e8a685815bf0cacb/nova/conductor/manager.py#L415


Not only will the segment IDs allow us to more properly use network
affinity in placement decisions, but doing this kind of "probing" for
network information in the conductor is inherently more scalable than
doing this all in allocate_for_instance() on the compute node while
holding the giant COMPUTE_NODE_SEMAPHORE lock.


I totally agree with that plan. I never replied to Ajo's point (thanks
Matt for doing that) but I was struggling to figure out an allocation
call in the Compute API service. Thanks Jay for clarifying this.

Funny, we do *deallocate* if an exception is raised when trying to find
a destination in the conductor, but since the port is not allocated yet,
I guess it's a no-op at the moment.

https://github.com/openstack/nova/blob/d57a4e8be9147bd79be12d3f5adccc9289a375b6/nova/conductor/manager.py#L423-L424


Is this here for rebuilds where we setup networks on a compute node but 
something else failed, maybe setting up block devices? Although we have 
a lot of checks in the build flow in the compute manager for 
deallocating the network on failure.






Clarifying the above and making the conductor responsible for placing
calls to Neutron is something I'd love to see before moving further with
the routed networks and the QoS specs, and yes doing that in the
conductor seems to me the best fit.

-Sylvain




Best,
-jay

__

OpenStack Development Mailing List (not for usage questions)
Unsubscribe:
openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev



__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev




--

Thanks,

Matt Riedemann


__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [neutron] [nova] scheduling bandwidth resources / NIC_BW_KB resource class

2016-04-22 Thread Sylvain Bauza



Le 22/04/2016 02:49, Jay Pipes a écrit :

On 04/20/2016 06:40 PM, Matt Riedemann wrote:

Note that I think the only time Nova gets details about ports in the API
during a server create request is when doing the network request
validation, and that's only if there is a fixed IP address or specific
port(s) in the request, otherwise Nova just gets the networks. [1]

[1]
https://github.com/openstack/nova/blob/ee7a01982611cdf8012a308fa49722146c51497f/nova/network/neutronv2/api.py#L1123 



Actually, nova.network.neutronv3.api.API.allocate_for_instance() is 
*never* called by the Compute API service (though, strangely, 
deallocate_for_instance() *is* called by the Compute API service.


allocate_for_instance() is *only* ever called in the nova-compute 
service:


https://github.com/openstack/nova/blob/7be945b53944a44b26e49892e8a685815bf0cacb/nova/compute/manager.py#L1388 



I was actually on a hangout today with Carl, Miguel and Dan Smith 
talking about just this particular section of code with regards to 
routed networks IPAM handling.


What I believe we'd like to do is move to a model where we call out to 
Neutron here in the conductor:


https://github.com/openstack/nova/blob/7be945b53944a44b26e49892e8a685815bf0cacb/nova/conductor/manager.py#L397 



and ask Neutron to give us as much information about available subnet 
allocation pools and segment IDs as it can *before* we end up calling 
the scheduler here:


https://github.com/openstack/nova/blob/7be945b53944a44b26e49892e8a685815bf0cacb/nova/conductor/manager.py#L415 



Not only will the segment IDs allow us to more properly use network 
affinity in placement decisions, but doing this kind of "probing" for 
network information in the conductor is inherently more scalable than 
doing this all in allocate_for_instance() on the compute node while 
holding the giant COMPUTE_NODE_SEMAPHORE lock.


I totally agree with that plan. I never replied to Ajo's point (thanks 
Matt for doing that) but I was struggling to figure out an allocation 
call in the Compute API service. Thanks Jay for clarifying this.


Funny, we do *deallocate* if an exception is raised when trying to find 
a destination in the conductor, but since the port is not allocated yet, 
I guess it's a no-op at the moment.


https://github.com/openstack/nova/blob/d57a4e8be9147bd79be12d3f5adccc9289a375b6/nova/conductor/manager.py#L423-L424


Clarifying the above and making the conductor responsible for placing 
calls to Neutron is something I'd love to see before moving further with 
the routed networks and the QoS specs, and yes doing that in the 
conductor seems to me the best fit.


-Sylvain




Best,
-jay

__ 


OpenStack Development Mailing List (not for usage questions)
Unsubscribe: 
openstack-dev-requ...@lists.openstack.org?subject:unsubscribe

http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev



__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [neutron] [nova] scheduling bandwidth resources / NIC_BW_KB resource class

2016-04-21 Thread Jay Pipes

On 04/20/2016 06:40 PM, Matt Riedemann wrote:

Note that I think the only time Nova gets details about ports in the API
during a server create request is when doing the network request
validation, and that's only if there is a fixed IP address or specific
port(s) in the request, otherwise Nova just gets the networks. [1]

[1]
https://github.com/openstack/nova/blob/ee7a01982611cdf8012a308fa49722146c51497f/nova/network/neutronv2/api.py#L1123


Actually, nova.network.neutronv3.api.API.allocate_for_instance() is 
*never* called by the Compute API service (though, strangely, 
deallocate_for_instance() *is* called by the Compute API service.


allocate_for_instance() is *only* ever called in the nova-compute service:

https://github.com/openstack/nova/blob/7be945b53944a44b26e49892e8a685815bf0cacb/nova/compute/manager.py#L1388

I was actually on a hangout today with Carl, Miguel and Dan Smith 
talking about just this particular section of code with regards to 
routed networks IPAM handling.


What I believe we'd like to do is move to a model where we call out to 
Neutron here in the conductor:


https://github.com/openstack/nova/blob/7be945b53944a44b26e49892e8a685815bf0cacb/nova/conductor/manager.py#L397

and ask Neutron to give us as much information about available subnet 
allocation pools and segment IDs as it can *before* we end up calling 
the scheduler here:


https://github.com/openstack/nova/blob/7be945b53944a44b26e49892e8a685815bf0cacb/nova/conductor/manager.py#L415

Not only will the segment IDs allow us to more properly use network 
affinity in placement decisions, but doing this kind of "probing" for 
network information in the conductor is inherently more scalable than 
doing this all in allocate_for_instance() on the compute node while 
holding the giant COMPUTE_NODE_SEMAPHORE lock.


Best,
-jay

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [neutron] [nova] scheduling bandwidth resources / NIC_BW_KB resource class

2016-04-20 Thread Matt Riedemann

On 4/20/2016 8:25 AM, Miguel Angel Ajo Pelayo wrote:

Inline update.

On Mon, Apr 11, 2016 at 4:22 PM, Miguel Angel Ajo Pelayo
 wrote:

On Mon, Apr 11, 2016 at 1:46 PM, Jay Pipes  wrote:

On 04/08/2016 09:17 AM, Miguel Angel Ajo Pelayo wrote:

[...]

Yes, Nova's conductor gathers information about the requested networks
*before* asking the scheduler where to place hosts:

https://github.com/openstack/nova/blob/stable/mitaka/nova/conductor/manager.py#L362


 That would require identifying that the port has a "qos_policy_id"
attached to it, and then, asking neutron for the specific QoS policy
  [3], then look out for a minimum bandwidth rule (still to be defined),
and extract the required bandwidth from it.



Yep, exactly correct.


That moves, again some of the responsibility to examine and
understand external resources to nova.



Yep, it does. The alternative is more retries for placement decisions
because accurate decisions cannot be made until the compute node is already
selected and the claim happens on the compute node.


 Could it make sense to make that part pluggable via stevedore?, so
we would provide something that takes the "resource id" (for a port in
this case) and returns the requirements translated to resource classes
(NIC_BW_KB in this case).



Not sure Stevedore makes sense in this context. Really, we want *less*
extensibility and *more* consistency. So, I would envision rather a system
where Nova would call to Neutron before scheduling when it has received a
port or network ID in the boot request and ask Neutron whether the port or
network has any resource constraints on it. Neutron would return a
standardized response containing each resource class and the amount
requested in a dictionary (or better yet, an os_vif.objects.* object,
serialized). Something like:

{
  'resources': {
'': {
  'NIC_BW_KB': 2048,
  'IPV4_ADDRESS': 1
}
  }
}



Oh, true, that's a great idea, having some API that translates a
neutron resource, to scheduling constraints. The external call will be
still required, but the coupling issue is removed.





I had a talk yesterday with @iharchys, @dansmith, and @sbauzas about
this, and we believe the synthesis of resource usage / scheduling
constraints from neutron makes sense.

We should probably look into providing those details in a read only
dictionary during port creation/update/show in general, in that way,
we would not be adding an extra API call to neutron from the nova
scheduler to figure out any of those details. That extra optimization
is something we may need to discuss with the neutron community.


Note that I think the only time Nova gets details about ports in the API 
during a server create request is when doing the network request 
validation, and that's only if there is a fixed IP address or specific 
port(s) in the request, otherwise Nova just gets the networks. [1]







In the case of the NIC_BW_KB resource class, Nova's scheduler would look for
compute nodes that had a NIC with that amount of bandwidth still available.
In the case of the IPV4_ADDRESS resource class, Nova's scheduler would use
the generic-resource-pools interface to find a resource pool of IPV4_ADDRESS
resources (i.e. a Neutron routed network or subnet allocation pool) that has
available IP space for the request.



Not sure about the IPV4_ADDRESS part because I still didn't look on
how they resolve routed networks with this new framework, but for
other constraints makes perfect sense to me.


Best,
-jay



Best regards,
Miguel Ángel Ajo


[1]

http://lists.openstack.org/pipermail/openstack-dev/2016-February/086371.html
[2] https://bugs.launchpad.net/neutron/+bug/1560963
[3]
http://developer.openstack.org/api-ref-networking-v2-ext.html#showPolicy


__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev



[1] 
https://github.com/openstack/nova/blob/ee7a01982611cdf8012a308fa49722146c51497f/nova/network/neutronv2/api.py#L1123


--

Thanks,

Matt Riedemann


__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [neutron] [nova] scheduling bandwidth resources / NIC_BW_KB resource class

2016-04-20 Thread Irena Berezovsky
On Wed, Apr 20, 2016 at 4:25 PM, Miguel Angel Ajo Pelayo <
majop...@redhat.com> wrote:

> Inline update.
>
> On Mon, Apr 11, 2016 at 4:22 PM, Miguel Angel Ajo Pelayo
>  wrote:
> > On Mon, Apr 11, 2016 at 1:46 PM, Jay Pipes  wrote:
> >> On 04/08/2016 09:17 AM, Miguel Angel Ajo Pelayo wrote:
> [...]
> >> Yes, Nova's conductor gathers information about the requested networks
> >> *before* asking the scheduler where to place hosts:
> >>
> >>
> https://github.com/openstack/nova/blob/stable/mitaka/nova/conductor/manager.py#L362
> >>
> >>>  That would require identifying that the port has a "qos_policy_id"
> >>> attached to it, and then, asking neutron for the specific QoS policy
> >>>   [3], then look out for a minimum bandwidth rule (still to be
> defined),
> >>> and extract the required bandwidth from it.
> >>
> >>
> >> Yep, exactly correct.
> >>
> >>> That moves, again some of the responsibility to examine and
> >>> understand external resources to nova.
> >>
> >>
> >> Yep, it does. The alternative is more retries for placement decisions
> >> because accurate decisions cannot be made until the compute node is
> already
> >> selected and the claim happens on the compute node.
> >>
> >>>  Could it make sense to make that part pluggable via stevedore?, so
> >>> we would provide something that takes the "resource id" (for a port in
> >>> this case) and returns the requirements translated to resource classes
> >>> (NIC_BW_KB in this case).
> >>
> >>
> >> Not sure Stevedore makes sense in this context. Really, we want *less*
> >> extensibility and *more* consistency. So, I would envision rather a
> system
> >> where Nova would call to Neutron before scheduling when it has received
> a
> >> port or network ID in the boot request and ask Neutron whether the port
> or
> >> network has any resource constraints on it. Neutron would return a
> >> standardized response containing each resource class and the amount
> >> requested in a dictionary (or better yet, an os_vif.objects.* object,
> >> serialized). Something like:
> >>
> >> {
> >>   'resources': {
> >> '': {
> >>   'NIC_BW_KB': 2048,
> >>   'IPV4_ADDRESS': 1
> >> }
> >>   }
> >> }
> >>
> >
> > Oh, true, that's a great idea, having some API that translates a
> > neutron resource, to scheduling constraints. The external call will be
> > still required, but the coupling issue is removed.
> >
> >
>
>
> I had a talk yesterday with @iharchys, @dansmith, and @sbauzas about
> this, and we believe the synthesis of resource usage / scheduling
> constraints from neutron makes sense.
>
> We should probably look into providing those details in a read only
> dictionary during port creation/update/show in general, in that way,
> we would not be adding an extra API call to neutron from the nova
> scheduler to figure out any of those details. That extra optimization
> is something we may need to discuss with the neutron community.
>
What about the caller context?
I believe these details should be  visible for admin user only.

>
>

> >> In the case of the NIC_BW_KB resource class, Nova's scheduler would
> look for
> >> compute nodes that had a NIC with that amount of bandwidth still
> available.
> >> In the case of the IPV4_ADDRESS resource class, Nova's scheduler would
> use
> >> the generic-resource-pools interface to find a resource pool of
> IPV4_ADDRESS
> >> resources (i.e. a Neutron routed network or subnet allocation pool)
> that has
> >> available IP space for the request.
> >>
> >
> > Not sure about the IPV4_ADDRESS part because I still didn't look on
> > how they resolve routed networks with this new framework, but for
> > other constraints makes perfect sense to me.
> >
> >> Best,
> >> -jay
> >>
> >>
> >>> Best regards,
> >>> Miguel Ángel Ajo
> >>>
> >>>
> >>> [1]
> >>>
> >>>
> http://lists.openstack.org/pipermail/openstack-dev/2016-February/086371.html
> >>> [2] https://bugs.launchpad.net/neutron/+bug/1560963
> >>> [3]
> >>>
> http://developer.openstack.org/api-ref-networking-v2-ext.html#showPolicy
>
> __
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>
__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [neutron] [nova] scheduling bandwidth resources / NIC_BW_KB resource class

2016-04-20 Thread Miguel Angel Ajo Pelayo
Inline update.

On Mon, Apr 11, 2016 at 4:22 PM, Miguel Angel Ajo Pelayo
 wrote:
> On Mon, Apr 11, 2016 at 1:46 PM, Jay Pipes  wrote:
>> On 04/08/2016 09:17 AM, Miguel Angel Ajo Pelayo wrote:
[...]
>> Yes, Nova's conductor gathers information about the requested networks
>> *before* asking the scheduler where to place hosts:
>>
>> https://github.com/openstack/nova/blob/stable/mitaka/nova/conductor/manager.py#L362
>>
>>>  That would require identifying that the port has a "qos_policy_id"
>>> attached to it, and then, asking neutron for the specific QoS policy
>>>   [3], then look out for a minimum bandwidth rule (still to be defined),
>>> and extract the required bandwidth from it.
>>
>>
>> Yep, exactly correct.
>>
>>> That moves, again some of the responsibility to examine and
>>> understand external resources to nova.
>>
>>
>> Yep, it does. The alternative is more retries for placement decisions
>> because accurate decisions cannot be made until the compute node is already
>> selected and the claim happens on the compute node.
>>
>>>  Could it make sense to make that part pluggable via stevedore?, so
>>> we would provide something that takes the "resource id" (for a port in
>>> this case) and returns the requirements translated to resource classes
>>> (NIC_BW_KB in this case).
>>
>>
>> Not sure Stevedore makes sense in this context. Really, we want *less*
>> extensibility and *more* consistency. So, I would envision rather a system
>> where Nova would call to Neutron before scheduling when it has received a
>> port or network ID in the boot request and ask Neutron whether the port or
>> network has any resource constraints on it. Neutron would return a
>> standardized response containing each resource class and the amount
>> requested in a dictionary (or better yet, an os_vif.objects.* object,
>> serialized). Something like:
>>
>> {
>>   'resources': {
>> '': {
>>   'NIC_BW_KB': 2048,
>>   'IPV4_ADDRESS': 1
>> }
>>   }
>> }
>>
>
> Oh, true, that's a great idea, having some API that translates a
> neutron resource, to scheduling constraints. The external call will be
> still required, but the coupling issue is removed.
>
>


I had a talk yesterday with @iharchys, @dansmith, and @sbauzas about
this, and we believe the synthesis of resource usage / scheduling
constraints from neutron makes sense.

We should probably look into providing those details in a read only
dictionary during port creation/update/show in general, in that way,
we would not be adding an extra API call to neutron from the nova
scheduler to figure out any of those details. That extra optimization
is something we may need to discuss with the neutron community.



>> In the case of the NIC_BW_KB resource class, Nova's scheduler would look for
>> compute nodes that had a NIC with that amount of bandwidth still available.
>> In the case of the IPV4_ADDRESS resource class, Nova's scheduler would use
>> the generic-resource-pools interface to find a resource pool of IPV4_ADDRESS
>> resources (i.e. a Neutron routed network or subnet allocation pool) that has
>> available IP space for the request.
>>
>
> Not sure about the IPV4_ADDRESS part because I still didn't look on
> how they resolve routed networks with this new framework, but for
> other constraints makes perfect sense to me.
>
>> Best,
>> -jay
>>
>>
>>> Best regards,
>>> Miguel Ángel Ajo
>>>
>>>
>>> [1]
>>>
>>> http://lists.openstack.org/pipermail/openstack-dev/2016-February/086371.html
>>> [2] https://bugs.launchpad.net/neutron/+bug/1560963
>>> [3]
>>> http://developer.openstack.org/api-ref-networking-v2-ext.html#showPolicy

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [neutron] [nova] scheduling bandwidth resources / NIC_BW_KB resource class

2016-04-11 Thread Miguel Angel Ajo Pelayo
On Mon, Apr 11, 2016 at 1:46 PM, Jay Pipes  wrote:
> Hi Miguel Angel, comments/answers inline :)
>
> On 04/08/2016 09:17 AM, Miguel Angel Ajo Pelayo wrote:
>>
>> Hi!,
>>
>> In the context of [1] (generic resource pools / scheduling in nova)
>> and [2] (minimum bandwidth guarantees -egress- in neutron), I had a talk
>> a few weeks ago with Jay Pipes,
>>
>> The idea was leveraging the generic resource pools and scheduling
>> mechanisms defined in [1] to find the right hosts and track the total
>> available bandwidth per host (and per host "physical network"),
>> something in neutron (still to be defined where) would notify the new
>> API about the total amount of "NIC_BW_KB" available on every host/physnet.
>
>
> Yes, what we discussed was making it initially per host, meaning the host
> would advertise a total aggregate bandwidth amount for all NICs that it uses
> for the data plane as a single amount.
>
> The other way to track this resource class (NIC_BW_KB) would be to make the
> NICs themselves be resource providers and then the scheduler could pick a
> specific NIC to bind the port to based on available NIC_BW_KB on a
> particular NIC.
>
> The former method makes things conceptually easier at the expense of
> introducing greater potential for retrying placement decisions (since the
> specific NIC to bind a port to wouldn't be known until the claim is made on
> the compute host). The latter method adds complexity to the filtering and
> scheduler in order to make more accurate placement decisions that would
> result in fewer retries.
>
>> That part is quite clear to me,
>>
>> From [1] I'm not sure which blueprint introduces the ability to
>> schedule based on the resource allocation/availability itself,
>> ("resource-providers-scheduler" seems more like an optimization to the
>> schedule/DB interaction, right?)
>
>
> Yes, you are correct about the above blueprint; it's only for moving the
> Python-side filters to be a DB query.
>
> The resource-providers-allocations blueprint:
>
> https://review.openstack.org/300177
>
> Is the one where we convert the various consumed resource amount fields to
> live in the single allocations table that may be queried for usage
> information.
>
> We aim to use the ComputeNode object as a facade that hides the migration of
> these data fields as much as possible so that the scheduler actually does
> not need to know that the schema has changed underneath it. Of course, this
> only works for *existing* resource classes, like vCPU, RAM, etc. It won't
> work for *new* resource classes like the discussed NET_BW_KB because,
> clearly, we don't have an existing field in the instance_extra or other
> tables that contain that usage amount and therefore can't use ComputeNode
> object as a facade over a non-existing piece of data.
>
> Eventually, the intent is to change the ComputeNode object to return a new
> AllocationList object that would contain all of the compute node's resources
> in a tabular format (mimicking the underlying allocations table):
>
> https://review.openstack.org/#/c/282442/20/nova/objects/resource_provider.py
>
> Once this is done, the scheduler can be fitted to query this AllocationList
> object to make resource usage and placement decisions in the Python-side
> filters.
>
> We are still debating on the resource-providers-scheduler-db-filters
> blueprint:
>
> https://review.openstack.org/#/c/300178/
>
> Whether to change the existing FilterScheduler or create a brand new
> scheduler driver. I could go either way, frankly. If we made a brand new
> scheduler driver, it would do a query against the compute_nodes table in the
> DB directly. The legacy FilterScheduler would manipulate the AllocationList
> object returned by the ComputeNode.allocations attribute. Either way we get
> to where we want to go: representing all quantitative resources in a
> standardized and consistent fashion.
>
>>  And, that brings me to another point: at the moment of filtering
>> hosts, nova  I guess, will have the neutron port information, it has to
>> somehow identify if the port is tied to a minimum bandwidth QoS policy.
>
>
> Yes, Nova's conductor gathers information about the requested networks
> *before* asking the scheduler where to place hosts:
>
> https://github.com/openstack/nova/blob/stable/mitaka/nova/conductor/manager.py#L362
>
>>  That would require identifying that the port has a "qos_policy_id"
>> attached to it, and then, asking neutron for the specific QoS policy
>>   [3], then look out for a minimum bandwidth rule (still to be defined),
>> and extract the required bandwidth from it.
>
>
> Yep, exactly correct.
>
>> That moves, again some of the responsibility to examine and
>> understand external resources to nova.
>
>
> Yep, it does. The alternative is more retries for placement decisions
> because accurate decisions cannot be made until the compute node is already
> selected and the claim happens on the compute 

Re: [openstack-dev] [neutron] [nova] scheduling bandwidth resources / NIC_BW_KB resource class

2016-04-11 Thread Jay Pipes

Hi Miguel Angel, comments/answers inline :)

On 04/08/2016 09:17 AM, Miguel Angel Ajo Pelayo wrote:

Hi!,

In the context of [1] (generic resource pools / scheduling in nova)
and [2] (minimum bandwidth guarantees -egress- in neutron), I had a talk
a few weeks ago with Jay Pipes,

The idea was leveraging the generic resource pools and scheduling
mechanisms defined in [1] to find the right hosts and track the total
available bandwidth per host (and per host "physical network"),
something in neutron (still to be defined where) would notify the new
API about the total amount of "NIC_BW_KB" available on every host/physnet.


Yes, what we discussed was making it initially per host, meaning the 
host would advertise a total aggregate bandwidth amount for all NICs 
that it uses for the data plane as a single amount.


The other way to track this resource class (NIC_BW_KB) would be to make 
the NICs themselves be resource providers and then the scheduler could 
pick a specific NIC to bind the port to based on available NIC_BW_KB on 
a particular NIC.


The former method makes things conceptually easier at the expense of 
introducing greater potential for retrying placement decisions (since 
the specific NIC to bind a port to wouldn't be known until the claim is 
made on the compute host). The latter method adds complexity to the 
filtering and scheduler in order to make more accurate placement 
decisions that would result in fewer retries.



That part is quite clear to me,

From [1] I'm not sure which blueprint introduces the ability to
schedule based on the resource allocation/availability itself,
("resource-providers-scheduler" seems more like an optimization to the
schedule/DB interaction, right?)


Yes, you are correct about the above blueprint; it's only for moving the 
Python-side filters to be a DB query.


The resource-providers-allocations blueprint:

https://review.openstack.org/300177

Is the one where we convert the various consumed resource amount fields 
to live in the single allocations table that may be queried for usage 
information.


We aim to use the ComputeNode object as a facade that hides the 
migration of these data fields as much as possible so that the scheduler 
actually does not need to know that the schema has changed underneath 
it. Of course, this only works for *existing* resource classes, like 
vCPU, RAM, etc. It won't work for *new* resource classes like the 
discussed NET_BW_KB because, clearly, we don't have an existing field in 
the instance_extra or other tables that contain that usage amount and 
therefore can't use ComputeNode object as a facade over a non-existing 
piece of data.


Eventually, the intent is to change the ComputeNode object to return a 
new AllocationList object that would contain all of the compute node's 
resources in a tabular format (mimicking the underlying allocations table):


https://review.openstack.org/#/c/282442/20/nova/objects/resource_provider.py

Once this is done, the scheduler can be fitted to query this 
AllocationList object to make resource usage and placement decisions in 
the Python-side filters.


We are still debating on the resource-providers-scheduler-db-filters 
blueprint:


https://review.openstack.org/#/c/300178/

Whether to change the existing FilterScheduler or create a brand new 
scheduler driver. I could go either way, frankly. If we made a brand new 
scheduler driver, it would do a query against the compute_nodes table in 
the DB directly. The legacy FilterScheduler would manipulate the 
AllocationList object returned by the ComputeNode.allocations attribute. 
Either way we get to where we want to go: representing all quantitative 
resources in a standardized and consistent fashion.



 And, that brings me to another point: at the moment of filtering
hosts, nova  I guess, will have the neutron port information, it has to
somehow identify if the port is tied to a minimum bandwidth QoS policy.


Yes, Nova's conductor gathers information about the requested networks 
*before* asking the scheduler where to place hosts:


https://github.com/openstack/nova/blob/stable/mitaka/nova/conductor/manager.py#L362


 That would require identifying that the port has a "qos_policy_id"
attached to it, and then, asking neutron for the specific QoS policy
  [3], then look out for a minimum bandwidth rule (still to be defined),
and extract the required bandwidth from it.


Yep, exactly correct.


That moves, again some of the responsibility to examine and
understand external resources to nova.


Yep, it does. The alternative is more retries for placement decisions 
because accurate decisions cannot be made until the compute node is 
already selected and the claim happens on the compute node.



 Could it make sense to make that part pluggable via stevedore?, so
we would provide something that takes the "resource id" (for a port in
this case) and returns the requirements translated to resource classes
(NIC_BW_KB 

Re: [openstack-dev] [neutron] [nova] scheduling bandwidth resources / NIC_BW_KB resource class

2016-04-11 Thread Miguel Angel Ajo Pelayo
On Sun, Apr 10, 2016 at 10:07 AM, Moshe Levi <mosh...@mellanox.com> wrote:

>
>
>
>
> *From:* Miguel Angel Ajo Pelayo [mailto:majop...@redhat.com]
> *Sent:* Friday, April 08, 2016 4:17 PM
> *To:* OpenStack Development Mailing List (not for usage questions) <
> openstack-dev@lists.openstack.org>
> *Subject:* [openstack-dev] [neutron] [nova] scheduling bandwidth
> resources / NIC_BW_KB resource class
>
>
>
>
>
> Hi!,
>
>
>
>In the context of [1] (generic resource pools / scheduling in nova) and
> [2] (minimum bandwidth guarantees -egress- in neutron), I had a talk a few
> weeks ago with Jay Pipes,
>
>
>
>The idea was leveraging the generic resource pools and scheduling
> mechanisms defined in [1] to find the right hosts and track the total
> available bandwidth per host (and per host "physical network"), something
> in neutron (still to be defined where) would notify the new API about the
> total amount of "NIC_BW_KB" available on every host/physnet.
>



> I believe that NIC bandwidth can be taken from Libvirt see [4] and the
> only piece that is missing is to tell nova the mapping of physnet to
> network interface name. (In case of SR-IOV this is already known)
>
> I see bandwidth (speed)  as one of many capabilities of NIC, therefore I
> think we should take all of them in the same way in this case libvirt.  I
> was think of adding a new NIC as new resource to nova.
>

Yes, at the low level, thats one way to do it. We may need neutron agents
or plugins to collect such information, since, in some cases one devices
will be tied to one physical network, other devices will be tied to other
physical networks, or even some devices could be connected to the same
physnet. In some cases, connectivity depends on L3 tunnels, and in that
case, bandwidth calculation is more complicated (depending on routes, etc..
-I'm not even looking at that case yet-)



>
>
> [4] - 
>
>   net_enp129s0_e4_1d_2d_2d_8c_41
>
>
> /sys/devices/pci:80/:80:01.0/:81:00.0/net/enp129s0
>
>   pci__81_00_0
>
>   
>
> enp129s0
>
> e4:1d:2d:2d:8c:41
>
> 
>
> 
>
> 
>
> 
>
> 
>
> 
>
> 
>
> 
>
> 
>
> 
>
> 
>
> 
>
>   
>
> 
>
>
>
>That part is quite clear to me,
>
>
>
>From [1] I'm not sure which blueprint introduces the ability to
> schedule based on the resource allocation/availability itself,
> ("resource-providers-scheduler" seems more like an optimization to the
> schedule/DB interaction, right?)
>
> My understating is that the resource provider blueprint is just a rough
> filter of compute nodes before passing them to the scheduler filters. The
> existing filters here [6] will do the  accurate filtering of resources.
>
> see [5]
>
>
>
> [5] -
> http://eavesdrop.openstack.org/irclogs/%23openstack-nova/%23openstack-nova.2016-04-04.log.html#t2016-04-04T16:24:10
>
>
> [6] - http://docs.openstack.org/developer/nova/filter_scheduler.html
>
>
>

Thanks, yes, if those filters can operate on the generic resource pools,
then, great, we will just need to write the right filters.



> And, that brings me to another point: at the moment of filtering
> hosts, nova  I guess, will have the neutron port information, it has to
> somehow identify if the port is tied to a minimum bandwidth QoS policy.
>
>
>
> That would require identifying that the port has a "qos_policy_id"
> attached to it, and then, asking neutron for the specific QoS policy  [3],
> then look out for a minimum bandwidth rule (still to be defined), and
> extract the required bandwidth from it.
>
> I am not sure if that is the correct  way to do it, but you can create NIC
> bandwidth filter (or NIC capabilities filter)  and in it you can implement
> the way to retrieve Qos policy information by using neutron client.
>

That's my concern, that logic would have to live on the nova side, again,
and it's tightly couple to the neutron models. I'd be glad to find a way to
uncouple nova from that as much as possible. And, even better if we could
find a way to avoid the need for nova to retrieve policies as it discovers
ports.


>
>
>That moves, again some of the responsibility to examine and understand
> external resources to nova.
>
>
>
> Could it make sense to make that part pluggable via stevedore?, so we
> would provide something that takes the "resource id" (for a port in this
> case) and returns the requirements translated to resource classes
> (NIC_BW_KB in this case).
>
>
>
>
>
> Best regards,
>
> Migue

Re: [openstack-dev] [neutron] [nova] scheduling bandwidth resources / NIC_BW_KB resource class

2016-04-10 Thread Moshe Levi


From: Miguel Angel Ajo Pelayo [mailto:majop...@redhat.com]
Sent: Friday, April 08, 2016 4:17 PM
To: OpenStack Development Mailing List (not for usage questions) 
<openstack-dev@lists.openstack.org>
Subject: [openstack-dev] [neutron] [nova] scheduling bandwidth resources / 
NIC_BW_KB resource class


Hi!,

   In the context of [1] (generic resource pools / scheduling in nova) and [2] 
(minimum bandwidth guarantees -egress- in neutron), I had a talk a few weeks 
ago with Jay Pipes,

   The idea was leveraging the generic resource pools and scheduling mechanisms 
defined in [1] to find the right hosts and track the total available bandwidth 
per host (and per host "physical network"), something in neutron (still to be 
defined where) would notify the new API about the total amount of "NIC_BW_KB" 
available on every host/physnet.
I believe that NIC bandwidth can be taken from Libvirt see [4] and the only 
piece that is missing is to tell nova the mapping of physnet to network 
interface name. (In case of SR-IOV this is already known)
I see bandwidth (speed)  as one of many capabilities of NIC, therefore I think 
we should take all of them in the same way in this case libvirt.  I was think 
of adding a new NIC as new resource to nova.

[4] - 
  net_enp129s0_e4_1d_2d_2d_8c_41
  /sys/devices/pci:80/:80:01.0/:81:00.0/net/enp129s0
  pci__81_00_0
  
enp129s0
e4:1d:2d:2d:8c:41












  


   That part is quite clear to me,

   From [1] I'm not sure which blueprint introduces the ability to schedule 
based on the resource allocation/availability itself, 
("resource-providers-scheduler" seems more like an optimization to the 
schedule/DB interaction, right?)
My understating is that the resource provider blueprint is just a rough filter 
of compute nodes before passing them to the scheduler filters. The existing 
filters here [6] will do the  accurate filtering of resources.
see [5]

[5] - 
http://eavesdrop.openstack.org/irclogs/%23openstack-nova/%23openstack-nova.2016-04-04.log.html#t2016-04-04T16:24:10
[6] - http://docs.openstack.org/developer/nova/filter_scheduler.html

And, that brings me to another point: at the moment of filtering hosts, 
nova  I guess, will have the neutron port information, it has to somehow 
identify if the port is tied to a minimum bandwidth QoS policy.

That would require identifying that the port has a "qos_policy_id" attached 
to it, and then, asking neutron for the specific QoS policy  [3], then look out 
for a minimum bandwidth rule (still to be defined), and extract the required 
bandwidth from it.
I am not sure if that is the correct  way to do it, but you can create NIC 
bandwidth filter (or NIC capabilities filter)  and in it you can implement the 
way to retrieve Qos policy information by using neutron client.

   That moves, again some of the responsibility to examine and understand 
external resources to nova.

Could it make sense to make that part pluggable via stevedore?, so we would 
provide something that takes the "resource id" (for a port in this case) and 
returns the requirements translated to resource classes (NIC_BW_KB in this 
case).


Best regards,
Miguel Ángel Ajo


[1] http://lists.openstack.org/pipermail/openstack-dev/2016-February/086371.html
[2] https://bugs.launchpad.net/neutron/+bug/1560963
[3] http://developer.openstack.org/api-ref-networking-v2-ext.html#showPolicy
__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


[openstack-dev] [neutron] [nova] scheduling bandwidth resources / NIC_BW_KB resource class

2016-04-08 Thread Miguel Angel Ajo Pelayo
Hi!,

   In the context of [1] (generic resource pools / scheduling in nova) and
[2] (minimum bandwidth guarantees -egress- in neutron), I had a talk a few
weeks ago with Jay Pipes,

   The idea was leveraging the generic resource pools and scheduling
mechanisms defined in [1] to find the right hosts and track the total
available bandwidth per host (and per host "physical network"), something
in neutron (still to be defined where) would notify the new API about the
total amount of "NIC_BW_KB" available on every host/physnet.

   That part is quite clear to me,

   From [1] I'm not sure which blueprint introduces the ability to schedule
based on the resource allocation/availability itself,
("resource-providers-scheduler" seems more like an optimization to the
schedule/DB interaction, right?)

And, that brings me to another point: at the moment of filtering hosts,
nova  I guess, will have the neutron port information, it has to somehow
identify if the port is tied to a minimum bandwidth QoS policy.

That would require identifying that the port has a "qos_policy_id"
attached to it, and then, asking neutron for the specific QoS policy  [3],
then look out for a minimum bandwidth rule (still to be defined), and
extract the required bandwidth from it.

   That moves, again some of the responsibility to examine and understand
external resources to nova.

Could it make sense to make that part pluggable via stevedore?, so we
would provide something that takes the "resource id" (for a port in this
case) and returns the requirements translated to resource classes
(NIC_BW_KB in this case).


Best regards,
Miguel Ángel Ajo


[1]
http://lists.openstack.org/pipermail/openstack-dev/2016-February/086371.html
[2] https://bugs.launchpad.net/neutron/+bug/1560963
[3] http://developer.openstack.org/api-ref-networking-v2-ext.html#showPolicy
__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev