Re: [openstack-dev] problems in pypi.openstack.org

2013-07-19 Thread Jeremy Stanley
On 2013-07-20 10:02:34 +0800 (+0800), Gareth wrote:
[...]
> xattr can't be installed correctly.
[...]
> BTW, Swift and Glance are influenced.

One of xattr's dependencies was cached in a broken state, but should
be resolved as of the past couple hours.
-- 
Jeremy Stanley

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


[openstack-dev] pip requirements externally host (evil evil stab stab stab)

2013-07-19 Thread Monty Taylor
Hey guys!

PyPI is moving towards the world of getting people to stop hosting stuff
via external links. It's been bad for us in the past and one of the
reasons for the existence of our mirror. pip 1.4 has an option to
disallow following external links, and in 1.5 it's going to be the
default behavior.

Looking forward, we have 5 pip packages that host their stuff
externally. If we have any pull with their authors, we should get them
to actually upload stuff to pypi. If we don't, we should strongly
consider our use of these packages. As soon as pip 1.4 comes out, I
would like to moving forward restrict the addition of NEW requirements
that do not host on pypi. (all 5 of these host insecurely as well, fwiw)

The culprits are:

dnspython,lockfile,netifaces,psutil,pysendfile

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Neutron] lbaas installation guide

2013-07-19 Thread Qing He
By the way, I'm wondering if lbaas has a separate doc somewhere else?

From: Anne Gentle [mailto:a...@openstack.org]
Sent: Friday, July 19, 2013 6:33 PM
To: OpenStack Development Mailing List
Subject: [openstack-dev] [Neutron] lbaas installation guide

Thanks for bringing it to the attention of the list -- I've logged this doc 
bug. https://bugs.launchpad.net/openstack-manuals/+bug/1203230 Hopefully a 
Neutron team member can pick it up and investigate.

Anne

On Fri, Jul 19, 2013 at 7:35 PM, Qing He 
mailto:qing...@radisys.com>> wrote:

In the network installation guide( 
http://docs.openstack.org/grizzly/openstack-network/admin/content/install_ubuntu.html
 ) there is a sentence "quantum-lbaas-agent, etc (see below for more 
information about individual services agents)." in the pluggin installation 
section. However, lbaas is never mentioned again after that in the doc.


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Neutron] lbaas installation guide

2013-07-19 Thread Qing He
Thanks Anne!

From: Anne Gentle [mailto:a...@openstack.org]
Sent: Friday, July 19, 2013 6:33 PM
To: OpenStack Development Mailing List
Subject: [openstack-dev] [Neutron] lbaas installation guide

Thanks for bringing it to the attention of the list -- I've logged this doc 
bug. https://bugs.launchpad.net/openstack-manuals/+bug/1203230 Hopefully a 
Neutron team member can pick it up and investigate.

Anne

On Fri, Jul 19, 2013 at 7:35 PM, Qing He 
mailto:qing...@radisys.com>> wrote:

In the network installation guide( 
http://docs.openstack.org/grizzly/openstack-network/admin/content/install_ubuntu.html
 ) there is a sentence "quantum-lbaas-agent, etc (see below for more 
information about individual services agents)." in the pluggin installation 
section. However, lbaas is never mentioned again after that in the doc.


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova] generate_sample.sh

2013-07-19 Thread Matt Riedemann
Looks like it's complaining because you changed nova.conf.sample.  Based 
on the readme:

https://github.com/openstack/nova/tree/master/tools/conf 

Did you running ./tools/conf/analyze_opts.py?  I'm assuming you need to 
run the tools and if there are issues you have to resolve them before 
pushing up your changes.  I've personally never ran this though.



Thanks,

MATT RIEDEMANN
Advisory Software Engineer
Cloud Solutions and OpenStack Development

Phone: 1-507-253-7622 | Mobile: 1-507-990-1889
E-mail: mrie...@us.ibm.com


3605 Hwy 52 N
Rochester, MN 55901-1407
United States




From:   Gary Kotton 
To: "OpenStack Development Mailing List 
(openstack-dev@lists.openstack.org)" , 
Date:   07/19/2013 07:03 PM
Subject:[openstack-dev] [nova] generate_sample.sh



Hi,
I have run into a problem with pep8 for 
https://review.openstack.org/#/c/37539/. The issue is that have run the 
script in the subject and the pep8 fails.
Any ideas?
Thanks
Gary___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

<>___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


[openstack-dev] problems in pypi.openstack.org

2013-07-19 Thread Gareth
Hi

https://jenkins.openstack.org/job/gate-glance-python27/4896/console

xattr can't be installed correctly.

And in http://pypi.openstack.org/openstack/xattr/

the xattr-0.7.1 was just updated several hours ago. Are any problems in it?

BTW, Swift and Glance are influenced.

--
Gareth

*Cloud Computing, OpenStack, Fitness, Basketball*
*OpenStack contributor*
*Company: UnitedStack *
*My promise: if you find any spelling or grammar mistakes in my email from
Mar 1 2013, notify me *
*and I'll donate $1 or ¥1 to an open organization you specify.*
___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] Revert Pass instance host-id to Quantum using port bindings extension.

2013-07-19 Thread Kyle Mestery (kmestery)
On Jul 19, 2013, at 6:01 PM, Aaron Rosen  wrote:
> 
> On Fri, Jul 19, 2013 at 3:37 PM, Ian Wells  wrote:
> > [arosen] - sure, in this case though then we'll have to add even more
> > queries between nova-compute and quantum as nova-compute will need to query
> > quantum for ports matching the device_id to see if the port was already
> > created and if not try to create them.
> 
> The cleanup job doesn't look like a job for nova-compute regardless of the 
> rest.
> 
> > Moving the create may for other reasons be a good idea (because compute
> > would *always* deal with ports and *never* with networks - a simpler API) -
> > but it's nothing to do with solving this problem.
> >
> > [arosen] - It does solve this issue because it moves the quantum port-create
> > calls outside of the retry schedule logic on that compute node. Therefore if
> > the port fails to create the instance goes to error state.  Moving networks
> > out of the nova-api will also solve this issue for us as the client then
> > won't rely on nova anymore to create the port. I'm wondering if creating an
> > additional network_api_class like nova.network.quantumv2.api.NoComputeAPI is
> > the way to prove this out. Most of the code in there would inherit from
> > nova.network.quantumv2.api.API .
> 
> OK, so if we were to say that:
> 
> - nova-api creates the port with an expiry timestamp to catch orphaned
> autocreated ports
> 
> I don't think we want to put a timestamp there. We can figure out which ports 
> are orphaned by checking if a port's device_id in quantum is still an active 
> instance_id in nova (which currently isn't true but would be if the 
> port-create is moved out of compute to api) and that device_owner is 
> nova:compute. 
>  
> - nova-compute always uses port-update (or, better still, have a
> distinct call that for now works like port-update but clearly
> represents an attach or detach and not a user-initiated update,
> improving the plugin division of labour, but that can be a separate
> proposal) and *never* creates a port; attaching to an
> apparently-attached port attached to the same instance should ensure
> that a previous attachment is destroyed, which should cover the
> multiple-schedule lost-reply case
> 
> agree 
> - nova-compute is always talked to in terms of ports, and never in
> terms of networks (a big improvement imo)
> 
> agree!
> 
> - nova-compute attempts to remove autocreated ports on detach 
> - a cleanup job in nova-api (or nova-conductor?) cleans up expired
> autocreated ports with no attachment or a broken attachment (which
> would catch failed detachments as well as failed schedules)  
> 
> how does that work for people?  It seems to improve the internal
> interface and the transactionality, it means that there's not the
> slightly nasty (and even faintly race-prone) create-update logic in
> nova-compute, it even simplifies the nova-compute interface - though
> we would need to consider how an upgrade path would work, there; newer
> API with older compute should work fine, the reverse not so much.
> 
> I agree, ensuring backwards compatibility if the compute nodes are updated 
> and not the api nodes would be slightly tricky. I'd hope we could get away 
> with releasing noting that you need to update the api nodes first. 
>  

This all sounds good to me as well. And release noting the update procedure 
seems reasonable for upgrades as well.
___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


[openstack-dev] [Neutron] lbaas installation guide

2013-07-19 Thread Anne Gentle
Thanks for bringing it to the attention of the list -- I've logged this doc
bug. https://bugs.launchpad.net/openstack-manuals/+bug/1203230 Hopefully a
Neutron team member can pick it up and investigate.

Anne


On Fri, Jul 19, 2013 at 7:35 PM, Qing He  wrote:

>  In the network installation guide(
> http://docs.openstack.org/grizzly/openstack-network/admin/content/install_ubuntu.html)
>  there is a sentence “quantum-lbaas-agent, etc (see below for more
> information about individual services agents).” in the pluggin installation
> section. However, lbaas is never mentioned again after that in the doc.***
> *
>
> ** **
>
> ___
> OpenStack-dev mailing list
> OpenStack-dev@lists.openstack.org
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>
>
___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


[openstack-dev] [nova] generate_sample.sh

2013-07-19 Thread Gary Kotton
Hi,
I have run into a problem with pep8 for 
https://review.openstack.org/#/c/37539/. The issue is that have run the script 
in the subject and the pep8 fails.
Any ideas?
Thanks
Gary
___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Netron] lbaas installation guide

2013-07-19 Thread Qing He
In the network installation guide( 
http://docs.openstack.org/grizzly/openstack-network/admin/content/install_ubuntu.html
 ) there is a sentence “quantum-lbaas-agent, etc (see below for more 
information about individual services agents).” in the pluggin installation 
section. However, lbaas is never mentioned again after that in the doc.

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] A simple way to improve nova scheduler

2013-07-19 Thread Joe Gordon
On Fri, Jul 19, 2013 at 3:13 PM, Sandy Walsh wrote:

>
>
> On 07/19/2013 05:36 PM, Boris Pavlovic wrote:
> > Sandy,
> >
> > I don't think that we have such problems here.
> > Because scheduler doesn't pool compute_nodes.
> > The situation is another compute_nodes notify scheduler about their
> > state. (instead of updating their state in DB)
> >
> > So for example if scheduler send request to compute_node, compute_node
> > is able to run rpc call to schedulers immediately (not after 60sec).
> >
> > So there is almost no races.
>
> There are races that occur between the eventlet request threads. This is
> why the scheduler has been switched to single threaded and we can only
> run one scheduler.
>
> This problem may have been eliminated with the work that Chris Behrens
> and Brian Elliott were doing, but I'm not sure.
>


Speaking of Chris Beherns  "Relying on anything but the DB for current
memory free, etc, is just too laggy… so we need to stick with it, IMO."
http://lists.openstack.org/pipermail/openstack-dev/2013-June/010485.html

Although there is some elegance to the proposal here I have some concerns.

If just using RPC broadcasts from compute to schedulers to keep track of
things, we get two issues:

* How do you bring a new scheduler up in an existing deployment and make it
get the full state of the system?
* Broadcasting RPC updates from compute nodes to the scheduler means every
scheduler has to process  the same RPC message.  And if a deployment hits
the point where the number of compute updates is consuming 99 percent of
the scheduler's time just adding another scheduler won't fix anything as it
will get bombarded too.

Also OpenStack is already deeply invested in using the central DB model for
the state of the 'world' and while I am not against changing that, I think
we should evaluate that switch in a larger context.



>
> But certainly, the old approach of having the compute node broadcast
> status every N seconds is not suitable and was eliminated a long time ago.
>
> >
> >
> > Best regards,
> > Boris Pavlovic
> >
> > Mirantis Inc.
> >
> >
> >
> > On Sat, Jul 20, 2013 at 12:23 AM, Sandy Walsh  > > wrote:
> >
> >
> >
> > On 07/19/2013 05:01 PM, Boris Pavlovic wrote:
> > > Sandy,
> > >
> > > Hm I don't know that algorithm. But our approach doesn't have
> > > exponential exchange.
> > > I don't think that in 10k nodes cloud we will have a problems with
> 150
> > > RPC call/sec. Even in 100k we will have only 1.5k RPC call/sec.
> > > More then (compute nodes update their state in DB through conductor
> > > which produce the same count of RPC calls).
> > >
> > > So I don't see any explosion here.
> >
> > Sorry, I was commenting on Soren's suggestion from way back
> (essentially
> > listening on a separate exchange for each unique flavor ... so no
> > scheduler was needed at all). It was a great idea, but fell apart
> rather
> > quickly.
> >
> > The existing approach the scheduler takes is expensive (asking the db
> > for state of all hosts) and polling the compute nodes might be
> do-able,
> > but you're still going to have latency problems waiting for the
> > responses (the states are invalid nearly immediately, especially if a
> > fill-first scheduling algorithm is used). We ran into this problem
> > before in an earlier scheduler implementation. The round-tripping
> kills.
> >
> > We have a lot of really great information on Host state in the form
> of
> > notifications right now. I think having a service (or notification
> > driver) listening for these and keeping an the HostState
> incrementally
> > updated (and reported back to all of the schedulers via the fanout
> > queue) would be a better approach.
> >
> > -S
> >
> >
> > >
> > > Best regards,
> > > Boris Pavlovic
> > >
> > > Mirantis Inc.
> > >
> > >
> > > On Fri, Jul 19, 2013 at 11:47 PM, Sandy Walsh
> > mailto:sandy.wa...@rackspace.com>
> > >  > >> wrote:
> > >
> > >
> > >
> > > On 07/19/2013 04:25 PM, Brian Schott wrote:
> > > > I think Soren suggested this way back in Cactus to use MQ
> > for compute
> > > > node state rather than database and it was a good idea then.
> > >
> > > The problem with that approach was the number of queues went
> > exponential
> > > as soon as you went beyond simple flavors. Add Capabilities or
> > other
> > > criteria and you get an explosion of exchanges to listen to.
> > >
> > >
> > >
> > > > On Jul 19, 2013, at 10:52 AM, Boris Pavlovic
> > mailto:bo...@pavlovic.me>
> > > >
> > > > 
> > 

Re: [openstack-dev] Revert Pass instance host-id to Quantum using port bindings extension.

2013-07-19 Thread Aaron Rosen
On Fri, Jul 19, 2013 at 3:37 PM, Ian Wells  wrote:

> > [arosen] - sure, in this case though then we'll have to add even more
> > queries between nova-compute and quantum as nova-compute will need to
> query
> > quantum for ports matching the device_id to see if the port was already
> > created and if not try to create them.
>
> The cleanup job doesn't look like a job for nova-compute regardless of the
> rest.
>
> > Moving the create may for other reasons be a good idea (because compute
> > would *always* deal with ports and *never* with networks - a simpler
> API) -
> > but it's nothing to do with solving this problem.
> >
> > [arosen] - It does solve this issue because it moves the quantum
> port-create
> > calls outside of the retry schedule logic on that compute node.
> Therefore if
> > the port fails to create the instance goes to error state.  Moving
> networks
> > out of the nova-api will also solve this issue for us as the client then
> > won't rely on nova anymore to create the port. I'm wondering if creating
> an
> > additional network_api_class like
> nova.network.quantumv2.api.NoComputeAPI is
> > the way to prove this out. Most of the code in there would inherit from
> > nova.network.quantumv2.api.API .
>
> OK, so if we were to say that:
>
> - nova-api creates the port with an expiry timestamp to catch orphaned
> autocreated ports
>

I don't think we want to put a timestamp there. We can figure out which
ports are orphaned by checking if a port's device_id in quantum is still an
active instance_id in nova (which currently isn't true but would be if the
port-create is moved out of compute to api) and that device_owner is
nova:compute.


> - nova-compute always uses port-update (or, better still, have a
> distinct call that for now works like port-update but clearly
> represents an attach or detach and not a user-initiated update,
> improving the plugin division of labour, but that can be a separate
> proposal) and *never* creates a port; attaching to an
> apparently-attached port attached to the same instance should ensure
> that a previous attachment is destroyed, which should cover the
> multiple-schedule lost-reply case
>

agree

> - nova-compute is always talked to in terms of ports, and never in
> terms of networks (a big improvement imo)
>

agree!

- nova-compute attempts to remove autocreated ports on detach

- a cleanup job in nova-api (or nova-conductor?) cleans up expired
> autocreated ports with no attachment or a broken attachment (which
> would catch failed detachments as well as failed schedules)


> how does that work for people?  It seems to improve the internal
> interface and the transactionality, it means that there's not the
> slightly nasty (and even faintly race-prone) create-update logic in
> nova-compute, it even simplifies the nova-compute interface - though
> we would need to consider how an upgrade path would work, there; newer
> API with older compute should work fine, the reverse not so much.
>

I agree, ensuring backwards compatibility if the compute nodes are updated
and not the api nodes would be slightly tricky. I'd hope we could get away
with releasing noting that you need to update the api nodes first.


> --
> Ian.
>
> ___
> OpenStack-dev mailing list
> OpenStack-dev@lists.openstack.org
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>
___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] Revert Pass instance host-id to Quantum using port bindings extension.

2013-07-19 Thread Ian Wells
> [arosen] - sure, in this case though then we'll have to add even more
> queries between nova-compute and quantum as nova-compute will need to query
> quantum for ports matching the device_id to see if the port was already
> created and if not try to create them.

The cleanup job doesn't look like a job for nova-compute regardless of the rest.

> Moving the create may for other reasons be a good idea (because compute
> would *always* deal with ports and *never* with networks - a simpler API) -
> but it's nothing to do with solving this problem.
>
> [arosen] - It does solve this issue because it moves the quantum port-create
> calls outside of the retry schedule logic on that compute node. Therefore if
> the port fails to create the instance goes to error state.  Moving networks
> out of the nova-api will also solve this issue for us as the client then
> won't rely on nova anymore to create the port. I'm wondering if creating an
> additional network_api_class like nova.network.quantumv2.api.NoComputeAPI is
> the way to prove this out. Most of the code in there would inherit from
> nova.network.quantumv2.api.API .

OK, so if we were to say that:

- nova-api creates the port with an expiry timestamp to catch orphaned
autocreated ports
- nova-compute always uses port-update (or, better still, have a
distinct call that for now works like port-update but clearly
represents an attach or detach and not a user-initiated update,
improving the plugin division of labour, but that can be a separate
proposal) and *never* creates a port; attaching to an
apparently-attached port attached to the same instance should ensure
that a previous attachment is destroyed, which should cover the
multiple-schedule lost-reply case
- nova-compute is always talked to in terms of ports, and never in
terms of networks (a big improvement imo)
- nova-compute attempts to remove autocreated ports on detach
- a cleanup job in nova-api (or nova-conductor?) cleans up expired
autocreated ports with no attachment or a broken attachment (which
would catch failed detachments as well as failed schedules)

how does that work for people?  It seems to improve the internal
interface and the transactionality, it means that there's not the
slightly nasty (and even faintly race-prone) create-update logic in
nova-compute, it even simplifies the nova-compute interface - though
we would need to consider how an upgrade path would work, there; newer
API with older compute should work fine, the reverse not so much.
-- 
Ian.

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] Moving task flow to conductor - concern about scale

2013-07-19 Thread Joshua Harlow
I remember trying to make this argument myself about a month or 2 ago. I agree 
with the thought & splitting up "principle", just unsure of the timing.

Taskflow (the library) I am hoping can become a useful library for making these 
complicates less complex. WIP of course :)

Honestly I think it's not just nova that sees this issue with flows and how to 
scale them outwards reliably. But this is  one of the big challenges (changing 
the tires on the car while its moving)...

Sent from my really tiny device...

On Jul 19, 2013, at 7:01 AM, "Day, Phil"  wrote:

> Hi Josh,
> 
> My idea's really pretty simple - make "DB proxy" and "Task workflow" separate 
> services, and allow people to co-locate them if they want to.
> 
> Cheers.
> Phil
> 
>> -Original Message-
>> From: Joshua Harlow [mailto:harlo...@yahoo-inc.com]
>> Sent: 17 July 2013 14:57
>> To: OpenStack Development Mailing List
>> Cc: OpenStack Development Mailing List
>> Subject: Re: [openstack-dev] Moving task flow to conductor - concern about
>> scale
>> 
>> Hi Phil,
>> 
>> I understand and appreciate your concern and I think everyone is trying to 
>> keep
>> that in mind. It still appears to me to be to early in this refactoring and 
>> task
>> restructuring effort to tell where it may "end up". I think that's also good 
>> news
>> since we can get these kinds of ideas (componentized conductors if u will) to
>> handle your (and mine) scaling concerns. It would be pretty neat if said
>> conductors could be scaled at different rates depending on there component,
>> although as u said we need to get much much better with handling said
>> patterns (as u said just 2 schedulers is a pita right now). I believe we can 
>> do it,
>> given the right kind of design and scaling "principles" we build in from the 
>> start
>> (right now).
>> 
>> Would like to hear more of your ideas so they get incorporated earlier rather
>> than later.
>> 
>> Sent from my really tiny device..
>> 
>> On Jul 16, 2013, at 9:55 AM, "Dan Smith"  wrote:
>> 
 In the original context of using Conductor as a database proxy then
 the number of conductor instances is directly related to the number
 of compute hosts I need them to serve.
>>> 
>>> Just a point of note, as far as I know, the plan has always been to
>>> establish conductor as a thing that sits between the api and compute
>>> nodes. However, we started with the immediate need, which was the
>>> offloading of database traffic.
>>> 
 What I not sure is that I would also want to have the same number of
 conductor instances for task control flow - historically even running
 2 schedulers has been a problem, so the thought of having 10's of
 them makes me very concerned at the moment.   However I can't see any
 way to specialise a conductor to only handle one type of request.
>>> 
>>> Yeah, I don't think the way it's currently being done allows for
>>> specialization.
>>> 
>>> Since you were reviewing actual task code, can you offer any specifics
>>> about the thing(s) that concern you? I think that scaling conductor
>>> (and its tasks) horizontally is an important point we need to achieve,
>>> so if you see something that needs tweaking, please point it out.
>>> 
>>> Based on what is there now and proposed soon, I think it's mostly
>>> fairly safe, straightforward, and really no different than what two
>>> computes do when working together for something like resize or migrate.
>>> 
 So I guess my question is, given that it may have to address two
 independent scale drivers, is putting task work flow and DB proxy
 functionality into the same service really the right thing to do - or
 should there be some separation between them.
>>> 
>>> I think that we're going to need more than one "task" node, and so it
>>> seems appropriate to locate one scales-with-computes function with
>>> another.
>>> 
>>> Thanks!
>>> 
>>> --Dan
>>> 
>>> ___
>>> OpenStack-dev mailing list
>>> OpenStack-dev@lists.openstack.org
>>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>> 
>> ___
>> OpenStack-dev mailing list
>> OpenStack-dev@lists.openstack.org
>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
> 
> ___
> OpenStack-dev mailing list
> OpenStack-dev@lists.openstack.org
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Nova] New DB column or new DB table?

2013-07-19 Thread Jiang, Yunhong
The  "lazy load" is , with lazy load, for example, the framework don't need 
fetch the PCI information if no PCI filter specified.

The discussion on 
'http://markmail.org/message/gxoqi6coscd2lhwo#query:+page:1+mid:7ksr6byyrpcgkqjv+state:results'
   gives a lot of information.

--jyh



From: Boris Pavlovic [mailto:bo...@pavlovic.me]
Sent: Friday, July 19, 2013 1:07 PM
To: OpenStack Development Mailing List
Subject: Re: [openstack-dev] [Nova] New DB column or new DB table?

Jiang,

I would like to reduce "magic"

1) We are using already RPC (because all compute nodes update are done in DB 
via conductor (which means RPC call).
So count of RPC calls and size of msg will be the same.

2) There is no lazy load when you have to fetch all data about all compute 
nodes on every request to scheduler.

3) Object models are off topic

Best regards,
Boris Pavlovic

Mirantis Inc.



On Fri, Jul 19, 2013 at 11:23 PM, Jiang, Yunhong 
mailto:yunhong.ji...@intel.com>> wrote:
Boris
   I think you in fact covered two topic, one is if use db or rpc for 
communication. This has been discussed a lot. But I didn't find the conclusion. 
From the discussion,  seems the key thing is the fan out messages. I'd suggest 
you to bring this to scheduler sub meeting.

http://eavesdrop.openstack.org/meetings/scheduler/2013/scheduler.2013-06-11-14.59.log.html
http://www.mail-archive.com/openstack-dev@lists.openstack.org/msg00070.html
http://comments.gmane.org/gmane.comp.cloud.openstack.devel/23

   The second topic is adding extra tables to compute nodes. I think we 
need the lazy loading for the compute node, and also I think with object model, 
we can further improve it if we utilize the compute node object.

Thanks
--jyh


From: Boris Pavlovic [mailto:bo...@pavlovic.me]
Sent: Friday, July 19, 2013 10:07 AM

To: OpenStack Development Mailing List
Subject: Re: [openstack-dev] [Nova] New DB column or new DB table?

Hi all,

We have to much different branches about scheduler (so I have to repeat here 
also).

I am against to add some extra tables that will be joined to compute_nodes 
table on each scheduler request (or adding large text columns).
Because it make our non scalable scheduler even less scalable.

Also if we just remove DB between scheduler and compute nodes we will get 
really good improvement in all aspects (performance, db load, network traffic, 
scalability )
And also it will be easily to use another resources provider (cinder, 
ceilometer e.g..) in Nova scheduler.

And one more thing this all could be really simple implement in current Nova, 
without big changes
 
https://docs.google.com/document/d/1_DRv7it_mwalEZzLy5WO92TJcummpmWL4NWsWf0UWiQ/edit?usp=sharing


Best regards,
Boris Pavlovic

Mirantis Inc.

On Fri, Jul 19, 2013 at 8:44 PM, Dan Smith 
mailto:d...@danplanet.com>> wrote:
> IIUC, Ceilometer is currently a downstream consumer of data from
> Nova, but no functionality in Nova is a consumer of data from
> Ceilometer. This is good split from a security separation point of
> view, since the security of Nova is self-contained in this
> architecture.
>
> If Nova schedular becomes dependant on data from ceilometer, then now
> the security of Nova depends on the security of Ceilometer, expanding
> the attack surface. This is not good architecture IMHO.
Agreed.

> At the same time, I hear your concerns about the potential for
> duplication of stats collection functionality between Nova &
> Ceilometer. I don't think we neccessarily need to remove 100% of
> duplication. IMHO probably the key thing is for the virt drivers to
> expose a standard API for exporting the stats, and make sure that
> both ceilometer & nova schedular use the same APIs and ideally the
> same data feed, so we're not invoking the same APIs twice to get the
> same data.
I imagine there's quite a bit that could be shared, without dependency
between the two. Interfaces out of the virt drivers may be one, and the
code that boils numbers into useful values, as well as perhaps the
format of the JSON blobs that are getting shoved into the database.
Perhaps a ceilo-core library with some very simple primitives and
definitions could be carved out, which both nova and ceilometer could
import for consistency, without a runtime dependency?

--Dan

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] Moving task flow to conductor - concern about scale

2013-07-19 Thread Joshua Harlow
This seems to me to be a good example where a library "problem" is leaking into 
the openstack architecture right? That is IMHO a bad path to go down.

I like to think of a world where this isn't a problem and design the correct 
solution there instead and fix the eventlet problem instead. Other large 
applications don't fallback to rpc calls to get around a database/eventlet 
scaling issues afaik. 

Honestly I would almost just want to finally fix the eventlet problem (chris b. 
I think has been working on it) and design a system that doesn't try to work 
around a libraries lacking. But maybe that's to much idealism, idk...

This doesn't even touch on the synchronization issues that can happen when u 
start pumping db traffic over a mq. Ex, an update is now queued behind another 
update, the second one conflicts with the first, where does resolution happen 
when an async mq call is used. What about when you have X conductors doing Y 
reads and Z updates; I don't even want to think about the sync/races there (and 
so on...). Did u hit / check for any consistency issues in your tests? 
Consistency issues under high load using multiple conductors scare the bejezzus 
out of me

Sent from my really tiny device...

On Jul 19, 2013, at 10:58 AM, "Peter Feiner"  wrote:

> On Fri, Jul 19, 2013 at 10:15 AM, Dan Smith  wrote:
>> 
>>> So rather than asking "what doesn't work / might not work in the
>>> future" I think the question should be "aside from them both being
>>> things that could be described as a conductor - what's the
>>> architectural reason for wanting to have these two separate groups of
>>> functionality in the same service ?"
>> 
>> IMHO, the architectural reason is "lack of proliferation of services and
>> the added complexity that comes with it." If one expects the
>> proxy workload to always overshadow the task workload, then making
>> these two things a single service makes things a lot simpler.
> 
> I'd like to point a low-level detail that makes scaling nova-conductor
> at the process level extremely compelling: the database driver
> blocking the eventlet thread serializes nova's database access.
> 
> Since the database connection driver is typically implemented in a
> library beyond the purview of eventlet's monkeypatching (i.e., a
> native python extension like _mysql.so), blocking database calls will
> block all eventlet coroutines. Since most of what nova-conductor does
> is access the database, a nova-conductor process's handling of
> requests is effectively serial.
> 
> Nova-conductor is the gateway to the database for nova-compute
> processes.  So permitting a single nova-conductor process would
> effectively serialize all database queries during instance creation,
> deletion, periodic instance refreshes, etc. Since these queries are
> made frequently (i.e., easily 100 times during instance creation) and
> while other global locks are held (e.g., in the case of nova-compute's
> ResourceTracker), most of what nova-compute does becomes serialized.
> 
> In parallel performance experiments I've done, I have found that
> running multiple nova-conductor processes is the best way to mitigate
> the serialization of blocking database calls. Say I am booting N
> instances in parallel (usually up to N=40). If I have a single
> nova-conductor process, the duration of each nova-conductor RPC
> increases linearly with N, which can add _minutes_ to instance
> creation time (i.e., dozens of RPCs, some taking several seconds).
> However, if I run N nova-conductor processes in parallel, then the
> duration of the nova-conductor RPCs do not increase with N; since each
> RPC is most likely handled by a different nova-conductor, serial
> execution of each process is moot.
> 
> Note that there are alternative methods for preventing the eventlet
> thread from blocking during database calls. However, none of these
> alternatives performed as well as multiple nova-conductor processes:
> 
> Instead of using the native database driver like _mysql.so, you can
> use a pure-python driver, like pymysql by setting
> sql_connection=mysql+pymysql://... in the [DEFAULT] section of
> /etc/nova/nova.conf, which eventlet will monkeypatch to avoid
> blocking. The problem with this approach is the vastly greater CPU
> demand of the pure-python driver compared to the native driver. Since
> the pure-python driver is so much more CPU intensive, the eventlet
> thread spends most of its time talking to the database, which
> effectively the problem we had before!
> 
> Instead of making database calls from eventlet's thread, you can
> submit them to eventlet's pool of worker threads and wait for the
> results. Try this by setting dbapi_use_tpool=True in the [DEFAULT]
> section of /etc/nova/nova.conf. The problem I found with this approach
> was the overhead of synchronizing with the worker threads. In
> particular, the time elapsed between the worker thread finishing and
> the waiting coroutine being resumed was typically several

Re: [openstack-dev] A simple way to improve nova scheduler

2013-07-19 Thread Sandy Walsh


On 07/19/2013 05:36 PM, Boris Pavlovic wrote:
> Sandy,
> 
> I don't think that we have such problems here.
> Because scheduler doesn't pool compute_nodes. 
> The situation is another compute_nodes notify scheduler about their
> state. (instead of updating their state in DB)
> 
> So for example if scheduler send request to compute_node, compute_node
> is able to run rpc call to schedulers immediately (not after 60sec).
> 
> So there is almost no races.

There are races that occur between the eventlet request threads. This is
why the scheduler has been switched to single threaded and we can only
run one scheduler.

This problem may have been eliminated with the work that Chris Behrens
and Brian Elliott were doing, but I'm not sure.

But certainly, the old approach of having the compute node broadcast
status every N seconds is not suitable and was eliminated a long time ago.

> 
> 
> Best regards,
> Boris Pavlovic
> 
> Mirantis Inc. 
> 
> 
> 
> On Sat, Jul 20, 2013 at 12:23 AM, Sandy Walsh  > wrote:
> 
> 
> 
> On 07/19/2013 05:01 PM, Boris Pavlovic wrote:
> > Sandy,
> >
> > Hm I don't know that algorithm. But our approach doesn't have
> > exponential exchange.
> > I don't think that in 10k nodes cloud we will have a problems with 150
> > RPC call/sec. Even in 100k we will have only 1.5k RPC call/sec.
> > More then (compute nodes update their state in DB through conductor
> > which produce the same count of RPC calls).
> >
> > So I don't see any explosion here.
> 
> Sorry, I was commenting on Soren's suggestion from way back (essentially
> listening on a separate exchange for each unique flavor ... so no
> scheduler was needed at all). It was a great idea, but fell apart rather
> quickly.
> 
> The existing approach the scheduler takes is expensive (asking the db
> for state of all hosts) and polling the compute nodes might be do-able,
> but you're still going to have latency problems waiting for the
> responses (the states are invalid nearly immediately, especially if a
> fill-first scheduling algorithm is used). We ran into this problem
> before in an earlier scheduler implementation. The round-tripping kills.
> 
> We have a lot of really great information on Host state in the form of
> notifications right now. I think having a service (or notification
> driver) listening for these and keeping an the HostState incrementally
> updated (and reported back to all of the schedulers via the fanout
> queue) would be a better approach.
> 
> -S
> 
> 
> >
> > Best regards,
> > Boris Pavlovic
> >
> > Mirantis Inc.
> >
> >
> > On Fri, Jul 19, 2013 at 11:47 PM, Sandy Walsh
> mailto:sandy.wa...@rackspace.com>
> >  >> wrote:
> >
> >
> >
> > On 07/19/2013 04:25 PM, Brian Schott wrote:
> > > I think Soren suggested this way back in Cactus to use MQ
> for compute
> > > node state rather than database and it was a good idea then.
> >
> > The problem with that approach was the number of queues went
> exponential
> > as soon as you went beyond simple flavors. Add Capabilities or
> other
> > criteria and you get an explosion of exchanges to listen to.
> >
> >
> >
> > > On Jul 19, 2013, at 10:52 AM, Boris Pavlovic
> mailto:bo...@pavlovic.me>
> > >
> > > 
>  > >
> > >> Hi all,
> > >>
> > >>
> > >> In Mirantis Alexey Ovtchinnikov and me are working on nova
> scheduler
> > >> improvements.
> > >>
> > >> As far as we can see the problem, now scheduler has two
> major issues:
> > >>
> > >> 1) Scalability. Factors that contribute to bad scalability
> are these:
> > >> *) Each compute node every periodic task interval (60 sec
> by default)
> > >> updates resources state in DB.
> > >> *) On every boot request scheduler has to fetch information
> about all
> > >> compute nodes from DB.
> > >>
> > >> 2) Flexibility. Flexibility perishes due to problems with:
> > >> *) Addiing new complex resources (such as big lists of complex
> > objects
> > >> e.g. required by PCI Passthrough
> > >>
> >
> https://review.openstack.org/#/c/34644/5/nova/db/sqlalchemy/models.py)
> > >> *) Using different sources of data in Scheduler for example
> from
> > >> cinder or ceilometer.
> > >> (as required by Volume Affinity Filter
> > >> https://review.openstack.org/#/c/29343/)
> > >>
> > >>

Re: [openstack-dev] Revert Pass instance host-id to Quantum using port bindings extension.

2013-07-19 Thread Aaron Rosen
On Fri, Jul 19, 2013 at 8:47 AM, Kyle Mestery (kmestery)  wrote:

> On Jul 18, 2013, at 5:16 PM, Aaron Rosen  wrote:
> >
> > Hi,
> >
> > I wanted to raise another design failure of why creating the port on
> nova-compute is bad. Previously, we have encountered this bug (
> https://bugs.launchpad.net/neutron/+bug/1160442). What was causing the
> issue was that when nova-compute calls into quantum to create the port;
> quantum creates the port but fails to return the port to nova and instead
> timesout. When this happens the instance is scheduled to be run on another
> compute node where another port is created with the same device_id and when
> the instance boots it will look like it has two ports. This is still a
> problem that can occur today in our current implementation (!).
> >
> > I think in order to move forward with this we'll need to compromise.
> Here is my though on how we should proceed.
> >
> > 1) Modify the quantum API so that mac addresses can now be updated via
> the api. There is no reason why we have this limitation (especially once
> the patch that uses dhcp_release is merged as it will allow us to update
> the lease for the new mac immediately).  We need to do this in order for
> bare metal support as we need to match the mac address of the port to the
> compute node.
> >
> I don't understand how this relates to creating a port through
> nova-compute. I'm not saying this is a bad idea, I just don't see how it
> relates to the original discussion point on this thread around Yong's patch.
>
> > 2) move the port-creation from nova-compute to nova-api. This will solve
> a number of issues like the one i pointed out above.
> >
> This seems like a bad idea. So now a Nova API call will implicitly create
> a Neutron port? What happens on failure here? The caller isn't aware the
> port was created in Neutron if it's implicit, so who cleans things up? Or
> if the caller is aware, than all we've done is move an API the caller would
> have done (nova-compute in this case) into nova-api, though the caller is
> now still aware of what's happening.
>

On failure here the VM will go to ERROR state if the port is failed to
create in quantum. Then when deleting the instance; the delete code should
also search quantum for the device_id in order to remove the port there as
well.

 The issue here is that if an instance fails to boot on a compute node
(because nova-compute did not get the port-create response from quantum and
the port was actually created) the instance gets scheduled to be booted on
another nova-compute node where the duplicate create happens. Moving the
creation to the API node removes the port from getting created in the retry
logic that solves this.

>
> > 3)  For now, i'm okay with leaving logic on the compute node that calls
> update-port if the port binding extension is loaded. This will allow the
> vif type to be correctly set as well.
> >
> And this will also still pass in the hostname the VM was booted on?
>
> In this case there would have to be an update-port call done on the
compute node which would set the hostname (which is the same case as live
migration).


> To me, this thread seems to have diverged a bit from the original
> discussion point around Yong's patch. Yong's patch makes sense, because
> it's passing the hostname the VM is booted on during port create. It also
> updates the binding during a live migration, so that case is covered. Any
> change to this behavior should cover both those cases and not involve any
> sort of agent polling, IMHO.
>
> Thanks,
> Kyle
>
> > Thoughts/Comments?
> >
> > Thanks,
> >
> > Aaron
> >
> >
> > On Mon, Jul 15, 2013 at 2:45 PM, Aaron Rosen  wrote:
> >
> >
> >
> > On Mon, Jul 15, 2013 at 1:26 PM, Robert Kukura 
> wrote:
> > On 07/15/2013 03:54 PM, Aaron Rosen wrote:
> > >
> > >
> > >
> > > On Sun, Jul 14, 2013 at 6:48 PM, Robert Kukura  > > > wrote:
> > >
> > > On 07/12/2013 04:17 PM, Aaron Rosen wrote:
> > > > Hi,
> > > >
> > > >
> > > > On Fri, Jul 12, 2013 at 6:47 AM, Robert Kukura <
> rkuk...@redhat.com
> > > 
> > > > >> wrote:
> > > >
> > > > On 07/11/2013 04:30 PM, Aaron Rosen wrote:
> > > > > Hi,
> > > > >
> > > > > I think we should revert this patch that was added here
> > > > > (https://review.openstack.org/#/c/29767/). What this patch
> > > does is
> > > > when
> > > > > nova-compute calls into quantum to create the port it
> passes
> > > in the
> > > > > hostname on which the instance was booted on. The idea of
> the
> > > > patch was
> > > > > that providing this information would "allow hardware
> device
> > > vendors
> > > > > management stations to allow them to segment the network in
> > > a more
> > > > > precise manager (for example automatically trunk the vlan
> on the
> >

Re: [openstack-dev] Revert Pass instance host-id to Quantum using port bindings extension.

2013-07-19 Thread Kyle Mestery (kmestery)
On Jul 19, 2013, at 1:58 PM, Aaron Rosen  wrote:
> 
> 
> 
> 
> On Fri, Jul 19, 2013 at 8:47 AM, Kyle Mestery (kmestery)  
> wrote:
> On Jul 18, 2013, at 5:16 PM, Aaron Rosen  wrote:
> >
> > Hi,
> >
> > I wanted to raise another design failure of why creating the port on 
> > nova-compute is bad. Previously, we have encountered this bug 
> > (https://bugs.launchpad.net/neutron/+bug/1160442). What was causing the 
> > issue was that when nova-compute calls into quantum to create the port; 
> > quantum creates the port but fails to return the port to nova and instead 
> > timesout. When this happens the instance is scheduled to be run on another 
> > compute node where another port is created with the same device_id and when 
> > the instance boots it will look like it has two ports. This is still a 
> > problem that can occur today in our current implementation (!).
> >
> > I think in order to move forward with this we'll need to compromise. Here 
> > is my though on how we should proceed.
> >
> > 1) Modify the quantum API so that mac addresses can now be updated via the 
> > api. There is no reason why we have this limitation (especially once the 
> > patch that uses dhcp_release is merged as it will allow us to update the 
> > lease for the new mac immediately).  We need to do this in order for bare 
> > metal support as we need to match the mac address of the port to the 
> > compute node.
> >
> I don't understand how this relates to creating a port through nova-compute. 
> I'm not saying this is a bad idea, I just don't see how it relates to the 
> original discussion point on this thread around Yong's patch.
> 
> > 2) move the port-creation from nova-compute to nova-api. This will solve a 
> > number of issues like the one i pointed out above.
> >
> This seems like a bad idea. So now a Nova API call will implicitly create a 
> Neutron port? What happens on failure here? The caller isn't aware the port 
> was created in Neutron if it's implicit, so who cleans things up? Or if the 
> caller is aware, than all we've done is move an API the caller would have 
> done (nova-compute in this case) into nova-api, though the caller is now 
> still aware of what's happening.
> 
> On failure here the VM will go to ERROR state if the port is failed to create 
> in quantum. Then when deleting the instance; the delete code should also 
> search quantum for the device_id in order to remove the port there as well.
> 
So, nova-compute will implicitly know the port was created by nova-api, and if 
a failure happens, it will clean up the port? That doesn't sound like a 
balanced solution to me, and seems to tie nova-compute and nova-api close 
together when it comes to launching VMs with Neutron ports.

>  The issue here is that if an instance fails to boot on a compute node 
> (because nova-compute did not get the port-create response from quantum and 
> the port was actually created) the instance gets scheduled to be booted on 
> another nova-compute node where the duplicate create happens. Moving the 
> creation to the API node removes the port from getting created in the retry 
> logic that solves this. 
> 
I think Ian's comments on your blueprint [1] address this exact problem, can 
you take a look at them there?

[1] https://blueprints.launchpad.net/nova/+spec/nova-api-quantum-create-port

> > 3)  For now, i'm okay with leaving logic on the compute node that calls 
> > update-port if the port binding extension is loaded. This will allow the 
> > vif type to be correctly set as well.
> >
> And this will also still pass in the hostname the VM was booted on?
> 
> In this case there would have to be an update-port call done on the compute 
> node which would set the hostname (which is the same case as live migration). 
>  
Just to be sure I understand, nova-compute will do this or this will be the 
responsibility of some neutron agent?

Thanks,
Kyle

> To me, this thread seems to have diverged a bit from the original discussion 
> point around Yong's patch. Yong's patch makes sense, because it's passing the 
> hostname the VM is booted on during port create. It also updates the binding 
> during a live migration, so that case is covered. Any change to this behavior 
> should cover both those cases and not involve any sort of agent polling, IMHO.
> 
> Thanks,
> Kyle
> 
> > Thoughts/Comments?
> >
> > Thanks,
> >
> > Aaron
> >
> >
> > On Mon, Jul 15, 2013 at 2:45 PM, Aaron Rosen  wrote:
> >
> >
> >
> > On Mon, Jul 15, 2013 at 1:26 PM, Robert Kukura  wrote:
> > On 07/15/2013 03:54 PM, Aaron Rosen wrote:
> > >
> > >
> > >
> > > On Sun, Jul 14, 2013 at 6:48 PM, Robert Kukura  > > > wrote:
> > >
> > > On 07/12/2013 04:17 PM, Aaron Rosen wrote:
> > > > Hi,
> > > >
> > > >
> > > > On Fri, Jul 12, 2013 at 6:47 AM, Robert Kukura  > > 
> > > > >> wrote:
> > > >
> > > > On 07/11/2013 04:3

Re: [openstack-dev] [keystone] sqlite doesn't support migrations

2013-07-19 Thread Joe Gordon
Along these lines, OpenStack is now maintaining sqlachemy-migrate, and we
have our first patch up for better SQLite support, taken from nova,
https://review.openstack.org/#/c/37656/

Do we want to go the direction of explicitly not supporting sqllite and not
running migrations with it, like keystone.  If so we may not want the patch
above to get merged.


On Wed, Jul 17, 2013 at 10:40 AM, Adam Young  wrote:

> On 07/16/2013 10:58 AM, Thomas Goirand wrote:
>
>> On 07/16/2013 03:55 PM, Michael Still wrote:
>>
>>> On Tue, Jul 16, 2013 at 4:17 PM, Thomas Goirand  wrote:
>>>
>>>  Could you explain a bit more what could be done to fix it in an easy
 way, even if it's not efficient? I understand that ALTER doesn't work
 well. Though would we have the possibility to just create a new
 temporary table with the correct fields, and copy the existing content
 in it, then rename the temp table so that it replaces the original one?

>>> There are a bunch of nova migrations that already work that way...
>>> Checkout the *sqlite* files in
>>> nova/db/sqlalchemy/migrate_**repo/versions/
>>>
>> Why can't we do that with Keystone then? Is it too much work? It doesn't
>> seem hard to do (just probably a bit annoying and boring ...). Does it
>> represent too much work for the case of Keystone?
>>
>> Thomas
>>
>>
>> __**_
>> OpenStack-dev mailing list
>> OpenStack-dev@lists.openstack.**org 
>> http://lists.openstack.org/**cgi-bin/mailman/listinfo/**openstack-dev
>>
> In general, yes, the SQlite migrations have eaten up far more time than
> the benefit they provide.
>
> Sqlite migrations don't buy us much.  A live deployment should not be on
> sqlite.  It is OK to use it for testing, though.  As such, it should be
> possible to generate a sqlite scheme off the model the way that the unit
> tests do.  You just will not be able to migrate it forward:  later changes
> will require regenerating the database.
>
>
Agreed, so lets change the default database away from SQLite everywhere and
we can just override the default when needed for testing.



> For any non-trivial deployent, we should encourage the use of Mysql or
> Postgresql.  Migrations for these will be full supported.
>
> From a unit test perspective, we are going to disable the sqlite migration
> tests.  Users that need to perform testing of migrations should instead
> test them against both mysql and postgresql. the IBMers out there will also
> be responsible for keeping on top of the DB2 variations.  The
> [My/postgre]Sql migration tests will be run as part of the gate.
>
> The current crop of unit tests for the SQL backend use the module based
> approach to generate a sqlite database.  This approach is fine.  Sqlite,
> when run in memory, provides a very fast sanity check of our logic.  MySQL
> and PostgreSQL versions would take far, far too long to execute for
> everyday development. We will, however, make it possible to execute those
> body of unit tests against both postgresql and mysql as part of the gate.
>  Wer will also execute against the LDAP backend.  IN order to be able to
> perform that, however, we need to fix bunch of the tests so that they run
> at 100% first.
>
>
>
>
> __**_
> OpenStack-dev mailing list
> OpenStack-dev@lists.openstack.**org 
> http://lists.openstack.org/**cgi-bin/mailman/listinfo/**openstack-dev
>
___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] A simple way to improve nova scheduler

2013-07-19 Thread Sandy Walsh


On 07/19/2013 04:25 PM, Brian Schott wrote:
> I think Soren suggested this way back in Cactus to use MQ for compute
> node state rather than database and it was a good idea then. 

The problem with that approach was the number of queues went exponential
as soon as you went beyond simple flavors. Add Capabilities or other
criteria and you get an explosion of exchanges to listen to.



> On Jul 19, 2013, at 10:52 AM, Boris Pavlovic  > wrote:
> 
>> Hi all, 
>>
>>
>> In Mirantis Alexey Ovtchinnikov and me are working on nova scheduler
>> improvements.
>>
>> As far as we can see the problem, now scheduler has two major issues:
>>
>> 1) Scalability. Factors that contribute to bad scalability are these:
>> *) Each compute node every periodic task interval (60 sec by default)
>> updates resources state in DB.
>> *) On every boot request scheduler has to fetch information about all
>> compute nodes from DB.
>>
>> 2) Flexibility. Flexibility perishes due to problems with:
>> *) Addiing new complex resources (such as big lists of complex objects
>> e.g. required by PCI Passthrough
>> https://review.openstack.org/#/c/34644/5/nova/db/sqlalchemy/models.py)
>> *) Using different sources of data in Scheduler for example from
>> cinder or ceilometer.
>> (as required by Volume Affinity Filter
>> https://review.openstack.org/#/c/29343/)
>>
>>
>> We found a simple way to mitigate this issues by avoiding of DB usage
>> for host state storage.
>>
>> A more detailed discussion of the problem state and one of a possible
>> solution can be found here:
>>
>> https://docs.google.com/document/d/1_DRv7it_mwalEZzLy5WO92TJcummpmWL4NWsWf0UWiQ/edit#
>>
>>
>> Best regards,
>> Boris Pavlovic
>>
>> Mirantis Inc.
>>
>> ___
>> OpenStack-dev mailing list
>> OpenStack-dev@lists.openstack.org
>> 
>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
> 
> 
> 
> ___
> OpenStack-dev mailing list
> OpenStack-dev@lists.openstack.org
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
> 

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Neutron] Chalenges with highly available service VMs - port adn security group options.

2013-07-19 Thread Aaron Rosen
On Fri, Jul 19, 2013 at 1:55 AM, Samuel Bercovici wrote:

>  Hi,
>
> ** **
>
> I have completely missed this discussion as it does not have
> quantum/Neutron in the subject (modify it now)
>
> I think that the security group is the right place to control this.
>
> I think that this might be only allowed to admins.
>
> **
>
I think this shouldn't be admin only since tenant's have control of their
own networks they should be allowed to do this.

> **
>
> Let me explain what we need which is more than just disable spoofing.
>
> **1.   **Be able to allow MACs which are not defined on the port
> level to transmit packets (for example VRRP MACs)== turn off MAC spoofing
>

For this it seems you would need to implement the port security extension
which allows one to enable/disable port spoofing on a port.

> 
>
> **2.   **Be able to allow IPs which are not defined on the port level
> to transmit packets (for example, IP used for HA service that moves between
> an HA pair) == turn off IP spoofing
>

It seems like this would fit your use case perfectly:
https://blueprints.launchpad.net/neutron/+spec/allowed-address-pairs

> 
>
> **3.   **Be able to allow broadcast message on the port (for example
> for VRRP broadcast) == allow broadcast.
>
> **
>
Quantum does have an abstraction for disabling this so we already allow
this by default.

> **
>
> ** **
>
> Regards,
>
> -Sam.
>
> ** **
>
> ** **
>
> *From:* Aaron Rosen [mailto:aro...@nicira.com]
> *Sent:* Friday, July 19, 2013 3:26 AM
> *To:* OpenStack Development Mailing List
> *Subject:* Re: [openstack-dev] Chalenges with highly available service VMs
> 
>
> ** **
>
> Yup: 
>
> I'm definitely happy to review and give hints. 
>
> Blueprint:
> https://docs.google.com/document/d/18trYtq3wb0eJK2CapktN415FRIVasr7UkTpWn9mLq5M/edit
>
> https://review.openstack.org/#/c/19279/  < patch that merged the feature;
> 
>
> Aaron
>
> ** **
>
> On Thu, Jul 18, 2013 at 5:15 PM, Ian Wells  wrote:
> 
>
> On 18 July 2013 19:48, Aaron Rosen  wrote:
> > Is there something this is missing that could be added to cover your use
> > case? I'd be curious to hear where this doesn't work for your case.  One
> > would need to implement the port_security extension if they want to
> > completely allow all ips/macs to pass and they could state which ones are
> > explicitly allowed with the allowed-address-pair extension (at least
> that is
> > my current thought).
>
> Yes - have you got docs on the port security extension?  All I've
> found so far are
>
> http://docs.openstack.org/developer/quantum/api/quantum.extensions.portsecurity.html
> and the fact that it's only the Nicira plugin that implements it.  I
> could implement it for something else, but not without a few hints...
> --
> Ian.
>
>
> ___
> OpenStack-dev mailing list
> OpenStack-dev@lists.openstack.org
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>
> ** **
>
> ___
> OpenStack-dev mailing list
> OpenStack-dev@lists.openstack.org
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>
>
___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


[openstack-dev] [Netron] Allow OVS default veth MTU to be configured.

2013-07-19 Thread Jun Cheol Park
Neutron Core Reviewers,

Could you please review the following bug fix? I have refactored the test
code.

https://review.openstack.org/#/c/27937/

Thanks,

-Jun
___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Nova] New DB column or new DB table?

2013-07-19 Thread Boris Pavlovic
Jiang,

I would like to reduce "magic"

1) We are using already RPC (because all compute nodes update are done in
DB via conductor (which means RPC call).
So count of RPC calls and size of msg will be the same.

2) There is no lazy load when you have to fetch all data about all compute
nodes on every request to scheduler.

3) Object models are off topic

Best regards,
Boris Pavlovic

Mirantis Inc.




On Fri, Jul 19, 2013 at 11:23 PM, Jiang, Yunhong wrote:

>  Boris
>
>I think you in fact covered two topic, one is if use db or rpc for
> communication. This has been discussed a lot. But I didn’t find the
> conclusion. From the discussion,  seems the key thing is the fan out
> messages. I’d suggest you to bring this to scheduler sub meeting.
>
> ** **
>
>
> http://eavesdrop.openstack.org/meetings/scheduler/2013/scheduler.2013-06-11-14.59.log.html
> 
>
> http://www.mail-archive.com/openstack-dev@lists.openstack.org/msg00070.html
> 
>
> http://comments.gmane.org/gmane.comp.cloud.openstack.devel/23 
>
> ** **
>
>The second topic is adding extra tables to compute nodes. I think
> we need the lazy loading for the compute node, and also I think with object
> model, we can further improve it if we utilize the compute node object.***
> *
>
> ** **
>
> Thanks
>
> --jyh
>
> ** **
>
> ** **
>
> *From:* Boris Pavlovic [mailto:bo...@pavlovic.me]
> *Sent:* Friday, July 19, 2013 10:07 AM
>
> *To:* OpenStack Development Mailing List
> *Subject:* Re: [openstack-dev] [Nova] New DB column or new DB table?
>
>  ** **
>
> Hi all, 
>
> ** **
>
> We have to much different branches about scheduler (so I have to repeat
> here also).
>
> ** **
>
> I am against to add some extra tables that will be joined to compute_nodes
> table on each scheduler request (or adding large text columns).
>
> Because it make our non scalable scheduler even less scalable. 
>
> ** **
>
> Also if we just remove DB between scheduler and compute nodes we will get
> really good improvement in all aspects (performance, db load, network
> traffic, scalability )
>
> And also it will be easily to use another resources provider (cinder,
> ceilometer e.g..) in Nova scheduler. 
>
> ** **
>
> And one more thing this all could be really simple implement in current
> Nova, without big changes 
>
>
> https://docs.google.com/document/d/1_DRv7it_mwalEZzLy5WO92TJcummpmWL4NWsWf0UWiQ/edit?usp=sharing
> 
>
> ** **
>
> ** **
>
> Best regards,
>
> Boris Pavlovic 
>
> ** **
>
> Mirantis Inc.
>
> ** **
>
> On Fri, Jul 19, 2013 at 8:44 PM, Dan Smith  wrote:
>
> > IIUC, Ceilometer is currently a downstream consumer of data from
> > Nova, but no functionality in Nova is a consumer of data from
> > Ceilometer. This is good split from a security separation point of
> > view, since the security of Nova is self-contained in this
> > architecture.
> >
> > If Nova schedular becomes dependant on data from ceilometer, then now
> > the security of Nova depends on the security of Ceilometer, expanding
> > the attack surface. This is not good architecture IMHO.
>
> Agreed.
>
>
> > At the same time, I hear your concerns about the potential for
> > duplication of stats collection functionality between Nova &
> > Ceilometer. I don't think we neccessarily need to remove 100% of
> > duplication. IMHO probably the key thing is for the virt drivers to
> > expose a standard API for exporting the stats, and make sure that
> > both ceilometer & nova schedular use the same APIs and ideally the
> > same data feed, so we're not invoking the same APIs twice to get the
> > same data.
>
> I imagine there's quite a bit that could be shared, without dependency
> between the two. Interfaces out of the virt drivers may be one, and the
> code that boils numbers into useful values, as well as perhaps the
> format of the JSON blobs that are getting shoved into the database.
> Perhaps a ceilo-core library with some very simple primitives and
> definitions could be carved out, which both nova and ceilometer could
> import for consistency, without a runtime dependency?
>
> --Dan
>
>
> ___
> OpenStack-dev mailing list
> OpenStack-dev@lists.openstack.org
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>
> ** **
>
> ___
> OpenStack-dev mailing list
> OpenStack-dev@lists.openstack.org
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>
>
___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] Change in openstack/neutron[master]: Add method to get iptables traffic counters

2013-07-19 Thread Brian Haley
Hi Sylvain,

Sorry for the slow reply, I'll have to look closer next week, but I did have
some comments.

1. This isn't something a tenant should be able to do, so should be admin-only,
correct?

2. I think it would be useful for an admin to be able to add metering rules for
all tenants with a single command.  This gets back to wanting to pre-seed an ini
file with a set of subnets, then add/subtract from it later without restarting
the daemon.

3. I think it would be better if you didn't mark the packets, for performance
reasons.  If you were marking them on input to be matched by something on output
I'd feel different, but for just counting bytes we should be able to do it
another way.  I can get back to you next week on figuring this out.

Thanks,

-Brian

On 07/18/2013 04:29 AM, Sylvain Afchain wrote:
> Hi Brian,
> 
> For iptables rules, see below
> 
> Yes the only way to setup metering labels/rules is the neutronclient. I agree 
> with you about the future
> enhancement.
> 
> Regards,
> 
> Sylvain
> 
> - Original Message -
> From: "Brian Haley" 
> To: "Sylvain Afchain" 
> Cc: openstack-dev@lists.openstack.org
> Sent: Thursday, July 18, 2013 4:58:26 AM
> Subject: Re: Change in openstack/neutron[master]: Add method to get iptables 
> traffic counters
> 
>> Hi Sylvain,
>>
>> I think I've caught-up with all your reviews, but I still did have some
>> questions on the iptables rules, below.
>>
>> One other question, and maybe it's simply a future enhancement, but is the 
>> only
>> way to setup these meters using neutronclient?  I think being able to specify
>> these in an .ini file would be good as well, which is something I'd want to 
>> do
>> as a provider, such that they're always there, and actually not visible to 
>> the
>> tenant.
>>
>> On 07/11/2013 10:04 AM, Sylvain Afchain wrote:
>>> Hi Brian,
>>>
>>> You're right It could be easier with your approach to get and keep the 
>>> traffic counters.
>>>
>>> I will add a new method to get the details of traffic counters of a chain.
>>> https://review.openstack.org/35624
>>>
>>> Thoughts?
>>>
>>> Regards,
>>>
>>> Sylvain.
>>>
>>> - Original Message -
>>> From: "Sylvain Afchain" 
>>> To: "Brian Haley" 
>>> Cc: openstack-dev@lists.openstack.org
>>> Sent: Thursday, July 11, 2013 11:19:39 AM
>>> Subject: Re: Change in openstack/neutron[master]: Add method to get 
>>> iptables traffic counters
>>>
>>> Hi Brian,
>>>
>>> First thanks for the reviews and your detailed email.
>>>
>>> Second I will update the blueprint specs. as soon as possible, but for 
>>> example it will look like that:
>>>
>>> Chain FORWARD (policy ACCEPT 0 packets, 0 bytes)
>>> pkts  bytes target prot opt in out source   
>>> destination 
>>>55   245 metering-r-aef1456343  all  --  *  *   
>>> 0.0.0.0/00.0.0.0/0   /* jump to rules the label aef1456343 */   
>>>  
>>>55   245 metering-r-badf566782  all  --  *  *   
>>> 0.0.0.0/00.0.0.0/0   
> 
>> So are these two used to separate out what you don't want to count from what 
>> you
>> want to count?  Seems the jump to the r-aef1456343 will filter, then the
>> r-badf566782 will count per-subnet?  I'm just trying to understand why you're
>> splitting the two up.
> 
> No here, there are two rules only because there are two labels. In the chain 
> of each 
> label you will find the label's rules. 
> 
>>> Chain metering-l-aef1456343 (1 references)  /* the chain for the label 
>>> aef1456343 */
>>> pkts  bytes target prot opt in out source   
>>> destination 
>>>
>>> Chain metering-l-badf566782 (1 references)  /* the chain for the label 
>>> badf566782 */
>>> pkts  bytes target prot opt in out source   
>>> destination  
> 
>> These two chains aren't really doing anything, and I believe their 
>> packet/byte
>> counts would be duplicated in the calling rules, correct?  If that's the 
>> case I
>> don't see the reason to jump to them.  Our performance person always reminds 
>> me
>> when I increase path length by doing things like this, so removing 
>> unnecessary
>> things is one of my goals.  Of course we're doing more work here to count
>> things, but that needs to be done.
> 
> I recently change this(according to your remarks on iptables accounting), so 
> now there is a 
> rule which is used to count the traffic for a label. A mark is added one this 
> rule to be 
> sure to not count it twice. You can check the metering iptables drivers.
> https://review.openstack.org/#/c/36813/
> 
>>> Chain metering-r-aef1456343 (1 references)
>>> pkts  bytes target prot opt in out source   
>>> destination 
>>>20 100 RETURN all  --  *  *   0.0.0.0/0   
>>> !10.0.0.0/24  /* don't want to count this traffic */   
>>>00 RETURN all  --  *  *   0.0.0.0/0

[openstack-dev] Savanna 0.2 release details and screencast

2013-07-19 Thread Sergey Lukjanov
Hello everyone!

As you remember, we had 0.2 release this week and I’m really happy to do one 
more announcement today. We've prepared blog post for you with all new features 
appeared in our release. The most beautiful thing is that my colleague, Dmitry 
Mescheryakov made an awesome screencast with demonstration of brand new Savanna 
and you can find it in the post - 
http://www.mirantis.com/blog/savanna-0-2-released-new-hadoop-on-openstack-features/

You are welcome to read, watch and comment.

Thank you!

Sincerely yours,
Sergey Lukjanov
Savanna Technical Lead
Mirantis Inc.


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] Revert Pass instance host-id to Quantum using port bindings extension.

2013-07-19 Thread Aaron Rosen
On Fri, Jul 19, 2013 at 1:11 PM, Kyle Mestery (kmestery)  wrote:

> On Jul 19, 2013, at 1:58 PM, Aaron Rosen  wrote:
> >
> >
> >
> >
> > On Fri, Jul 19, 2013 at 8:47 AM, Kyle Mestery (kmestery) <
> kmest...@cisco.com> wrote:
> > On Jul 18, 2013, at 5:16 PM, Aaron Rosen  wrote:
> > >
> > > Hi,
> > >
> > > I wanted to raise another design failure of why creating the port on
> nova-compute is bad. Previously, we have encountered this bug (
> https://bugs.launchpad.net/neutron/+bug/1160442). What was causing the
> issue was that when nova-compute calls into quantum to create the port;
> quantum creates the port but fails to return the port to nova and instead
> timesout. When this happens the instance is scheduled to be run on another
> compute node where another port is created with the same device_id and when
> the instance boots it will look like it has two ports. This is still a
> problem that can occur today in our current implementation (!).
> > >
> > > I think in order to move forward with this we'll need to compromise.
> Here is my though on how we should proceed.
> > >
> > > 1) Modify the quantum API so that mac addresses can now be updated via
> the api. There is no reason why we have this limitation (especially once
> the patch that uses dhcp_release is merged as it will allow us to update
> the lease for the new mac immediately).  We need to do this in order for
> bare metal support as we need to match the mac address of the port to the
> compute node.
> > >
> > I don't understand how this relates to creating a port through
> nova-compute. I'm not saying this is a bad idea, I just don't see how it
> relates to the original discussion point on this thread around Yong's patch.
> >
> > > 2) move the port-creation from nova-compute to nova-api. This will
> solve a number of issues like the one i pointed out above.
> > >
> > This seems like a bad idea. So now a Nova API call will implicitly
> create a Neutron port? What happens on failure here? The caller isn't aware
> the port was created in Neutron if it's implicit, so who cleans things up?
> Or if the caller is aware, than all we've done is move an API the caller
> would have done (nova-compute in this case) into nova-api, though the
> caller is now still aware of what's happening.
> >
> > On failure here the VM will go to ERROR state if the port is failed to
> create in quantum. Then when deleting the instance; the delete code should
> also search quantum for the device_id in order to remove the port there as
> well.
> >
> So, nova-compute will implicitly know the port was created by nova-api,
> and if a failure happens, it will clean up the port? That doesn't sound
> like a balanced solution to me, and seems to tie nova-compute and nova-api
> close together when it comes to launching VMs with Neutron ports.
>
> >  The issue here is that if an instance fails to boot on a compute node
> (because nova-compute did not get the port-create response from quantum and
> the port was actually created) the instance gets scheduled to be booted on
> another nova-compute node where the duplicate create happens. Moving the
> creation to the API node removes the port from getting created in the retry
> logic that solves this.
> >
> I think Ian's comments on your blueprint [1] address this exact problem,
> can you take a look at them there?
>
> [1]
> https://blueprints.launchpad.net/nova/+spec/nova-api-quantum-create-port


Sure from Ian: comments inline with [arosen] :

 The issue is not the location of the call.

The issue is one of transactionality - you want to create a neutron port
implicitly while nova booting a machine, and you want all the Neutron and
Nova calls to both succeed or both fail. If you can't have transactionality
the old fashioned way with synchronous calls (and we can't) then you need
eventual consistency: a task to clean up dead ports and the understanding
that such ports may still be kicking around from previous attempts.

[arosen] - I agree.

We should create the port, *then* attempt the attach using update - the
create can succeed independently and any subsequent nova-compute attach
will succeed on the previously created port rather than making a new one
(possibly verifying that its 'attached' status, if the second call
completed but didn't return, is a lie).

So:
create fails to return but port is created
-> run on 2nd compute node won't attempt the create, port already exists;
port consumed, everything good

create returns, attach fails to return but port is attached
-> run on 2nd compute node won't attempt the create and will identify that
the attachment state is bogus and overwrite it; port consumed, everything
good
-> if last attempt, a port with a bogus attach is left hanging around in
the DB; a cleanup job has to go looking for it and remove it; optionally
anything else can spot its inconsistency and ignore or remove it. Risk of
removal during the actual scheduling, in which case the schedule pass will
fail; can set expiry time on p

Re: [openstack-dev] A simple way to improve nova scheduler

2013-07-19 Thread Boris Pavlovic
Sandy,

Hm I don't know that algorithm. But our approach doesn't have exponential
exchange.
I don't think that in 10k nodes cloud we will have a problems with 150 RPC
call/sec. Even in 100k we will have only 1.5k RPC call/sec.
More then (compute nodes update their state in DB through conductor which
produce the same count of RPC calls).

So I don't see any explosion here.

Best regards,
Boris Pavlovic

Mirantis Inc.


On Fri, Jul 19, 2013 at 11:47 PM, Sandy Walsh wrote:

>
>
> On 07/19/2013 04:25 PM, Brian Schott wrote:
> > I think Soren suggested this way back in Cactus to use MQ for compute
> > node state rather than database and it was a good idea then.
>
> The problem with that approach was the number of queues went exponential
> as soon as you went beyond simple flavors. Add Capabilities or other
> criteria and you get an explosion of exchanges to listen to.
>
>
>
> > On Jul 19, 2013, at 10:52 AM, Boris Pavlovic  > > wrote:
> >
> >> Hi all,
> >>
> >>
> >> In Mirantis Alexey Ovtchinnikov and me are working on nova scheduler
> >> improvements.
> >>
> >> As far as we can see the problem, now scheduler has two major issues:
> >>
> >> 1) Scalability. Factors that contribute to bad scalability are these:
> >> *) Each compute node every periodic task interval (60 sec by default)
> >> updates resources state in DB.
> >> *) On every boot request scheduler has to fetch information about all
> >> compute nodes from DB.
> >>
> >> 2) Flexibility. Flexibility perishes due to problems with:
> >> *) Addiing new complex resources (such as big lists of complex objects
> >> e.g. required by PCI Passthrough
> >> https://review.openstack.org/#/c/34644/5/nova/db/sqlalchemy/models.py)
> >> *) Using different sources of data in Scheduler for example from
> >> cinder or ceilometer.
> >> (as required by Volume Affinity Filter
> >> https://review.openstack.org/#/c/29343/)
> >>
> >>
> >> We found a simple way to mitigate this issues by avoiding of DB usage
> >> for host state storage.
> >>
> >> A more detailed discussion of the problem state and one of a possible
> >> solution can be found here:
> >>
> >>
> https://docs.google.com/document/d/1_DRv7it_mwalEZzLy5WO92TJcummpmWL4NWsWf0UWiQ/edit#
> >>
> >>
> >> Best regards,
> >> Boris Pavlovic
> >>
> >> Mirantis Inc.
> >>
> >> ___
> >> OpenStack-dev mailing list
> >> OpenStack-dev@lists.openstack.org
> >> 
> >> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
> >
> >
> >
> > ___
> > OpenStack-dev mailing list
> > OpenStack-dev@lists.openstack.org
> > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
> >
>
> ___
> OpenStack-dev mailing list
> OpenStack-dev@lists.openstack.org
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>
___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] Moving task flow to conductor - concern about scale

2013-07-19 Thread Peter Feiner
On Fri, Jul 19, 2013 at 4:36 PM, Joshua Harlow  wrote:
> This seems to me to be a good example where a library "problem" is leaking 
> into the openstack architecture right? That is IMHO a bad path to go down.
>
> I like to think of a world where this isn't a problem and design the correct 
> solution there instead and fix the eventlet problem instead. Other large 
> applications don't fallback to rpc calls to get around a database/eventlet 
> scaling issues afaik.
>
> Honestly I would almost just want to finally fix the eventlet problem (chris 
> b. I think has been working on it) and design a system that doesn't try to 
> work around a libraries lacking. But maybe that's to much idealism, idk...

Well, there are two problems that multiple nova-conductor processes
fix. One is the bad interaction between eventlet and native code. The
other is allowing multiprocessing.  That is, once nova-conductor
starts to handle enough requests, enough time will be spent holding
the GIL to make it a bottleneck; in fact I've had to scale keystone
using multiple processes because of GIL contention (i.e., keystone was
steadily at 100% CPU utilization when I was hitting OpenStack with
enough requests). So multiple processes isn't avoidable. Indeed, other
software that strives for high concurrency, such as apache, use
multiple processes to avoid contention for per-process kernel
resources like the mmap semaphore.

> This doesn't even touch on the synchronization issues that can happen when u 
> start pumping db traffic over a mq. Ex, an update is now queued behind 
> another update, the second one conflicts with the first, where does 
> resolution happen when an async mq call is used. What about when you have X 
> conductors doing Y reads and Z updates; I don't even want to think about the 
> sync/races there (and so on...). Did u hit / check for any consistency issues 
> in your tests? Consistency issues under high load using multiple conductors 
> scare the bejezzus out of me

If a sequence of updates needs to be atomic, then they should be made
in the same database transaction. Hence nova-conductor's interface
isn't do_some_sql(query), it's a bunch of high-level nova operations
that are implemented using transactions.

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] A simple way to improve nova scheduler

2013-07-19 Thread Boris Pavlovic
Sandy,

I don't think that we have such problems here.
Because scheduler doesn't pool compute_nodes.
The situation is another compute_nodes notify scheduler about their state.
(instead of updating their state in DB)

So for example if scheduler send request to compute_node, compute_node is
able to run rpc call to schedulers immediately (not after 60sec).

So there is almost no races.


Best regards,
Boris Pavlovic

Mirantis Inc.



On Sat, Jul 20, 2013 at 12:23 AM, Sandy Walsh wrote:

>
>
> On 07/19/2013 05:01 PM, Boris Pavlovic wrote:
> > Sandy,
> >
> > Hm I don't know that algorithm. But our approach doesn't have
> > exponential exchange.
> > I don't think that in 10k nodes cloud we will have a problems with 150
> > RPC call/sec. Even in 100k we will have only 1.5k RPC call/sec.
> > More then (compute nodes update their state in DB through conductor
> > which produce the same count of RPC calls).
> >
> > So I don't see any explosion here.
>
> Sorry, I was commenting on Soren's suggestion from way back (essentially
> listening on a separate exchange for each unique flavor ... so no
> scheduler was needed at all). It was a great idea, but fell apart rather
> quickly.
>
> The existing approach the scheduler takes is expensive (asking the db
> for state of all hosts) and polling the compute nodes might be do-able,
> but you're still going to have latency problems waiting for the
> responses (the states are invalid nearly immediately, especially if a
> fill-first scheduling algorithm is used). We ran into this problem
> before in an earlier scheduler implementation. The round-tripping kills.
>
> We have a lot of really great information on Host state in the form of
> notifications right now. I think having a service (or notification
> driver) listening for these and keeping an the HostState incrementally
> updated (and reported back to all of the schedulers via the fanout
> queue) would be a better approach.
>
> -S
>
>
> >
> > Best regards,
> > Boris Pavlovic
> >
> > Mirantis Inc.
> >
> >
> > On Fri, Jul 19, 2013 at 11:47 PM, Sandy Walsh  > > wrote:
> >
> >
> >
> > On 07/19/2013 04:25 PM, Brian Schott wrote:
> > > I think Soren suggested this way back in Cactus to use MQ for
> compute
> > > node state rather than database and it was a good idea then.
> >
> > The problem with that approach was the number of queues went
> exponential
> > as soon as you went beyond simple flavors. Add Capabilities or other
> > criteria and you get an explosion of exchanges to listen to.
> >
> >
> >
> > > On Jul 19, 2013, at 10:52 AM, Boris Pavlovic  > 
> > > >> wrote:
> > >
> > >> Hi all,
> > >>
> > >>
> > >> In Mirantis Alexey Ovtchinnikov and me are working on nova
> scheduler
> > >> improvements.
> > >>
> > >> As far as we can see the problem, now scheduler has two major
> issues:
> > >>
> > >> 1) Scalability. Factors that contribute to bad scalability are
> these:
> > >> *) Each compute node every periodic task interval (60 sec by
> default)
> > >> updates resources state in DB.
> > >> *) On every boot request scheduler has to fetch information about
> all
> > >> compute nodes from DB.
> > >>
> > >> 2) Flexibility. Flexibility perishes due to problems with:
> > >> *) Addiing new complex resources (such as big lists of complex
> > objects
> > >> e.g. required by PCI Passthrough
> > >>
> >
> https://review.openstack.org/#/c/34644/5/nova/db/sqlalchemy/models.py)
> > >> *) Using different sources of data in Scheduler for example from
> > >> cinder or ceilometer.
> > >> (as required by Volume Affinity Filter
> > >> https://review.openstack.org/#/c/29343/)
> > >>
> > >>
> > >> We found a simple way to mitigate this issues by avoiding of DB
> usage
> > >> for host state storage.
> > >>
> > >> A more detailed discussion of the problem state and one of a
> possible
> > >> solution can be found here:
> > >>
> > >>
> >
> https://docs.google.com/document/d/1_DRv7it_mwalEZzLy5WO92TJcummpmWL4NWsWf0UWiQ/edit#
> > >>
> > >>
> > >> Best regards,
> > >> Boris Pavlovic
> > >>
> > >> Mirantis Inc.
> > >>
> > >> ___
> > >> OpenStack-dev mailing list
> > >> OpenStack-dev@lists.openstack.org
> > 
> > >>  > >
> > >> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
> > >
> > >
> > >
> > > ___
> > > OpenStack-dev mailing list
> > > OpenStack-dev@lists.openstack.org
> > 
> > > http://lists.openstack.org/cgi-bin/mailman/li

Re: [openstack-dev] [keystone] sqlite doesn't support migrations

2013-07-19 Thread Boris Pavlovic
Hi all,

We are working around adding support for sqlite into Alembic.
And we will switch to Alembic in I cycle in whole OpenStack I hope.
So don't touch sqlite.

Joe Gordon,

No we would like to merge it. Because at this moment we are able to run
unit test in Nova only against SQLite.
So it has critical importance for at least Nova, Glance and Cinder.


Best regards,
Boris Pavlovic

Mirantis Inc.






On Sat, Jul 20, 2013 at 12:48 AM, Joe Gordon  wrote:

> Along these lines, OpenStack is now maintaining sqlachemy-migrate, and we
> have our first patch up for better SQLite support, taken from nova,
> https://review.openstack.org/#/c/37656/
>
> Do we want to go the direction of explicitly not supporting sqllite and
> not running migrations with it, like keystone.  If so we may not want the
> patch above to get merged.
>
>
> On Wed, Jul 17, 2013 at 10:40 AM, Adam Young  wrote:
>
>> On 07/16/2013 10:58 AM, Thomas Goirand wrote:
>>
>>> On 07/16/2013 03:55 PM, Michael Still wrote:
>>>
 On Tue, Jul 16, 2013 at 4:17 PM, Thomas Goirand 
 wrote:

  Could you explain a bit more what could be done to fix it in an easy
> way, even if it's not efficient? I understand that ALTER doesn't work
> well. Though would we have the possibility to just create a new
> temporary table with the correct fields, and copy the existing content
> in it, then rename the temp table so that it replaces the original one?
>
 There are a bunch of nova migrations that already work that way...
 Checkout the *sqlite* files in
 nova/db/sqlalchemy/migrate_**repo/versions/

>>> Why can't we do that with Keystone then? Is it too much work? It doesn't
>>> seem hard to do (just probably a bit annoying and boring ...). Does it
>>> represent too much work for the case of Keystone?
>>>
>>> Thomas
>>>
>>>
>>> __**_
>>> OpenStack-dev mailing list
>>> OpenStack-dev@lists.openstack.**org 
>>> http://lists.openstack.org/**cgi-bin/mailman/listinfo/**openstack-dev
>>>
>> In general, yes, the SQlite migrations have eaten up far more time than
>> the benefit they provide.
>>
>> Sqlite migrations don't buy us much.  A live deployment should not be on
>> sqlite.  It is OK to use it for testing, though.  As such, it should be
>> possible to generate a sqlite scheme off the model the way that the unit
>> tests do.  You just will not be able to migrate it forward:  later changes
>> will require regenerating the database.
>>
>>
> Agreed, so lets change the default database away from SQLite everywhere
> and we can just override the default when needed for testing.
>
>
>
>> For any non-trivial deployent, we should encourage the use of Mysql or
>> Postgresql.  Migrations for these will be full supported.
>>
>> From a unit test perspective, we are going to disable the sqlite
>> migration tests.  Users that need to perform testing of migrations should
>> instead test them against both mysql and postgresql. the IBMers out there
>> will also be responsible for keeping on top of the DB2 variations.  The
>> [My/postgre]Sql migration tests will be run as part of the gate.
>>
>> The current crop of unit tests for the SQL backend use the module based
>> approach to generate a sqlite database.  This approach is fine.  Sqlite,
>> when run in memory, provides a very fast sanity check of our logic.  MySQL
>> and PostgreSQL versions would take far, far too long to execute for
>> everyday development. We will, however, make it possible to execute those
>> body of unit tests against both postgresql and mysql as part of the gate.
>>  Wer will also execute against the LDAP backend.  IN order to be able to
>> perform that, however, we need to fix bunch of the tests so that they run
>> at 100% first.
>>
>>
>>
>>
>> __**_
>> OpenStack-dev mailing list
>> OpenStack-dev@lists.openstack.**org 
>> http://lists.openstack.org/**cgi-bin/mailman/listinfo/**openstack-dev
>>
>
>
> ___
> OpenStack-dev mailing list
> OpenStack-dev@lists.openstack.org
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>
>
___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] Cinder Driver Base Requirements

2013-07-19 Thread Anne Gentle
Great idea, Mike. Should we have a section that describes the minimum docs
for a driver?


On Thu, Jul 18, 2013 at 12:20 AM, thingee  wrote:

> To avoid having a grid of what features are available by which drivers and
> which releases, the Cinder team has met and agreed on 2013-04-24 that we
> would request all current and new drivers to fulfill a list of minimum
> requirement features [1] in order to be included in new releases.
>
> There have been emails sent to the maintainers of drivers that are missing
> features in the minimum feature requirement list.
>
> If there are questions, maintainers can reply back to my email and as
> always reach out to the team on #openstack-cinder.
>
> Thanks,
> Mike Perez
>
> [1] - https://wiki.openstack.org/wiki/Cinder#Minimum_Driver_Features
>
> ___
> OpenStack-dev mailing list
> OpenStack-dev@lists.openstack.org
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>
>


-- 
Anne Gentle
annegen...@justwriteclick.com
___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


[openstack-dev] [neutron] [ml2] [devstack] ML2 devstack changes for tunneling

2013-07-19 Thread Kyle Mestery (kmestery)
I've pushed out a review [1] to enable support for setting additional ML2 
options when running with devstack. Since H2, ML2 has now added support for 
both GRE and VXLAN tunnels, and the patch below allows for this configuration 
when running with devstack. Feedback from folks on the Neutron ML2 sub-team and 
devstack developers would be appreciated!

Thanks!
Kyle

[1] https://review.openstack.org/37963
___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] Moving task flow to conductor - concern about scale

2013-07-19 Thread Robert Collins
On 19 July 2013 22:55, Day, Phil  wrote:
> Hi Josh,
>
> My idea's really pretty simple - make "DB proxy" and "Task workflow" separate 
> services, and allow people to co-locate them if they want to.

+1, for all the reasons discussed in this thread. I was weirded out
when I saw non-DB-proxy work being put into the same service. One
additional reason that hasn't been discussed is security : the more
complex the code in the 'actually connects to the DB', the greater the
risk of someone getting direct access that shouldn't via a code bug.

-Rob

-- 
Robert Collins 
Distinguished Technologist
HP Cloud Services

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova] volume affinity filter for nova scheduler

2013-07-19 Thread Joe Gordon
On Fri, Jul 12, 2013 at 1:57 AM, Robert Collins
wrote:

> On 11 July 2013 02:39, Russell Bryant  wrote:
>
> >> We'll probably need something like this for Ironic with persistent
> >> volumes on machines - yes its a rare case, but when it matters, it
> >> matters a great deal.
> >
> > I believe you, but I guess I'd like to better understand how this works
> > to make sure what gets added actually solves your use case.  Is there
> > already support for Cinder managed persistent volumes that live on
> > baremetal nodes?
>
> There isn't, but we discussed it with Cinder folk in Portland.
>
> Basic intent is this:
>  - we have a cinder 'Ironic' backend.
>  - volume requests for the Ironic backend are lazy provisioned: they
> just allocate a UUID on creation
>  - nova-bm/Ironic will store a volume <-> node mapping
>  - 'nova boot' without a volume spec will only select nodes with no
> volumes associated to them.
>  - 'nova boot' with a volume spec will find an existing node with
> those volumes mapped to it, or if none exists create the volume
> mapping just-in-time
>  - the deployment ramdisk would setup the volume on the hardware
> [using hardware RAID]
>- where there isn't hardware RAID we'd let the instance take care
> of how to setup the persistent storage - because we don't have a
> translation layer in place we can't assume lvm or windows volume
> manager or veritas or.
>
> The obvious gap between intent and implementation here is that choices
> about nodes happen in the nova scheduler, so we need the scheduler to
> be able to honour four cases:
>  - baremetal flavor with no volumes requested, gets a baremetal node
> with no volumes mapped
>  - baremetal flavor with volumes requested, just one baremetal node
> with any of those volumes exist -> that node
>  - baremetal flavor with volumes requested, > one baremetal node with
> any of those volumes exist -> error
>  - baremetal flavor with volumes requested, no nodes with any of those
> volumes -> pick any node that has enough disks to supply the volume
> definitions
>
>
Wouldn't this all be possible today if we added a ironic nova filter?
 Perhaps the current design wouldn't very performant though.  We

can even keep this filter in the ironic repo instead of nova.


> Writing this it seems like the nova scheduler may be a tricky fit;
> perhaps we should - again- reevaluate just how this all glues
> together?
>
> -Rob
> --
> Robert Collins 
> Distinguished Technologist
> HP Cloud Services
>
> ___
> OpenStack-dev mailing list
> OpenStack-dev@lists.openstack.org
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>
___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] A simple way to improve nova scheduler

2013-07-19 Thread Sandy Walsh


On 07/19/2013 05:01 PM, Boris Pavlovic wrote:
> Sandy,
> 
> Hm I don't know that algorithm. But our approach doesn't have
> exponential exchange.
> I don't think that in 10k nodes cloud we will have a problems with 150
> RPC call/sec. Even in 100k we will have only 1.5k RPC call/sec.
> More then (compute nodes update their state in DB through conductor
> which produce the same count of RPC calls). 
> 
> So I don't see any explosion here.

Sorry, I was commenting on Soren's suggestion from way back (essentially
listening on a separate exchange for each unique flavor ... so no
scheduler was needed at all). It was a great idea, but fell apart rather
quickly.

The existing approach the scheduler takes is expensive (asking the db
for state of all hosts) and polling the compute nodes might be do-able,
but you're still going to have latency problems waiting for the
responses (the states are invalid nearly immediately, especially if a
fill-first scheduling algorithm is used). We ran into this problem
before in an earlier scheduler implementation. The round-tripping kills.

We have a lot of really great information on Host state in the form of
notifications right now. I think having a service (or notification
driver) listening for these and keeping an the HostState incrementally
updated (and reported back to all of the schedulers via the fanout
queue) would be a better approach.

-S


> 
> Best regards,
> Boris Pavlovic
> 
> Mirantis Inc.  
> 
> 
> On Fri, Jul 19, 2013 at 11:47 PM, Sandy Walsh  > wrote:
> 
> 
> 
> On 07/19/2013 04:25 PM, Brian Schott wrote:
> > I think Soren suggested this way back in Cactus to use MQ for compute
> > node state rather than database and it was a good idea then.
> 
> The problem with that approach was the number of queues went exponential
> as soon as you went beyond simple flavors. Add Capabilities or other
> criteria and you get an explosion of exchanges to listen to.
> 
> 
> 
> > On Jul 19, 2013, at 10:52 AM, Boris Pavlovic  
> > >> wrote:
> >
> >> Hi all,
> >>
> >>
> >> In Mirantis Alexey Ovtchinnikov and me are working on nova scheduler
> >> improvements.
> >>
> >> As far as we can see the problem, now scheduler has two major issues:
> >>
> >> 1) Scalability. Factors that contribute to bad scalability are these:
> >> *) Each compute node every periodic task interval (60 sec by default)
> >> updates resources state in DB.
> >> *) On every boot request scheduler has to fetch information about all
> >> compute nodes from DB.
> >>
> >> 2) Flexibility. Flexibility perishes due to problems with:
> >> *) Addiing new complex resources (such as big lists of complex
> objects
> >> e.g. required by PCI Passthrough
> >>
> https://review.openstack.org/#/c/34644/5/nova/db/sqlalchemy/models.py)
> >> *) Using different sources of data in Scheduler for example from
> >> cinder or ceilometer.
> >> (as required by Volume Affinity Filter
> >> https://review.openstack.org/#/c/29343/)
> >>
> >>
> >> We found a simple way to mitigate this issues by avoiding of DB usage
> >> for host state storage.
> >>
> >> A more detailed discussion of the problem state and one of a possible
> >> solution can be found here:
> >>
> >>
> 
> https://docs.google.com/document/d/1_DRv7it_mwalEZzLy5WO92TJcummpmWL4NWsWf0UWiQ/edit#
> >>
> >>
> >> Best regards,
> >> Boris Pavlovic
> >>
> >> Mirantis Inc.
> >>
> >> ___
> >> OpenStack-dev mailing list
> >> OpenStack-dev@lists.openstack.org
> 
> >>  >
> >> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
> >
> >
> >
> > ___
> > OpenStack-dev mailing list
> > OpenStack-dev@lists.openstack.org
> 
> > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
> >
> 
> ___
> OpenStack-dev mailing list
> OpenStack-dev@lists.openstack.org
> 
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
> 
> 
> 
> 
> ___
> OpenStack-dev mailing list
> OpenStack-dev@lists.openstack.org
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
> 

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] Moving task flow to conductor - concern about scale

2013-07-19 Thread Dan Smith
> Nova-conductor is the gateway to the database for nova-compute
> processes.  So permitting a single nova-conductor process would
> effectively serialize all database queries during instance creation,
> deletion, periodic instance refreshes, etc.

FWIW, I don't think anyone is suggesting a single conductor, and
especially not a single database proxy.

> Since these queries are made frequently (i.e., easily 100 times
> during instance creation) and while other global locks are held
> (e.g., in the case of nova-compute's ResourceTracker), most of what
> nova-compute does becomes serialized.

I think your numbers are a bit off. When I measured it just before
grizzly, an instance create was something like 20-30 database calls.
Unless that's changed (a lot) lately ... :)

--Dan

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Nova] New DB column or new DB table?

2013-07-19 Thread Boris Pavlovic
Hi all,

We have to much different branches about scheduler (so I have to repeat
here also).

I am against to add some extra tables that will be joined to compute_nodes
table on each scheduler request (or adding large text columns).
Because it make our non scalable scheduler even less scalable.

Also if we just remove DB between scheduler and compute nodes we will get
really good improvement in all aspects (performance, db load, network
traffic, scalability )
And also it will be easily to use another resources provider (cinder,
ceilometer e.g..) in Nova scheduler.

And one more thing this all could be really simple implement in current
Nova, without big changes

https://docs.google.com/document/d/1_DRv7it_mwalEZzLy5WO92TJcummpmWL4NWsWf0UWiQ/edit?usp=sharing


Best regards,
Boris Pavlovic

Mirantis Inc.


On Fri, Jul 19, 2013 at 8:44 PM, Dan Smith  wrote:

> > IIUC, Ceilometer is currently a downstream consumer of data from
> > Nova, but no functionality in Nova is a consumer of data from
> > Ceilometer. This is good split from a security separation point of
> > view, since the security of Nova is self-contained in this
> > architecture.
> >
> > If Nova schedular becomes dependant on data from ceilometer, then now
> > the security of Nova depends on the security of Ceilometer, expanding
> > the attack surface. This is not good architecture IMHO.
>
> Agreed.
>
> > At the same time, I hear your concerns about the potential for
> > duplication of stats collection functionality between Nova &
> > Ceilometer. I don't think we neccessarily need to remove 100% of
> > duplication. IMHO probably the key thing is for the virt drivers to
> > expose a standard API for exporting the stats, and make sure that
> > both ceilometer & nova schedular use the same APIs and ideally the
> > same data feed, so we're not invoking the same APIs twice to get the
> > same data.
>
> I imagine there's quite a bit that could be shared, without dependency
> between the two. Interfaces out of the virt drivers may be one, and the
> code that boils numbers into useful values, as well as perhaps the
> format of the JSON blobs that are getting shoved into the database.
> Perhaps a ceilo-core library with some very simple primitives and
> definitions could be carved out, which both nova and ceilometer could
> import for consistency, without a runtime dependency?
>
> --Dan
>
> ___
> OpenStack-dev mailing list
> OpenStack-dev@lists.openstack.org
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>
___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Nova] New DB column or new DB table?

2013-07-19 Thread Jiang, Yunhong
Boris
   I think you in fact covered two topic, one is if use db or rpc for 
communication. This has been discussed a lot. But I didn't find the conclusion. 
From the discussion,  seems the key thing is the fan out messages. I'd suggest 
you to bring this to scheduler sub meeting.

http://eavesdrop.openstack.org/meetings/scheduler/2013/scheduler.2013-06-11-14.59.log.html
http://www.mail-archive.com/openstack-dev@lists.openstack.org/msg00070.html
http://comments.gmane.org/gmane.comp.cloud.openstack.devel/23

   The second topic is adding extra tables to compute nodes. I think we 
need the lazy loading for the compute node, and also I think with object model, 
we can further improve it if we utilize the compute node object.

Thanks
--jyh


From: Boris Pavlovic [mailto:bo...@pavlovic.me]
Sent: Friday, July 19, 2013 10:07 AM
To: OpenStack Development Mailing List
Subject: Re: [openstack-dev] [Nova] New DB column or new DB table?

Hi all,

We have to much different branches about scheduler (so I have to repeat here 
also).

I am against to add some extra tables that will be joined to compute_nodes 
table on each scheduler request (or adding large text columns).
Because it make our non scalable scheduler even less scalable.

Also if we just remove DB between scheduler and compute nodes we will get 
really good improvement in all aspects (performance, db load, network traffic, 
scalability )
And also it will be easily to use another resources provider (cinder, 
ceilometer e.g..) in Nova scheduler.

And one more thing this all could be really simple implement in current Nova, 
without big changes
 
https://docs.google.com/document/d/1_DRv7it_mwalEZzLy5WO92TJcummpmWL4NWsWf0UWiQ/edit?usp=sharing


Best regards,
Boris Pavlovic

Mirantis Inc.

On Fri, Jul 19, 2013 at 8:44 PM, Dan Smith 
mailto:d...@danplanet.com>> wrote:
> IIUC, Ceilometer is currently a downstream consumer of data from
> Nova, but no functionality in Nova is a consumer of data from
> Ceilometer. This is good split from a security separation point of
> view, since the security of Nova is self-contained in this
> architecture.
>
> If Nova schedular becomes dependant on data from ceilometer, then now
> the security of Nova depends on the security of Ceilometer, expanding
> the attack surface. This is not good architecture IMHO.
Agreed.

> At the same time, I hear your concerns about the potential for
> duplication of stats collection functionality between Nova &
> Ceilometer. I don't think we neccessarily need to remove 100% of
> duplication. IMHO probably the key thing is for the virt drivers to
> expose a standard API for exporting the stats, and make sure that
> both ceilometer & nova schedular use the same APIs and ideally the
> same data feed, so we're not invoking the same APIs twice to get the
> same data.
I imagine there's quite a bit that could be shared, without dependency
between the two. Interfaces out of the virt drivers may be one, and the
code that boils numbers into useful values, as well as perhaps the
format of the JSON blobs that are getting shoved into the database.
Perhaps a ceilo-core library with some very simple primitives and
definitions could be carved out, which both nova and ceilometer could
import for consistency, without a runtime dependency?

--Dan

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] A simple way to improve nova scheduler

2013-07-19 Thread Brian Schott
I think Soren suggested this way back in Cactus to use MQ for compute node 
state rather than database and it was a good idea then. 

On Jul 19, 2013, at 10:52 AM, Boris Pavlovic  wrote:

> Hi all, 
> 
> 
> In Mirantis Alexey Ovtchinnikov and me are working on nova scheduler 
> improvements.
> 
> As far as we can see the problem, now scheduler has two major issues:
> 
> 1) Scalability. Factors that contribute to bad scalability are these:
>   *) Each compute node every periodic task interval (60 sec by default) 
> updates resources state in DB.
>   *) On every boot request scheduler has to fetch information about all 
> compute nodes from DB. 
> 
> 2) Flexibility. Flexibility perishes due to problems with:
>   *) Addiing new complex resources (such as big lists of complex objects e.g. 
> required by PCI Passthrough 
> https://review.openstack.org/#/c/34644/5/nova/db/sqlalchemy/models.py)
>   *) Using different sources of data in Scheduler for example from cinder or 
> ceilometer.
> (as required by Volume Affinity Filter 
> https://review.openstack.org/#/c/29343/)
> 
> 
> We found a simple way to mitigate this issues by avoiding of DB usage for 
> host state storage.
> 
> A more detailed discussion of the problem state and one of a possible 
> solution can be found here:
> 
> https://docs.google.com/document/d/1_DRv7it_mwalEZzLy5WO92TJcummpmWL4NWsWf0UWiQ/edit#
> 
> 
> Best regards,
> Boris Pavlovic
> 
> Mirantis Inc. 
> 
> ___
> OpenStack-dev mailing list
> OpenStack-dev@lists.openstack.org
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] A vision for Keystone

2013-07-19 Thread Brad Topol
Adam,

Your essay below is outstanding!  Any chance part of it could be included 
within the keystone project documentation?  I think having it in the 
project  and at folks fingertips would really help folks that are trying 
to get up to speed with keystone!

Thanks again for writing this up!

--Brad

Brad Topol, Ph.D.
IBM Distinguished Engineer
OpenStack
(919) 543-0646
Internet:  bto...@us.ibm.com
Assistant: Cindy Willman (919) 268-5296



From:   Adam Young 
To: OpenStack Development Mailing List 

Date:   07/18/2013 02:21 PM
Subject:[openstack-dev] A vision for Keystone



I wrote up an essay that, I hope, explains where Keystone is headed as 
far as token management.

http://adam.younglogic.com/2013/07/a-vision-for-keystone/

It is fairly long (2000 words) but I attempted to make it readable, and 
to provide the context for what we are doing.

There are several blueprints for this work, many of which have already 
been implemented. There is at least one that I still need to write up.

This is not new stuff.  It is just an attempt to cleanly lay out the 
story.

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Nova] New DB column or new DB table?

2013-07-19 Thread Dan Smith
> IIUC, Ceilometer is currently a downstream consumer of data from
> Nova, but no functionality in Nova is a consumer of data from
> Ceilometer. This is good split from a security separation point of
> view, since the security of Nova is self-contained in this
> architecture.
> 
> If Nova schedular becomes dependant on data from ceilometer, then now
> the security of Nova depends on the security of Ceilometer, expanding
> the attack surface. This is not good architecture IMHO.

Agreed.
 
> At the same time, I hear your concerns about the potential for
> duplication of stats collection functionality between Nova &
> Ceilometer. I don't think we neccessarily need to remove 100% of
> duplication. IMHO probably the key thing is for the virt drivers to
> expose a standard API for exporting the stats, and make sure that
> both ceilometer & nova schedular use the same APIs and ideally the
> same data feed, so we're not invoking the same APIs twice to get the
> same data.

I imagine there's quite a bit that could be shared, without dependency
between the two. Interfaces out of the virt drivers may be one, and the
code that boils numbers into useful values, as well as perhaps the
format of the JSON blobs that are getting shoved into the database.
Perhaps a ceilo-core library with some very simple primitives and
definitions could be carved out, which both nova and ceilometer could
import for consistency, without a runtime dependency?

--Dan

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Nova] Ceilometer vs. Nova internal metrics collector for scheduling

2013-07-19 Thread Sandy Walsh


On 07/19/2013 12:30 PM, Sean Dague wrote:
> On 07/19/2013 10:37 AM, Murray, Paul (HP Cloud Services) wrote:
>> If we agree that "something like capabilities" should go through Nova,
>> what do you suggest should be done with the change that sparked this
>> debate: https://review.openstack.org/#/c/35760/
>>
>> I would be happy to use it or a modified version.
> 
> CPU sys, user, idle, iowait time isn't capabilities though. That's a
> dynamically changing value. I also think the current approach where this
> is point in time sampling, because we only keep a single value, is going
> to cause some oddly pathologic behavior if you try to use it as
> scheduling criteria.
> 
> I'd really appreciate the views of more nova core folks on this thread,
> as it looks like these blueprints have seen pretty minimal code review
> at this point. H3 isn't that far away, and there is a lot of high
> priority things ahead of this, and only so much coffee and review time
> in a day.

You really need to have a moving window average of these meters in order
to have anything sensible. Also, some sort of view into the pipeline of
scheduler requests (what's coming up?)

Capabilities are only really used in the host filtering phase. The host
weighing phase is where these measurements would be applied.


> -Sean
> 

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


[openstack-dev] Host evacuation

2013-07-19 Thread Endre Karlson
Would it be an idea to make the host evacuation to use the scheduler to
pick where the VMs are supposed to address the note
http://sebastien-han.fr/blog/2013/07/19/openstack-instance-evacuation-goes-to-host/?
___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Heat] Long-term, how do we make heat image/flavor name agnostic?

2013-07-19 Thread Adrian Otto
Gabriel,

On Jul 18, 2013, at 5:18 PM, Gabriel Hurley  wrote:

> Generally spot-on with what Adrian said, but I have one question from that 
> email:
> 
>> Mappings is one of the high level concepts in CFN that I think can be
>> completely eliminated with auto-discovery.
> 
> What do you mean by this? What kind of autodiscovery, and where? I'm all for 
> eliminating mappings someday if possible, but I haven't heard this proposal 
> before. Enlighten me?

The concept works by using a set of declared requirements, which are matched by 
the model interpreter in the orchestration system to resources/services that 
provide them. For example, one template may declare that it needs "linux", and 
the resource/service Nova can provide that.

This is something that has been worked on quite a bit in the open standards 
community. Namely, both the TOSCA and CAMP spec drafts have these concepts to 
some extent.

AFAIK, Heat does not yet have a sophisticated model interpreter, so this 
concept is something that we could iterate towards over time. What I am sure 
exists today is that resources have specific names that you can use to order 
any of the OpenStack services. You can declare you want a particular resource 
name when authoring your template. This does work within a single cloud, but it 
does not eliminate the need for mappings in order to make anything portable, 
because there is no guarantee that the resource name will exist innaother 
cloud, or that it will work the same.

There is more to this concept than what fits in a couple of paragraphs, but 
this should give you a general idea.

Adrian
___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] Moving task flow to conductor - concern about scale

2013-07-19 Thread Peter Feiner
On Fri, Jul 19, 2013 at 10:15 AM, Dan Smith  wrote:
>
> > So rather than asking "what doesn't work / might not work in the
> > future" I think the question should be "aside from them both being
> > things that could be described as a conductor - what's the
> > architectural reason for wanting to have these two separate groups of
> > functionality in the same service ?"
>
> IMHO, the architectural reason is "lack of proliferation of services and
> the added complexity that comes with it." If one expects the
> proxy workload to always overshadow the task workload, then making
> these two things a single service makes things a lot simpler.

I'd like to point a low-level detail that makes scaling nova-conductor
at the process level extremely compelling: the database driver
blocking the eventlet thread serializes nova's database access.

Since the database connection driver is typically implemented in a
library beyond the purview of eventlet's monkeypatching (i.e., a
native python extension like _mysql.so), blocking database calls will
block all eventlet coroutines. Since most of what nova-conductor does
is access the database, a nova-conductor process's handling of
requests is effectively serial.

Nova-conductor is the gateway to the database for nova-compute
processes.  So permitting a single nova-conductor process would
effectively serialize all database queries during instance creation,
deletion, periodic instance refreshes, etc. Since these queries are
made frequently (i.e., easily 100 times during instance creation) and
while other global locks are held (e.g., in the case of nova-compute's
ResourceTracker), most of what nova-compute does becomes serialized.

In parallel performance experiments I've done, I have found that
running multiple nova-conductor processes is the best way to mitigate
the serialization of blocking database calls. Say I am booting N
instances in parallel (usually up to N=40). If I have a single
nova-conductor process, the duration of each nova-conductor RPC
increases linearly with N, which can add _minutes_ to instance
creation time (i.e., dozens of RPCs, some taking several seconds).
However, if I run N nova-conductor processes in parallel, then the
duration of the nova-conductor RPCs do not increase with N; since each
RPC is most likely handled by a different nova-conductor, serial
execution of each process is moot.

Note that there are alternative methods for preventing the eventlet
thread from blocking during database calls. However, none of these
alternatives performed as well as multiple nova-conductor processes:

Instead of using the native database driver like _mysql.so, you can
use a pure-python driver, like pymysql by setting
sql_connection=mysql+pymysql://... in the [DEFAULT] section of
/etc/nova/nova.conf, which eventlet will monkeypatch to avoid
blocking. The problem with this approach is the vastly greater CPU
demand of the pure-python driver compared to the native driver. Since
the pure-python driver is so much more CPU intensive, the eventlet
thread spends most of its time talking to the database, which
effectively the problem we had before!

Instead of making database calls from eventlet's thread, you can
submit them to eventlet's pool of worker threads and wait for the
results. Try this by setting dbapi_use_tpool=True in the [DEFAULT]
section of /etc/nova/nova.conf. The problem I found with this approach
was the overhead of synchronizing with the worker threads. In
particular, the time elapsed between the worker thread finishing and
the waiting coroutine being resumed was typically several times
greater than the duration of the database call itself.

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Keystone] Reviewers wanted: Delegated Auth a la Oauth

2013-07-19 Thread Steve Martinelli

Hi David,

I don't want to derail anything, as there are still open issues in your
note; but it's worth mentioning that the session fixation attack issue you
mentioned was in OAuth1.0, it was addressed in OAuth1.0a.

Thanks,

_
Steve Martinelli | A4-317 @ IBM Toronto Software Lab
Software Developer - OpenStack
Phone: (905) 413-2851
E-Mail: steve...@ca.ibm.com



From:   David Chadwick 
To: Steve Martinelli/Toronto/IBM@IBMCA,
Cc: openstack-dev@lists.openstack.org
Date:   07/19/2013 07:15 AM
Subject:Re: [openstack-dev] [Keystone] Reviewers wanted: Delegated Auth
a la Oauth





On 19/07/2013 07:10, Steve Martinelli wrote:
> Hi David,
>
> This email is long overdue.
>
> 1. I don't recall ever stating that we were going to use OAuth 1.0a over
> 2.0, or vice versa. I've checked
> _https://etherpad.openstack.org/havana-saml-oauth-scim_and
> _https://etherpad.openstack.org/havana-external-auth_and couldn't find
> anything that said definitively that we were going to use 2.0.
> OAuth provider implementations (in python) of either version of the
> protocol are few and far between.

As I recall, there was a presentation from David Waite, Ping Identity,
about OAuth2, and after this it was agreed that OAuth2 would be used as
the generic delegation protocol in OpenStack, and that OAuth1 would only
be used as an alternative to the existing trusts delegation mechanism.
If this is not explicitly recorded in the meeting notes, then we will
need to ask others who were present at the design summit what their
recollections are.


>
> 2. I think I'm going to regret asking this question... (as I don't want
> to get into a long discussion about OAuth 1.0a vs 2.0), but what are the
> security weaknesses you mention?

These are documented in RFC 5849 (http://tools.ietf.org/rfc/rfc5849.txt)
which is presumably the specification of OAuth1 that you are implementing.

There is also a session fixation attack documented here
http://oauth.net/advisories/2009-1/

>
> 3. I think you are disagreeing with the consumer requesting roles? And
> are suggesting the consumer should be requesting an authorizing user
> instead? I'm not completely against it, but I'd be interested in what
> others have to say.

If the use case is to replace trusts, and to allow delegation from a
user to one or more openstack services then yes, it should be the user
who is wanting to launch the openstack service, and not any user who
happens to have the same role as this user. Otherwise a consumer could
get one user to authorise the job of another user, if they have the same
roles.

If the use case is general delegation, then relating it to Adam's post:

http://adam.younglogic.com/2013/07/a-vision-for-keystone/

in order to future proof, the consumer should request authorising
attributes, rather than roles, since as Adam points out, the policy
based authz system is not restricted to using roles.

>
> 4. Regarding the evil consumer; we do not just use the consumer key, the
> magic oauth variables I mentioned also contain oauth_signature, which
> (if you are using a standard oauth client) is the consumers secret (and
> other values) and also hashed.

This solves one of my problems, because now you are effectively saying
that the consumer has to be pre-registered with Keystone and have a
un/pw allocated to it. I did not see this mentioned in the original
blueprint, and thought (wrongly) that anything could be a consumer
without any pre-registration requirement. So the pre-registration
requirement removes most probability that the consumer can be evil,
since one would hope that the Keystone administrator would check it out
before registering it.

regards

david


  On the server side, we grab the consumer
> entity from our database and recreate the oauth_signature value, then
> verify the request.
>
>   * If the evil consumer did not provide a provide a secret, (or a wrong
> secret) it would fail on the verify step on the server side
> (signatures wouldn't match).
>   * If the evil consumer used his own library, he would still need to
> sign the request correctly (done only with the correct consumer
> key/secret).
>
>
>
> Thanks,
>
> _
> Steve Martinelli | A4-317 @ IBM Toronto Software Lab
> Software Developer - OpenStack
> Phone: (905) 413-2851
> E-Mail: steve...@ca.ibm.com
>
> Inactive hide details for David Chadwick ---06/19/2013 04:38:56 PM--- Hi
> Steve On 19/06/2013 20:56, Steve Martinelli wrote:David Chadwick
> ---06/19/2013 04:38:56 PM---  Hi Steve On 19/06/2013 20:56, Steve
> Martinelli wrote:
>
> From: David Chadwick 
> To: Steve Martinelli/Toronto/IBM@IBMCA,
> Cc: openstack-dev@lists.openstack.org
> Date: 06/19/2013 04:38 PM
> Subject: Re: [openstack-dev] [Keystone] Reviewers wanted: Delegated Auth
> a la Oauth
> 
>
>
>
>Hi Steve
>
> On 19/06/2013 20:56, S

Re: [openstack-dev] [Nova] New DB column or new DB table?

2013-07-19 Thread Daniel P. Berrange
On Thu, Jul 18, 2013 at 07:05:10AM -0400, Sean Dague wrote:
> On 07/17/2013 10:54 PM, Lu, Lianhao wrote:
> >Hi fellows,
> >
> >Currently we're implementing the BP 
> >https://blueprints.launchpad.net/nova/+spec/utilization-aware-scheduling. 
> >The main idea is to have an extensible plugin framework on nova-compute 
> >where every plugin can get different metrics(e.g. CPU utilization, memory 
> >cache utilization, network bandwidth, etc.) to store into the DB, and the 
> >nova-scheduler will use that data from DB for scheduling decision.
> >
> >Currently we adds a new table to store all the metric data and have 
> >nova-scheduler join loads the new table with the compute_nodes table to get 
> >all the data(https://review.openstack.org/35759). Someone is concerning 
> >about the performance penalty of the join load operation when there are many 
> >metrics data stored in the DB for every single compute node. Don suggested 
> >adding a new column in the current compute_nodes table in DB, and put all 
> >metric data into a dictionary key/value format and store the json encoded 
> >string of the dictionary into that new column in DB.
> >
> >I'm just wondering which way has less performance impact, join load with a 
> >new table with quite a lot of rows, or json encode/decode a dictionary with 
> >a lot of key/value pairs?
> >
> >Thanks,
> >-Lianhao
> 
> I'm really confused. Why are we talking about collecting host
> metrics in nova when we've got a whole project to do that in
> ceilometer? I think utilization based scheduling would be a great
> thing, but it really out to be interfacing with ceilometer to get
> that data. Storing it again in nova (or even worse collecting it a
> second time in nova) seems like the wrong direction.
> 
> I think there was an equiv patch series at the end of Grizzly that
> was pushed out for the same reasons.
> 
> If there is a reason ceilometer can't be used in this case, we
> should have that discussion here on the list. Because my initial
> reading of this blueprint and the code patches is that it partially
> duplicates ceilometer function, which we definitely don't want to
> do. Would be happy to be proved wrong on that.

IIUC, Ceilometer is currently a downstream consumer of data from Nova, but
no functionality in Nova is a consumer of data from Ceilometer. This is good
split from a security separation point of view, since the security of Nova
is self-contained in this architecture.

If Nova schedular becomes dependant on data from ceilometer, then now the
security of Nova depends on the security of Ceilometer, expanding the attack
surface. This is not good architecture IMHO.

At the same time, I hear your concerns about the potential for duplication
of stats collection functionality between Nova & Ceilometer. I don't think
we neccessarily need to remove 100% of duplication. IMHO probably the key
thing is for the virt drivers to expose a standard API for exporting the
stats, and make sure that both ceilometer & nova schedular use the same
APIs and ideally the same data feed, so we're not invoking the same APIs
twice to get the same data.


Daniel
-- 
|: http://berrange.com  -o-http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org  -o- http://virt-manager.org :|
|: http://autobuild.org   -o- http://search.cpan.org/~danberr/ :|
|: http://entangle-photo.org   -o-   http://live.gnome.org/gtk-vnc :|

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


[openstack-dev] A simple way to improve nova scheduler

2013-07-19 Thread Boris Pavlovic
Hi all,



In Mirantis Alexey Ovtchinnikov and me are working on nova scheduler
improvements.

As far as we can see the problem, now scheduler has two major issues:


1) Scalability. Factors that contribute to bad scalability are these:

*) Each compute node every periodic task interval (60 sec by default)
updates resources state in DB.

*) On every boot request scheduler has to fetch information about all
compute nodes from DB.

2) Flexibility. Flexibility perishes due to problems with:

*) Addiing new complex resources (such as big lists of complex objects e.g.
required by PCI Passthrough
https://review.openstack.org/#/c/34644/5/nova/db/sqlalchemy/models.py)

*) Using different sources of data in Scheduler for example from cinder or
ceilometer.

(as required by Volume Affinity Filter
https://review.openstack.org/#/c/29343/)


We found a simple way to mitigate this issues by avoiding of DB usage for
host state storage.


A more detailed discussion of the problem state and one of a possible
solution can be found here:

https://docs.google.com/document/d/1_DRv7it_mwalEZzLy5WO92TJcummpmWL4NWsWf0UWiQ/edit#


Best regards,

Boris Pavlovic


Mirantis Inc.
___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] Reminder: Oslo project meeting

2013-07-19 Thread Mark McLoughlin
On Tue, 2013-07-16 at 22:11 +0100, Mark McLoughlin wrote:
> Hi
> 
> We're having an IRC meeting on Friday to sync up again on the messaging
> work going on:
> 
>   https://wiki.openstack.org/wiki/Meetings/Oslo
>   https://etherpad.openstack.org/HavanaOsloMessaging
> 
> Feel free to add other topics to the wiki
> 
> See you on #openstack-meeting at 1400 UTC

Logs are published here:

  http://eavesdrop.openstack.org/meetings/oslo/2013/oslo.2013-07-19-14.00.html

Cheers,
Mark.


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Nova] Ceilometer vs. Nova internal metrics collector for scheduling

2013-07-19 Thread Boris Pavlovic
I think that current approach of Scheduler is not scalable and not flexible. if 
we add key/value this will make our scheduler flexible but with tons of hacks 
and less scalable.  We will get a tons of problems even on small 1k nodes 
cloud. 

We found another approach (just remove DB) this will solve all our problems. 
There is another thread with link to google doc, with large description. 

https://docs.google.com/document/d/1_DRv7it_mwalEZzLy5WO92TJcummpmWL4NWsWf0UWiQ/edit#

19.07.2013, в 19:30, Sean Dague  написал(а):

> On 07/19/2013 10:37 AM, Murray, Paul (HP Cloud Services) wrote:
>> If we agree that "something like capabilities" should go through Nova, what 
>> do you suggest should be done with the change that sparked this debate: 
>> https://review.openstack.org/#/c/35760/
>> 
>> I would be happy to use it or a modified version.
> 
> CPU sys, user, idle, iowait time isn't capabilities though. That's a 
> dynamically changing value. I also think the current approach where this is 
> point in time sampling, because we only keep a single value, is going to 
> cause some oddly pathologic behavior if you try to use it as scheduling 
> criteria.
> 
> I'd really appreciate the views of more nova core folks on this thread, as it 
> looks like these blueprints have seen pretty minimal code review at this 
> point. H3 isn't that far away, and there is a lot of high priority things 
> ahead of this, and only so much coffee and review time in a day.
> 
>   -Sean
> 
> -- 
> Sean Dague
> http://dague.net
> 
> ___
> OpenStack-dev mailing list
> OpenStack-dev@lists.openstack.org
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] Moving task flow to conductor - concern about scale

2013-07-19 Thread Day, Phil
> -Original Message-
> From: Dan Smith [mailto:d...@danplanet.com]
> Sent: 19 July 2013 15:15
> To: OpenStack Development Mailing List
> Cc: Day, Phil
> Subject: Re: [openstack-dev] Moving task flow to conductor - concern about
> scale
> 
> > There's nothing I've seen so far that causes me alarm,  but then again
> > we're in the very early stages and haven't moved anything really
> > complex.
> 
> The migrations (live, cold, and resize) are moving there now. These are some
> of the more complex stateful operations I would expect conductor to manage in
> the near term, and maybe ever.
> 
> > I just don't buy into this line of thinking - I need more than one API
> > node for HA as well - but that doesn't mean that therefore I want to
> > put anything else that needs more than one node in there.
> >
> > I don't even think these do scale-with-compute in the same way;  DB
> > proxy scales with the number of compute hosts because each new host
> > introduces an amount of DB load though its periodic tasks.Task
> 
> > to create / modify servers - and that's not directly related to the
> > number of hosts.
> 
> Unlike API, the only incoming requests that generate load for the conductor 
> are
> things like migrations, which also generate database traffic.
> 
> > So rather than asking "what doesn't work / might not work in the
> > future" I think the question should be "aside from them both being
> > things that could be described as a conductor - what's the
> > architectural reason for wanting to have these two separate groups of
> > functionality in the same service ?"
> 
> IMHO, the architectural reason is "lack of proliferation of services and the
> added complexity that comes with it."
> 

IMO I don't think reducing the number of services is a good enough reason to 
group unrelated services (db-proxy, task_workflow).  Otherwise why aren't we 
arguing to just add all of these to the existing scheduler service ?

> If one expects the proxy workload to
> always overshadow the task workload, then making these two things a single
> service makes things a lot simpler.

Not if you have to run 40 services to cope with the proxy load, but don't want 
the risk/complexity of havign 40 task workflow engines working in parallel.

> > If they were separate services and it turns out that I can/want/need
> > to run the same number of both then I can pretty easily do that  - but
> > the current approach is removing what to be seems a very important
> > degree of freedom around deployment on a large scale system.
> 
> I guess the question, then, is whether other folks agree that the scaling-
> separately problem is concerning enough to justify at least an RPC topic split
> now which would enable the services to be separated later if need be.
> 

Yep - that's the key question.   An in the interest of keeping the system 
stable at scale while we roll through this I think we should be erring on the 
side of caution/keeping deployment options open rather than waiting to see if 
there's a problem.

> I would like to point out, however, that the functions are being split into
> different interfaces currently. While that doesn't reach low enough on the 
> stack
> to allow hosting them in two different places, it does provide organization 
> such
> that if we later needed to split them, it would be a relatively simple (hah)
> matter of coordinating an RPC upgrade like anything else.
> 
> --Dan

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Neutron] Chalenges with highly available service VMs - port adn security group options.

2013-07-19 Thread Samuel Bercovici
Adding the original people conversing on this subject to this mail.

Regards,
 -Sam.

On Jul 19, 2013, at 11:57 AM, "Samuel Bercovici" 
mailto:samu...@radware.com>> wrote:

Hi,

I have completely missed this discussion as it does not have quantum/Neutron in 
the subject (modify it now)
I think that the security group is the right place to control this.
I think that this might be only allowed to admins.

Let me explain what we need which is more than just disable spoofing.

1.   Be able to allow MACs which are not defined on the port level to 
transmit packets (for example VRRP MACs)== turn off MAC spoofing

2.   Be able to allow IPs which are not defined on the port level to 
transmit packets (for example, IP used for HA service that moves between an HA 
pair) == turn off IP spoofing

3.   Be able to allow broadcast message on the port (for example for VRRP 
broadcast) == allow broadcast.


Regards,
-Sam.


From: Aaron Rosen [mailto:aro...@nicira.com]
Sent: Friday, July 19, 2013 3:26 AM
To: OpenStack Development Mailing List
Subject: Re: [openstack-dev] Chalenges with highly available service VMs

Yup:
I'm definitely happy to review and give hints.
Blueprint:  
https://docs.google.com/document/d/18trYtq3wb0eJK2CapktN415FRIVasr7UkTpWn9mLq5M/edit

https://review.openstack.org/#/c/19279/  < patch that merged the feature;
Aaron

On Thu, Jul 18, 2013 at 5:15 PM, Ian Wells 
mailto:ijw.ubu...@cack.org.uk>> wrote:
On 18 July 2013 19:48, Aaron Rosen 
mailto:aro...@nicira.com>> wrote:
> Is there something this is missing that could be added to cover your use
> case? I'd be curious to hear where this doesn't work for your case.  One
> would need to implement the port_security extension if they want to
> completely allow all ips/macs to pass and they could state which ones are
> explicitly allowed with the allowed-address-pair extension (at least that is
> my current thought).
Yes - have you got docs on the port security extension?  All I've
found so far are
http://docs.openstack.org/developer/quantum/api/quantum.extensions.portsecurity.html
and the fact that it's only the Nicira plugin that implements it.  I
could implement it for something else, but not without a few hints...
--
Ian.

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] Moving task flow to conductor - concern about scale

2013-07-19 Thread Joe Gordon
On Jul 19, 2013 9:57 AM, "Day, Phil"  wrote:
>
> > -Original Message-
> > From: Dan Smith [mailto:d...@danplanet.com]
> > Sent: 19 July 2013 15:15
> > To: OpenStack Development Mailing List
> > Cc: Day, Phil
> > Subject: Re: [openstack-dev] Moving task flow to conductor - concern
about
> > scale
> >
> > > There's nothing I've seen so far that causes me alarm,  but then again
> > > we're in the very early stages and haven't moved anything really
> > > complex.
> >
> > The migrations (live, cold, and resize) are moving there now. These are
some
> > of the more complex stateful operations I would expect conductor to
manage in
> > the near term, and maybe ever.
> >
> > > I just don't buy into this line of thinking - I need more than one API
> > > node for HA as well - but that doesn't mean that therefore I want to
> > > put anything else that needs more than one node in there.
> > >
> > > I don't even think these do scale-with-compute in the same way;  DB
> > > proxy scales with the number of compute hosts because each new host
> > > introduces an amount of DB load though its periodic tasks.Task
> >
> > > to create / modify servers - and that's not directly related to the
> > > number of hosts.
> >
> > Unlike API, the only incoming requests that generate load for the
conductor are
> > things like migrations, which also generate database traffic.
> >
> > > So rather than asking "what doesn't work / might not work in the
> > > future" I think the question should be "aside from them both being
> > > things that could be described as a conductor - what's the
> > > architectural reason for wanting to have these two separate groups of
> > > functionality in the same service ?"
> >
> > IMHO, the architectural reason is "lack of proliferation of services
and the
> > added complexity that comes with it."
> >
>
> IMO I don't think reducing the number of services is a good enough reason
to group unrelated services (db-proxy, task_workflow).  Otherwise why
aren't we arguing to just add all of these to the existing scheduler
service ?
>
> > If one expects the proxy workload to
> > always overshadow the task workload, then making these two things a
single
> > service makes things a lot simpler.
>
> Not if you have to run 40 services to cope with the proxy load, but don't
want the risk/complexity of havign 40 task workflow engines working in
parallel.
>
> > > If they were separate services and it turns out that I can/want/need
> > > to run the same number of both then I can pretty easily do that  - but
> > > the current approach is removing what to be seems a very important
> > > degree of freedom around deployment on a large scale system.
> >
> > I guess the question, then, is whether other folks agree that the
scaling-
> > separately problem is concerning enough to justify at least an RPC
topic split
> > now which would enable the services to be separated later if need be.
> >
>
> Yep - that's the key question.   An in the interest of keeping the system
stable at scale while we roll through this I think we should be erring on
the side of caution/keeping deployment options open rather than waiting to
see if there's a problem.

++, unless there is some downside to a RPC topic split, this seems like a
reasonable precaution.

>
> > I would like to point out, however, that the functions are being split
into
> > different interfaces currently. While that doesn't reach low enough on
the stack
> > to allow hosting them in two different places, it does provide
organization such
> > that if we later needed to split them, it would be a relatively simple
(hah)
> > matter of coordinating an RPC upgrade like anything else.
> >
> > --Dan
>
> ___
> OpenStack-dev mailing list
> OpenStack-dev@lists.openstack.org
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


[openstack-dev] [Infra] Channel logging enabled

2013-07-19 Thread Elizabeth Krumbach Joseph
Hi everyone,

The #openstack-infra channel has been increasing in traffic and
attention these past several months (hooray!). It finally became clear
to us that discussions happening there were often of interest to the
wider project and that we should start logging the channel.

Today we added the eavesdrop bot to our channel to do just that,
logging for all channels eavesdrop watches are available here:
http://eavesdrop.openstack.org/irclogs/

We'd also like to welcome other channels to log via eavesdrop if they
wish. This can be done by adding your channel to the openstack
eavesdrop puppet manifest in the openstack-infra/config project (
modules/openstack_project/manifests/eavesdrop.pp). You can see the
change where we added ours as an example:
https://review.openstack.org/#/c/36773/2/
modules/openstack_project/manifests/eavesdrop.pp

-- 
Elizabeth Krumbach Joseph || Lyz || pleia2
http://www.princessleia.com

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] Revert Pass instance host-id to Quantum using port bindings extension.

2013-07-19 Thread Kyle Mestery (kmestery)
On Jul 18, 2013, at 5:16 PM, Aaron Rosen  wrote:
> 
> Hi, 
> 
> I wanted to raise another design failure of why creating the port on 
> nova-compute is bad. Previously, we have encountered this bug 
> (https://bugs.launchpad.net/neutron/+bug/1160442). What was causing the issue 
> was that when nova-compute calls into quantum to create the port; quantum 
> creates the port but fails to return the port to nova and instead timesout. 
> When this happens the instance is scheduled to be run on another compute node 
> where another port is created with the same device_id and when the instance 
> boots it will look like it has two ports. This is still a problem that can 
> occur today in our current implementation (!). 
> 
> I think in order to move forward with this we'll need to compromise. Here is 
> my though on how we should proceed. 
> 
> 1) Modify the quantum API so that mac addresses can now be updated via the 
> api. There is no reason why we have this limitation (especially once the 
> patch that uses dhcp_release is merged as it will allow us to update the 
> lease for the new mac immediately).  We need to do this in order for bare 
> metal support as we need to match the mac address of the port to the compute 
> node.
> 
I don't understand how this relates to creating a port through nova-compute. 
I'm not saying this is a bad idea, I just don't see how it relates to the 
original discussion point on this thread around Yong's patch.

> 2) move the port-creation from nova-compute to nova-api. This will solve a 
> number of issues like the one i pointed out above. 
> 
This seems like a bad idea. So now a Nova API call will implicitly create a 
Neutron port? What happens on failure here? The caller isn't aware the port was 
created in Neutron if it's implicit, so who cleans things up? Or if the caller 
is aware, than all we've done is move an API the caller would have done 
(nova-compute in this case) into nova-api, though the caller is now still aware 
of what's happening.

> 3)  For now, i'm okay with leaving logic on the compute node that calls 
> update-port if the port binding extension is loaded. This will allow the vif 
> type to be correctly set as well. 
> 
And this will also still pass in the hostname the VM was booted on?

To me, this thread seems to have diverged a bit from the original discussion 
point around Yong's patch. Yong's patch makes sense, because it's passing the 
hostname the VM is booted on during port create. It also updates the binding 
during a live migration, so that case is covered. Any change to this behavior 
should cover both those cases and not involve any sort of agent polling, IMHO.

Thanks,
Kyle

> Thoughts/Comments? 
> 
> Thanks,
> 
> Aaron
> 
> 
> On Mon, Jul 15, 2013 at 2:45 PM, Aaron Rosen  wrote:
> 
> 
> 
> On Mon, Jul 15, 2013 at 1:26 PM, Robert Kukura  wrote:
> On 07/15/2013 03:54 PM, Aaron Rosen wrote:
> >
> >
> >
> > On Sun, Jul 14, 2013 at 6:48 PM, Robert Kukura  > > wrote:
> >
> > On 07/12/2013 04:17 PM, Aaron Rosen wrote:
> > > Hi,
> > >
> > >
> > > On Fri, Jul 12, 2013 at 6:47 AM, Robert Kukura  > 
> > > >> wrote:
> > >
> > > On 07/11/2013 04:30 PM, Aaron Rosen wrote:
> > > > Hi,
> > > >
> > > > I think we should revert this patch that was added here
> > > > (https://review.openstack.org/#/c/29767/). What this patch
> > does is
> > > when
> > > > nova-compute calls into quantum to create the port it passes
> > in the
> > > > hostname on which the instance was booted on. The idea of the
> > > patch was
> > > > that providing this information would "allow hardware device
> > vendors
> > > > management stations to allow them to segment the network in
> > a more
> > > > precise manager (for example automatically trunk the vlan on the
> > > > physical switch port connected to the compute node on which
> > the vm
> > > > instance was started)."
> > > >
> > > > In my opinion I don't think this is the right approach.
> > There are
> > > > several other ways to get this information of where a
> > specific port
> > > > lives. For example, in the OVS plugin case the agent running
> > on the
> > > > nova-compute node can update the port in quantum to provide this
> > > > information. Alternatively, quantum could query nova using the
> > > > port.device_id to determine which server the instance is on.
> > > >
> > > > My motivation for removing this code is I now have the free
> > cycles to
> > > > work on
> > > >
> > >
> > https://blueprints.launchpad.net/nova/+spec/nova-api-quantum-create-port
> > > >  discussed here
> > > >
> > >
> > 
> > (http://lists.openstack.or

Re: [openstack-dev] [Nova] Ceilometer vs. Nova internal metrics collector for scheduling

2013-07-19 Thread Sean Dague

On 07/19/2013 10:37 AM, Murray, Paul (HP Cloud Services) wrote:

If we agree that "something like capabilities" should go through Nova, what do 
you suggest should be done with the change that sparked this debate: 
https://review.openstack.org/#/c/35760/

I would be happy to use it or a modified version.


CPU sys, user, idle, iowait time isn't capabilities though. That's a 
dynamically changing value. I also think the current approach where this 
is point in time sampling, because we only keep a single value, is going 
to cause some oddly pathologic behavior if you try to use it as 
scheduling criteria.


I'd really appreciate the views of more nova core folks on this thread, 
as it looks like these blueprints have seen pretty minimal code review 
at this point. H3 isn't that far away, and there is a lot of high 
priority things ahead of this, and only so much coffee and review time 
in a day.


-Sean

--
Sean Dague
http://dague.net

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] Moving task flow to conductor - concern about scale

2013-07-19 Thread Dan Smith
> I had assumed that some of the task management state would exist
> in memory. Is it all going to exist in the database?

Well, our state is tracked in the database now, so.. yeah. There's a
desire, of course, to make the state transitions as
idempotent/restartable as possible, which may mean driving some
finer-grained status details into the database. That's really
independent of the move to conductor (although doing that does take
less effort if those don't have to make an RPC trip to get there).

--Dan

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] Moving task flow to conductor - concern about scale

2013-07-19 Thread Peter Feiner
On Fri, Jul 19, 2013 at 11:06 AM, Dan Smith  wrote:
> FWIW, I don't think anyone is suggesting a single conductor, and
> especially not a single database proxy.

This is a critical detail that I missed. Re-reading Phil's original email,
I see you're debating the ratio of nova-conductor DB proxies to
nova-conductor task flow managers.

I had assumed that some of the task management state would exist
in memory. Is it all going to exist in the database?

>> Since these queries are made frequently (i.e., easily 100 times
>> during instance creation) and while other global locks are held
>> (e.g., in the case of nova-compute's ResourceTracker), most of what
>> nova-compute does becomes serialized.
>
> I think your numbers are a bit off. When I measured it just before
> grizzly, an instance create was something like 20-30 database calls.
> Unless that's changed (a lot) lately ... :)

Ah perhaps... at least I had the order of magnitude right :-) Even
with 20-30 calls,
when a bunch of instances are being booted in parallel and all of the
database calls
are serialized, minutes are added in instance creation time.

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Nova] Ceilometer vs. Nova internal metrics collector for scheduling

2013-07-19 Thread Murray, Paul (HP Cloud Services)
If we agree that "something like capabilities" should go through Nova, what do 
you suggest should be done with the change that sparked this debate: 
https://review.openstack.org/#/c/35760/ 

I would be happy to use it or a modified version.

Paul.

-Original Message-
From: Sean Dague [mailto:s...@dague.net] 
Sent: 19 July 2013 14:28
To: OpenStack Development Mailing List
Subject: Re: [openstack-dev] [Nova] Ceilometer vs. Nova internal metrics 
collector for scheduling

On 07/19/2013 08:30 AM, Andrew Laski wrote:
> On 07/19/13 at 12:08pm, Murray, Paul (HP Cloud Services) wrote:
>> Hi Sean,
>>
>> Do you think the existing static allocators should be migrated to 
>> going through ceilometer - or do you see that as different? Ignoring 
>> backward compatibility.
>
> It makes sense to keep some things in Nova, in order to handle the 
> graceful degradation needed if Ceilometer couldn't be reached.  I see 
> the line as something like capabilities should be handled by Nova, 
> memory free, vcpus available, etc... and utilization metrics handled 
> by Ceilometer.

Yes, that makes sense to me. I'd be happy with that.

-Sean

--
Sean Dague
http://dague.net

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] Moving task flow to conductor - concern about scale

2013-07-19 Thread Dan Smith
> There's nothing I've seen so far that causes me alarm,  but then
> again we're in the very early stages and haven't moved anything
> really complex.

The migrations (live, cold, and resize) are moving there now. These are
some of the more complex stateful operations I would expect conductor
to manage in the near term, and maybe ever.

> I just don't buy into this line of thinking - I need more than one
> API node for HA as well - but that doesn't mean that therefore I want
> to put anything else that needs more than one node in there.
> 
> I don't even think these do scale-with-compute in the same way;  DB
> proxy scales with the number of compute hosts because each new host
> introduces an amount of DB load though its periodic tasks.Task

> to create / modify servers - and that's not directly related to the
> number of hosts. 

Unlike API, the only incoming requests that generate load for the
conductor are things like migrations, which also generate database
traffic.

> So rather than asking "what doesn't work / might not work in the
> future" I think the question should be "aside from them both being
> things that could be described as a conductor - what's the
> architectural reason for wanting to have these two separate groups of
> functionality in the same service ?"

IMHO, the architectural reason is "lack of proliferation of services and
the added complexity that comes with it." If one expects the
proxy workload to always overshadow the task workload, then making
these two things a single service makes things a lot simpler.

> If they were separate services and it turns out that I can/want/need
> to run the same number of both then I can pretty easily do that  -
> but the current approach is removing what to be seems a very
> important degree of freedom around deployment on a large scale system.

I guess the question, then, is whether other folks agree that the
scaling-separately problem is concerning enough to justify at least an
RPC topic split now which would enable the services to be separated
later if need be.

I would like to point out, however, that the functions are being split
into different interfaces currently. While that doesn't reach low
enough on the stack to allow hosting them in two different places, it
does provide organization such that if we later needed to split them, it
would be a relatively simple (hah) matter of coordinating an RPC
upgrade like anything else.

--Dan

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] KMIP client for volume encryption key management

2013-07-19 Thread Jarret Raim
I'm not sure that I agree with this direction. In our investigation, KMIP is a 
problematic protocol for several reasons:

  *   We haven't found an implementation of KMIP for Python. (Let us know if 
there is one!)
  *   Support for KMIP by HSM vendors is limited.
  *   We haven't found software implementations of KMIP suitable for use as an 
HSM replacement. (e.g. Most deployers wanting to use KMIP would have to spend a 
rather large amount of money to purchase HSMs)
  *   From our research, the KMIP spec and implementations seem to lack support 
for multi-tenancy. This makes managing keys for thousands of users difficult or 
impossible.

The goal for the Barbican system is to provide key management for OpenStack. It 
uses the standard interaction mechanisms for OpenStack, namely ReST and JSON. 
We integrate with keystone and will provide common features like usage events, 
role-based access control, fine grained control, policy support, client libs, 
Celiometer support, Horizon support and other things expected of an OpenStack 
service. If every product is forced to implement KMIP, these features would 
most likely not be provided by whatever vendor is used for the Key Manager. 
Additionally, as mentioned in the blueprint, I have concerns that vendor 
specific data will be leaked into the rest of OpenStack for things like key 
identifiers, authentication and the like.

I would propose that rather than each product implement KMIP support, we 
implement KMIP support into Barbican. This will allow the products to speak 
ReST / JSON using our client libraries just like any other OpenStack system and 
Barbican will take care of being a good OpenStack citizen. On the backend, 
Barbican will support the use of KMIP to talk to whatever device the provider 
wishes to deploy. We will also support other interaction mechanisms including 
PKCS through OpenSSH, a development implementation and a fully free and open 
source software implementation. This also allows some advanced uses cases 
including federation. Federation will allow customers of public clouds like 
Rackspace's to maintain custody of their keys while still being able to 
delegate their use to the Cloud for specific tasks.

I've been asked about KMIP support at the Summit and by several of Rackspace's 
partners. I was planning on getting to it at some point, probably after 
Icehouse. This is mostly due to the fact that we didn't find a suitable KMIP 
implementation for Python so it looks like we'd have to write one. If there is 
interest from people to create that implementation, we'd be happy to help do 
the work to integrate it into Barbican.

We just released our M2 milestone and we are on track for our 1.0 release for 
Havana. I would encourage anyone interested to check our what we are working on 
and come help us out. We use this list for most of our discussions and we hang 
out on #openstack-cloudkeep on free node.


Thanks,
Jarret




From: , Bill 
mailto:bill.bec...@safenet-inc.com>>
Reply-To: OpenStack List 
mailto:openstack-dev@lists.openstack.org>>
Date: Thursday, July 18, 2013 2:11 PM
To: OpenStack List 
mailto:openstack-dev@lists.openstack.org>>
Subject: [openstack-dev] KMIP client for volume encryption key management

A blueprint and spec to add a client that implements OASIS KMIP standard was 
recently added:

https://blueprints.launchpad.net/nova/+spec/kmip-client-for-volume-encryption
https://wiki.openstack.org/wiki/KMIPclient


We’re looking for feedback to the set of questions in the spec. Any additional 
input is also appreciated.

Thanks,
Bill B.

The information contained in this electronic mail transmission
may be privileged and confidential, and therefore, protected
from disclosure. If you have received this communication in
error, please notify us immediately by replying to this
message and deleting it from your computer without copying
or disclosing it.



___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Nova] Ceilometer vs. Nova internal metrics collector for scheduling

2013-07-19 Thread Sean Dague

On 07/19/2013 08:30 AM, Andrew Laski wrote:

On 07/19/13 at 12:08pm, Murray, Paul (HP Cloud Services) wrote:

Hi Sean,

Do you think the existing static allocators should be migrated to
going through ceilometer - or do you see that as different? Ignoring
backward compatibility.


It makes sense to keep some things in Nova, in order to handle the
graceful degradation needed if Ceilometer couldn't be reached.  I see
the line as something like capabilities should be handled by Nova,
memory free, vcpus available, etc... and utilization metrics handled by
Ceilometer.


Yes, that makes sense to me. I'd be happy with that.

-Sean

--
Sean Dague
http://dague.net

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] Revert Pass instance host-id to Quantum using port bindings extension.

2013-07-19 Thread Ian Wells
> I wanted to raise another design failure of why creating the port on
> nova-compute is bad. Previously, we have encountered this bug
> (https://bugs.launchpad.net/neutron/+bug/1160442). What was causing the
> issue was that when nova-compute calls into quantum to create the port;
> quantum creates the port but fails to return the port to nova and instead
> timesout. When this happens the instance is scheduled to be run on another
> compute node where another port is created with the same device_id and when
> the instance boots it will look like it has two ports. This is still a
> problem that can occur today in our current implementation (!).

Well, the issue there would seem to be that we've merged the
functionality of 'create' (and 'update') and 'plug', where we make an
attachment to the network.  If we 'create'd a port without attaching
it, and then attached it from the compute driver, it would be obvious
when the second compute driver came along that we had an eventual
consistency problem that needed resolving.  We should indeed be
'create'ing the port from the API and then using a different 'plug'
call from the compute host, I think.

On the more general discussion:

I'm going to play my usual card  - if I want a mapped PCI device,
rather than a vswitch device, to be used, then I need to tell Neutron
that's what I'm thinking, and the obvious place to do that is from the
compute node where I'm creating the VM and know what specific PCI
device I've allocated.  Doing it from the Nova API server makes a lot
less sense - I shouldn't know the PCI bus ID on the API server for
anything more than debugging purposes, that's a compute-local choice.

[I'm also not convinced that the current model - where I choose a VF
to map in, then tell the Neutron server which tells the agent back on
the compute node which configures the PF of the card to encapsulate
the VF I've mapped, makes any sense.  I'm not entirely sure it makes
sense for OVS, either.  But that's a whole other thing.]
-- 
Ian.

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Nova] Ceilometer vs. Nova internal metrics collector for scheduling

2013-07-19 Thread Sandy Walsh


On 07/19/2013 09:47 AM, Day, Phil wrote:
>> -Original Message-
>> From: Sean Dague [mailto:s...@dague.net]
>> Sent: 19 July 2013 12:04
>> To: OpenStack Development Mailing List
>> Subject: Re: [openstack-dev] [Nova] Ceilometer vs. Nova internal metrics
>> collector for scheduling (was: New DB column or new DB table?)
>>
>> On 07/19/2013 06:18 AM, Day, Phil wrote:
>>> Ceilometer is a great project for taking metrics available in Nova and other
>> systems and making them available for use by Operations, Billing, Monitoring,
>> etc - and clearly we should try and avoid having multiple collectors of the 
>> same
>> data.
>>>
>>> But making the Nova scheduler dependent on Ceilometer seems to be the
>> wrong way round to me - scheduling is such a fundamental operation that I
>> want Nova to be self sufficient in this regard.   In particular I don't want 
>> the
>> availability of my core compute platform to be constrained by the 
>> availability
>> of my (still evolving) monitoring system.
>>>
>>> If Ceilometer can be fed from the data used by the Nova scheduler then 
>>> that's
>> a good plus - but not the other way round.
>>
>> I assume it would gracefully degrade to the existing static allocators if
>> something went wrong. If not, well that would be very bad.
>>
>> Ceilometer is an integrated project in Havana. Utilization based scheduling
>> would be a new feature. I'm not sure why we think that duplicating the 
>> metrics
>> collectors in new code would be less buggy than working with Ceilometer. Nova
>> depends on external projects all the time.
>>
>> If we have a concern about robustness here, we should be working as an 
>> overall
>> project to address that.
>>
>>  -Sean
>>
> Just to be cleat its about a lot more than just robustness in the code - its 
> the whole architectural pattern of putting Ceilometer at the centre of Nova 
> scheduling that concerns me.
> 
> As I understand it Celiometer can collect metrics from more than one copy of 
> Nova - which is good; I want to run multiple independent copies in different 
> regions and I want to have all of my monitoring data going back to one place. 
>   However that doesn't mean that I now also want all of those independent 
> copies of Nova depending on that central monitoring infrastructure for 
> something as basic as scheduling.  (I don't want to stop anyone that does 
> either - but I don't see why I should be forced down that route).
> 
> The original change that sparked this debate came not from anything to do 
> with utilisation based scheduling, but the pretty basic and simple desire to 
> add new types of consumable resource counters into the scheduler logic in a 
> more general way that having to make a DB schema change.  This was generally 
> agreed to be a good thing, and it pains me to see that valuable work now 
> blocked on what seems to be turning into an strategic discussion around the 
> role of Ceilometer (Is it a monitoring tool or a fundamental metric bus, etc).
> 
> At the point where Ceilomter can be shown to replace the current scheduler 
> resource mgmt code in Nova, then we should be talking about switching to it - 
> but in the meantime why can't we continue to have incremental improvements in 
> the current Nova code ?

+1

> 
> Cheers
> Phil
> 
>   
> 
> ___
> OpenStack-dev mailing list
> OpenStack-dev@lists.openstack.org
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
> 

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Nova] New DB column or new DB table?

2013-07-19 Thread Sandy Walsh


On 07/19/2013 09:43 AM, Sandy Walsh wrote:
> 
> 
> On 07/18/2013 11:12 PM, Lu, Lianhao wrote:
>> Sean Dague wrote on 2013-07-18:
>>> On 07/17/2013 10:54 PM, Lu, Lianhao wrote:
 Hi fellows,

 Currently we're implementing the BP 
 https://blueprints.launchpad.net/nova/+spec/utilization-aware-scheduling. 
 The main idea is to have
>>> an extensible plugin framework on nova-compute where every plugin can get 
>>> different metrics(e.g. CPU utilization, memory cache
>>> utilization, network bandwidth, etc.) to store into the DB, and the 
>>> nova-scheduler will use that data from DB for scheduling decision.

 Currently we adds a new table to store all the metric data and have 
 nova-scheduler join loads the new table with the compute_nodes
>>> table to get all the data(https://review.openstack.org/35759). Someone is 
>>> concerning about the performance penalty of the join load
>>> operation when there are many metrics data stored in the DB for every 
>>> single compute node. Don suggested adding a new column in the
>>> current compute_nodes table in DB, and put all metric data into a 
>>> dictionary key/value format and store the json encoded string of the
>>> dictionary into that new column in DB.

 I'm just wondering which way has less performance impact, join load
 with a new table with quite a lot of rows, or json encode/decode a
 dictionary with a lot of key/value pairs?

 Thanks,
 -Lianhao
>>>
>>> I'm really confused. Why are we talking about collecting host metrics in
>>> nova when we've got a whole project to do that in ceilometer? I think
>>> utilization based scheduling would be a great thing, but it really out
>>> to be interfacing with ceilometer to get that data. Storing it again in
>>> nova (or even worse collecting it a second time in nova) seems like the
>>> wrong direction.
>>>
>>> I think there was an equiv patch series at the end of Grizzly that was
>>> pushed out for the same reasons.
>>>
>>> If there is a reason ceilometer can't be used in this case, we should
>>> have that discussion here on the list. Because my initial reading of
>>> this blueprint and the code patches is that it partially duplicates
>>> ceilometer function, which we definitely don't want to do. Would be
>>> happy to be proved wrong on that.
>>>
>>> -Sean
>>>
>> Using ceilometer as the source of those metrics was discussed in the
>> nova-scheduler subgroup meeting. (see #topic extending data in host
>> state in the following link).
>> http://eavesdrop.openstack.org/meetings/scheduler/2013/scheduler.2013-04-30-15.04.log.html
>>
>> In that meeting, all agreed that ceilometer would be a great source of
>> metrics for scheduler, but many of them don't want to make the
>> ceilometer as a mandatory dependency for nova scheduler. 
> 
> This was also discussed at the Havana summit and rejected since we
> didn't want to introduce the external dependency of Ceilometer into Nova.
> 
> That said, we already have hooks at the virt layer for collecting host
> metrics and we're talking about removing the pollsters from nova compute
> nodes if the data can be collected from these existing hooks.
> 
> Whatever solution the scheduler group decides to use should utilize the
> existing (and maintained/growing) mechanisms we have in place there.
> That is, it should likely be a special notification driver that can get
> the data back to the scheduler in a timely fashion. It wouldn't have to
> use the rpc mechanism if it didn't want to, but it should be a plug-in
> at the notification layer.
> 
> Please don't add yet another way of pulling metric data out of the hosts.
> 
> -S

I should also add, that if you go the notification route, that doesn't
close the door on ceilometer integration. All you need is a means to get
the data from the notification driver to the scheduler, that part could
easily be replaced with a ceilometer driver if an operator wanted to go
that route.

The benefits of using Ceilometer would be having access to the
downstream events/meters and generated statistics that could be produced
there. We certainly don't want to add an advanced statistical package or
event-stream manager to Nova, when Ceilometer already has aspirations of
that.

The out-of-the-box nova experience should be better scheduling when
simple host metrics are used internally but really great scheduling when
integrated with Ceilometer.

> 
> 
> 
> 
>> Besides, currently ceilometer doesn't have "host metrics", like the 
>> cpu/network/cache utilization data of the compute node host, which
>> will affect the scheduling decision. What ceilometer has currently
>> is the "VM metrics", like cpu/network utilization of each VM instance.
>>
>> After the nova compute node collects the "host metrics", those metrics
>> could also be fed into ceilometer framework(e.g. through a ceilometer
>> listener) for further processing, like alarming, etc.
>>
>> -Lianhao
>>
>> ___

Re: [openstack-dev] [Nova] Ceilometer vs. Nova internal metrics collector for scheduling (was: New DB column or new DB table?)

2013-07-19 Thread Day, Phil
> -Original Message-
> From: Sean Dague [mailto:s...@dague.net]
> Sent: 19 July 2013 12:04
> To: OpenStack Development Mailing List
> Subject: Re: [openstack-dev] [Nova] Ceilometer vs. Nova internal metrics
> collector for scheduling (was: New DB column or new DB table?)
> 
> On 07/19/2013 06:18 AM, Day, Phil wrote:
> > Ceilometer is a great project for taking metrics available in Nova and other
> systems and making them available for use by Operations, Billing, Monitoring,
> etc - and clearly we should try and avoid having multiple collectors of the 
> same
> data.
> >
> > But making the Nova scheduler dependent on Ceilometer seems to be the
> wrong way round to me - scheduling is such a fundamental operation that I
> want Nova to be self sufficient in this regard.   In particular I don't want 
> the
> availability of my core compute platform to be constrained by the availability
> of my (still evolving) monitoring system.
> >
> > If Ceilometer can be fed from the data used by the Nova scheduler then 
> > that's
> a good plus - but not the other way round.
> 
> I assume it would gracefully degrade to the existing static allocators if
> something went wrong. If not, well that would be very bad.
> 
> Ceilometer is an integrated project in Havana. Utilization based scheduling
> would be a new feature. I'm not sure why we think that duplicating the metrics
> collectors in new code would be less buggy than working with Ceilometer. Nova
> depends on external projects all the time.
> 
> If we have a concern about robustness here, we should be working as an overall
> project to address that.
> 
>   -Sean
> 
Just to be cleat its about a lot more than just robustness in the code - its 
the whole architectural pattern of putting Ceilometer at the centre of Nova 
scheduling that concerns me.

As I understand it Celiometer can collect metrics from more than one copy of 
Nova - which is good; I want to run multiple independent copies in different 
regions and I want to have all of my monitoring data going back to one place.   
However that doesn't mean that I now also want all of those independent copies 
of Nova depending on that central monitoring infrastructure for something as 
basic as scheduling.  (I don't want to stop anyone that does either - but I 
don't see why I should be forced down that route).

The original change that sparked this debate came not from anything to do with 
utilisation based scheduling, but the pretty basic and simple desire to add new 
types of consumable resource counters into the scheduler logic in a more 
general way that having to make a DB schema change.  This was generally agreed 
to be a good thing, and it pains me to see that valuable work now blocked on 
what seems to be turning into an strategic discussion around the role of 
Ceilometer (Is it a monitoring tool or a fundamental metric bus, etc).

At the point where Ceilomter can be shown to replace the current scheduler 
resource mgmt code in Nova, then we should be talking about switching to it - 
but in the meantime why can't we continue to have incremental improvements in 
the current Nova code ?

Cheers
Phil

  

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Nova] New DB column or new DB table?

2013-07-19 Thread Sandy Walsh


On 07/18/2013 11:12 PM, Lu, Lianhao wrote:
> Sean Dague wrote on 2013-07-18:
>> On 07/17/2013 10:54 PM, Lu, Lianhao wrote:
>>> Hi fellows,
>>>
>>> Currently we're implementing the BP 
>>> https://blueprints.launchpad.net/nova/+spec/utilization-aware-scheduling. 
>>> The main idea is to have
>> an extensible plugin framework on nova-compute where every plugin can get 
>> different metrics(e.g. CPU utilization, memory cache
>> utilization, network bandwidth, etc.) to store into the DB, and the 
>> nova-scheduler will use that data from DB for scheduling decision.
>>>
>>> Currently we adds a new table to store all the metric data and have 
>>> nova-scheduler join loads the new table with the compute_nodes
>> table to get all the data(https://review.openstack.org/35759). Someone is 
>> concerning about the performance penalty of the join load
>> operation when there are many metrics data stored in the DB for every single 
>> compute node. Don suggested adding a new column in the
>> current compute_nodes table in DB, and put all metric data into a dictionary 
>> key/value format and store the json encoded string of the
>> dictionary into that new column in DB.
>>>
>>> I'm just wondering which way has less performance impact, join load
>>> with a new table with quite a lot of rows, or json encode/decode a
>>> dictionary with a lot of key/value pairs?
>>>
>>> Thanks,
>>> -Lianhao
>>
>> I'm really confused. Why are we talking about collecting host metrics in
>> nova when we've got a whole project to do that in ceilometer? I think
>> utilization based scheduling would be a great thing, but it really out
>> to be interfacing with ceilometer to get that data. Storing it again in
>> nova (or even worse collecting it a second time in nova) seems like the
>> wrong direction.
>>
>> I think there was an equiv patch series at the end of Grizzly that was
>> pushed out for the same reasons.
>>
>> If there is a reason ceilometer can't be used in this case, we should
>> have that discussion here on the list. Because my initial reading of
>> this blueprint and the code patches is that it partially duplicates
>> ceilometer function, which we definitely don't want to do. Would be
>> happy to be proved wrong on that.
>>
>>  -Sean
>>
> Using ceilometer as the source of those metrics was discussed in the
> nova-scheduler subgroup meeting. (see #topic extending data in host
> state in the following link).
> http://eavesdrop.openstack.org/meetings/scheduler/2013/scheduler.2013-04-30-15.04.log.html
> 
> In that meeting, all agreed that ceilometer would be a great source of
> metrics for scheduler, but many of them don't want to make the
> ceilometer as a mandatory dependency for nova scheduler. 

This was also discussed at the Havana summit and rejected since we
didn't want to introduce the external dependency of Ceilometer into Nova.

That said, we already have hooks at the virt layer for collecting host
metrics and we're talking about removing the pollsters from nova compute
nodes if the data can be collected from these existing hooks.

Whatever solution the scheduler group decides to use should utilize the
existing (and maintained/growing) mechanisms we have in place there.
That is, it should likely be a special notification driver that can get
the data back to the scheduler in a timely fashion. It wouldn't have to
use the rpc mechanism if it didn't want to, but it should be a plug-in
at the notification layer.

Please don't add yet another way of pulling metric data out of the hosts.

-S




> Besides, currently ceilometer doesn't have "host metrics", like the 
> cpu/network/cache utilization data of the compute node host, which
> will affect the scheduling decision. What ceilometer has currently
> is the "VM metrics", like cpu/network utilization of each VM instance.
> 
> After the nova compute node collects the "host metrics", those metrics
> could also be fed into ceilometer framework(e.g. through a ceilometer
> listener) for further processing, like alarming, etc.
> 
> -Lianhao
> 
> ___
> OpenStack-dev mailing list
> OpenStack-dev@lists.openstack.org
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
> 

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Nova] Ceilometer vs. Nova internal metrics collector for scheduling (was: New DB column or new DB table?)

2013-07-19 Thread Andrew Laski

On 07/19/13 at 07:04am, Sean Dague wrote:

On 07/19/2013 06:18 AM, Day, Phil wrote:

Ceilometer is a great project for taking metrics available in Nova and other 
systems and making them available for use by Operations, Billing, Monitoring, 
etc - and clearly we should try and avoid having multiple collectors of the 
same data.

But making the Nova scheduler dependent on Ceilometer seems to be the wrong way 
round to me - scheduling is such a fundamental operation that I want Nova to be 
self sufficient in this regard.   In particular I don't want the availability 
of my core compute platform to be constrained by the availability of my (still 
evolving) monitoring system.

If Ceilometer can be fed from the data used by the Nova scheduler then that's a 
good plus - but not the other way round.


I assume it would gracefully degrade to the existing static 
allocators if something went wrong. If not, well that would be very 
bad.


Ceilometer is an integrated project in Havana. Utilization based 
scheduling would be a new feature. I'm not sure why we think that 
duplicating the metrics collectors in new code would be less buggy 
than working with Ceilometer. Nova depends on external projects all 
the time.


I tend to agree here.  It's worth at least doing the due diligence to 
see if this is something that could fit into Ceilometer.  Or maybe 
there's code that can be pulled out of Ceilometer and put into oslo to 
hanlde this.  It doesn't make sense to duplicate effort if we can take 
advantage of already implemented code.




If we have a concern about robustness here, we should be working as 
an overall project to address that.


-Sean

--
Sean Dague
http://dague.net

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] Opinions needed: Changing method signature in RPC callback ...

2013-07-19 Thread Sandy Walsh


On 07/18/2013 05:56 PM, Eric Windisch wrote:
> 
> > These callback methods are part of the Kombu driver (and maybe part of
> > Qpid), but are NOT part of the RPC abstraction. These are private
> > methods. They can be broken for external consumers of these methods,
> > because there shouldn't be any. It will be a good lesson to anyone
> that
> > tries to abuse private methods.
> 
> I was wondering about that, but I assumed some parts of amqp.py were
> used by other transports as well (and not just impl_kombu.py)
> 
> There are several callbacks in amqp.py that would be affected.
> 
>  
> The code in amqp.py is used by the Kombu and Qpid drivers and might
> implement the public methods expected by the abstraction, but does not
> define it. The RPC abstraction is defined in __init__.py, and does not
> define callbacks. Other drivers, granted only being the ZeroMQ driver at
> present, are not expected to define a callback method and as a private
> method -- would have no template to follow nor an expectation to have
> this method.
> 
> I'm not saying your proposed changes are bad or invalid, but there is no
> need to make concessions to the possibility that code outside of oslo
> would be using callback(). This opens up the option, besides creating a
> new method, to simply updating all the existing method calls that exist
> in amqp.py, impl_kombu.py, and impl_qpid.py.

Gotcha ... thanks Eric. Yeah, the outer api is very generic.

I did a little more research and, unfortunately, it seems the inner amqp
implementations are being used by others. So I'll have to be careful
with the callback signature. Ceilometer, for example, seems to be
leaving zeromq support as an exercise for the reader.

Perhaps oslo-messaging will make this abstraction easier to enforce.

Cheers!
-S


> 
> -- 
> Regards,
> Eric Windisch
> 
> 
> ___
> OpenStack-dev mailing list
> OpenStack-dev@lists.openstack.org
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
> 

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] headsup - transient test failures on py26 ' cannot import name OrderedDict'

2013-07-19 Thread Roman Podolyaka
Hi guys,

Both 0.0.16 and 0.0.17 seem to have a broken tests counter. It shows that 2
times more tests have been run than I actually have.

Thanks,
Roman


On Thu, Jul 18, 2013 at 2:29 AM, David Ripton  wrote:

> On 07/17/2013 04:54 PM, Robert Collins wrote:
>
>> On 18 July 2013 08:48, Chris Jones  wrote:
>>
>>> Hi
>>>
>>> On 17 July 2013 21:27, Robert Collins  wrote:
>>>

 Surely thats fixable by having a /opt/ install of Python2.7 built for
 RHEL
 ? That would make life s much easier for all concerned, and is super

>>>
>>>
>>> Possibly not easier for those tasked with keeping OS security patches up
>>> to
>>> date, which is part of what a RHEL customer is paying Red Hat a bunch of
>>> money to do.
>>>
>>
>> I totally agree, which is why it would make sense for Red Hat to
>> supply the build of Python 2.7 :).
>>
>
> FYI,
>
> http://developerblog.redhat.**com/2013/06/05/red-hat-**
> software-collections-1-0-beta-**now-available/
>
> (TL;DR : Red Hat Software Collections is a way to get newer versions of
> Python and some other software on RHEL 6.  It's still in beta though.)
>
> --
> David Ripton   Red Hat   drip...@redhat.com
>
>
> __**_
> OpenStack-dev mailing list
> OpenStack-dev@lists.openstack.**org 
> http://lists.openstack.org/**cgi-bin/mailman/listinfo/**openstack-dev
>
___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Nova] Ceilometer vs. Nova internal metrics collector for scheduling (was: New DB column or new DB table?)

2013-07-19 Thread Andrew Laski

On 07/19/13 at 12:08pm, Murray, Paul (HP Cloud Services) wrote:

Hi Sean,

Do you think the existing static allocators should be migrated to going through 
ceilometer - or do you see that as different? Ignoring backward compatibility.


It makes sense to keep some things in Nova, in order to handle the 
graceful degradation needed if Ceilometer couldn't be reached.  I see 
the line as something like capabilities should be handled by Nova, 
memory free, vcpus available, etc... and utilization metrics handled by 
Ceilometer.




The reason I ask is I want to extend the static allocators to include a couple 
more. These plugins are the way I would have done it. Which way do you think 
that should be done?

Paul.

-Original Message-
From: Sean Dague [mailto:s...@dague.net]
Sent: 19 July 2013 12:04
To: OpenStack Development Mailing List
Subject: Re: [openstack-dev] [Nova] Ceilometer vs. Nova internal metrics 
collector for scheduling (was: New DB column or new DB table?)

On 07/19/2013 06:18 AM, Day, Phil wrote:

Ceilometer is a great project for taking metrics available in Nova and other 
systems and making them available for use by Operations, Billing, Monitoring, 
etc - and clearly we should try and avoid having multiple collectors of the 
same data.

But making the Nova scheduler dependent on Ceilometer seems to be the wrong way 
round to me - scheduling is such a fundamental operation that I want Nova to be 
self sufficient in this regard.   In particular I don't want the availability 
of my core compute platform to be constrained by the availability of my (still 
evolving) monitoring system.

If Ceilometer can be fed from the data used by the Nova scheduler then that's a 
good plus - but not the other way round.


I assume it would gracefully degrade to the existing static allocators if 
something went wrong. If not, well that would be very bad.

Ceilometer is an integrated project in Havana. Utilization based scheduling 
would be a new feature. I'm not sure why we think that duplicating the metrics 
collectors in new code would be less buggy than working with Ceilometer. Nova 
depends on external projects all the time.

If we have a concern about robustness here, we should be working as an overall 
project to address that.

-Sean

--
Sean Dague
http://dague.net

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Nova] Ceilometer vs. Nova internal metrics collector for scheduling (was: New DB column or new DB table?)

2013-07-19 Thread Murray, Paul (HP Cloud Services)
Hi Sean,

Do you think the existing static allocators should be migrated to going through 
ceilometer - or do you see that as different? Ignoring backward compatibility.

The reason I ask is I want to extend the static allocators to include a couple 
more. These plugins are the way I would have done it. Which way do you think 
that should be done?

Paul.

-Original Message-
From: Sean Dague [mailto:s...@dague.net] 
Sent: 19 July 2013 12:04
To: OpenStack Development Mailing List
Subject: Re: [openstack-dev] [Nova] Ceilometer vs. Nova internal metrics 
collector for scheduling (was: New DB column or new DB table?)

On 07/19/2013 06:18 AM, Day, Phil wrote:
> Ceilometer is a great project for taking metrics available in Nova and other 
> systems and making them available for use by Operations, Billing, Monitoring, 
> etc - and clearly we should try and avoid having multiple collectors of the 
> same data.
>
> But making the Nova scheduler dependent on Ceilometer seems to be the wrong 
> way round to me - scheduling is such a fundamental operation that I want Nova 
> to be self sufficient in this regard.   In particular I don't want the 
> availability of my core compute platform to be constrained by the 
> availability of my (still evolving) monitoring system.
>
> If Ceilometer can be fed from the data used by the Nova scheduler then that's 
> a good plus - but not the other way round.

I assume it would gracefully degrade to the existing static allocators if 
something went wrong. If not, well that would be very bad.

Ceilometer is an integrated project in Havana. Utilization based scheduling 
would be a new feature. I'm not sure why we think that duplicating the metrics 
collectors in new code would be less buggy than working with Ceilometer. Nova 
depends on external projects all the time.

If we have a concern about robustness here, we should be working as an overall 
project to address that.

-Sean

--
Sean Dague
http://dague.net

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Keystone] Reviewers wanted: Delegated Auth a la Oauth

2013-07-19 Thread David Chadwick



On 19/07/2013 07:10, Steve Martinelli wrote:

Hi David,

This email is long overdue.

1. I don't recall ever stating that we were going to use OAuth 1.0a over
2.0, or vice versa. I've checked
_https://etherpad.openstack.org/havana-saml-oauth-scim_and
_https://etherpad.openstack.org/havana-external-auth_and couldn't find
anything that said definitively that we were going to use 2.0.
OAuth provider implementations (in python) of either version of the
protocol are few and far between.


As I recall, there was a presentation from David Waite, Ping Identity, 
about OAuth2, and after this it was agreed that OAuth2 would be used as 
the generic delegation protocol in OpenStack, and that OAuth1 would only 
be used as an alternative to the existing trusts delegation mechanism. 
If this is not explicitly recorded in the meeting notes, then we will 
need to ask others who were present at the design summit what their 
recollections are.





2. I think I'm going to regret asking this question... (as I don't want
to get into a long discussion about OAuth 1.0a vs 2.0), but what are the
security weaknesses you mention?


These are documented in RFC 5849 (http://tools.ietf.org/rfc/rfc5849.txt) 
which is presumably the specification of OAuth1 that you are implementing.


There is also a session fixation attack documented here
http://oauth.net/advisories/2009-1/



3. I think you are disagreeing with the consumer requesting roles? And
are suggesting the consumer should be requesting an authorizing user
instead? I'm not completely against it, but I'd be interested in what
others have to say.


If the use case is to replace trusts, and to allow delegation from a 
user to one or more openstack services then yes, it should be the user 
who is wanting to launch the openstack service, and not any user who 
happens to have the same role as this user. Otherwise a consumer could 
get one user to authorise the job of another user, if they have the same 
roles.


If the use case is general delegation, then relating it to Adam's post:

http://adam.younglogic.com/2013/07/a-vision-for-keystone/

in order to future proof, the consumer should request authorising 
attributes, rather than roles, since as Adam points out, the policy 
based authz system is not restricted to using roles.




4. Regarding the evil consumer; we do not just use the consumer key, the
magic oauth variables I mentioned also contain oauth_signature, which
(if you are using a standard oauth client) is the consumers secret (and
other values) and also hashed.


This solves one of my problems, because now you are effectively saying 
that the consumer has to be pre-registered with Keystone and have a 
un/pw allocated to it. I did not see this mentioned in the original 
blueprint, and thought (wrongly) that anything could be a consumer 
without any pre-registration requirement. So the pre-registration 
requirement removes most probability that the consumer can be evil, 
since one would hope that the Keystone administrator would check it out 
before registering it.


regards

david


 On the server side, we grab the consumer

entity from our database and recreate the oauth_signature value, then
verify the request.

  * If the evil consumer did not provide a provide a secret, (or a wrong
secret) it would fail on the verify step on the server side
(signatures wouldn't match).
  * If the evil consumer used his own library, he would still need to
sign the request correctly (done only with the correct consumer
key/secret).



Thanks,

_
Steve Martinelli | A4-317 @ IBM Toronto Software Lab
Software Developer - OpenStack
Phone: (905) 413-2851
E-Mail: steve...@ca.ibm.com

Inactive hide details for David Chadwick ---06/19/2013 04:38:56 PM--- Hi
Steve On 19/06/2013 20:56, Steve Martinelli wrote:David Chadwick
---06/19/2013 04:38:56 PM---  Hi Steve On 19/06/2013 20:56, Steve
Martinelli wrote:

From: David Chadwick 
To: Steve Martinelli/Toronto/IBM@IBMCA,
Cc: openstack-dev@lists.openstack.org
Date: 06/19/2013 04:38 PM
Subject: Re: [openstack-dev] [Keystone] Reviewers wanted: Delegated Auth
a la Oauth




   Hi Steve

On 19/06/2013 20:56, Steve Martinelli wrote:
 > Hey David,
 >
 > 1. and 5. The delegate is not always known to keystone. The delegate (I
 > like to say consumer) would use an oauth client (web-based one here
 > _http://term.ie/oauth/example/client.php_); in an oauth flow, the
 > delegate requires a key/secret pair, they don't have to be already known
 > to keystone. (Apologies if my keystoneclient example led you to believe
 > that)

I know that in Oauth the consumer is not always known to the resource
provider (Keystone) but Oauth has security weaknesses in it which OAuth2
has fixed. So I would hope we are not going to use Oauth as the general
delegation model. I thought that the last design summit agreed that
Oauthv2 should be the right wa

Re: [openstack-dev] [Nova] Ceilometer vs. Nova internal metrics collector for scheduling (was: New DB column or new DB table?)

2013-07-19 Thread Sean Dague

On 07/19/2013 06:18 AM, Day, Phil wrote:

Ceilometer is a great project for taking metrics available in Nova and other 
systems and making them available for use by Operations, Billing, Monitoring, 
etc - and clearly we should try and avoid having multiple collectors of the 
same data.

But making the Nova scheduler dependent on Ceilometer seems to be the wrong way 
round to me - scheduling is such a fundamental operation that I want Nova to be 
self sufficient in this regard.   In particular I don't want the availability 
of my core compute platform to be constrained by the availability of my (still 
evolving) monitoring system.

If Ceilometer can be fed from the data used by the Nova scheduler then that's a 
good plus - but not the other way round.


I assume it would gracefully degrade to the existing static allocators 
if something went wrong. If not, well that would be very bad.


Ceilometer is an integrated project in Havana. Utilization based 
scheduling would be a new feature. I'm not sure why we think that 
duplicating the metrics collectors in new code would be less buggy than 
working with Ceilometer. Nova depends on external projects all the time.


If we have a concern about robustness here, we should be working as an 
overall project to address that.


-Sean

--
Sean Dague
http://dague.net

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] Moving task flow to conductor - concern about scale

2013-07-19 Thread Day, Phil
Hi Josh,

My idea's really pretty simple - make "DB proxy" and "Task workflow" separate 
services, and allow people to co-locate them if they want to.

Cheers.
Phil

> -Original Message-
> From: Joshua Harlow [mailto:harlo...@yahoo-inc.com]
> Sent: 17 July 2013 14:57
> To: OpenStack Development Mailing List
> Cc: OpenStack Development Mailing List
> Subject: Re: [openstack-dev] Moving task flow to conductor - concern about
> scale
> 
> Hi Phil,
> 
> I understand and appreciate your concern and I think everyone is trying to 
> keep
> that in mind. It still appears to me to be to early in this refactoring and 
> task
> restructuring effort to tell where it may "end up". I think that's also good 
> news
> since we can get these kinds of ideas (componentized conductors if u will) to
> handle your (and mine) scaling concerns. It would be pretty neat if said
> conductors could be scaled at different rates depending on there component,
> although as u said we need to get much much better with handling said
> patterns (as u said just 2 schedulers is a pita right now). I believe we can 
> do it,
> given the right kind of design and scaling "principles" we build in from the 
> start
> (right now).
> 
> Would like to hear more of your ideas so they get incorporated earlier rather
> than later.
> 
> Sent from my really tiny device..
> 
> On Jul 16, 2013, at 9:55 AM, "Dan Smith"  wrote:
> 
> >> In the original context of using Conductor as a database proxy then
> >> the number of conductor instances is directly related to the number
> >> of compute hosts I need them to serve.
> >
> > Just a point of note, as far as I know, the plan has always been to
> > establish conductor as a thing that sits between the api and compute
> > nodes. However, we started with the immediate need, which was the
> > offloading of database traffic.
> >
> >> What I not sure is that I would also want to have the same number of
> >> conductor instances for task control flow - historically even running
> >> 2 schedulers has been a problem, so the thought of having 10's of
> >> them makes me very concerned at the moment.   However I can't see any
> >> way to specialise a conductor to only handle one type of request.
> >
> > Yeah, I don't think the way it's currently being done allows for
> > specialization.
> >
> > Since you were reviewing actual task code, can you offer any specifics
> > about the thing(s) that concern you? I think that scaling conductor
> > (and its tasks) horizontally is an important point we need to achieve,
> > so if you see something that needs tweaking, please point it out.
> >
> > Based on what is there now and proposed soon, I think it's mostly
> > fairly safe, straightforward, and really no different than what two
> > computes do when working together for something like resize or migrate.
> >
> >> So I guess my question is, given that it may have to address two
> >> independent scale drivers, is putting task work flow and DB proxy
> >> functionality into the same service really the right thing to do - or
> >> should there be some separation between them.
> >
> > I think that we're going to need more than one "task" node, and so it
> > seems appropriate to locate one scales-with-computes function with
> > another.
> >
> > Thanks!
> >
> > --Dan
> >
> > ___
> > OpenStack-dev mailing list
> > OpenStack-dev@lists.openstack.org
> > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
> 
> ___
> OpenStack-dev mailing list
> OpenStack-dev@lists.openstack.org
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Nova] Ceilometer vs. Nova internal metrics collector for scheduler (was: New DB column or new DB table?)

2013-07-19 Thread Sean Dague

On 07/18/2013 10:12 PM, Lu, Lianhao wrote:

Using ceilometer as the source of those metrics was discussed in the
nova-scheduler subgroup meeting. (see #topic extending data in host
state in the following link).
http://eavesdrop.openstack.org/meetings/scheduler/2013/scheduler.2013-04-30-15.04.log.html

In that meeting, all agreed that ceilometer would be a great source of
metrics for scheduler, but many of them don't want to make the
ceilometer as a mandatory dependency for nova scheduler.

Besides, currently ceilometer doesn't have "host metrics", like the
cpu/network/cache utilization data of the compute node host, which
will affect the scheduling decision. What ceilometer has currently
is the "VM metrics", like cpu/network utilization of each VM instance.


How hard would that be to add? vs. duplicating an efficient collector 
framework in Nova?



After the nova compute node collects the "host metrics", those metrics
could also be fed into ceilometer framework(e.g. through a ceilometer
listener) for further processing, like alarming, etc.

-Lianhao


I think "not mandatory" for nova scheduler means different things to 
different folks. My assumption is that means without ceilometer, you 
just don't have utilization metrics, and now you are going on static info.


This still seems like duplication of function in Nova that could be 
better handled in a different core project. It really feels like as 
OpenStack we've decided the Ceilometer is our metrics message bus, and 
we should really push metrics there when ever we can.


Ceilometer is an integrated project for Havana, so the argument that 
someone doesn't want to run it to get an enhancement to Nova doesn't 
hold a lot of weight in my mind.


-Sean

--
Sean Dague
http://dague.net

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] Moving task flow to conductor - concern about scale

2013-07-19 Thread Day, Phil
> -Original Message-
> From: Dan Smith [mailto:d...@danplanet.com]
> Sent: 16 July 2013 14:51
> To: OpenStack Development Mailing List
> Cc: Day, Phil
> Subject: Re: [openstack-dev] Moving task flow to conductor - concern about
> scale
> 
> > In the original context of using Conductor as a database proxy then
> > the number of conductor instances is directly related to the number of
> > compute hosts I need them to serve.
> 
> Just a point of note, as far as I know, the plan has always been to establish
> conductor as a thing that sits between the api and compute nodes. However,
> we started with the immediate need, which was the offloading of database
> traffic.
>

Like I said, I see the need for both a layer between the API and compute and 
between compute and DB, - I just don't see them as having to be part of the 
same thing.

 
> > What I not sure is that I would also want to have the same number of
> > conductor instances for task control flow - historically even running
> > 2 schedulers has been a problem, so the thought of having 10's of
> > them makes me very concerned at the moment.   However I can't see any
> > way to specialise a conductor to only handle one type of request.
> 
> Yeah, I don't think the way it's currently being done allows for 
> specialization.
> 
> Since you were reviewing actual task code, can you offer any specifics about
> the thing(s) that concern you? I think that scaling conductor (and its tasks)
> horizontally is an important point we need to achieve, so if you see something
> that needs tweaking, please point it out.
> 
> Based on what is there now and proposed soon, I think it's mostly fairly safe,
> straightforward, and really no different than what two computes do when
> working together for something like resize or migrate.
>

There's nothing I've seen so far that causes me alarm,  but then again we're in 
the very early stages and haven't moved anything really complex.
However I think there's an inherent big difference in scaling something which 
is stateless like a DB proxy and scaling a statefull entity like a task 
workflow component.  I'd also suggest that so far there is no real experience 
with that latter within the current code base; compute nodes (which are the 
main scaled-out component so far) work on well defined subsets of the data.


> > So I guess my question is, given that it may have to address two
> > independent scale drivers, is putting task work flow and DB proxy
> > functionality into the same service really the right thing to do - or
> > should there be some separation between them.
> 
> I think that we're going to need more than one "task" node, and so it seems
> appropriate to locate one scales-with-computes function with another.
> 

I just don't buy into this line of thinking - I need more than one API node for 
HA as well - but that doesn't mean that therefore I want to put anything else 
that needs more than one node in there.

I don't even think these do scale-with-compute in the same way;  DB proxy 
scales with the number of compute hosts because each new host introduces an 
amount of DB load though its periodic tasks.Task work flow scales with the 
number of requests coming into  the system to create / modify servers - and 
that's not directly related to the number of hosts. 

So rather than asking "what doesn't work / might not work in the future" I 
think the question should be "aside from them both being things that could be 
described as a conductor - what's the architectural reason for wanting to have 
these two separate groups of functionality in the same service ?"

If it's really just because the concept of "conductor" got used for a DB proxy 
layer before the task workflow, then we should either think if a new name for 
the latter or rename the former.

If they were separate services and it turns out that I can/want/need to run the 
same number of both then I can pretty easily do that  - but the current 
approach is removing what to be seems a very important degree of freedom around 
deployment on a large scale system.

Cheers,
Phil


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Openstack] [cinder] Proposal for Ollie Leahy to join cinder-core

2013-07-19 Thread Day, Phil
+1

Just to add to the context - keep in mind that within HP (and I assume other 
large organisations such as IBM but can't speak directly from them) there are 
now many separate divisions working with Openstack (Public Cloud, Private Cloud 
Products, Labs, Consulting, etc), each bringing different perspectives to the 
table.In another context many of these would be considered separate 
companies in their own right.

Phil   

> -Original Message-
> From: Sean Dague [mailto:s...@dague.net]
> Sent: 17 July 2013 19:42
> To: OpenStack Development Mailing List
> Cc: Openstack (openst...@lists.launchpad.net)
> (openst...@lists.launchpad.net)
> Subject: Re: [openstack-dev] [Openstack] [cinder] Proposal for Ollie Leahy to
> join cinder-core
> 
> On 07/17/2013 02:35 PM, John Griffith wrote:
> 
> > Just to point out a few things here, first off there is no guideline
> > that states a company affiliation should have anything to do with the
> > decision on voting somebody as core.  I have ABSOLUTELY NO concern
> > about representation of company affiliation what so ever.
> >
> > Quite frankly I wouldn't mind if there were 20 core members from HP,
> > if they're all actively engaged and participating then that's great.
> > I don't think there has been ANY incidence of folks exerting
> > inappropriate influence based on their affiliated interest, and if
> > there ever was I think it would be easy to identify and address.
> >
> > As far as "don't need more" I don't agree with that either, if there
> > are folks contributing and doing the work then there's no reason not
> > to add them.  Cinder IMO does NOT have an excess of reviewers by a
> > very very long stretch.
> >
> > The criteria here should be review consistency and quality as well as
> > knowledge of the project, nothing more nothing less.  If there's an
> > objection to the individuals participation or contribution that's
> > fine, but company affiliation should have no bearing.
> 
> +1
> 
> The people that do great work on reviews, should really be your review team,
> regardless of affiliation.
> 
>   -Sean
> 
> --
> Sean Dague
> http://dague.net
> 
> ___
> OpenStack-dev mailing list
> OpenStack-dev@lists.openstack.org
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Quantum][LBaaS] Feedback needed: Healthmonitor workflow.

2013-07-19 Thread Oleg Bondarev
Hi,
First want to mention that currently health monitor is not a pure DB
object: upon creation the request is also send to device/driver.
Another thing is that there is no delete_health_monitor in driver API
(delete_pool_health_monitor only deletes the association) which is weird
because potentially health monitors will remain on device forever.

>From what I saw from the thread referenced by Salvatore the main argument
against health monitor templates approach was:
*>In NetScaler, F5 and similar products, health monitors are created*
>*beforehand just like in the API, and then they are “bound” to pools (our*
>*association API), so the mapping will be more natural*

IMO this is a strong argument until we start to maintain several
devices/drivers/service_providers per LB service.
With several drivers we'll need to send create/update/delete_health_monitor
requests to each driver and it's a big overhead. Same for pool monitor
assiciations as described in Eugene's initial mail.
I think it's not a big deal for drivers such as F5 to create/delete monitor
object upon creating/deleting 'private' health monitor.
So I vote for monitor templates and 1:1 mapping of 'private' health
monitors to pools until it's not too late and I'm ready to implement this
in havana.
Thoughts?

Thanks,
Oleg


On Thu, Jun 20, 2013 at 6:53 PM, Salvatore Orlando wrote:

> The idea of health-monitor templates was first discussed here:
> http://lists.openstack.org/pipermail/openstack-dev/2012-November/003233.html
> See follow-up on that mailing list thread to understand pro and cons of
> the idea.
>
> I will avoid moaning about backward compatibility at the moment, but that
> something else we need to discuss at some point if go ahead with changes in
> the API.
>
> Salvatore
>
>
> On 20 June 2013 14:54, Samuel Bercovici  wrote:
>
>>  Hi,
>>
>> ** **
>>
>> I agree with this.
>>
>> We are facing challenges when the global health pool is changed to
>> atomically modify all the groups that are linked to this health check as
>> the groups might be configured in different devices.
>>
>> So if one of the group modification fails it is very difficult to revert
>> the change back.
>>
>> ** **
>>
>> -Sam.
>>
>> ** **
>>
>> ** **
>>
>> *From:* Eugene Nikanorov [mailto:enikano...@mirantis.com]
>> *Sent:* Thursday, June 20, 2013 3:10 PM
>> *To:* OpenStack Development Mailing List
>> *Cc:* Avishay Balderman; Samuel Bercovici
>> *Subject:* [Quantum][LBaaS] Feedback needed: Healthmonitor workflow.
>>
>> ** **
>>
>> Hi community,
>>
>> ** **
>>
>> Here's a question.
>>
>> Currently Health monitors in Loadbalancer service are made in such way
>> that health monitor itself is a global shared database object. 
>>
>> If user wants to add health monitor to a pool, it adds association
>> between pool and health monitor.
>>
>> In order to update existing health monitor (change url, for example)
>> service will need to go over existing pool-health monitor associations
>> notifying devices of this change.
>>
>> ** **
>>
>> I think it could be changed to the following workflow:
>>
>> Instead of adding pool-healthmonitor association, use health monitor
>> object as a template (probably renaming is needed) and add 'private' health
>> monitor to the pool. 
>>
>> So all further operations would result in changing health monitor on one
>> device only.
>>
>> ** **
>>
>> What do you think?
>>
>> ___
>> OpenStack-dev mailing list
>> OpenStack-dev@lists.openstack.org
>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>>
>>
>
> ___
> OpenStack-dev mailing list
> OpenStack-dev@lists.openstack.org
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>
>
___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Nova] New DB column or new DB table?

2013-07-19 Thread Day, Phil
Ceilometer is a great project for taking metrics available in Nova and other 
systems and making them available for use by Operations, Billing, Monitoring, 
etc - and clearly we should try and avoid having multiple collectors of the 
same data.

But making the Nova scheduler dependent on Ceilometer seems to be the wrong way 
round to me - scheduling is such a fundamental operation that I want Nova to be 
self sufficient in this regard.   In particular I don't want the availability 
of my core compute platform to be constrained by the availability of my (still 
evolving) monitoring system.

If Ceilometer can be fed from the data used by the Nova scheduler then that's a 
good plus - but not the other way round.

Phil

> -Original Message-
> From: Sean Dague [mailto:s...@dague.net]
> Sent: 18 July 2013 12:05
> To: OpenStack Development Mailing List
> Subject: Re: [openstack-dev] [Nova] New DB column or new DB table?
> 
> On 07/17/2013 10:54 PM, Lu, Lianhao wrote:
> > Hi fellows,
> >
> > Currently we're implementing the BP
> https://blueprints.launchpad.net/nova/+spec/utilization-aware-scheduling. The
> main idea is to have an extensible plugin framework on nova-compute where
> every plugin can get different metrics(e.g. CPU utilization, memory cache
> utilization, network bandwidth, etc.) to store into the DB, and the nova-
> scheduler will use that data from DB for scheduling decision.
> >
> > Currently we adds a new table to store all the metric data and have nova-
> scheduler join loads the new table with the compute_nodes table to get all the
> data(https://review.openstack.org/35759). Someone is concerning about the
> performance penalty of the join load operation when there are many metrics
> data stored in the DB for every single compute node. Don suggested adding a
> new column in the current compute_nodes table in DB, and put all metric data
> into a dictionary key/value format and store the json encoded string of the
> dictionary into that new column in DB.
> >
> > I'm just wondering which way has less performance impact, join load with a
> new table with quite a lot of rows, or json encode/decode a dictionary with a
> lot of key/value pairs?
> >
> > Thanks,
> > -Lianhao
> 
> I'm really confused. Why are we talking about collecting host metrics in nova
> when we've got a whole project to do that in ceilometer? I think utilization
> based scheduling would be a great thing, but it really out to be interfacing 
> with
> ceilometer to get that data. Storing it again in nova (or even worse 
> collecting it
> a second time in nova) seems like the wrong direction.
> 
> I think there was an equiv patch series at the end of Grizzly that was pushed 
> out
> for the same reasons.
> 
> If there is a reason ceilometer can't be used in this case, we should have 
> that
> discussion here on the list. Because my initial reading of this blueprint and 
> the
> code patches is that it partially duplicates ceilometer function, which we
> definitely don't want to do. Would be happy to be proved wrong on that.
> 
>   -Sean
> 
> --
> Sean Dague
> http://dague.net
> 
> ___
> OpenStack-dev mailing list
> OpenStack-dev@lists.openstack.org
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Neutron] Chalenges with highly available service VMs - port adn security group options.

2013-07-19 Thread Samuel Bercovici
Hi,

I have completely missed this discussion as it does not have quantum/Neutron in 
the subject (modify it now)
I think that the security group is the right place to control this.
I think that this might be only allowed to admins.

Let me explain what we need which is more than just disable spoofing.

1.   Be able to allow MACs which are not defined on the port level to 
transmit packets (for example VRRP MACs)== turn off MAC spoofing

2.   Be able to allow IPs which are not defined on the port level to 
transmit packets (for example, IP used for HA service that moves between an HA 
pair) == turn off IP spoofing

3.   Be able to allow broadcast message on the port (for example for VRRP 
broadcast) == allow broadcast.


Regards,
-Sam.


From: Aaron Rosen [mailto:aro...@nicira.com]
Sent: Friday, July 19, 2013 3:26 AM
To: OpenStack Development Mailing List
Subject: Re: [openstack-dev] Chalenges with highly available service VMs

Yup:
I'm definitely happy to review and give hints.
Blueprint:  
https://docs.google.com/document/d/18trYtq3wb0eJK2CapktN415FRIVasr7UkTpWn9mLq5M/edit

https://review.openstack.org/#/c/19279/  < patch that merged the feature;
Aaron

On Thu, Jul 18, 2013 at 5:15 PM, Ian Wells 
mailto:ijw.ubu...@cack.org.uk>> wrote:
On 18 July 2013 19:48, Aaron Rosen 
mailto:aro...@nicira.com>> wrote:
> Is there something this is missing that could be added to cover your use
> case? I'd be curious to hear where this doesn't work for your case.  One
> would need to implement the port_security extension if they want to
> completely allow all ips/macs to pass and they could state which ones are
> explicitly allowed with the allowed-address-pair extension (at least that is
> my current thought).
Yes - have you got docs on the port security extension?  All I've
found so far are
http://docs.openstack.org/developer/quantum/api/quantum.extensions.portsecurity.html
and the fact that it's only the Nicira plugin that implements it.  I
could implement it for something else, but not without a few hints...
--
Ian.

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Horizon] Navigation UX Enhancements - Collecting Issues

2013-07-19 Thread Jaromir Coufal

Hi Jeff,

thanks for contribution. As long as there is no more input in gathering 
issues, I'll try to wrap all the problems up in BP's whiteboard and we 
can start designing proposals.


Best
-- Jarda

On 2013/09/07 16:09, Walls, Jeffrey Joel (HP Converged Cloud - Cloud OS) 
wrote:


One issue I have is that the panels are physically grouped and it’s 
very difficult to logically re-group them.  For example, the Project 
dashboard explicitly lists the panel groups and the order of those 
panel groups.  Within each panel group, the individual panels are also 
explicitly listed.  What I would like to do is arrange the panels more 
logically without affecting the physical structure of the files or the 
order in the panel specification.


You could think of “Deployment” as a “section” under the Project 
dashboard tab.  In this “section”, I’d want to see things related to 
the actual deployment of virtual machines (e.g., Instances, Snapshots, 
Networks, Routers, etc).  I was beginning to tackle this in our code 
base and was planning to use some sort of accordion-type widget.  My 
thinking was that there would be “a few” (probably no more than 4) 
“sections” under the Project tab.  Each “section” would have elements 
within it that logically mapped to that section.


I think this is a great discussion and I’m very interested to hear 
where others are headed with their thinking, so thank you for getting 
it started!


*Jeff***

*From:*Jaromir Coufal [mailto:jcou...@redhat.com]
*Sent:* Tuesday, July 09, 2013 6:38 AM
*To:* OpenStack Development Mailing List
*Subject:* [openstack-dev] [Horizon] Navigation UX Enhancements - 
Collecting Issues


Hi everybody,

in UX community group on G+ popped out a need for enhancing user 
experience of main navigation, because there are spreading out various 
issues .


There is already created a BP for this: 
https://blueprints.launchpad.net/horizon/+spec/navigation-enhancement


Toshi had great idea to start discussion about navigation issues on 
mailing list.


So I'd like to ask all of you, if you have some issues with 
navigation, what are the issues you are dealing with? I'd like to 
gather as much feedback as possible, so we can design the best 
solution which covers most of the cases. Issues will be listed in BP 
and I will try to come out with design proposals which hopefully will 
help all of you.


Examples are following:
* Navigation is not scaling for more dashboards (Project, Admin, ...)
* Each dashboard might contain different hierarchy (number of levels)

What problems do you experience with navigation?

Thanks all for contributing
-- Jarda



___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev