Re: [openstack-dev] [nova] Thoughs please on how to address a problem with mutliple deletes leading to a nova-compute thread pool problem

2013-10-26 Thread Joshua Harlow
An idea that others and I are having for a similar use case in cinder (or it 
appears to be similar).

If there was a well defined state machine/s in nova with well defined and 
managed transitions between states then it seems like this state machine could 
resume on failure as well as be interrupted when a dueling or preemptable 
operation arrives (a delete while being created for example). This way not only 
would it be very clear the set of states and transitions but it would also be 
clear how preemption occurs (and under what cases). 

Right now in nova there is a distributed and ad-hoc state machine which if it 
was more formalized it could inherit some if the described useful capabilities. 
It would also be much more resilient to these types of locking problems that u 
described. 

IMHO that's the only way these types of problems will be fully be fixed, not by 
more queues or more periodic tasks, but by solidifying  formalizing the state 
machines that compose the work nova does.

Sent from my really tiny device...

 On Oct 25, 2013, at 3:52 AM, Day, Phil philip@hp.com wrote:
 
 Hi Folks,
 
 We're very occasionally seeing problems where a thread processing a create 
 hangs (and we've seen when taking to Cinder and Glance).  Whilst those issues 
 need to be hunted down in their own rights, they do show up what seems to me 
 to be a weakness in the processing of delete requests that I'd like to get 
 some feedback on.
 
 Delete is the one operation that is allowed regardless of the Instance state 
 (since it's a one-way operation, and users should always be able to free up 
 their quota).   However when we get a create thread hung in one of these 
 states, the delete requests when they hit the manager will also block as they 
 are synchronized on the uuid.   Because the user making the delete request 
 doesn't see anything happen they tend to submit more delete requests.   The 
 Service is still up, so these go to the computer manager as well, and 
 eventually all of the threads will be waiting for the lock, and the compute 
 manager will stop consuming new messages.
 
 The problem isn't limited to deletes - although in most cases the change of 
 state in the API means that you have to keep making different calls to get 
 past the state checker logic to do it with an instance stuck in another 
 state.   Users also seem to be more impatient with deletes, as they are 
 trying to free up quota for other things. 
 
 So while I know that we should never get a thread into a hung state into the 
 first place, I was wondering about one of the following approaches to address 
 just the delete case:
 
 i) Change the delete call on the manager so it doesn't wait for the uuid 
 lock.  Deletes should be coded so that they work regardless of the state of 
 the VM, and other actions should be able to cope with a delete being 
 performed from under them.  There is of course no guarantee that the delete 
 itself won't block as well. 
 
 ii) Record in the API server that a delete has been started (maybe enough to 
 use the task state being set to DELETEING in the API if we're sure this 
 doesn't get cleared), and add a periodic task in the compute manager to check 
 for and delete instances that are in a DELETING state for more than some 
 timeout. Then the API, knowing that the delete will be processes eventually 
 can just no-op any further delete requests.
 
 iii) Add some hook into the ServiceGroup API so that the timer could depend 
 on getting a free thread from the compute manager pool (ie run some no-op 
 task) - so that of there are no free threads then the service becomes down. 
 That would (eventually) stop the scheduler from sending new requests to it, 
 and make deleted be processed in the API server but won't of course help with 
 commands for other instances on the same host.
 
 iv) Move away from having a general topic and thread pool for all requests, 
 and start a listener on an instance specific topic for each running instance 
 on a host (leaving the general topic and pool just for creates and other 
 non-instance calls like the hypervisor API).   Then a blocked task would only 
 affect request for a specific instance.
 
 I'm tending towards ii) as a simple and pragmatic solution in the near term, 
 although I like both iii) and iv) as being both generally good enhancments - 
 but iv) in particular feels like a pretty seismic change.
 
 Thoughts please,
 
 Phil
 
 ___
 OpenStack-dev mailing list
 OpenStack-dev@lists.openstack.org
 http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


[openstack-dev] consolidating .mailmap?

2013-10-26 Thread Robert Collins
So nova has a massive .mailmap that maps multiple addresses for one
person together. I'm wondering if a) it's still needed, and b) if it
is, should we push it into all the repositories - e.g. have a single
global copy and an automated job to push updates around.

-Rob

-- 
Robert Collins rbtcoll...@hp.com
Distinguished Technologist
HP Converged Cloud

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [heat] Proposal for new heat-core member

2013-10-26 Thread Clint Byrum
Excerpts from Steven Dake's message of 2013-10-25 12:12:54 -0700:
 Hi,
 
 I would like to propose Randall Burt for Heat Core.  He has shown 
 interest in Heat by participating in IRC and providing high quality 
 reviews.  The most important aspect in my mind of joining Heat Core is 
 output and quality of reviews.  Randall has been involved in Heat 
 reviews for atleast 6 months.  He has had 172 reviews over the last 6 
 months staying in the pack [1] of core heat reviewers.  His 90 day 
 stats are also encouraging, with 97 reviews (compared to the top 
 reviewer Steve Hardy with 444 reviews).  Finally his 30 day stats also 
 look good, beating out 3 core reviewers [2] on output with good quality 
 reviews.
 

+1

!!

 Please have a vote +1/-1 and take into consideration: 
 https://wiki.openstack.org/wiki/Heat/CoreTeam
 
 Regards,
 -steve
 
 [1]http://russellbryant.net/openstack-stats/heat-reviewers-180.txt 
 http://russellbryant.net/openstack-stats/heat-reviewers-180.txt
 [2]http://russellbryant.net/openstack-stats/heat-reviewers-30.txt 
 http://russellbryant.net/openstack-stats/heat-reviewers-30.txt

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova] Thoughs please on how to address a problem with mutliple deletes leading to a nova-compute thread pool problem

2013-10-26 Thread Alex Glikson
+1

Regards,
Alex


Joshua Harlow harlo...@yahoo-inc.com wrote on 26/10/2013 09:29:03 AM:
 
 An idea that others and I are having for a similar use case in 
 cinder (or it appears to be similar).
 
 If there was a well defined state machine/s in nova with well 
 defined and managed transitions between states then it seems like 
 this state machine could resume on failure as well as be interrupted
 when a dueling or preemptable operation arrives (a delete while 
 being created for example). This way not only would it be very clear
 the set of states and transitions but it would also be clear how 
 preemption occurs (and under what cases). 
 
 Right now in nova there is a distributed and ad-hoc state machine 
 which if it was more formalized it could inherit some if the 
 described useful capabilities. It would also be much more resilient 
 to these types of locking problems that u described. 
 
 IMHO that's the only way these types of problems will be fully be 
 fixed, not by more queues or more periodic tasks, but by solidifying
  formalizing the state machines that compose the work nova does.
 
 Sent from my really tiny device...
 
  On Oct 25, 2013, at 3:52 AM, Day, Phil philip@hp.com wrote:
  
  Hi Folks,
  
  We're very occasionally seeing problems where a thread processing 
 a create hangs (and we've seen when taking to Cinder and Glance). 
 Whilst those issues need to be hunted down in their own rights, they
 do show up what seems to me to be a weakness in the processing of 
 delete requests that I'd like to get some feedback on.
  
  Delete is the one operation that is allowed regardless of the 
 Instance state (since it's a one-way operation, and users should 
 always be able to free up their quota).   However when we get a 
 create thread hung in one of these states, the delete requests when 
 they hit the manager will also block as they are synchronized on the
 uuid.   Because the user making the delete request doesn't see 
 anything happen they tend to submit more delete requests.   The 
 Service is still up, so these go to the computer manager as well, 
 and eventually all of the threads will be waiting for the lock, and 
 the compute manager will stop consuming new messages.
  
  The problem isn't limited to deletes - although in most cases the 
 change of state in the API means that you have to keep making 
 different calls to get past the state checker logic to do it with an
 instance stuck in another state.   Users also seem to be more 
 impatient with deletes, as they are trying to free up quota for other 
things. 
  
  So while I know that we should never get a thread into a hung 
 state into the first place, I was wondering about one of the 
 following approaches to address just the delete case:
  
  i) Change the delete call on the manager so it doesn't wait for 
 the uuid lock.  Deletes should be coded so that they work regardless
 of the state of the VM, and other actions should be able to cope 
 with a delete being performed from under them.  There is of course 
 no guarantee that the delete itself won't block as well. 
  
  ii) Record in the API server that a delete has been started (maybe
 enough to use the task state being set to DELETEING in the API if 
 we're sure this doesn't get cleared), and add a periodic task in the
 compute manager to check for and delete instances that are in a 
 DELETING state for more than some timeout. Then the API, knowing 
 that the delete will be processes eventually can just no-op any 
 further delete requests.
  
  iii) Add some hook into the ServiceGroup API so that the timer 
 could depend on getting a free thread from the compute manager pool 
 (ie run some no-op task) - so that of there are no free threads then
 the service becomes down. That would (eventually) stop the scheduler
 from sending new requests to it, and make deleted be processed in 
 the API server but won't of course help with commands for other 
 instances on the same host.
  
  iv) Move away from having a general topic and thread pool for all 
 requests, and start a listener on an instance specific topic for 
 each running instance on a host (leaving the general topic and pool 
 just for creates and other non-instance calls like the hypervisor 
 API).   Then a blocked task would only affect request for a 
specificinstance.
  
  I'm tending towards ii) as a simple and pragmatic solution in the 
 near term, although I like both iii) and iv) as being both generally
 good enhancments - but iv) in particular feels like a pretty seismic 
change.
  
  Thoughts please,
  
  Phil 
  
  ___
  OpenStack-dev mailing list
  OpenStack-dev@lists.openstack.org
  http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
 
 ___
 OpenStack-dev mailing list
 OpenStack-dev@lists.openstack.org
 http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
 

Re: [openstack-dev] extending nova boot

2013-10-26 Thread Christopher Yeoh
Hi Drew,

Unfortunately there's not much up to date documentation on how to write an
api extension (its on the TODO list), but as Phil mentioned looking at
existing extensions like scheduler_hints is a good place to start.

You'll need to decide whether you want to write a V2 and V3 API version of
your extension or only V3. V3 is currently marked experimental but should
(hopefully!) become the default API with the release of Icehouse. So if you
submit a V2 extension you will have to also submit a V3 version.

As Phil mentioned, for V2 you'll need to add both a new extension file,
plus modify nova/api/openstack/compute/servers.py to look for an pass the
new parameter to compute_api.create. For V2 all parameters have to be
explicitly handled in servers.py

For V3 (see nova/api/openstack/compute/plugins/v3/) you will only need to
add a new extension with no modifications to servers.py. access_ips.py is
probably a good example for V3 to see how parameters can be passed to
compute_api.create by an extension. In access_ips, see the create function
in AccessIPsController and server_create in AccessIPs. Note that you will
need to add some entries in setup.cfg for the V3 plugin to be detected.

Depending on how your extension works you may also need to add entries to
nova/etc/policy.json as well.

Regards,

Chris




On Sat, Oct 26, 2013 at 7:04 AM, Day, Phil philip@hp.com wrote:

 Hi Drew,

 Generally you need to create a new api extention and make some changes in
 the main servers.py

 The scheduler-hints API extension does this kind of thing, so if you look
 at:  api/openstack/compute/contrib/scheduler_hints.py for how the extension
 is defined, and look  in api/poenstack/compute/servers.py code for
 scheduler_hints   (e.g. _extract_scheduler_hints()  ) then that should
 point you in the right direction.

 Hope that helps,
 Phil

  -Original Message-
  From: Drew Fisher [mailto:drew.fis...@oracle.com]
  Sent: 25 October 2013 16:34
  To: openstack-dev@lists.openstack.org
  Subject: [openstack-dev] extending nova boot
 
  Good morning!
 
  I am looking at extending nova boot with a few new flags.  I've found
 enough
  examples online that I have a working extension to novaclient (I can see
 the
  new flags in `nova help boot` and if I run with the --debug flag I can
 see the
  curl requests to the API have the data.
 
  What I can't seem to figure out is how nova-api processes these extra
  arguments.  With stable/grizzly bits, in
  nova/api/openstack/compute/servers.py, I can see where that data is
  processed (in Controller.create()) but it doesn't appear to me that any
  leftover flags are handled.
 
  What do I need to do to get these new flags to nova boot from novaclient
  into nova-api and ultimately my compute driver?
 
  Thanks for any help!
 
  -Drew Fisher
 
  ___
  OpenStack-dev mailing list
  OpenStack-dev@lists.openstack.org
  http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

 ___
 OpenStack-dev mailing list
 OpenStack-dev@lists.openstack.org
 http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova] Thoughs please on how to address a problem with mutliple deletes leading to a nova-compute thread pool problem

2013-10-26 Thread Abhishek Lahiri
Deletes should only be allowed when the vm is in a power off state. This will 
allow consistent state transition.

Thanks
Al


On Oct 26, 2013, at 8:55 AM, Joshua Harlow harlo...@yahoo-inc.com wrote:

 I think I will try to have a unconference at the HK summit about ideas the 
 cinder developers (and the taskflow developers, since it's not a concept that 
 is unique /applicable to just cinder) are having about said state machine 
 (and it's potential usage).
 
 So look out for that, be interesting to have some nova folks involved there 
 also :-)
 
 Sent from my really tiny device...
 
 On Oct 26, 2013, at 3:14 AM, Alex Glikson glik...@il.ibm.com wrote:
 
 +1 
 
 Regards, 
 Alex 
 
 
 Joshua Harlow harlo...@yahoo-inc.com wrote on 26/10/2013 09:29:03 AM:
  
  An idea that others and I are having for a similar use case in 
  cinder (or it appears to be similar).
  
  If there was a well defined state machine/s in nova with well 
  defined and managed transitions between states then it seems like 
  this state machine could resume on failure as well as be interrupted
  when a dueling or preemptable operation arrives (a delete while 
  being created for example). This way not only would it be very clear
  the set of states and transitions but it would also be clear how 
  preemption occurs (and under what cases). 
  
  Right now in nova there is a distributed and ad-hoc state machine 
  which if it was more formalized it could inherit some if the 
  described useful capabilities. It would also be much more resilient 
  to these types of locking problems that u described. 
  
  IMHO that's the only way these types of problems will be fully be 
  fixed, not by more queues or more periodic tasks, but by solidifying
   formalizing the state machines that compose the work nova does.
  
  Sent from my really tiny device...
  
   On Oct 25, 2013, at 3:52 AM, Day, Phil philip@hp.com wrote:
   
   Hi Folks,
   
   We're very occasionally seeing problems where a thread processing 
  a create hangs (and we've seen when taking to Cinder and Glance).  
  Whilst those issues need to be hunted down in their own rights, they
  do show up what seems to me to be a weakness in the processing of 
  delete requests that I'd like to get some feedback on.
   
   Delete is the one operation that is allowed regardless of the 
  Instance state (since it's a one-way operation, and users should 
  always be able to free up their quota).   However when we get a 
  create thread hung in one of these states, the delete requests when 
  they hit the manager will also block as they are synchronized on the
  uuid.   Because the user making the delete request doesn't see 
  anything happen they tend to submit more delete requests.   The 
  Service is still up, so these go to the computer manager as well, 
  and eventually all of the threads will be waiting for the lock, and 
  the compute manager will stop consuming new messages.
   
   The problem isn't limited to deletes - although in most cases the 
  change of state in the API means that you have to keep making 
  different calls to get past the state checker logic to do it with an
  instance stuck in another state.   Users also seem to be more 
  impatient with deletes, as they are trying to free up quota for other 
  things. 
   
   So while I know that we should never get a thread into a hung 
  state into the first place, I was wondering about one of the 
  following approaches to address just the delete case:
   
   i) Change the delete call on the manager so it doesn't wait for 
  the uuid lock.  Deletes should be coded so that they work regardless
  of the state of the VM, and other actions should be able to cope 
  with a delete being performed from under them.  There is of course 
  no guarantee that the delete itself won't block as well. 
   
   ii) Record in the API server that a delete has been started (maybe
  enough to use the task state being set to DELETEING in the API if 
  we're sure this doesn't get cleared), and add a periodic task in the
  compute manager to check for and delete instances that are in a 
  DELETING state for more than some timeout. Then the API, knowing 
  that the delete will be processes eventually can just no-op any 
  further delete requests.
   
   iii) Add some hook into the ServiceGroup API so that the timer 
  could depend on getting a free thread from the compute manager pool 
  (ie run some no-op task) - so that of there are no free threads then
  the service becomes down. That would (eventually) stop the scheduler
  from sending new requests to it, and make deleted be processed in 
  the API server but won't of course help with commands for other 
  instances on the same host.
   
   iv) Move away from having a general topic and thread pool for all 
  requests, and start a listener on an instance specific topic for 
  each running instance on a host (leaving the general topic and pool 
  just for creates and other non-instance calls like 

Re: [openstack-dev] [nova] Thoughs please on how to address a problem with mutliple deletes leading to a nova-compute thread pool problem

2013-10-26 Thread Joshua Harlow
Potentially,

Although I think the lack of formalization and visibility (and the ability to 
easily change its transitions) into the state machine is at this point causing 
part of this pain.

If the state machine was well defined (and adjustable - to a degree...) then 
you could imagine only allowing delete transitions from the powered off state 
while another deployer may prefer a more experimental allow delete transition 
at any time (with associated side effects).

Without the underlying state machine being in a more formalized state it's hard 
to do either in a manageable (and recoverable  resumable...) manner IMHO.

Sent from my really tiny device...

On Oct 26, 2013, at 9:12 AM, Abhishek Lahiri 
aviost...@gmail.commailto:aviost...@gmail.com wrote:

Deletes should only be allowed when the vm is in a power off state. This will 
allow consistent state transition.

Thanks
Al


On Oct 26, 2013, at 8:55 AM, Joshua Harlow 
harlo...@yahoo-inc.commailto:harlo...@yahoo-inc.com wrote:

I think I will try to have a unconference at the HK summit about ideas the 
cinder developers (and the taskflow developers, since it's not a concept that 
is unique /applicable to just cinder) are having about said state machine (and 
it's potential usage).

So look out for that, be interesting to have some nova folks involved there 
also :-)

Sent from my really tiny device...

On Oct 26, 2013, at 3:14 AM, Alex Glikson 
glik...@il.ibm.commailto:glik...@il.ibm.com wrote:

+1

Regards,
Alex


Joshua Harlow harlo...@yahoo-inc.commailto:harlo...@yahoo-inc.com wrote on 
26/10/2013 09:29:03 AM:

 An idea that others and I are having for a similar use case in
 cinder (or it appears to be similar).

 If there was a well defined state machine/s in nova with well
 defined and managed transitions between states then it seems like
 this state machine could resume on failure as well as be interrupted
 when a dueling or preemptable operation arrives (a delete while
 being created for example). This way not only would it be very clear
 the set of states and transitions but it would also be clear how
 preemption occurs (and under what cases).

 Right now in nova there is a distributed and ad-hoc state machine
 which if it was more formalized it could inherit some if the
 described useful capabilities. It would also be much more resilient
 to these types of locking problems that u described.

 IMHO that's the only way these types of problems will be fully be
 fixed, not by more queues or more periodic tasks, but by solidifying
  formalizing the state machines that compose the work nova does.

 Sent from my really tiny device...

  On Oct 25, 2013, at 3:52 AM, Day, Phil 
  philip@hp.commailto:philip@hp.com wrote:
 
  Hi Folks,
 
  We're very occasionally seeing problems where a thread processing
 a create hangs (and we've seen when taking to Cinder and Glance).
 Whilst those issues need to be hunted down in their own rights, they
 do show up what seems to me to be a weakness in the processing of
 delete requests that I'd like to get some feedback on.
 
  Delete is the one operation that is allowed regardless of the
 Instance state (since it's a one-way operation, and users should
 always be able to free up their quota).   However when we get a
 create thread hung in one of these states, the delete requests when
 they hit the manager will also block as they are synchronized on the
 uuid.   Because the user making the delete request doesn't see
 anything happen they tend to submit more delete requests.   The
 Service is still up, so these go to the computer manager as well,
 and eventually all of the threads will be waiting for the lock, and
 the compute manager will stop consuming new messages.
 
  The problem isn't limited to deletes - although in most cases the
 change of state in the API means that you have to keep making
 different calls to get past the state checker logic to do it with an
 instance stuck in another state.   Users also seem to be more
 impatient with deletes, as they are trying to free up quota for other things.
 
  So while I know that we should never get a thread into a hung
 state into the first place, I was wondering about one of the
 following approaches to address just the delete case:
 
  i) Change the delete call on the manager so it doesn't wait for
 the uuid lock.  Deletes should be coded so that they work regardless
 of the state of the VM, and other actions should be able to cope
 with a delete being performed from under them.  There is of course
 no guarantee that the delete itself won't block as well.
 
  ii) Record in the API server that a delete has been started (maybe
 enough to use the task state being set to DELETEING in the API if
 we're sure this doesn't get cleared), and add a periodic task in the
 compute manager to check for and delete instances that are in a
 DELETING state for more than some timeout. Then the API, knowing
 that the delete will be processes eventually can just no-op any
 

Re: [openstack-dev] [nova] Thoughs please on how to address a problem with mutliple deletes leading to a nova-compute thread pool problem

2013-10-26 Thread Abhishek Lahiri
This is a bit off topic , but in general it seems to me that the state 
transitions as you said are not clearly defined for many openstack components. 
Is there any effort underway to define these? As the software gets bigger and 
bigger this will help both developers and operators.

Thanks  Regards
Abhishek Lahiri

On Oct 26, 2013, at 9:52 AM, Joshua Harlow harlo...@yahoo-inc.com wrote:

 Potentially,
 
 Although I think the lack of formalization and visibility (and the ability to 
 easily change its transitions) into the state machine is at this point 
 causing part of this pain.
 
 If the state machine was well defined (and adjustable - to a degree...) then 
 you could imagine only allowing delete transitions from the powered off state 
 while another deployer may prefer a more experimental allow delete 
 transition at any time (with associated side effects).
 
 Without the underlying state machine being in a more formalized state it's 
 hard to do either in a manageable (and recoverable  resumable...) manner 
 IMHO.
 
 Sent from my really tiny device...
 
 On Oct 26, 2013, at 9:12 AM, Abhishek Lahiri aviost...@gmail.com wrote:
 
 Deletes should only be allowed when the vm is in a power off state. This 
 will allow consistent state transition.
 
 Thanks
 Al
 
 
 On Oct 26, 2013, at 8:55 AM, Joshua Harlow harlo...@yahoo-inc.com wrote:
 
 I think I will try to have a unconference at the HK summit about ideas the 
 cinder developers (and the taskflow developers, since it's not a concept 
 that is unique /applicable to just cinder) are having about said state 
 machine (and it's potential usage).
 
 So look out for that, be interesting to have some nova folks involved there 
 also :-)
 
 Sent from my really tiny device...
 
 On Oct 26, 2013, at 3:14 AM, Alex Glikson glik...@il.ibm.com wrote:
 
 +1 
 
 Regards, 
 Alex 
 
 
 Joshua Harlow harlo...@yahoo-inc.com wrote on 26/10/2013 09:29:03 AM:
  
  An idea that others and I are having for a similar use case in 
  cinder (or it appears to be similar).
  
  If there was a well defined state machine/s in nova with well 
  defined and managed transitions between states then it seems like 
  this state machine could resume on failure as well as be interrupted
  when a dueling or preemptable operation arrives (a delete while 
  being created for example). This way not only would it be very clear
  the set of states and transitions but it would also be clear how 
  preemption occurs (and under what cases). 
  
  Right now in nova there is a distributed and ad-hoc state machine 
  which if it was more formalized it could inherit some if the 
  described useful capabilities. It would also be much more resilient 
  to these types of locking problems that u described. 
  
  IMHO that's the only way these types of problems will be fully be 
  fixed, not by more queues or more periodic tasks, but by solidifying
   formalizing the state machines that compose the work nova does.
  
  Sent from my really tiny device...
  
   On Oct 25, 2013, at 3:52 AM, Day, Phil philip@hp.com wrote:
   
   Hi Folks,
   
   We're very occasionally seeing problems where a thread processing 
  a create hangs (and we've seen when taking to Cinder and Glance).  
  Whilst those issues need to be hunted down in their own rights, they
  do show up what seems to me to be a weakness in the processing of 
  delete requests that I'd like to get some feedback on.
   
   Delete is the one operation that is allowed regardless of the 
  Instance state (since it's a one-way operation, and users should 
  always be able to free up their quota).   However when we get a 
  create thread hung in one of these states, the delete requests when 
  they hit the manager will also block as they are synchronized on the
  uuid.   Because the user making the delete request doesn't see 
  anything happen they tend to submit more delete requests.   The 
  Service is still up, so these go to the computer manager as well, 
  and eventually all of the threads will be waiting for the lock, and 
  the compute manager will stop consuming new messages.
   
   The problem isn't limited to deletes - although in most cases the 
  change of state in the API means that you have to keep making 
  different calls to get past the state checker logic to do it with an
  instance stuck in another state.   Users also seem to be more 
  impatient with deletes, as they are trying to free up quota for other 
  things. 
   
   So while I know that we should never get a thread into a hung 
  state into the first place, I was wondering about one of the 
  following approaches to address just the delete case:
   
   i) Change the delete call on the manager so it doesn't wait for 
  the uuid lock.  Deletes should be coded so that they work regardless
  of the state of the VM, and other actions should be able to cope 
  with a delete being performed from under them.  There is of course 
  no guarantee that the delete itself won't block as well. 

Re: [openstack-dev] [nova] Thoughs please on how to address a problem with mutliple deletes leading to a nova-compute thread pool problem

2013-10-26 Thread Joshua Harlow
There is at least 1 such effort being discussed in cinder, other projects I can 
not say. I am hoping to gain more traction there as taskflow[1] I think can 
provide (or help provide) a foundation to help here. Taskflow itself has a well 
defined state machine [2]. But it's in the end up to the projects themselves to 
see the issue and have a desire to move to a more formalized and managed 
model...

1. https://wiki.openstack.org/wiki/TaskFlow
2. https://wiki.openstack.org/wiki/TaskFlow/States_of_Task_and_Flow

Sent from my really tiny device...

On Oct 26, 2013, at 12:22 PM, Abhishek Lahiri 
aviost...@gmail.commailto:aviost...@gmail.com wrote:

This is a bit off topic , but in general it seems to me that the state 
transitions as you said are not clearly defined for many openstack components. 
Is there any effort underway to define these? As the software gets bigger and 
bigger this will help both developers and operators.

Thanks  Regards
Abhishek Lahiri

On Oct 26, 2013, at 9:52 AM, Joshua Harlow 
harlo...@yahoo-inc.commailto:harlo...@yahoo-inc.com wrote:

Potentially,

Although I think the lack of formalization and visibility (and the ability to 
easily change its transitions) into the state machine is at this point causing 
part of this pain.

If the state machine was well defined (and adjustable - to a degree...) then 
you could imagine only allowing delete transitions from the powered off state 
while another deployer may prefer a more experimental allow delete transition 
at any time (with associated side effects).

Without the underlying state machine being in a more formalized state it's hard 
to do either in a manageable (and recoverable  resumable...) manner IMHO.

Sent from my really tiny device...

On Oct 26, 2013, at 9:12 AM, Abhishek Lahiri 
aviost...@gmail.commailto:aviost...@gmail.com wrote:

Deletes should only be allowed when the vm is in a power off state. This will 
allow consistent state transition.

Thanks
Al


On Oct 26, 2013, at 8:55 AM, Joshua Harlow 
harlo...@yahoo-inc.commailto:harlo...@yahoo-inc.com wrote:

I think I will try to have a unconference at the HK summit about ideas the 
cinder developers (and the taskflow developers, since it's not a concept that 
is unique /applicable to just cinder) are having about said state machine (and 
it's potential usage).

So look out for that, be interesting to have some nova folks involved there 
also :-)

Sent from my really tiny device...

On Oct 26, 2013, at 3:14 AM, Alex Glikson 
glik...@il.ibm.commailto:glik...@il.ibm.com wrote:

+1

Regards,
Alex


Joshua Harlow harlo...@yahoo-inc.commailto:harlo...@yahoo-inc.com wrote on 
26/10/2013 09:29:03 AM:

 An idea that others and I are having for a similar use case in
 cinder (or it appears to be similar).

 If there was a well defined state machine/s in nova with well
 defined and managed transitions between states then it seems like
 this state machine could resume on failure as well as be interrupted
 when a dueling or preemptable operation arrives (a delete while
 being created for example). This way not only would it be very clear
 the set of states and transitions but it would also be clear how
 preemption occurs (and under what cases).

 Right now in nova there is a distributed and ad-hoc state machine
 which if it was more formalized it could inherit some if the
 described useful capabilities. It would also be much more resilient
 to these types of locking problems that u described.

 IMHO that's the only way these types of problems will be fully be
 fixed, not by more queues or more periodic tasks, but by solidifying
  formalizing the state machines that compose the work nova does.

 Sent from my really tiny device...

  On Oct 25, 2013, at 3:52 AM, Day, Phil 
  philip@hp.commailto:philip@hp.com wrote:
 
  Hi Folks,
 
  We're very occasionally seeing problems where a thread processing
 a create hangs (and we've seen when taking to Cinder and Glance).
 Whilst those issues need to be hunted down in their own rights, they
 do show up what seems to me to be a weakness in the processing of
 delete requests that I'd like to get some feedback on.
 
  Delete is the one operation that is allowed regardless of the
 Instance state (since it's a one-way operation, and users should
 always be able to free up their quota).   However when we get a
 create thread hung in one of these states, the delete requests when
 they hit the manager will also block as they are synchronized on the
 uuid.   Because the user making the delete request doesn't see
 anything happen they tend to submit more delete requests.   The
 Service is still up, so these go to the computer manager as well,
 and eventually all of the threads will be waiting for the lock, and
 the compute manager will stop consuming new messages.
 
  The problem isn't limited to deletes - although in most cases the
 change of state in the API means that you have to keep making
 different calls to get past the state checker logic to do 

Re: [openstack-dev] Remove vim modelines?

2013-10-26 Thread Ruslan Kiianchuk
If someone takes effort to learn Vim to the point when he/she develops code
in it, they most definitely have their preferred settings already, why
overwrite them? If those settings conflict with the style guide -- it has
been said, pep8 and hacking check will notify.

I always thought that leaving some hints to text editor by adding specific
content to file content is just a dirty hack -- files should not be editor
agnostic, the editor should be smart enough to figure everything out by
himself. And yeah, also a happy Vim user for 4 years.

Replacing the long paragraph with a short response:
+1, remove them :)


On Fri, Oct 25, 2013 at 4:45 PM, Vishvananda Ishaya
vishvana...@gmail.comwrote:

 Interesting Background Information:

 Why do we have modelines?

 Termie put them in all the files of the first version of nova

 Why did he put in modelines instead of configuring his editor?

 Termie does a lot of python coding and he prefers a tabstop of 2 on all
 his personal projects[1]

 I really don't see much value outside of people who prefer other tabstops

 +1 to remove them

 Vish

 [1] https://github.com/termie/git-bzr-ng/blob/master/git-bzr
 On Oct 24, 2013, at 5:38 AM, Joe Gordon joe.gord...@gmail.com wrote:

 Since the beginning of OpenStack we have had vim modelines all over the
 codebase, but after seeing this patch
 https://review.opeenstack.org/#/c/50891/https://review.openstack.org/#/c/50891/I
  took a further look into vim modelines and think we should remove them.
 Before going any further, I should point out these lines don't bother me
 too much but I figured if we could get consensus, then we could shrink our
 codebase by a little bit.

 Sidenote: This discussion is being moved to the mailing list because it 'would
 be better to have a mailing list thread about this rather than bits and
 pieces of discussion in gerrit' as this change requires multiple patches.
  https://review.openstack.org/#/c/51295/.


 Why remove them?

 * Modelines aren't supported by default in debian or ubuntu due to
 security reasons: https://wiki.python.org/moin/Vim
 * Having modelines for vim means if someone wants we should support
 modelines for emacs (
 http://www.gnu.org/software/emacs/manual/html_mono/emacs.html#Specifying-File-Variables)
 etc. as well.  And having a bunch of headers for different editors in each
 file seems like extra overhead.
 * There are other ways of making sure tabstop is set correctly for python
 files, see  https://wiki.python.org/moin/Vim.  I am a vIm user myself and
 have never used modelines.
 * We have vim modelines in only 828 out of 1213 python files in nova
 (68%), so if anyone is using modelines today, then it only works 68% of the
 time in nova
 * Why have the same config 828 times for one repo alone?  This violates
 the DRY principle (Don't Repeat Yourself).


 Related Patches:
 https://review.openstack.org/#/c/51295/

 https://review.openstack.org/#/q/status:open+project:openstack/nova+branch:master+topic:noboilerplate,n,z

 best,
 Joe
 ___
 OpenStack-dev mailing list
 OpenStack-dev@lists.openstack.org
 http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev



 ___
 OpenStack-dev mailing list
 OpenStack-dev@lists.openstack.org
 http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev




-- 
Sincerely, Ruslan Kiianchuk.
___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev