Re: [openstack-dev] creating a default for oslo config variables within a project?

2013-12-05 Thread Julien Danjou
On Wed, Dec 04 2013, Sean Dague wrote:

 Honestly, I'd love us to be clever and figure out a not dangerous way
 through this, even if unwise (where we can yell at the user in the LOGs
 loudly, and fail them in J if lock_dir=/tmp) that lets us progress
 through this while gracefully bringing configs into line.

Correct me if I'm wrong, but I think the correct way to deal with that
security problem is to use an atomic operation using open(2) with:
  open(pathname, O_CREAT | O_EXCL)

or mkstemp(3).

That should be doable in Python too.

-- 
Julien Danjou
# Free Software hacker # independent consultant
# http://julien.danjou.info


signature.asc
Description: PGP signature
___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] creating a default for oslo config variables within a project?

2013-12-05 Thread Clint Byrum
Excerpts from Julien Danjou's message of 2013-12-05 01:22:00 -0800:
 On Wed, Dec 04 2013, Sean Dague wrote:
 
  Honestly, I'd love us to be clever and figure out a not dangerous way
  through this, even if unwise (where we can yell at the user in the LOGs
  loudly, and fail them in J if lock_dir=/tmp) that lets us progress
  through this while gracefully bringing configs into line.
 
 Correct me if I'm wrong, but I think the correct way to deal with that
 security problem is to use an atomic operation using open(2) with:
   open(pathname, O_CREAT | O_EXCL)
 

DOS by a malicious user creating it first is still trivial.

 or mkstemp(3).
 

Can't use mkstemp as the point is this needs to be something shared
between processes.

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] creating a default for oslo config variables within a project?

2013-12-04 Thread Sean Dague
On 12/03/2013 11:21 PM, Clint Byrum wrote:
 Excerpts from Sean Dague's message of 2013-12-03 16:05:47 -0800:
 On 12/03/2013 06:13 PM, Ben Nemec wrote:
 On 2013-12-03 17:09, Sean Dague wrote:
 On 12/03/2013 05:50 PM, Mark McLoughlin wrote:
 On Tue, 2013-12-03 at 16:23 -0600, Ben Nemec wrote:
 On 2013-12-03 15:56, Sean Dague wrote:
 This cinder patch - https://review.openstack.org/#/c/48935/

 Is blocked on failing upgrade because the updated oslo lockutils won't
 function until there is a specific configuration variable added to the
 cinder.conf.

 That work around is proposed here -
 https://review.openstack.org/#/c/52070/3

 However I think this is exactly the kind of forward breaks that we
 want
 to prevent with grenade, as cinder failing to function after a rolling
 upgrade because a config item wasn't added is exactly the kind of pain
 we are trying to prevent happening to ops.

 So the question is, how is this done correctly so that a default
 can be
 set in the cinder code for this value, and it not require a config
 change to work?

 You're absolutely correct, in principle - if the default value for
 lock_path worked for users before, we should absolutely continue to
 support it.

 I don't know that I have a good answer on how to handle this, but for
 context this change is the result of a nasty bug in lockutils that
 meant
 external locks were doing nothing if lock_path wasn't set.  Basically
 it's something we should never have allowed in the first place.

 As far as setting this in code, it's important that all of the
 processes
 for a service are using the same value to avoid the same bad situation
 we were in before.  For tests, we have a lockutils wrapper
 (https://github.com/openstack/oslo-incubator/blob/master/openstack/common/lockutils.py#L282)
 that sets an environment variable to address this, but that only
 works if all of the processes are going to be spawned from within
 the same wrapper, and I'm not sure how secure that is for production
 deployments since it puts all of the lock files in a temporary
 directory.

 Right, I don't think the previous default really worked - if you used
 the default, then external locking was broken.

 I suspect most distros do set a default - I see RDO has this in its
 default nova.conf:

   lock_path = /var/lib/nova/tmp

 So, yes - this is all terrible.

 IMHO, rather than raise an exception we should log a big fat warning
 about relying on the default and perhaps just treat the lock as an
 in-process lock in that case ... since that's essentially what it was
 before, right?

 So a default of lock_path = /tmp will work (FHS says that path has to be
 there), even if not optimal. Could we make it a default value like that
 instead of the current default which is null (and hence the problem).

 IIRC, my initial fix was something similar to that, but it got shot down
 because putting the lock files in a known world writeable location was a
 security issue.

 Although maybe if we put them in a subdirectory of /tmp and ensured that
 the permissions were such that only the user running the service could
 use that directory, it might be acceptable?  We could still log a
 warning if we wanted.

 This seems like it would have implications for people running services
 on Windows too, but we can probably find a way to make that work if we
 decide on a solution.

 How is that a security issue? Are the lock files being written with some
 sensitive data in them and have g or o permissions on? The sticky bit
 (+t) on /tmp will prevent other users from deleting the file.

 
 Right, but it won't prevent users from creating a symlink with the same
 name.
 
 ln -s /var/lib/nova/instances/x/image.raw /tmp/well.known.location
 
 Now when you do
 
 with open('/tmp/well.known.location', 'w') as lockfile:
   lockfile.write('Stuff')
 
 Nova has just truncated the image file and written 'Stuff' to it.

So that's the generic case (and the way people often write this). But
the oslo lockutils code doesn't work that way. While it does open the
file for write, it does not actually write, it's using fcntl to hold
locks. That's taking a data structure on the fd in kernel memory (IIRC),
so it correctly gives it up if the process crashes.

I'm not saying there isn't some other possible security vulnerability
here as well, but it's not jumping out at me. So I'd love to understand
that, because if we can close that exposure we can provide a working
default, plus a strong recommendation for how to do that *right*. I'd be
totally happy with printing WARNING level at startup if lock_path = /tmp
that this should be adjusted.

 The typical solution is to use a lock directory, /var/run/yourprogram,
 that has restrictive enough permissions setup for your program to have
 exclusive use of it, and is created by root at boot time. That is what
 the packages do now.
 
 It would be good if everybody agreed on a default, %(nova_home)/locks
 or something, but root still must set it up with the right 

Re: [openstack-dev] creating a default for oslo config variables within a project?

2013-12-04 Thread Ben Nemec

On 2013-12-04 06:07, Sean Dague wrote:

On 12/03/2013 11:21 PM, Clint Byrum wrote:

Excerpts from Sean Dague's message of 2013-12-03 16:05:47 -0800:

On 12/03/2013 06:13 PM, Ben Nemec wrote:

On 2013-12-03 17:09, Sean Dague wrote:

On 12/03/2013 05:50 PM, Mark McLoughlin wrote:

On Tue, 2013-12-03 at 16:23 -0600, Ben Nemec wrote:

On 2013-12-03 15:56, Sean Dague wrote:

This cinder patch - https://review.openstack.org/#/c/48935/

Is blocked on failing upgrade because the updated oslo lockutils 
won't
function until there is a specific configuration variable added 
to the

cinder.conf.

That work around is proposed here -
https://review.openstack.org/#/c/52070/3

However I think this is exactly the kind of forward breaks that 
we

want
to prevent with grenade, as cinder failing to function after a 
rolling
upgrade because a config item wasn't added is exactly the kind 
of pain

we are trying to prevent happening to ops.

So the question is, how is this done correctly so that a default
can be
set in the cinder code for this value, and it not require a 
config

change to work?


You're absolutely correct, in principle - if the default value for
lock_path worked for users before, we should absolutely continue 
to

support it.

I don't know that I have a good answer on how to handle this, but 
for
context this change is the result of a nasty bug in lockutils 
that

meant
external locks were doing nothing if lock_path wasn't set.  
Basically

it's something we should never have allowed in the first place.

As far as setting this in code, it's important that all of the
processes
for a service are using the same value to avoid the same bad 
situation

we were in before.  For tests, we have a lockutils wrapper
(https://github.com/openstack/oslo-incubator/blob/master/openstack/common/lockutils.py#L282)
that sets an environment variable to address this, but that only
works if all of the processes are going to be spawned from within
the same wrapper, and I'm not sure how secure that is for 
production

deployments since it puts all of the lock files in a temporary
directory.


Right, I don't think the previous default really worked - if you 
used

the default, then external locking was broken.

I suspect most distros do set a default - I see RDO has this in 
its

default nova.conf:

  lock_path = /var/lib/nova/tmp

So, yes - this is all terrible.

IMHO, rather than raise an exception we should log a big fat 
warning

about relying on the default and perhaps just treat the lock as an
in-process lock in that case ... since that's essentially what it 
was

before, right?


So a default of lock_path = /tmp will work (FHS says that path has 
to be
there), even if not optimal. Could we make it a default value like 
that
instead of the current default which is null (and hence the 
problem).


IIRC, my initial fix was something similar to that, but it got shot 
down
because putting the lock files in a known world writeable location 
was a

security issue.

Although maybe if we put them in a subdirectory of /tmp and ensured 
that
the permissions were such that only the user running the service 
could

use that directory, it might be acceptable?  We could still log a
warning if we wanted.

This seems like it would have implications for people running 
services
on Windows too, but we can probably find a way to make that work if 
we

decide on a solution.


How is that a security issue? Are the lock files being written with 
some

sensitive data in them and have g or o permissions on? The sticky bit
(+t) on /tmp will prevent other users from deleting the file.



Right, but it won't prevent users from creating a symlink with the 
same

name.

ln -s /var/lib/nova/instances/x/image.raw /tmp/well.known.location

Now when you do

with open('/tmp/well.known.location', 'w') as lockfile:
  lockfile.write('Stuff')

Nova has just truncated the image file and written 'Stuff' to it.


So that's the generic case (and the way people often write this). But
the oslo lockutils code doesn't work that way. While it does open the
file for write, it does not actually write, it's using fcntl to hold
locks. That's taking a data structure on the fd in kernel memory 
(IIRC),

so it correctly gives it up if the process crashes.

I'm not saying there isn't some other possible security vulnerability
here as well, but it's not jumping out at me. So I'd love to understand
that, because if we can close that exposure we can provide a working
default, plus a strong recommendation for how to do that *right*. I'd 
be
totally happy with printing WARNING level at startup if lock_path = 
/tmp

that this should be adjusted.


Full disclosure: I don't claim to be a security expert, so take my 
thoughts on this with a grain of salt.


tldr: I still don't see a way to do this without breaking _something_.

Unfortunately, while we don't actually write to the file, just opening 
it for write access truncates it.  So there remains the issue Clint 
raised if someone 

Re: [openstack-dev] creating a default for oslo config variables within a project?

2013-12-04 Thread Sean Dague
On 12/04/2013 11:56 AM, Ben Nemec wrote:
 On 2013-12-04 06:07, Sean Dague wrote:
 On 12/03/2013 11:21 PM, Clint Byrum wrote:
 Excerpts from Sean Dague's message of 2013-12-03 16:05:47 -0800:
 On 12/03/2013 06:13 PM, Ben Nemec wrote:
 On 2013-12-03 17:09, Sean Dague wrote:
 On 12/03/2013 05:50 PM, Mark McLoughlin wrote:
 On Tue, 2013-12-03 at 16:23 -0600, Ben Nemec wrote:
 On 2013-12-03 15:56, Sean Dague wrote:
 This cinder patch - https://review.openstack.org/#/c/48935/

 Is blocked on failing upgrade because the updated oslo
 lockutils won't
 function until there is a specific configuration variable added
 to the
 cinder.conf.

 That work around is proposed here -
 https://review.openstack.org/#/c/52070/3

 However I think this is exactly the kind of forward breaks that we
 want
 to prevent with grenade, as cinder failing to function after a
 rolling
 upgrade because a config item wasn't added is exactly the kind
 of pain
 we are trying to prevent happening to ops.

 So the question is, how is this done correctly so that a default
 can be
 set in the cinder code for this value, and it not require a config
 change to work?

 You're absolutely correct, in principle - if the default value for
 lock_path worked for users before, we should absolutely continue to
 support it.

 I don't know that I have a good answer on how to handle this,
 but for
 context this change is the result of a nasty bug in lockutils that
 meant
 external locks were doing nothing if lock_path wasn't set. 
 Basically
 it's something we should never have allowed in the first place.

 As far as setting this in code, it's important that all of the
 processes
 for a service are using the same value to avoid the same bad
 situation
 we were in before.  For tests, we have a lockutils wrapper
 (https://github.com/openstack/oslo-incubator/blob/master/openstack/common/lockutils.py#L282)

 that sets an environment variable to address this, but that only
 works if all of the processes are going to be spawned from within
 the same wrapper, and I'm not sure how secure that is for
 production
 deployments since it puts all of the lock files in a temporary
 directory.

 Right, I don't think the previous default really worked - if
 you used
 the default, then external locking was broken.

 I suspect most distros do set a default - I see RDO has this in its
 default nova.conf:

   lock_path = /var/lib/nova/tmp

 So, yes - this is all terrible.

 IMHO, rather than raise an exception we should log a big fat warning
 about relying on the default and perhaps just treat the lock as an
 in-process lock in that case ... since that's essentially what it
 was
 before, right?

 So a default of lock_path = /tmp will work (FHS says that path has
 to be
 there), even if not optimal. Could we make it a default value like
 that
 instead of the current default which is null (and hence the problem).

 IIRC, my initial fix was something similar to that, but it got shot
 down
 because putting the lock files in a known world writeable location
 was a
 security issue.

 Although maybe if we put them in a subdirectory of /tmp and ensured
 that
 the permissions were such that only the user running the service could
 use that directory, it might be acceptable?  We could still log a
 warning if we wanted.

 This seems like it would have implications for people running services
 on Windows too, but we can probably find a way to make that work if we
 decide on a solution.

 How is that a security issue? Are the lock files being written with
 some
 sensitive data in them and have g or o permissions on? The sticky bit
 (+t) on /tmp will prevent other users from deleting the file.


 Right, but it won't prevent users from creating a symlink with the same
 name.

 ln -s /var/lib/nova/instances/x/image.raw /tmp/well.known.location

 Now when you do

 with open('/tmp/well.known.location', 'w') as lockfile:
   lockfile.write('Stuff')

 Nova has just truncated the image file and written 'Stuff' to it.

 So that's the generic case (and the way people often write this). But
 the oslo lockutils code doesn't work that way. While it does open the
 file for write, it does not actually write, it's using fcntl to hold
 locks. That's taking a data structure on the fd in kernel memory (IIRC),
 so it correctly gives it up if the process crashes.

 I'm not saying there isn't some other possible security vulnerability
 here as well, but it's not jumping out at me. So I'd love to understand
 that, because if we can close that exposure we can provide a working
 default, plus a strong recommendation for how to do that *right*. I'd be
 totally happy with printing WARNING level at startup if lock_path = /tmp
 that this should be adjusted.
 
 Full disclosure: I don't claim to be a security expert, so take my
 thoughts on this with a grain of salt.
 
 tldr: I still don't see a way to do this without breaking _something_.
 
 Unfortunately, while we don't actually write to the file, just opening
 it for 

Re: [openstack-dev] creating a default for oslo config variables within a project?

2013-12-04 Thread Ben Nemec

On 2013-12-04 12:51, Sean Dague wrote:

On 12/04/2013 11:56 AM, Ben Nemec wrote:

On 2013-12-04 06:07, Sean Dague wrote:

On 12/03/2013 11:21 PM, Clint Byrum wrote:

Excerpts from Sean Dague's message of 2013-12-03 16:05:47 -0800:

On 12/03/2013 06:13 PM, Ben Nemec wrote:

On 2013-12-03 17:09, Sean Dague wrote:

On 12/03/2013 05:50 PM, Mark McLoughlin wrote:

On Tue, 2013-12-03 at 16:23 -0600, Ben Nemec wrote:

On 2013-12-03 15:56, Sean Dague wrote:

This cinder patch - https://review.openstack.org/#/c/48935/

Is blocked on failing upgrade because the updated oslo
lockutils won't
function until there is a specific configuration variable 
added

to the
cinder.conf.

That work around is proposed here -
https://review.openstack.org/#/c/52070/3

However I think this is exactly the kind of forward breaks 
that we

want
to prevent with grenade, as cinder failing to function after a
rolling
upgrade because a config item wasn't added is exactly the kind
of pain
we are trying to prevent happening to ops.

So the question is, how is this done correctly so that a 
default

can be
set in the cinder code for this value, and it not require a 
config

change to work?


You're absolutely correct, in principle - if the default value 
for
lock_path worked for users before, we should absolutely continue 
to

support it.


I don't know that I have a good answer on how to handle this,
but for
context this change is the result of a nasty bug in lockutils 
that

meant
external locks were doing nothing if lock_path wasn't set.
Basically
it's something we should never have allowed in the first place.

As far as setting this in code, it's important that all of the
processes
for a service are using the same value to avoid the same bad
situation
we were in before.  For tests, we have a lockutils wrapper
(https://github.com/openstack/oslo-incubator/blob/master/openstack/common/lockutils.py#L282)

that sets an environment variable to address this, but that 
only
works if all of the processes are going to be spawned from 
within

the same wrapper, and I'm not sure how secure that is for
production
deployments since it puts all of the lock files in a temporary
directory.


Right, I don't think the previous default really worked - if
you used
the default, then external locking was broken.

I suspect most distros do set a default - I see RDO has this in 
its

default nova.conf:

  lock_path = /var/lib/nova/tmp

So, yes - this is all terrible.

IMHO, rather than raise an exception we should log a big fat 
warning
about relying on the default and perhaps just treat the lock as 
an
in-process lock in that case ... since that's essentially what 
it

was
before, right?


So a default of lock_path = /tmp will work (FHS says that path 
has

to be
there), even if not optimal. Could we make it a default value 
like

that
instead of the current default which is null (and hence the 
problem).


IIRC, my initial fix was something similar to that, but it got 
shot

down
because putting the lock files in a known world writeable location
was a
security issue.

Although maybe if we put them in a subdirectory of /tmp and 
ensured

that
the permissions were such that only the user running the service 
could

use that directory, it might be acceptable?  We could still log a
warning if we wanted.

This seems like it would have implications for people running 
services
on Windows too, but we can probably find a way to make that work 
if we

decide on a solution.


How is that a security issue? Are the lock files being written with
some
sensitive data in them and have g or o permissions on? The sticky 
bit

(+t) on /tmp will prevent other users from deleting the file.



Right, but it won't prevent users from creating a symlink with the 
same

name.

ln -s /var/lib/nova/instances/x/image.raw 
/tmp/well.known.location


Now when you do

with open('/tmp/well.known.location', 'w') as lockfile:
  lockfile.write('Stuff')

Nova has just truncated the image file and written 'Stuff' to it.


So that's the generic case (and the way people often write this). But
the oslo lockutils code doesn't work that way. While it does open the
file for write, it does not actually write, it's using fcntl to hold
locks. That's taking a data structure on the fd in kernel memory 
(IIRC),

so it correctly gives it up if the process crashes.

I'm not saying there isn't some other possible security vulnerability
here as well, but it's not jumping out at me. So I'd love to 
understand

that, because if we can close that exposure we can provide a working
default, plus a strong recommendation for how to do that *right*. I'd 
be
totally happy with printing WARNING level at startup if lock_path = 
/tmp

that this should be adjusted.


Full disclosure: I don't claim to be a security expert, so take my
thoughts on this with a grain of salt.

tldr: I still don't see a way to do this without breaking _something_.

Unfortunately, while we don't actually write to the file, just opening
it for 

Re: [openstack-dev] creating a default for oslo config variables within a project?

2013-12-04 Thread Clint Byrum
Excerpts from Sean Dague's message of 2013-12-04 10:51:16 -0800:
 On 12/04/2013 11:56 AM, Ben Nemec wrote:
  On 2013-12-04 06:07, Sean Dague wrote:
  On 12/03/2013 11:21 PM, Clint Byrum wrote:
  Excerpts from Sean Dague's message of 2013-12-03 16:05:47 -0800:
  On 12/03/2013 06:13 PM, Ben Nemec wrote:
  On 2013-12-03 17:09, Sean Dague wrote:
  On 12/03/2013 05:50 PM, Mark McLoughlin wrote:
  On Tue, 2013-12-03 at 16:23 -0600, Ben Nemec wrote:
  On 2013-12-03 15:56, Sean Dague wrote:
  This cinder patch - https://review.openstack.org/#/c/48935/
 
  Is blocked on failing upgrade because the updated oslo
  lockutils won't
  function until there is a specific configuration variable added
  to the
  cinder.conf.
 
  That work around is proposed here -
  https://review.openstack.org/#/c/52070/3
 
  However I think this is exactly the kind of forward breaks that we
  want
  to prevent with grenade, as cinder failing to function after a
  rolling
  upgrade because a config item wasn't added is exactly the kind
  of pain
  we are trying to prevent happening to ops.
 
  So the question is, how is this done correctly so that a default
  can be
  set in the cinder code for this value, and it not require a config
  change to work?
 
  You're absolutely correct, in principle - if the default value for
  lock_path worked for users before, we should absolutely continue to
  support it.
 
  I don't know that I have a good answer on how to handle this,
  but for
  context this change is the result of a nasty bug in lockutils that
  meant
  external locks were doing nothing if lock_path wasn't set. 
  Basically
  it's something we should never have allowed in the first place.
 
  As far as setting this in code, it's important that all of the
  processes
  for a service are using the same value to avoid the same bad
  situation
  we were in before.  For tests, we have a lockutils wrapper
  (https://github.com/openstack/oslo-incubator/blob/master/openstack/common/lockutils.py#L282)
 
  that sets an environment variable to address this, but that only
  works if all of the processes are going to be spawned from within
  the same wrapper, and I'm not sure how secure that is for
  production
  deployments since it puts all of the lock files in a temporary
  directory.
 
  Right, I don't think the previous default really worked - if
  you used
  the default, then external locking was broken.
 
  I suspect most distros do set a default - I see RDO has this in its
  default nova.conf:
 
lock_path = /var/lib/nova/tmp
 
  So, yes - this is all terrible.
 
  IMHO, rather than raise an exception we should log a big fat warning
  about relying on the default and perhaps just treat the lock as an
  in-process lock in that case ... since that's essentially what it
  was
  before, right?
 
  So a default of lock_path = /tmp will work (FHS says that path has
  to be
  there), even if not optimal. Could we make it a default value like
  that
  instead of the current default which is null (and hence the problem).
 
  IIRC, my initial fix was something similar to that, but it got shot
  down
  because putting the lock files in a known world writeable location
  was a
  security issue.
 
  Although maybe if we put them in a subdirectory of /tmp and ensured
  that
  the permissions were such that only the user running the service could
  use that directory, it might be acceptable?  We could still log a
  warning if we wanted.
 
  This seems like it would have implications for people running services
  on Windows too, but we can probably find a way to make that work if we
  decide on a solution.
 
  How is that a security issue? Are the lock files being written with
  some
  sensitive data in them and have g or o permissions on? The sticky bit
  (+t) on /tmp will prevent other users from deleting the file.
 
 
  Right, but it won't prevent users from creating a symlink with the same
  name.
 
  ln -s /var/lib/nova/instances/x/image.raw /tmp/well.known.location
 
  Now when you do
 
  with open('/tmp/well.known.location', 'w') as lockfile:
lockfile.write('Stuff')
 
  Nova has just truncated the image file and written 'Stuff' to it.
 
  So that's the generic case (and the way people often write this). But
  the oslo lockutils code doesn't work that way. While it does open the
  file for write, it does not actually write, it's using fcntl to hold
  locks. That's taking a data structure on the fd in kernel memory (IIRC),
  so it correctly gives it up if the process crashes.
 
  I'm not saying there isn't some other possible security vulnerability
  here as well, but it's not jumping out at me. So I'd love to understand
  that, because if we can close that exposure we can provide a working
  default, plus a strong recommendation for how to do that *right*. I'd be
  totally happy with printing WARNING level at startup if lock_path = /tmp
  that this should be adjusted.
  
  Full disclosure: I don't claim to be a security expert, so take my
  

Re: [openstack-dev] creating a default for oslo config variables within a project?

2013-12-03 Thread Davanum Srinivas
Should we update oslo-incubator first to?

default=os.environ.get(OSLO_LOCK_PATH) or tempfile.mkdtemp()

-- dims

On Tue, Dec 3, 2013 at 4:56 PM, Sean Dague s...@dague.net wrote:
 This cinder patch - https://review.openstack.org/#/c/48935/

 Is blocked on failing upgrade because the updated oslo lockutils won't
 function until there is a specific configuration variable added to the
 cinder.conf.

 That work around is proposed here -
 https://review.openstack.org/#/c/52070/3

 However I think this is exactly the kind of forward breaks that we want
 to prevent with grenade, as cinder failing to function after a rolling
 upgrade because a config item wasn't added is exactly the kind of pain
 we are trying to prevent happening to ops.

 So the question is, how is this done correctly so that a default can be
 set in the cinder code for this value, and it not require a config
 change to work?

 -Sean

 --
 Sean Dague
 http://dague.net


 ___
 OpenStack-dev mailing list
 OpenStack-dev@lists.openstack.org
 http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev




-- 
Davanum Srinivas :: http://davanum.wordpress.com

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] creating a default for oslo config variables within a project?

2013-12-03 Thread Ben Nemec

On 2013-12-03 15:56, Sean Dague wrote:

This cinder patch - https://review.openstack.org/#/c/48935/

Is blocked on failing upgrade because the updated oslo lockutils won't
function until there is a specific configuration variable added to the
cinder.conf.

That work around is proposed here -
https://review.openstack.org/#/c/52070/3

However I think this is exactly the kind of forward breaks that we want
to prevent with grenade, as cinder failing to function after a rolling
upgrade because a config item wasn't added is exactly the kind of pain
we are trying to prevent happening to ops.

So the question is, how is this done correctly so that a default can be
set in the cinder code for this value, and it not require a config
change to work?


I don't know that I have a good answer on how to handle this, but for 
context this change is the result of a nasty bug in lockutils that meant 
external locks were doing nothing if lock_path wasn't set.  Basically 
it's something we should never have allowed in the first place.


As far as setting this in code, it's important that all of the processes 
for a service are using the same value to avoid the same bad situation 
we were in before.  For tests, we have a lockutils wrapper 
(https://github.com/openstack/oslo-incubator/blob/master/openstack/common/lockutils.py#L282) that sets an environment variable to address this, but that only works if all of the processes are going to be spawned from within the same wrapper, and I'm not sure how secure that is for production deployments since it puts all of the lock files in a temporary directory.


-Ben

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] creating a default for oslo config variables within a project?

2013-12-03 Thread Ben Nemec
We can't do that.  It used to behave that way, which led to 
https://bugs.launchpad.net/cinder/+bug/1065531


-Ben

On 2013-12-03 16:22, Davanum Srinivas wrote:

Should we update oslo-incubator first to?

default=os.environ.get(OSLO_LOCK_PATH) or tempfile.mkdtemp()

-- dims

On Tue, Dec 3, 2013 at 4:56 PM, Sean Dague s...@dague.net wrote:

This cinder patch - https://review.openstack.org/#/c/48935/

Is blocked on failing upgrade because the updated oslo lockutils won't
function until there is a specific configuration variable added to the
cinder.conf.

That work around is proposed here -
https://review.openstack.org/#/c/52070/3

However I think this is exactly the kind of forward breaks that we 
want

to prevent with grenade, as cinder failing to function after a rolling
upgrade because a config item wasn't added is exactly the kind of pain
we are trying to prevent happening to ops.

So the question is, how is this done correctly so that a default can 
be

set in the cinder code for this value, and it not require a config
change to work?

-Sean

--
Sean Dague
http://dague.net


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev



___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] creating a default for oslo config variables within a project?

2013-12-03 Thread Sean Dague
On 12/03/2013 05:50 PM, Mark McLoughlin wrote:
 On Tue, 2013-12-03 at 16:23 -0600, Ben Nemec wrote:
 On 2013-12-03 15:56, Sean Dague wrote:
 This cinder patch - https://review.openstack.org/#/c/48935/

 Is blocked on failing upgrade because the updated oslo lockutils won't
 function until there is a specific configuration variable added to the
 cinder.conf.

 That work around is proposed here -
 https://review.openstack.org/#/c/52070/3

 However I think this is exactly the kind of forward breaks that we want
 to prevent with grenade, as cinder failing to function after a rolling
 upgrade because a config item wasn't added is exactly the kind of pain
 we are trying to prevent happening to ops.

 So the question is, how is this done correctly so that a default can be
 set in the cinder code for this value, and it not require a config
 change to work?
 
 You're absolutely correct, in principle - if the default value for
 lock_path worked for users before, we should absolutely continue to
 support it.
 
 I don't know that I have a good answer on how to handle this, but for 
 context this change is the result of a nasty bug in lockutils that meant 
 external locks were doing nothing if lock_path wasn't set.  Basically 
 it's something we should never have allowed in the first place.

 As far as setting this in code, it's important that all of the processes 
 for a service are using the same value to avoid the same bad situation 
 we were in before.  For tests, we have a lockutils wrapper 
 (https://github.com/openstack/oslo-incubator/blob/master/openstack/common/lockutils.py#L282)
  that sets an environment variable to address this, but that only works if 
 all of the processes are going to be spawned from within the same wrapper, 
 and I'm not sure how secure that is for production deployments since it puts 
 all of the lock files in a temporary directory.
 
 Right, I don't think the previous default really worked - if you used
 the default, then external locking was broken.
 
 I suspect most distros do set a default - I see RDO has this in its
 default nova.conf:
 
   lock_path = /var/lib/nova/tmp
 
 So, yes - this is all terrible.
 
 IMHO, rather than raise an exception we should log a big fat warning
 about relying on the default and perhaps just treat the lock as an
 in-process lock in that case ... since that's essentially what it was
 before, right?

So a default of lock_path = /tmp will work (FHS says that path has to be
there), even if not optimal. Could we make it a default value like that
instead of the current default which is null (and hence the problem).

-Sean

-- 
Sean Dague
http://dague.net

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] creating a default for oslo config variables within a project?

2013-12-03 Thread Ben Nemec

On 2013-12-03 17:09, Sean Dague wrote:

On 12/03/2013 05:50 PM, Mark McLoughlin wrote:

On Tue, 2013-12-03 at 16:23 -0600, Ben Nemec wrote:

On 2013-12-03 15:56, Sean Dague wrote:

This cinder patch - https://review.openstack.org/#/c/48935/

Is blocked on failing upgrade because the updated oslo lockutils 
won't
function until there is a specific configuration variable added to 
the

cinder.conf.

That work around is proposed here -
https://review.openstack.org/#/c/52070/3

However I think this is exactly the kind of forward breaks that we 
want
to prevent with grenade, as cinder failing to function after a 
rolling
upgrade because a config item wasn't added is exactly the kind of 
pain

we are trying to prevent happening to ops.

So the question is, how is this done correctly so that a default can 
be

set in the cinder code for this value, and it not require a config
change to work?


You're absolutely correct, in principle - if the default value for
lock_path worked for users before, we should absolutely continue to
support it.


I don't know that I have a good answer on how to handle this, but for
context this change is the result of a nasty bug in lockutils that 
meant

external locks were doing nothing if lock_path wasn't set.  Basically
it's something we should never have allowed in the first place.

As far as setting this in code, it's important that all of the 
processes
for a service are using the same value to avoid the same bad 
situation

we were in before.  For tests, we have a lockutils wrapper
(https://github.com/openstack/oslo-incubator/blob/master/openstack/common/lockutils.py#L282)
 that sets an environment variable to address this, but that only works if all 
of the processes are going to be spawned from within the same wrapper, and I'm 
not sure how secure that is for production deployments since it puts all of the 
lock files in a temporary directory.


Right, I don't think the previous default really worked - if you 
used

the default, then external locking was broken.

I suspect most distros do set a default - I see RDO has this in its
default nova.conf:

  lock_path = /var/lib/nova/tmp

So, yes - this is all terrible.

IMHO, rather than raise an exception we should log a big fat warning
about relying on the default and perhaps just treat the lock as an
in-process lock in that case ... since that's essentially what it was
before, right?


So a default of lock_path = /tmp will work (FHS says that path has to 
be

there), even if not optimal. Could we make it a default value like that
instead of the current default which is null (and hence the problem).


IIRC, my initial fix was something similar to that, but it got shot down 
because putting the lock files in a known world writeable location was a 
security issue.


Although maybe if we put them in a subdirectory of /tmp and ensured that 
the permissions were such that only the user running the service could 
use that directory, it might be acceptable?  We could still log a 
warning if we wanted.


This seems like it would have implications for people running services 
on Windows too, but we can probably find a way to make that work if we 
decide on a solution.


-Ben

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] creating a default for oslo config variables within a project?

2013-12-03 Thread Sean Dague
On 12/03/2013 06:13 PM, Ben Nemec wrote:
 On 2013-12-03 17:09, Sean Dague wrote:
 On 12/03/2013 05:50 PM, Mark McLoughlin wrote:
 On Tue, 2013-12-03 at 16:23 -0600, Ben Nemec wrote:
 On 2013-12-03 15:56, Sean Dague wrote:
 This cinder patch - https://review.openstack.org/#/c/48935/

 Is blocked on failing upgrade because the updated oslo lockutils won't
 function until there is a specific configuration variable added to the
 cinder.conf.

 That work around is proposed here -
 https://review.openstack.org/#/c/52070/3

 However I think this is exactly the kind of forward breaks that we
 want
 to prevent with grenade, as cinder failing to function after a rolling
 upgrade because a config item wasn't added is exactly the kind of pain
 we are trying to prevent happening to ops.

 So the question is, how is this done correctly so that a default
 can be
 set in the cinder code for this value, and it not require a config
 change to work?

 You're absolutely correct, in principle - if the default value for
 lock_path worked for users before, we should absolutely continue to
 support it.

 I don't know that I have a good answer on how to handle this, but for
 context this change is the result of a nasty bug in lockutils that
 meant
 external locks were doing nothing if lock_path wasn't set.  Basically
 it's something we should never have allowed in the first place.

 As far as setting this in code, it's important that all of the
 processes
 for a service are using the same value to avoid the same bad situation
 we were in before.  For tests, we have a lockutils wrapper
 (https://github.com/openstack/oslo-incubator/blob/master/openstack/common/lockutils.py#L282)
 that sets an environment variable to address this, but that only
 works if all of the processes are going to be spawned from within
 the same wrapper, and I'm not sure how secure that is for production
 deployments since it puts all of the lock files in a temporary
 directory.

 Right, I don't think the previous default really worked - if you used
 the default, then external locking was broken.

 I suspect most distros do set a default - I see RDO has this in its
 default nova.conf:

   lock_path = /var/lib/nova/tmp

 So, yes - this is all terrible.

 IMHO, rather than raise an exception we should log a big fat warning
 about relying on the default and perhaps just treat the lock as an
 in-process lock in that case ... since that's essentially what it was
 before, right?

 So a default of lock_path = /tmp will work (FHS says that path has to be
 there), even if not optimal. Could we make it a default value like that
 instead of the current default which is null (and hence the problem).
 
 IIRC, my initial fix was something similar to that, but it got shot down
 because putting the lock files in a known world writeable location was a
 security issue.
 
 Although maybe if we put them in a subdirectory of /tmp and ensured that
 the permissions were such that only the user running the service could
 use that directory, it might be acceptable?  We could still log a
 warning if we wanted.
 
 This seems like it would have implications for people running services
 on Windows too, but we can probably find a way to make that work if we
 decide on a solution.

How is that a security issue? Are the lock files being written with some
sensitive data in them and have g or o permissions on? The sticky bit
(+t) on /tmp will prevent other users from deleting the file.

(from the chmod man page):

RESTRICTED DELETION FLAG OR STICKY BIT
   The restricted deletion flag or sticky bit is a single bit, whose
interpretation depends on the file type.  For directories, it prevents
unprivi‐
   leged  users  from  removing or renaming a file in the directory
unless they own the file or the directory; this is called the restricted
deletion
   flag for the directory, and is commonly found on world-writable
directories like /tmp.  For regular files on some older systems, the bit
saves the
   program's text image on the swap device so it will load more
quickly when run; this is called the sticky bit.


-Sean

-- 
Sean Dague
http://dague.net



signature.asc
Description: OpenPGP digital signature
___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] creating a default for oslo config variables within a project?

2013-12-03 Thread Clint Byrum
Excerpts from Sean Dague's message of 2013-12-03 16:05:47 -0800:
 On 12/03/2013 06:13 PM, Ben Nemec wrote:
  On 2013-12-03 17:09, Sean Dague wrote:
  On 12/03/2013 05:50 PM, Mark McLoughlin wrote:
  On Tue, 2013-12-03 at 16:23 -0600, Ben Nemec wrote:
  On 2013-12-03 15:56, Sean Dague wrote:
  This cinder patch - https://review.openstack.org/#/c/48935/
 
  Is blocked on failing upgrade because the updated oslo lockutils won't
  function until there is a specific configuration variable added to the
  cinder.conf.
 
  That work around is proposed here -
  https://review.openstack.org/#/c/52070/3
 
  However I think this is exactly the kind of forward breaks that we
  want
  to prevent with grenade, as cinder failing to function after a rolling
  upgrade because a config item wasn't added is exactly the kind of pain
  we are trying to prevent happening to ops.
 
  So the question is, how is this done correctly so that a default
  can be
  set in the cinder code for this value, and it not require a config
  change to work?
 
  You're absolutely correct, in principle - if the default value for
  lock_path worked for users before, we should absolutely continue to
  support it.
 
  I don't know that I have a good answer on how to handle this, but for
  context this change is the result of a nasty bug in lockutils that
  meant
  external locks were doing nothing if lock_path wasn't set.  Basically
  it's something we should never have allowed in the first place.
 
  As far as setting this in code, it's important that all of the
  processes
  for a service are using the same value to avoid the same bad situation
  we were in before.  For tests, we have a lockutils wrapper
  (https://github.com/openstack/oslo-incubator/blob/master/openstack/common/lockutils.py#L282)
  that sets an environment variable to address this, but that only
  works if all of the processes are going to be spawned from within
  the same wrapper, and I'm not sure how secure that is for production
  deployments since it puts all of the lock files in a temporary
  directory.
 
  Right, I don't think the previous default really worked - if you used
  the default, then external locking was broken.
 
  I suspect most distros do set a default - I see RDO has this in its
  default nova.conf:
 
lock_path = /var/lib/nova/tmp
 
  So, yes - this is all terrible.
 
  IMHO, rather than raise an exception we should log a big fat warning
  about relying on the default and perhaps just treat the lock as an
  in-process lock in that case ... since that's essentially what it was
  before, right?
 
  So a default of lock_path = /tmp will work (FHS says that path has to be
  there), even if not optimal. Could we make it a default value like that
  instead of the current default which is null (and hence the problem).
  
  IIRC, my initial fix was something similar to that, but it got shot down
  because putting the lock files in a known world writeable location was a
  security issue.
  
  Although maybe if we put them in a subdirectory of /tmp and ensured that
  the permissions were such that only the user running the service could
  use that directory, it might be acceptable?  We could still log a
  warning if we wanted.
  
  This seems like it would have implications for people running services
  on Windows too, but we can probably find a way to make that work if we
  decide on a solution.
 
 How is that a security issue? Are the lock files being written with some
 sensitive data in them and have g or o permissions on? The sticky bit
 (+t) on /tmp will prevent other users from deleting the file.
 

Right, but it won't prevent users from creating a symlink with the same
name.

ln -s /var/lib/nova/instances/x/image.raw /tmp/well.known.location

Now when you do

with open('/tmp/well.known.location', 'w') as lockfile:
  lockfile.write('Stuff')

Nova has just truncated the image file and written 'Stuff' to it.

The typical solution is to use a lock directory, /var/run/yourprogram,
that has restrictive enough permissions setup for your program to have
exclusive use of it, and is created by root at boot time. That is what
the packages do now.

It would be good if everybody agreed on a default, %(nova_home)/locks
or something, but root still must set it up with the right permissions
before the program can use it. IMO that is fine, your upgrade should
include a change to your init script/systemd unit/upstart job which
ensures that it exists before starting.

We could abstract this away with a nova-manage subcommand that is intended
to be run as root but can inspect the config file. That would allow for
simpler documentation and the packagers would probably love you for it.
:)

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev