Re: [openstack-dev] [cinder] [nova] os-brick privsep failures and an upgrade strategy?

2016-07-25 Thread Thierry Carrez
Sean Dague wrote:
> [...]
> After we brought that up in the room, we started going through other
> options. Someone brought up "what about making rootwrap always do this
> for privsep, instead of manually doing this for every project", and I
> volunteered to look at the code to figure out how hard it would be. That
> patch is up at https://review.openstack.org/344450.

I replied (removing my -1) on the review. Just a few answers to the
specific questions:

> I think the path forward here is about the following questions:
> 
> 1) how important are seamless upgrades in our vision?

Very

> 2) are root wrap rules supposed to be config (which is manually audited
> by installers)?

They are code, but were config files in the original design, and that
default persisted over time in some (most?) distros.

> 3) is the software supposed to take into account and adapt to the rules
> not being there (or disabled by an auditor)?

Depends on what you mean by software...

> 4) does always letting rootwrap call privsep regress our near term
> security in any real way (given the flaws in existing rules)?

Only for hypothetical non-OpenStack users, and only slightly.

> 5) what will most quickly allow us to transition into a non rootwrap
> world, with a privsep architecture that will give us a better security
> model?

Probably your patch, since it makes rootwrap a deprecated transitional
library enabling privsep. Which is fine as long as nobody else used
rootwrap (or all those hypothetical users would migrate to privsep).

In summary: I can live with the patch as proposed, as long as Angus is
fine with it.

-- 
Thierry Carrez (ttx)

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [cinder] [nova] os-brick privsep failures and an upgrade strategy?

2016-07-25 Thread Sean Dague
On 07/22/2016 09:20 AM, Angus Lees wrote:
> On Thu, 21 Jul 2016 at 09:27 Sean Dague  > wrote:
> 
> On 07/12/2016 06:25 AM, Matt Riedemann wrote:
> 
> > We probably aren't doing anything while Sean Dague is on vacation.
> He's
> > back next week and we have the nova/cinder meetups, so I'm planning on
> > talking about the grenade issue in person and hopefully we'll have a
> > plan by the end of next week to move forward.
> 
> After some discussions at the Nova midcycle we threw together an
> approach where we just always allow privsep-helper from oslo.rootwrap.
> 
> https://review.openstack.org/344450
> 
> 
> Were these discussions captured anywhere?  I thought we'd discussed
> alternatives on os-dev, reached a conclusion, implemented the
> changes(*), and verified the results all a month ago - and that we were
> just awaiting nova approval.  So I'm surprised to see this sudden change
> in direction...
> 
> (*) Changes:
> https://review.openstack.org/#/c/329769/
> https://review.openstack.org/#/c/332610/
> mriedem's verification: https://review.openstack.org/#/c/331885/

By agreed we said that - https://review.openstack.org/#/c/332610/ was
the option of last resort if no better option could be figured out. But
then we ran into having to do this again for os-vif. And given the roll
out of privsep it looks like we'll basically have this same exception /
manual update another place in base IaaS for multiple cycles here as
this rolls out.

Which is exactly the opposite of our upgrade vision, which upgrades
should be seamless code rolling forward.

If we only had to do this once, maybe we mea culpa and do it. But we
know we at least have to do this twice, and coordinated nova and neutron
coupling the release. This gets exponentially worse.

After we brought that up in the room, we started going through other
options. Someone brought up "what about making rootwrap always do this
for privsep, instead of manually doing this for every project", and I
volunteered to look at the code to figure out how hard it would be. That
patch is up at https://review.openstack.org/344450.

I think the path forward here is about the following questions:

1) how important are seamless upgrades in our vision?
2) are root wrap rules supposed to be config (which is manually audited
by installers)?
3) is the software supposed to take into account and adapt to the rules
not being there (or disabled by an auditor)?
4) does always letting rootwrap call privsep regress our near term
security in any real way (given the flaws in existing rules)?
5) what will most quickly allow us to transition into a non rootwrap
world, with a privsep architecture that will give us a better security
model?

Making oslo.rootwrap trust privsep seems like the least worst option in
front of us, especially to actually get os-vif out there and deployed
this cycle as well.

-Sean

-- 
Sean Dague
http://dague.net

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [cinder] [nova] os-brick privsep failures and an upgrade strategy?

2016-07-22 Thread Matt Riedemann

On 7/22/2016 8:20 AM, Angus Lees wrote:

On Thu, 21 Jul 2016 at 09:27 Sean Dague > wrote:

On 07/12/2016 06:25 AM, Matt Riedemann wrote:

> We probably aren't doing anything while Sean Dague is on vacation.
He's
> back next week and we have the nova/cinder meetups, so I'm planning on
> talking about the grenade issue in person and hopefully we'll have a
> plan by the end of next week to move forward.

After some discussions at the Nova midcycle we threw together an
approach where we just always allow privsep-helper from oslo.rootwrap.

https://review.openstack.org/344450


Were these discussions captured anywhere?  I thought we'd discussed
alternatives on os-dev, reached a conclusion, implemented the
changes(*), and verified the results all a month ago - and that we were
just awaiting nova approval.  So I'm surprised to see this sudden change
in direction...

(*) Changes:
https://review.openstack.org/#/c/329769/
https://review.openstack.org/#/c/332610/
mriedem's verification: https://review.openstack.org/#/c/331885/

 - Gus

We did a sniff test of this, and it worked to roll over the upgrade
boundary, without an etc change, and work with osbrick 1.4.0 (currently
blacklisted because of the upgrade issue). While I realize it wasn't the
favorite approach by many it works. It's 3 lines of functional change.
If we land this, release, and bump the minimum, we've got the upgrade
issue solved in this cycle.

Please take a look and see if we can agree to this path forward.

-Sean

--
Sean Dague
http://dague.net

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe:
openstack-dev-requ...@lists.openstack.org?subject:unsubscribe

http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
--
Message  protected by MailGuard: e-mail anti-virus, anti-spam and
content filtering.http://www.mailguard.com.au/mg
Click here to report this message as spam:
https://console.mailguard.com.au/ras/1OSGOh3pqW/kb4I7l49SLBdqHGpZpoHi/0.82



__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev



We talked about it at the nova midcycle, the etherpad is here but the 
notes on privsep/grenade are pretty sparse:


https://etherpad.openstack.org/p/nova-newton-midcycle

Long-term we want this in code, which is why privsep is for, but today 
it's config because it's deployed into /etc, so we treat it as config 
with the same rules for upgrades that are applied in grenade for actual 
config options, i.e. new code should be able to run on old config.


I mentioned that we still break this for other new filters which we 
don't test, but the feeling was we shouldn't change how we do this for 
things we do test since operators rely on it and upgrade is their top 
pain point.


We also decided that simply hard-coding the privsep-helper in 
oslo.rootwrap itself was better than needing to script/hack the same 
thing for every project that is going to adopt privsep - and we can 
isolate it in the rootwrap library so there are no exceptional upgrade 
scripts for newton (for nova, or anyone).


So this is not great, but it's the least bad to get us over this issue 
for newton and unblock os-brick and os-vif and allow new projects to 
start adopting privsep and not hit the same upgrade issues.


mikal suggested that gus and sdague talk over a hangout or some higher 
bandwidth medium if we still need to hash things out here.


--

Thanks,

Matt Riedemann


__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [cinder] [nova] os-brick privsep failures and an upgrade strategy?

2016-07-22 Thread Angus Lees
On Thu, 21 Jul 2016 at 09:27 Sean Dague  wrote:

> On 07/12/2016 06:25 AM, Matt Riedemann wrote:
> 
> > We probably aren't doing anything while Sean Dague is on vacation. He's
> > back next week and we have the nova/cinder meetups, so I'm planning on
> > talking about the grenade issue in person and hopefully we'll have a
> > plan by the end of next week to move forward.
>
> After some discussions at the Nova midcycle we threw together an
> approach where we just always allow privsep-helper from oslo.rootwrap.
>
> https://review.openstack.org/344450


Were these discussions captured anywhere?  I thought we'd discussed
alternatives on os-dev, reached a conclusion, implemented the changes(*),
and verified the results all a month ago - and that we were just awaiting
nova approval.  So I'm surprised to see this sudden change in direction...

(*) Changes:
https://review.openstack.org/#/c/329769/
https://review.openstack.org/#/c/332610/
mriedem's verification: https://review.openstack.org/#/c/331885/

 - Gus

We did a sniff test of this, and it worked to roll over the upgrade
> boundary, without an etc change, and work with osbrick 1.4.0 (currently
> blacklisted because of the upgrade issue). While I realize it wasn't the
> favorite approach by many it works. It's 3 lines of functional change.
> If we land this, release, and bump the minimum, we've got the upgrade
> issue solved in this cycle.
>
> Please take a look and see if we can agree to this path forward.
>
> -Sean
>
> --
> Sean Dague
> http://dague.net
>
> __
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
> --
> Message  protected by MailGuard: e-mail anti-virus, anti-spam and content
> filtering.http://www.mailguard.com.au/mg
> Click here to report this message as spam:
> https://console.mailguard.com.au/ras/1OSGOh3pqW/kb4I7l49SLBdqHGpZpoHi/0.82
>
>
__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [cinder] [nova] os-brick privsep failures and an upgrade strategy?

2016-07-22 Thread Li, Xiaoyan
Hi,

What is the discussion result of privsep issue?
When can we release next os-brick?

Best wishes
Lisa

From: Ivan Kolodyazhny [mailto:e...@e0ne.info]
Sent: Wednesday, July 13, 2016 9:55 PM
To: OpenStack Development Mailing List (not for usage questions) 
<openstack-dev@lists.openstack.org>
Subject: Re: [openstack-dev] [cinder] [nova] os-brick privsep failures and an 
upgrade strategy?

Thanks for the update, Matt.

I will join our meeting next week.

Regards,
Ivan Kolodyazhny,
http://blog.e0ne.info/

On Tue, Jul 12, 2016 at 4:25 PM, Matt Riedemann 
<mrie...@linux.vnet.ibm.com<mailto:mrie...@linux.vnet.ibm.com>> wrote:
On 7/12/2016 6:29 AM, Ivan Kolodyazhny wrote:
Hi team,

Do we have any decision on this issue? I've found few patches but both
of them are -1'ed.

From Cinder perspective, it blocks us to release new os-brick with
features, which are needed for other projects like Cinder and
python-brick-cinderclient-ext.

Regards,
Ivan Kolodyazhny,
http://blog.e0ne.info/

On Wed, Jun 22, 2016 at 5:47 PM, Matt Riedemann
<mrie...@linux.vnet.ibm.com<mailto:mrie...@linux.vnet.ibm.com> 
<mailto:mrie...@linux.vnet.ibm.com<mailto:mrie...@linux.vnet.ibm.com>>> wrote:

On 6/21/2016 10:12 PM, Angus Lees wrote:

On Wed, 22 Jun 2016 at 05:59 Matt Riedemann
<mrie...@linux.vnet.ibm.com<mailto:mrie...@linux.vnet.ibm.com> 
<mailto:mrie...@linux.vnet.ibm.com<mailto:mrie...@linux.vnet.ibm.com>>
<mailto:mrie...@linux.vnet.ibm.com<mailto:mrie...@linux.vnet.ibm.com>


<mailto:mrie...@linux.vnet.ibm.com<mailto:mrie...@linux.vnet.ibm.com>>>> wrote:

Angus, what should we be looking at from the privsep side
for debugging
this?


The line above the screen-n-cpu.txt.gz failure you linked to is:
2016-06-21 16:21:30.994

<http://logs.openstack.org/85/331885/2/check/gate-grenade-dsvm-multinode/415e1bc/logs/new/screen-n-cpu.txt.gz?level=TRACE#_2016-06-21_16_21_30_994>1840
WARNING oslo.privsep.daemon [-] privsep log:
/usr/local/bin/nova-rootwrap: Unauthorized command: privsep-helper
--config-file /etc/nova/nova.conf --privsep_context
os_brick.privileged.default --privsep_sock_path
/tmp/tmpV5w2VC/privsep.sock (no filter matched)

 .. so nova-rootwrap is rejecting the privsep-helper command line
because no filter matched.  This indicates the nova
compute.filters file
has not been updated, or is incorrect.


As was later pointed out by mtreinish, grenade is attempting to
run the
newton code against mitaka configs, and this includes using mitaka
rootwrap filters.   Unfortunately, the change to add privsep to
nova's
rootwrap filters wasn't approved until the newton cycle (so that
all the
os-brick privsep-related changes could be approved together), and so
this doesn't Just Work.

Digging in further, it appears that there *is* a mechanism in
grenade to
upgrade rootwrap filters between major releases, but this needs
to be
explicitly updated for each project+release and hasn't been for
nova+mitaka->newton.  I'm not sure how this is *meant* to work,
since
the grenade "theory of upgrade" doesn't mention when configs
should be
updated - the only mechanism provided is an "exception ... used
sparingly."


As noted in the review, my understanding of the config changes is
deprecation of options across release boundaries so that you can't
drop a config option that would break someone from release to
release without it being deprecated first. So deprecate option foo
in mitaka, people upgrading from liberty to mitaka aren't broken,
but they get warnings in mitaka so that when you drop the option in
newton it's not a surprise and consumers should have adjusted during
mitaka.

For rootwrap filters I agree this is more complicated.


Anyway, I added an upgrade step for nova mitaka->newton that updates
rootwrap filters appropriately(*).  Again, I'm not sure what this
communicates to deployers compared to cinder (which *did* have the
updated rootwrap filter merged in mitaka, but of course that update
still needs to be installed at some point).
(*) https://review.openstack.org/#/c/332610

 - Gus



__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe:

openstack-dev-requ...@lists.openstack.org?subject:unsubscribe<http://openstack-dev-requ...@lists.openstack.org?subject:unsubscribe>
<http://openstack-dev-requ...@lists.openstack.org?subject:unsubscribe>
http://lists.openstack.org

Re: [openstack-dev] [cinder] [nova] os-brick privsep failures and an upgrade strategy?

2016-07-20 Thread Sean Dague

On 07/12/2016 06:25 AM, Matt Riedemann wrote:


We probably aren't doing anything while Sean Dague is on vacation. He's
back next week and we have the nova/cinder meetups, so I'm planning on
talking about the grenade issue in person and hopefully we'll have a
plan by the end of next week to move forward.


After some discussions at the Nova midcycle we threw together an 
approach where we just always allow privsep-helper from oslo.rootwrap.


https://review.openstack.org/344450

We did a sniff test of this, and it worked to roll over the upgrade 
boundary, without an etc change, and work with osbrick 1.4.0 (currently 
blacklisted because of the upgrade issue). While I realize it wasn't the 
favorite approach by many it works. It's 3 lines of functional change. 
If we land this, release, and bump the minimum, we've got the upgrade 
issue solved in this cycle.


Please take a look and see if we can agree to this path forward.

-Sean

--
Sean Dague
http://dague.net

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [cinder] [nova] os-brick privsep failures and an upgrade strategy?

2016-07-13 Thread Ivan Kolodyazhny
Thanks for the update, Matt.

I will join our meeting next week.

Regards,
Ivan Kolodyazhny,
http://blog.e0ne.info/

On Tue, Jul 12, 2016 at 4:25 PM, Matt Riedemann 
wrote:

> On 7/12/2016 6:29 AM, Ivan Kolodyazhny wrote:
>
>> Hi team,
>>
>> Do we have any decision on this issue? I've found few patches but both
>> of them are -1'ed.
>>
>> From Cinder perspective, it blocks us to release new os-brick with
>> features, which are needed for other projects like Cinder and
>> python-brick-cinderclient-ext.
>>
>> Regards,
>> Ivan Kolodyazhny,
>> http://blog.e0ne.info/
>>
>> On Wed, Jun 22, 2016 at 5:47 PM, Matt Riedemann
>> > wrote:
>>
>> On 6/21/2016 10:12 PM, Angus Lees wrote:
>>
>> On Wed, 22 Jun 2016 at 05:59 Matt Riedemann
>> 
>> >
>> >> wrote:
>>
>> Angus, what should we be looking at from the privsep side
>> for debugging
>> this?
>>
>>
>> The line above the screen-n-cpu.txt.gz failure you linked to is:
>> 2016-06-21 16:21:30.994
>> <
>> http://logs.openstack.org/85/331885/2/check/gate-grenade-dsvm-multinode/415e1bc/logs/new/screen-n-cpu.txt.gz?level=TRACE#_2016-06-21_16_21_30_994
>> >1840
>> WARNING oslo.privsep.daemon [-] privsep log:
>> /usr/local/bin/nova-rootwrap: Unauthorized command: privsep-helper
>> --config-file /etc/nova/nova.conf --privsep_context
>> os_brick.privileged.default --privsep_sock_path
>> /tmp/tmpV5w2VC/privsep.sock (no filter matched)
>>
>>  .. so nova-rootwrap is rejecting the privsep-helper command line
>> because no filter matched.  This indicates the nova
>> compute.filters file
>> has not been updated, or is incorrect.
>>
>>
>> As was later pointed out by mtreinish, grenade is attempting to
>> run the
>> newton code against mitaka configs, and this includes using mitaka
>> rootwrap filters.   Unfortunately, the change to add privsep to
>> nova's
>> rootwrap filters wasn't approved until the newton cycle (so that
>> all the
>> os-brick privsep-related changes could be approved together), and
>> so
>> this doesn't Just Work.
>>
>> Digging in further, it appears that there *is* a mechanism in
>> grenade to
>> upgrade rootwrap filters between major releases, but this needs
>> to be
>> explicitly updated for each project+release and hasn't been for
>> nova+mitaka->newton.  I'm not sure how this is *meant* to work,
>> since
>> the grenade "theory of upgrade" doesn't mention when configs
>> should be
>> updated - the only mechanism provided is an "exception ... used
>> sparingly."
>>
>>
>> As noted in the review, my understanding of the config changes is
>> deprecation of options across release boundaries so that you can't
>> drop a config option that would break someone from release to
>> release without it being deprecated first. So deprecate option foo
>> in mitaka, people upgrading from liberty to mitaka aren't broken,
>> but they get warnings in mitaka so that when you drop the option in
>> newton it's not a surprise and consumers should have adjusted during
>> mitaka.
>>
>> For rootwrap filters I agree this is more complicated.
>>
>>
>> Anyway, I added an upgrade step for nova mitaka->newton that
>> updates
>> rootwrap filters appropriately(*).  Again, I'm not sure what this
>> communicates to deployers compared to cinder (which *did* have the
>> updated rootwrap filter merged in mitaka, but of course that
>> update
>> still needs to be installed at some point).
>> (*) https://review.openstack.org/#/c/332610
>>
>>  - Gus
>>
>>
>>
>> __
>> OpenStack Development Mailing List (not for usage questions)
>> Unsubscribe:
>> openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
>> <
>> http://openstack-dev-requ...@lists.openstack.org?subject:unsubscribe>
>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>>
>>
>> Alternatively Walter had a potential workaround to fallback to
>> rootwrap for os-brick:
>>
>> https://review.openstack.org/#/c/329586/
>>
>> So we could maybe use that for newton. But os-vif doesn't have
>> anything like that, so we'd have to see what kind of (immediately
>> deprecated) workaround could happen for os-vif in newton and then
>> drop that in ocata.
>>
>> I'm told danpb is out until tomorrow though so we'll probably need
>> to wait to talk to him about options 

Re: [openstack-dev] [cinder] [nova] os-brick privsep failures and an upgrade strategy?

2016-07-12 Thread Matt Riedemann

On 7/12/2016 6:29 AM, Ivan Kolodyazhny wrote:

Hi team,

Do we have any decision on this issue? I've found few patches but both
of them are -1'ed.

From Cinder perspective, it blocks us to release new os-brick with
features, which are needed for other projects like Cinder and
python-brick-cinderclient-ext.

Regards,
Ivan Kolodyazhny,
http://blog.e0ne.info/

On Wed, Jun 22, 2016 at 5:47 PM, Matt Riedemann
> wrote:

On 6/21/2016 10:12 PM, Angus Lees wrote:

On Wed, 22 Jun 2016 at 05:59 Matt Riedemann

>> wrote:

Angus, what should we be looking at from the privsep side
for debugging
this?


The line above the screen-n-cpu.txt.gz failure you linked to is:
2016-06-21 16:21:30.994

1840
WARNING oslo.privsep.daemon [-] privsep log:
/usr/local/bin/nova-rootwrap: Unauthorized command: privsep-helper
--config-file /etc/nova/nova.conf --privsep_context
os_brick.privileged.default --privsep_sock_path
/tmp/tmpV5w2VC/privsep.sock (no filter matched)

 .. so nova-rootwrap is rejecting the privsep-helper command line
because no filter matched.  This indicates the nova
compute.filters file
has not been updated, or is incorrect.


As was later pointed out by mtreinish, grenade is attempting to
run the
newton code against mitaka configs, and this includes using mitaka
rootwrap filters.   Unfortunately, the change to add privsep to
nova's
rootwrap filters wasn't approved until the newton cycle (so that
all the
os-brick privsep-related changes could be approved together), and so
this doesn't Just Work.

Digging in further, it appears that there *is* a mechanism in
grenade to
upgrade rootwrap filters between major releases, but this needs
to be
explicitly updated for each project+release and hasn't been for
nova+mitaka->newton.  I'm not sure how this is *meant* to work,
since
the grenade "theory of upgrade" doesn't mention when configs
should be
updated - the only mechanism provided is an "exception ... used
sparingly."


As noted in the review, my understanding of the config changes is
deprecation of options across release boundaries so that you can't
drop a config option that would break someone from release to
release without it being deprecated first. So deprecate option foo
in mitaka, people upgrading from liberty to mitaka aren't broken,
but they get warnings in mitaka so that when you drop the option in
newton it's not a surprise and consumers should have adjusted during
mitaka.

For rootwrap filters I agree this is more complicated.


Anyway, I added an upgrade step for nova mitaka->newton that updates
rootwrap filters appropriately(*).  Again, I'm not sure what this
communicates to deployers compared to cinder (which *did* have the
updated rootwrap filter merged in mitaka, but of course that update
still needs to be installed at some point).
(*) https://review.openstack.org/#/c/332610

 - Gus



__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe:
openstack-dev-requ...@lists.openstack.org?subject:unsubscribe

http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Alternatively Walter had a potential workaround to fallback to
rootwrap for os-brick:

https://review.openstack.org/#/c/329586/

So we could maybe use that for newton. But os-vif doesn't have
anything like that, so we'd have to see what kind of (immediately
deprecated) workaround could happen for os-vif in newton and then
drop that in ocata.

I'm told danpb is out until tomorrow though so we'll probably need
to wait to talk to him about options there.


--

Thanks,

Matt Riedemann


__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe:
openstack-dev-requ...@lists.openstack.org?subject:unsubscribe

http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev




__
OpenStack 

Re: [openstack-dev] [cinder] [nova] os-brick privsep failures and an upgrade strategy?

2016-07-12 Thread Ivan Kolodyazhny
Hi team,

Do we have any decision on this issue? I've found few patches but both of
them are -1'ed.

>From Cinder perspective, it blocks us to release new os-brick with
features, which are needed for other projects like Cinder and
python-brick-cinderclient-ext.

Regards,
Ivan Kolodyazhny,
http://blog.e0ne.info/

On Wed, Jun 22, 2016 at 5:47 PM, Matt Riedemann 
wrote:

> On 6/21/2016 10:12 PM, Angus Lees wrote:
>
>> On Wed, 22 Jun 2016 at 05:59 Matt Riedemann > > wrote:
>>
>> Angus, what should we be looking at from the privsep side for
>> debugging
>> this?
>>
>>
>> The line above the screen-n-cpu.txt.gz failure you linked to is:
>> 2016-06-21 16:21:30.994
>> <
>> http://logs.openstack.org/85/331885/2/check/gate-grenade-dsvm-multinode/415e1bc/logs/new/screen-n-cpu.txt.gz?level=TRACE#_2016-06-21_16_21_30_994
>> >1840
>> WARNING oslo.privsep.daemon [-] privsep log:
>> /usr/local/bin/nova-rootwrap: Unauthorized command: privsep-helper
>> --config-file /etc/nova/nova.conf --privsep_context
>> os_brick.privileged.default --privsep_sock_path
>> /tmp/tmpV5w2VC/privsep.sock (no filter matched)
>>
>>  .. so nova-rootwrap is rejecting the privsep-helper command line
>> because no filter matched.  This indicates the nova compute.filters file
>> has not been updated, or is incorrect.
>>
>>
>> As was later pointed out by mtreinish, grenade is attempting to run the
>> newton code against mitaka configs, and this includes using mitaka
>> rootwrap filters.   Unfortunately, the change to add privsep to nova's
>> rootwrap filters wasn't approved until the newton cycle (so that all the
>> os-brick privsep-related changes could be approved together), and so
>> this doesn't Just Work.
>>
>> Digging in further, it appears that there *is* a mechanism in grenade to
>> upgrade rootwrap filters between major releases, but this needs to be
>> explicitly updated for each project+release and hasn't been for
>> nova+mitaka->newton.  I'm not sure how this is *meant* to work, since
>> the grenade "theory of upgrade" doesn't mention when configs should be
>> updated - the only mechanism provided is an "exception ... used
>> sparingly."
>>
>
> As noted in the review, my understanding of the config changes is
> deprecation of options across release boundaries so that you can't drop a
> config option that would break someone from release to release without it
> being deprecated first. So deprecate option foo in mitaka, people upgrading
> from liberty to mitaka aren't broken, but they get warnings in mitaka so
> that when you drop the option in newton it's not a surprise and consumers
> should have adjusted during mitaka.
>
> For rootwrap filters I agree this is more complicated.
>
>
>> Anyway, I added an upgrade step for nova mitaka->newton that updates
>> rootwrap filters appropriately(*).  Again, I'm not sure what this
>> communicates to deployers compared to cinder (which *did* have the
>> updated rootwrap filter merged in mitaka, but of course that update
>> still needs to be installed at some point).
>> (*) https://review.openstack.org/#/c/332610
>>
>>  - Gus
>>
>>
>> __
>> OpenStack Development Mailing List (not for usage questions)
>> Unsubscribe:
>> openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>>
>>
> Alternatively Walter had a potential workaround to fallback to rootwrap
> for os-brick:
>
> https://review.openstack.org/#/c/329586/
>
> So we could maybe use that for newton. But os-vif doesn't have anything
> like that, so we'd have to see what kind of (immediately deprecated)
> workaround could happen for os-vif in newton and then drop that in ocata.
>
> I'm told danpb is out until tomorrow though so we'll probably need to wait
> to talk to him about options there.
>
>
> --
>
> Thanks,
>
> Matt Riedemann
>
>
> __
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>
__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [cinder] [nova] os-brick privsep failures and an upgrade strategy?

2016-06-22 Thread Matt Riedemann

On 6/21/2016 10:12 PM, Angus Lees wrote:

On Wed, 22 Jun 2016 at 05:59 Matt Riedemann > wrote:

Angus, what should we be looking at from the privsep side for debugging
this?


The line above the screen-n-cpu.txt.gz failure you linked to is:
2016-06-21 16:21:30.994
1840
WARNING oslo.privsep.daemon [-] privsep log:
/usr/local/bin/nova-rootwrap: Unauthorized command: privsep-helper
--config-file /etc/nova/nova.conf --privsep_context
os_brick.privileged.default --privsep_sock_path
/tmp/tmpV5w2VC/privsep.sock (no filter matched)

 .. so nova-rootwrap is rejecting the privsep-helper command line
because no filter matched.  This indicates the nova compute.filters file
has not been updated, or is incorrect.


As was later pointed out by mtreinish, grenade is attempting to run the
newton code against mitaka configs, and this includes using mitaka
rootwrap filters.   Unfortunately, the change to add privsep to nova's
rootwrap filters wasn't approved until the newton cycle (so that all the
os-brick privsep-related changes could be approved together), and so
this doesn't Just Work.

Digging in further, it appears that there *is* a mechanism in grenade to
upgrade rootwrap filters between major releases, but this needs to be
explicitly updated for each project+release and hasn't been for
nova+mitaka->newton.  I'm not sure how this is *meant* to work, since
the grenade "theory of upgrade" doesn't mention when configs should be
updated - the only mechanism provided is an "exception ... used sparingly."


As noted in the review, my understanding of the config changes is 
deprecation of options across release boundaries so that you can't drop 
a config option that would break someone from release to release without 
it being deprecated first. So deprecate option foo in mitaka, people 
upgrading from liberty to mitaka aren't broken, but they get warnings in 
mitaka so that when you drop the option in newton it's not a surprise 
and consumers should have adjusted during mitaka.


For rootwrap filters I agree this is more complicated.



Anyway, I added an upgrade step for nova mitaka->newton that updates
rootwrap filters appropriately(*).  Again, I'm not sure what this
communicates to deployers compared to cinder (which *did* have the
updated rootwrap filter merged in mitaka, but of course that update
still needs to be installed at some point).
(*) https://review.openstack.org/#/c/332610

 - Gus


__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev



Alternatively Walter had a potential workaround to fallback to rootwrap 
for os-brick:


https://review.openstack.org/#/c/329586/

So we could maybe use that for newton. But os-vif doesn't have anything 
like that, so we'd have to see what kind of (immediately deprecated) 
workaround could happen for os-vif in newton and then drop that in ocata.


I'm told danpb is out until tomorrow though so we'll probably need to 
wait to talk to him about options there.


--

Thanks,

Matt Riedemann


__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [cinder] [nova] os-brick privsep failures and an upgrade strategy?

2016-06-21 Thread Angus Lees
On Wed, 22 Jun 2016 at 05:59 Matt Riedemann 
wrote:

> Angus, what should we be looking at from the privsep side for debugging
> this?
>

The line above the screen-n-cpu.txt.gz failure you linked to is:
2016-06-21 16:21:30.994

1840 WARNING oslo.privsep.daemon [-] privsep log:
/usr/local/bin/nova-rootwrap: Unauthorized command: privsep-helper
--config-file /etc/nova/nova.conf --privsep_context
os_brick.privileged.default --privsep_sock_path /tmp/tmpV5w2VC/privsep.sock
(no filter matched)

 .. so nova-rootwrap is rejecting the privsep-helper command line because
no filter matched.  This indicates the nova compute.filters file has not
been updated, or is incorrect.


As was later pointed out by mtreinish, grenade is attempting to run the
newton code against mitaka configs, and this includes using mitaka rootwrap
filters.   Unfortunately, the change to add privsep to nova's rootwrap
filters wasn't approved until the newton cycle (so that all the os-brick
privsep-related changes could be approved together), and so this doesn't
Just Work.

Digging in further, it appears that there *is* a mechanism in grenade to
upgrade rootwrap filters between major releases, but this needs to be
explicitly updated for each project+release and hasn't been for
nova+mitaka->newton.  I'm not sure how this is *meant* to work, since the
grenade "theory of upgrade" doesn't mention when configs should be updated
- the only mechanism provided is an "exception ... used sparingly."

Anyway, I added an upgrade step for nova mitaka->newton that updates
rootwrap filters appropriately(*).  Again, I'm not sure what this
communicates to deployers compared to cinder (which *did* have the updated
rootwrap filter merged in mitaka, but of course that update still needs to
be installed at some point).
(*) https://review.openstack.org/#/c/332610

 - Gus
__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [cinder] [nova] os-brick privsep failures and an upgrade strategy?

2016-06-21 Thread Matt Riedemann

On 6/15/2016 1:11 AM, Angus Lees wrote:

oslo.privsep change: https://review.openstack.org/#/c/329766/
And the nova change that uses it: https://review.openstack.org/#/c/329769

In particular I'm unsure if os-brick/os-vif is even loaded at this point
in nova-compute main().  Does anyone know when that actually happens or
shall I go exploring?

 - Gus



An update on this.

The oslo.privsep change is merged and released in 1.9.0.

The nova change was updated to depend on oslo.privsep>=1.9.0.

To test that fix with os-brick 1.4.0, I made this change to requirements:

https://review.openstack.org/#/c/331885/

That depends on the nova fix and enables os-brick 1.4.0 which is the 
version that uses oslo.privsep.


It also depends on a requirements change to stable/mitaka to enable 
os-brick 1.4.0. I wasn't sure if this was necessary though given 
upper-constraints for stable/mitaka already caps brick at 1.2.0.


Anyway, it still fails in the grenade multinode job even with the nova 
and privsep fixes:


http://logs.openstack.org/85/331885/2/check/gate-grenade-dsvm-multinode/415e1bc/logs/new/screen-n-cpu.txt.gz?level=TRACE#_2016-06-21_16_21_31_015

And I confirmed via pip-freeze that it's using oslo.privsep 1.9.0 and 
os-brick 1.4.0:


http://logs.openstack.org/85/331885/2/check/gate-grenade-dsvm-multinode/415e1bc/logs/pip2-freeze.txt.gz

I did verify that it's correctly pulling in the dependent nova fix:

http://logs.openstack.org/85/331885/2/check/gate-grenade-dsvm-multinode/415e1bc/logs/devstack-gate-setup-workspace-new.txt.gz#_2016-06-21_15_37_02_958

Angus, what should we be looking at from the privsep side for debugging 
this?


--

Thanks,

Matt Riedemann


__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [cinder] [nova] os-brick privsep failures and an upgrade strategy?

2016-06-15 Thread Angus Lees
oslo.privsep change: https://review.openstack.org/#/c/329766/
And the nova change that uses it: https://review.openstack.org/#/c/329769

In particular I'm unsure if os-brick/os-vif is even loaded at this point in
nova-compute main().  Does anyone know when that actually happens or shall
I go exploring?

 - Gus

On Wed, 15 Jun 2016 at 11:43 Sean Dague  wrote:

> On 06/14/2016 06:11 PM, Angus Lees wrote:
> > Yep (3) is quite possible, and the only reason it doesn't just do this
> > already is because there's no way to find the name of the rootwrap
> > command to use (from any library, privsep or os-brick) - and I was never
> > very happy with the current need to specify a command line in
> > oslo.config purely for this lame reason.
> >
> > As Sean points out, all the others involve some sort of configuration
> > change preceding the code.  I had imagined rollouts would work by
> > pushing out the harmless conf or sudoers change first, but hadn't
> > appreciated the strict change phases imposed by grenade (and ourselves).
> >
> > If all "end-application" devs are happy calling something like (3)
> > before the first privileged operation occurs, then we should be good.  I
> > might even take the opportunity to phrase it as a general privsep.init()
> > function, and then we can use it for any other top-of-main()
> > privilege-setup steps that need to be taken in the future.
>
> That sounds promising. It would be fine to emit a warning if it only was
> using the default, asking people to make a configuration change to make
> it go away. We're totally good with things functioning with warnings
> after transitions, that ops can adjust during their timetable.
>
> -Sean
>
> --
> Sean Dague
> http://dague.net
>
> __
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
> --
> Message  protected by MailGuard: e-mail anti-virus, anti-spam and content
> filtering.http://www.mailguard.com.au/mg
> Click here to report this message as spam:
> https://console.mailguard.com.au/ras/1ODUv4oqIN/4x80DVYpDOULTM59jB3mdH/0.82
>
>
__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [cinder] [nova] os-brick privsep failures and an upgrade strategy?

2016-06-14 Thread Sean Dague

On 06/14/2016 06:11 PM, Angus Lees wrote:

Yep (3) is quite possible, and the only reason it doesn't just do this
already is because there's no way to find the name of the rootwrap
command to use (from any library, privsep or os-brick) - and I was never
very happy with the current need to specify a command line in
oslo.config purely for this lame reason.

As Sean points out, all the others involve some sort of configuration
change preceding the code.  I had imagined rollouts would work by
pushing out the harmless conf or sudoers change first, but hadn't
appreciated the strict change phases imposed by grenade (and ourselves).

If all "end-application" devs are happy calling something like (3)
before the first privileged operation occurs, then we should be good.  I
might even take the opportunity to phrase it as a general privsep.init()
function, and then we can use it for any other top-of-main()
privilege-setup steps that need to be taken in the future.


That sounds promising. It would be fine to emit a warning if it only was 
using the default, asking people to make a configuration change to make 
it go away. We're totally good with things functioning with warnings 
after transitions, that ops can adjust during their timetable.


-Sean

--
Sean Dague
http://dague.net

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [cinder] [nova] os-brick privsep failures and an upgrade strategy?

2016-06-14 Thread Angus Lees
On Tue, 14 Jun 2016 at 23:04 Daniel P. Berrange  wrote:

> On Tue, Jun 14, 2016 at 07:49:54AM -0400, Sean Dague wrote:
>
> [snip]
>

Urgh, thanks for the in-depth analysis :/

> The crux of the problem is that os-brick 1.4 and privsep can't be used
> > without a config file change during the upgrade. Which violates our
> > policy, because it breaks rolling upgrades.
>
> os-vif support is going to face exactly the same problem. We just followed
> os-brick's lead by adding a change to devstack to explicitly set the
> required config options in nova.conf to change privsep to use rootwrap
> instead of plain sudo.
>
> Basically every single user of privsep is likely to face the same
> problem.
>
> > So... we have a few options:
> >
> > 1) make an exception here with release notes, because it's the only way
> > to move forward.
>
> That's quite user hostile I think.
>
> > 2) have some way for os-brick to use either mode for a transition period
> > (depending on whether privsep is configured to work)
>
> I'm not sure that's viable - at least for os-vif we started from
> a clean slate to assume use of privsep, so we won't be able to have
> any optional fallback to non-privsep mode.
>
> > 3) Something else ?
>
> 3) Add an API to oslo.privsep that lets us configure the default
>command to launch the helper. Nova would invoke this on startup
>
>   privsep.set_default_helper("sudo nova-rootwrap ")
>
> 4) Have oslo.privsep install a sudo rule that grants permission
>to run privsep-helper, without needing rootwrap.
>
> 5) Have each user of privsep install a sudo rule to grants
>permission to run privsep-helper with just their specific
>entry point context, without needing rootwrap
>
> Any of 3/4/5 work out of the box, but I'm probably favouring
> option 4, then 5, then 3.
>
>
Yep (3) is quite possible, and the only reason it doesn't just do this
already is because there's no way to find the name of the rootwrap command
to use (from any library, privsep or os-brick) - and I was never very happy
with the current need to specify a command line in oslo.config purely for
this lame reason.

As Sean points out, all the others involve some sort of configuration
change preceding the code.  I had imagined rollouts would work by pushing
out the harmless conf or sudoers change first, but hadn't appreciated the
strict change phases imposed by grenade (and ourselves).

If all "end-application" devs are happy calling something like (3) before
the first privileged operation occurs, then we should be good.  I might
even take the opportunity to phrase it as a general privsep.init()
function, and then we can use it for any other top-of-main()
privilege-setup steps that need to be taken in the future.

 - Gus
__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [cinder] [nova] os-brick privsep failures and an upgrade strategy?

2016-06-14 Thread Walter A. Boring IV
I just put up a WIP patch in os-brick that tests to see if os-privsep is 
configured with
the helper_command.  If it's not, then os-brick falls back to using 
processutils

with the root_helper and run_as_root kwargs passed in.

https://review.openstack.org/#/c/329586
If you can check this out that would be helpful.  If this is the route 
we want to go,

then I'll add unit tests and take it out of WIP and try to get it in.


So, if nova.conf and cinder.conf aren't updated with the privsep_osbrick 
sections
providing the helper_command, then os_brick will assume local 
processutils calls

with the configured root_helper passed in.

This should be backwards compatible (grenade upgrade tests).  But we 
should encourage
admins to add that section to their nova.conf and cinder.conf files.  
The other downside
to this is that if we have to keep this code in place, then we 
effectively still have to maintain

rootwrap filters in place and keep them up to date.   *sadness*


Walt

On 06/14/2016 04:49 AM, Sean Dague wrote:

os-brick 1.4 was released over the weekend, and was the first os-brick
to include privsep. We got a really odd failure rate in the
grenade-multinode jobs (1/3 - 1/2) after wards which was super non
obvious why. Hemma looks to have figured it out (this is a summary of
what I've seen on IRC to pull it all together)

Remembering the following -
https://github.com/openstack-dev/grenade#theory-of-upgrade and
https://governance.openstack.org/reference/tags/assert_supports-upgrade.html#requirements
- New code must work with N-1 configs. So this is `master` running with
`mitaka` configuration.

privsep requires a sudo rule or rootwrap rule (to get to sudo) to allow
the privsep daemon to be spawned for volume actions.

During gate testing we have a blanket sudoer rule for the stack user
during the run of grenade.sh. It has to do system level modifications
broadly to perform the upgrade. This sudoer rule is deleted at the end
of the grenade.sh run before Tempest tests are run, so that Tempest
tests don't accidentally require root privs on their target environment.

Grenade *also* makes sure that some resources live across the upgrade
boundary. This includes a boot from volume guest, which is torn down
before testing starts. And this is where things get interesting.

This means there is a volume teardown needed before grenade ends. But
there is only one. In single node grenade this happens about 30 seconds
for the end of the script, triggers the privsep daemon start, and then
we're done. And the 50_stack_sh sudoers file is removed. In multinode,
*if* the boot from volume server is on the upgrade node, then the same
thing happens. *However*, if it instead ended up on the subnode, which
is not upgraded, then the volume tear down in on the old node. No
os-brick calls are made on the upgraded node before grenade finishes.
The 50_stack_sh sudoers file is removed, as expected.

And now all volume tests on those nodes fail.

Which is what should happen. The point is that in production no one is
going to put a blanket sudoers rule like that in place. It's just we
needed it for this activity, and the userid on the services being the
same as the shell user (which is not root) let this fallback rule be used.

The crux of the problem is that os-brick 1.4 and privsep can't be used
without a config file change during the upgrade. Which violates our
policy, because it breaks rolling upgrades.

So... we have a few options:

1) make an exception here with release notes, because it's the only way
to move forward.

2) have some way for os-brick to use either mode for a transition period
(depending on whether privsep is configured to work)

3) Something else ?

https://bugs.launchpad.net/os-brick/+bug/1592043 is the bug we've got on
this. We should probably sort out the path forward here on the ML as
there are a bunch of folks in a bunch of different time zones that have
important perspectives here.

-Sean




__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [cinder] [nova] os-brick privsep failures and an upgrade strategy?

2016-06-14 Thread Matt Riedemann

On 6/14/2016 11:33 AM, Sean Dague wrote:

On 06/14/2016 09:02 AM, Daniel P. Berrange wrote:

On Tue, Jun 14, 2016 at 07:49:54AM -0400, Sean Dague wrote:

[snip]


The crux of the problem is that os-brick 1.4 and privsep can't be used
without a config file change during the upgrade. Which violates our
policy, because it breaks rolling upgrades.


os-vif support is going to face exactly the same problem. We just followed
os-brick's lead by adding a change to devstack to explicitly set the
required config options in nova.conf to change privsep to use rootwrap
instead of plain sudo.

Basically every single user of privsep is likely to face the same
problem.


So... we have a few options:

1) make an exception here with release notes, because it's the only way
to move forward.


That's quite user hostile I think.


2) have some way for os-brick to use either mode for a transition period
(depending on whether privsep is configured to work)


I'm not sure that's viable - at least for os-vif we started from
a clean slate to assume use of privsep, so we won't be able to have
any optional fallback to non-privsep mode.


3) Something else ?


3) Add an API to oslo.privsep that lets us configure the default
   command to launch the helper. Nova would invoke this on startup

  privsep.set_default_helper("sudo nova-rootwrap ")

4) Have oslo.privsep install a sudo rule that grants permission
   to run privsep-helper, without needing rootwrap.

5) Have each user of privsep install a sudo rule to grants
   permission to run privsep-helper with just their specific
   entry point context, without needing rootwrap


4 & 5 are the same as 1, because python packages don't have standardized
management of /etc in their infrastructure. The code can't roll forward
without a config change.

Option #3 is a new one, I wonder if that would get us past here better.

-Sean



Yeah #3 sounds the best to me, but would need to hear from Angus on this.

--

Thanks,

Matt Riedemann


__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [cinder] [nova] os-brick privsep failures and an upgrade strategy?

2016-06-14 Thread Sean Dague
On 06/14/2016 09:02 AM, Daniel P. Berrange wrote:
> On Tue, Jun 14, 2016 at 07:49:54AM -0400, Sean Dague wrote:
> 
> [snip]
> 
>> The crux of the problem is that os-brick 1.4 and privsep can't be used
>> without a config file change during the upgrade. Which violates our
>> policy, because it breaks rolling upgrades.
> 
> os-vif support is going to face exactly the same problem. We just followed
> os-brick's lead by adding a change to devstack to explicitly set the
> required config options in nova.conf to change privsep to use rootwrap
> instead of plain sudo.
> 
> Basically every single user of privsep is likely to face the same
> problem.
> 
>> So... we have a few options:
>>
>> 1) make an exception here with release notes, because it's the only way
>> to move forward.
> 
> That's quite user hostile I think.
> 
>> 2) have some way for os-brick to use either mode for a transition period
>> (depending on whether privsep is configured to work)
> 
> I'm not sure that's viable - at least for os-vif we started from
> a clean slate to assume use of privsep, so we won't be able to have
> any optional fallback to non-privsep mode.
> 
>> 3) Something else ?
> 
> 3) Add an API to oslo.privsep that lets us configure the default
>command to launch the helper. Nova would invoke this on startup
> 
>   privsep.set_default_helper("sudo nova-rootwrap ")
> 
> 4) Have oslo.privsep install a sudo rule that grants permission
>to run privsep-helper, without needing rootwrap.
> 
> 5) Have each user of privsep install a sudo rule to grants
>permission to run privsep-helper with just their specific
>entry point context, without needing rootwrap

4 & 5 are the same as 1, because python packages don't have standardized
management of /etc in their infrastructure. The code can't roll forward
without a config change.

Option #3 is a new one, I wonder if that would get us past here better.

-Sean

-- 
Sean Dague
http://dague.net

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [cinder] [nova] os-brick privsep failures and an upgrade strategy?

2016-06-14 Thread Daniel P. Berrange
On Tue, Jun 14, 2016 at 07:49:54AM -0400, Sean Dague wrote:

[snip]

> The crux of the problem is that os-brick 1.4 and privsep can't be used
> without a config file change during the upgrade. Which violates our
> policy, because it breaks rolling upgrades.

os-vif support is going to face exactly the same problem. We just followed
os-brick's lead by adding a change to devstack to explicitly set the
required config options in nova.conf to change privsep to use rootwrap
instead of plain sudo.

Basically every single user of privsep is likely to face the same
problem.

> So... we have a few options:
> 
> 1) make an exception here with release notes, because it's the only way
> to move forward.

That's quite user hostile I think.

> 2) have some way for os-brick to use either mode for a transition period
> (depending on whether privsep is configured to work)

I'm not sure that's viable - at least for os-vif we started from
a clean slate to assume use of privsep, so we won't be able to have
any optional fallback to non-privsep mode.

> 3) Something else ?

3) Add an API to oslo.privsep that lets us configure the default
   command to launch the helper. Nova would invoke this on startup

  privsep.set_default_helper("sudo nova-rootwrap ")

4) Have oslo.privsep install a sudo rule that grants permission
   to run privsep-helper, without needing rootwrap.

5) Have each user of privsep install a sudo rule to grants
   permission to run privsep-helper with just their specific
   entry point context, without needing rootwrap

Any of 3/4/5 work out of the box, but I'm probably favouring
option 4, then 5, then 3.

Regards,
Daniel
-- 
|: http://berrange.com  -o-http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org  -o- http://virt-manager.org :|
|: http://autobuild.org   -o- http://search.cpan.org/~danberr/ :|
|: http://entangle-photo.org   -o-   http://live.gnome.org/gtk-vnc :|

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


[openstack-dev] [cinder] [nova] os-brick privsep failures and an upgrade strategy?

2016-06-14 Thread Sean Dague
os-brick 1.4 was released over the weekend, and was the first os-brick
to include privsep. We got a really odd failure rate in the
grenade-multinode jobs (1/3 - 1/2) after wards which was super non
obvious why. Hemma looks to have figured it out (this is a summary of
what I've seen on IRC to pull it all together)

Remembering the following -
https://github.com/openstack-dev/grenade#theory-of-upgrade and
https://governance.openstack.org/reference/tags/assert_supports-upgrade.html#requirements
- New code must work with N-1 configs. So this is `master` running with
`mitaka` configuration.

privsep requires a sudo rule or rootwrap rule (to get to sudo) to allow
the privsep daemon to be spawned for volume actions.

During gate testing we have a blanket sudoer rule for the stack user
during the run of grenade.sh. It has to do system level modifications
broadly to perform the upgrade. This sudoer rule is deleted at the end
of the grenade.sh run before Tempest tests are run, so that Tempest
tests don't accidentally require root privs on their target environment.

Grenade *also* makes sure that some resources live across the upgrade
boundary. This includes a boot from volume guest, which is torn down
before testing starts. And this is where things get interesting.

This means there is a volume teardown needed before grenade ends. But
there is only one. In single node grenade this happens about 30 seconds
for the end of the script, triggers the privsep daemon start, and then
we're done. And the 50_stack_sh sudoers file is removed. In multinode,
*if* the boot from volume server is on the upgrade node, then the same
thing happens. *However*, if it instead ended up on the subnode, which
is not upgraded, then the volume tear down in on the old node. No
os-brick calls are made on the upgraded node before grenade finishes.
The 50_stack_sh sudoers file is removed, as expected.

And now all volume tests on those nodes fail.

Which is what should happen. The point is that in production no one is
going to put a blanket sudoers rule like that in place. It's just we
needed it for this activity, and the userid on the services being the
same as the shell user (which is not root) let this fallback rule be used.

The crux of the problem is that os-brick 1.4 and privsep can't be used
without a config file change during the upgrade. Which violates our
policy, because it breaks rolling upgrades.

So... we have a few options:

1) make an exception here with release notes, because it's the only way
to move forward.

2) have some way for os-brick to use either mode for a transition period
(depending on whether privsep is configured to work)

3) Something else ?

https://bugs.launchpad.net/os-brick/+bug/1592043 is the bug we've got on
this. We should probably sort out the path forward here on the ML as
there are a bunch of folks in a bunch of different time zones that have
important perspectives here.

-Sean

-- 
Sean Dague
http://dague.net

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev