Re: [openstack-dev] [nova] How to properly detect and fence a compromised host (and why I dislike TrustedFilter)

Bhandaru, Malini K Tue, 23 Jun 2015 22:17:07 -0700

Would like to add to Shane's points below.

1) The Trust filter can be treated as an API, with different underlying 
implementations. Its default could even be "Not Implemented" and always return 
false.
     And Nova.conf could specify use the OAT trust implementation. This would 
not break present day users of the functionality.


2) The issue in the original bug is a a VM waking up after a reboot on a host 
that has not pre-determined whether the host is still trustable.
     This is essentially begging a feature to check that all constraints 
requested by a VM during launch are confirmed to hold when it re-awakens, even 
if it is not
     going through Nova scheduler at this point. 

     This holds even for aggregates that might be specified by geo, or even 
reservation such as "Coke" or "Pepsi".
     What if a host, even without a reboot and certainly before a reboot was 
assigned from Coke to Pepsi, there is cross contamination.
     Perhaps we need Nova hooks that can be registered with functions that 
check expected aggregate values.

     Better still have  libvirt functionality that makes a call back for each 
VM on a host to ensure its constraints are satisfied on start-up/boot, and 
re-start when it comes out of pause.

     Using aggregate for trust with a cron job to check for trust is 
inefficient in this case, trust status gets updated only on a host reboot. 
Intel TXT is a boot
     time authentication.

Regards
Malini


-----Original Message-----
From: Wang, Shane [mailto:[email protected]] 
Sent: Tuesday, June 23, 2015 9:26 PM
To: OpenStack Development Mailing List (not for usage questions)
Subject: Re: [openstack-dev] [nova] How to properly detect and fence a 
compromised host (and why I dislike TrustedFilter)

AFAIK, TrustedFilter is using a sort of cache to cache the trusted state, which 
is designed to solve the performance issue mentioned here.

My thoughts for deprecating it are:
#1. We already have customers here in China who are using that filter. How are 
they going to do upgrade in the future?
#2. Dependency should not be a reason to deprecate a module in OpenStack, Nova 
is not a stand-alone module, and it depends on various technologies and 
libraries.

Intel is setting up the third party CI for TCP/OAT in Liberty, which is to 
address the concerns mentioned in the thread. And also, OAT is an open source 
project which is being maintained as the long-term strategy.

For the situation that a host gets compromised, OAT checks trusted or untrusted 
from the start point of boot/reboot, it is hard for OAT to detect whether a 
host gets compromised when it is running, I don't know how to detect that 
without the filter?
Back to Michael's question, the process of the verification is done by software 
automatically when a host boots or reboots, will that be an overhead for the 
admin to have a separate job?

Thanks.
--
Shane

-----Original Message-----
From: Michael Still [mailto:[email protected]]
Sent: Wednesday, June 24, 2015 7:49 AM
To: OpenStack Development Mailing List (not for usage questions)
Subject: Re: [openstack-dev] [nova] How to properly detect and fence a 
compromised host (and why I dislike TrustedFilter)

I agree. I feel like this is another example of functionality which is 
trivially implemented outside nova, and where it works much better if we don't 
do it. Couldn't an admin just have a cron job which verifies hosts, and then 
adds them to a compromised-hosts host aggregate if they're owned? I assume 
without testing it that you can migrate instances _out_ of a host aggregate you 
can't boot in?

Michael

On Tue, Jun 23, 2015 at 8:41 PM, Sylvain Bauza <[email protected]> wrote:
> Hi team,
>
> Some discussion occurred over IRC about a bug which was publicly open 
> related to TrustedFilter [1] I want to take the opportunity for 
> raising my concerns about that specific filter, why I dislike it and 
> how I think we could improve the situation - and clarify everyone's
> thoughts)
>
> The current situation is that way : Nova only checks if one host is 
> compromised only when the scheduler is called, ie. only when 
> booting/migrating/evacuating/unshelving an instance (well, not exactly 
> all the evacuate/live-migrate cases, but let's not discuss about that 
> now). When the request goes in the scheduler, all the hosts are 
> checked against all the enabled filters and the TrustedFilter is 
> making an external HTTP(S) call to the Attestation API service (not 
> handled by Nova) for *each host* to see if the host is valid (not 
> compromised) or not.
>
> To be clear, that's the only in-tree scheduler filter which explicitly 
> does an external call to a separate service that Nova is not managing.
> I can see at least 3 reasons for thinking about why it's bad :
>
> #1 : that's a terrible bottleneck for performance, because we're 
> IO-blocking N times given N hosts (we're even not multiplexing the 
> HTTP requests)
> #2 : all the filters are checking an internal Nova state for the host 
> (called HostState) but that the TrustedFilter, which means that 
> conceptually we defer the decision to a 3rd-party engine
> #3 : that Attestation API services becomes a de facto dependency for 
> Nova (since it's an in-tree filter) while it's not listed as a 
> dependency and thus not gated.
>
>
> All of these reasons could be acceptable if that would cover the 
> exposed usecase given in [1] (ie. I want to make sure that if my host 
> gets compromised, my instances will not be running on that host) but 
> that just doesn't work, due to the situation I mentioned above.
>
> So, given that, here are my thoughts :
> a/ if a host gets compromised, we can just disable its service to 
> prevent its election as a valid destination host. There is no need for 
> a specialised filter.
> b/ if a host is compromised, we can assume that the instances have to 
> resurrect elsewhere, ie. we can call a nova evacuate c/ checking if an 
> host is compromised or not is not a Nova responsibility since it's 
> already perfectly done by [2]
>
> In other words, I'm considering that "security" usecase as something 
> analog as the HA usecase [3] where we need a 3rd-party tool 
> responsible for periodically checking the state of the hosts, and if 
> compromised then call the Nova API for fencing the host and evacuating the 
> compromised instances.
>
> Given that, I'm proposing to deprecate TrustedFilter and explictly 
> mention to drop it from in-tree in a later cycle
> https://review.openstack.org/194592
>
> Thoughts ?
> -Sylvain
>
>
>
> [1] https://bugs.launchpad.net/nova/+bug/1456228
> [2] https://github.com/OpenAttestation/OpenAttestation
> [3]
> http://blog.russellbryant.net/2014/10/15/openstack-instance-ha-proposa
> l/
>
>
> ______________________________________________________________________
> ____ OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: 
> [email protected]?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev



--
Rackspace Australia

__________________________________________________________________________
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: [email protected]?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

__________________________________________________________________________
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: [email protected]?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

__________________________________________________________________________
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: [email protected]?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] [nova] How to properly detect and fence a compromised host (and why I dislike TrustedFilter)

Reply via email to