Re: [tpmdd-devel] [RFC] tpm2-space: add handling for global session exhaustion
On Thu, Feb 16, 2017 at 09:04:47AM -0500, Ken Goldman wrote: Good morning to everyone, leveraging some time between planes. > On 2/14/2017 9:38 AM, Dr. Greg Wettstein wrote: > > > >I don't think there is any doubt that running cryptographic primitives > >in userspace is going to be faster then going to hardware. Obviously > >that also means there is no need for a TPM resource manager which has > >been the subject of much discussion here. > I don't understand that comment. > > The resource manager schedules user space access to the TPM. It also > handles swapping of objects in and out of the limited number of > TPM slots. > > Without a RM, either you'd have to permit only a single TPM connection, > blocking all other connections, or you'd have different connections > interfering with each other. Yes, if multiple contexts of execution require access to the TPM a resource manager is needed to arbitrate that access. I think, however, that we are talking past one another a bit. We design and build systems which implement autonomous self-regulation. As such we need a hardware based confirmation that the machine is in a given behavioral state. This requires that we reference a hardware root of trust, ie. the TPM. Depending on the assurance granularity requirements, that may mean a high rate of TPM verifications. When I noticed you and James talking about 'cloud based' levels of transactions I was assuming you were operating at transaction rates we build for, ie. 10-100's/second. That didn't seem feasible given our hardware measurements on Skylake and Kabylake based systems. James had cited the CoreOS/Tectonic white paper as an example of TPM's working at cloud scale. Our conversation to date seems to indicate that the accepted modality of security appers to be to do userspace verification of container signatures. Given the extensive dialogue in the paper about using TPM's for security we had inadvertently believed that container verifications were being pinned to current platform status which didn't correlate with expected container start time latencies. Our behavioral assessment code is namespaced so a supervisory system can make statements about the behavior of a container. We have concluded the only way that is possible is to use userspace TPM implementations which can meet the necessary latency requirements. Our point in all this is that it doesn't seem to make any sense to implement anything in the kernel more then basic resource management. If other 'virtualization' is needed, such as session state management and the like, the community would seem to be served better by having a solid userspace simulation environment, with appropriate hardware security guarantees. That would serve needs like re-keying support for VPNaaS applications as well as high transaction rate environments, ie. why load the kernel with code to virtualize a resource when a 'user' can just be given its own TPM2 instance. Just as an aside, has anyone given any thought about TPM2 resource management in things like TXT/tboot environments? The current tboot code makes a rather naive assumption that it can take a handle slot to protect its platform verification secret. Doing resource management correctly will require addressing extra-OS environments such as this which may have TPM2 state requirement issues. Our take away from all this is that it doesn't seem that we need to worry about the fact that someone may have invented TPM2 hardware which is faster then what we are developing on :-) Have a good weekend. Greg As always, Dr. G.W. Wettstein, Ph.D. Enjellic Systems Development, LLC. 4206 N. 19th Ave. Specializing in information infra-structure Fargo, ND 58102development. PH: 701-281-1686 FAX: 701-281-3949 EMAIL: g...@enjellic.com -- "If you ever teach a yodeling class, probably the hardest thing is to keep the students from just trying to yodel right off. You see, we build to that." -- Jack Handey Deep Thoughts -- Check out the vibrant tech community on one of the world's most engaging tech sites, SlashDot.org! http://sdm.link/slashdot ___ tpmdd-devel mailing list tpmdd-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/tpmdd-devel
Re: [tpmdd-devel] [PATCH v2 6/7] tpm: expose spaces via a device link /dev/tpms
systems will inter-operate or not. > /Jarkko Have a good day. Greg As always, Dr. G.W. Wettstein, Ph.D. Enjellic Systems Development, LLC. 4206 N. 19th Ave. Specializing in information infra-structure Fargo, ND 58102development. PH: 701-281-1686 FAX: 701-281-3949 EMAIL: g...@enjellic.com -- "You've got to be kidding me Nate. You've seen the shit that has come through my office in the last two hours. You think I'm even remotely worried about one SATA cable being six inches longer than the other." -- Dr. Greg Wettstein Resurrection -- Check out the vibrant tech community on one of the world's most engaging tech sites, SlashDot.org! http://sdm.link/slashdot ___ tpmdd-devel mailing list tpmdd-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/tpmdd-devel
Re: [tpmdd-devel] [RFC] tpm2-space: add handling for global session exhaustion
On Fri, Feb 17, 2017 at 02:37:12PM +0200, Jarkko Sakkinen wrote: Hi, I hope the week is ending well for everyone. > On Fri, Feb 17, 2017 at 03:56:26AM -0600, Dr. Greg Wettstein wrote: > > On Thu, Feb 16, 2017 at 10:33:04PM +0200, Jarkko Sakkinen wrote: > > > > Good morning to everyone. > > > > > On Thu, Feb 16, 2017 at 02:06:42PM -0600, Dr. Greg Wettstein wrote: > > > > Just as an aside, has anyone given any thought about TPM2 resource > > > > management in things like TXT/tboot environments? The current tboot > > > > code makes a rather naive assumption that it can take a handle slot to > > > > protect its platform verification secret. Doing resource management > > > > correctly will require addressing extra-OS environments such as this > > > > which may have TPM2 state requirement issues. > > > > > The current implementation handles stuff created from regular > > > /dev/tpm0 so I do not think this would be an issue. You can only > > > access objects from a TPM space that are created within that space. > > > > Unless I misunderstand the number of transient objects which can be > > managed is a characteristic of the hardware and is a limited resource, > > hence our discussion on the notion of a resource manager to shuttle > > context in and out of these limited slots. > > > > On a Kabylake system, running the following command: > > > > getcapability -cap 6 | grep trans > > > > After booting into a TXT mediated measured launch environment (MLE) yields > > the following: > > > > TPM_PT 010e value 0003 TPM_PT_HR_TRANSIENT_MIN - the minimum number > > of transient objects that can be held in TPM RAM > > > > TPM_PT 0207 value 0002 TPM_PT_HR_TRANSIENT_AVAIL - estimate of the > > number of additional transient objects that could be loaded into TPM RAM > > > > Booting without TXT results in the getcapability call indicating that > > three slots are available. Based on that and reading the tboot code, > > we are assuming the occupied slot is the ephemeral primary key > > generated by tboot which seals the verification secret. > > > > In an MLE it is possible to create and then flush a new ephemeral > > primary key which results in the following getcapability output: > > > > TPM_PT 0207 value 0003 TPM_PT_HR_TRANSIENT_AVAIL - estimate of > > the number of additional transient objects that could be loaded into TPM RAM > > > > Which is probably going to be pretty surprising to tboot in the event > > that it tries to re-verify the system state after a suspend event. > > > > So based on that it would seem there would need to be some semblance > > of cooperation between the resource manager and an extra-OS > > utilization of TPM2 resources such as tboot. > > > > Thoughts? > The driver swaps in and out all the objects for one send-receive > cycle. So unless the driver is sending a command to a TPM the > resource manager occupies zero slots. I do not see reason for > forseeable future to change this pattern. > > I discussed about some "lazier" schemes for swapping with James an > Ken in the early Fall but came into conclusion that it would make > the RM really complicated. There would have to be something show > stopper work load to even to start consider it. > > With the capacity of current TPMs and amount of traffic and > workloads it is really not a worth of the trouble. > > I guess the way we do swapping kind of indirectly sorts out the > issue you described, doesn't it? I'm not sure, we've pulled down your resource manager branch so we can figure out the exact mechanics of how it works. Based on a cursory read of the code it appears as if it loops through all three transient handle slots and attempts to context save each transient object it finds. So if it does that for each send/receive cycle it should theoretically inter-operate with TXT/tboot. As noted previously, with the current kernel driver, we can see that tboot has allocated a slot for the ephemeral key which is used to seal the memory verification secrets. This key gets allocated to handle 8000 as one would anticipate. However when we attempt to issue a context save against that handle we get an error. Interestingly, when we attempt to flush that handle manually we receive an error as well, but the number of available transient handles increases by one which suggests the context flush cleared the slot. It seems that we should be able to manually replicate what the resource manager is doing with the standard kernel driver or is this
Re: [tpmdd-devel] [RFC] tpm2-space: add handling for global session exhaustion
On Thu, Feb 16, 2017 at 10:33:04PM +0200, Jarkko Sakkinen wrote: Good morning to everyone. > On Thu, Feb 16, 2017 at 02:06:42PM -0600, Dr. Greg Wettstein wrote: > > Just as an aside, has anyone given any thought about TPM2 resource > > management in things like TXT/tboot environments? The current tboot > > code makes a rather naive assumption that it can take a handle slot to > > protect its platform verification secret. Doing resource management > > correctly will require addressing extra-OS environments such as this > > which may have TPM2 state requirement issues. > The current implementation handles stuff created from regular > /dev/tpm0 so I do not think this would be an issue. You can only > access objects from a TPM space that are created within that space. Unless I misunderstand the number of transient objects which can be managed is a characteristic of the hardware and is a limited resource, hence our discussion on the notion of a resource manager to shuttle context in and out of these limited slots. On a Kabylake system, running the following command: getcapability -cap 6 | grep trans After booting into a TXT mediated measured launch environment (MLE) yields the following: TPM_PT 010e value 0003 TPM_PT_HR_TRANSIENT_MIN - the minimum number of transient objects that can be held in TPM RAM TPM_PT 0207 value 0002 TPM_PT_HR_TRANSIENT_AVAIL - estimate of the number of additional transient objects that could be loaded into TPM RAM Booting without TXT results in the getcapability call indicating that three slots are available. Based on that and reading the tboot code, we are assuming the occupied slot is the ephemeral primary key generated by tboot which seals the verification secret. In an MLE it is possible to create and then flush a new ephemeral primary key which results in the following getcapability output: TPM_PT 0207 value 0003 TPM_PT_HR_TRANSIENT_AVAIL - estimate of the number of additional transient objects that could be loaded into TPM RAM Which is probably going to be pretty surprising to tboot in the event that it tries to re-verify the system state after a suspend event. So based on that it would seem there would need to be some semblance of cooperation between the resource manager and an extra-OS utilization of TPM2 resources such as tboot. Thoughts? > /Jarkko Greg As always, Dr. G.W. Wettstein, Ph.D. Enjellic Systems Development, LLC. 4206 N. 19th Ave. Specializing in information infra-structure Fargo, ND 58102development. PH: 701-281-1686 FAX: 701-281-3949 EMAIL: g...@enjellic.com -- "For a successful technology, reality must take precedence over public relations, for nature cannot be fooled." -- Richard Feynmann -- Check out the vibrant tech community on one of the world's most engaging tech sites, SlashDot.org! http://sdm.link/slashdot ___ tpmdd-devel mailing list tpmdd-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/tpmdd-devel
Re: [tpmdd-devel] [RFC] tpm2-space: add handling for global session exhaustion
On Feb 9, 11:24am, James Bottomley wrote: } Subject: Re: [tpmdd-devel] [RFC] tpm2-space: add handling for global sessi Good morning to everyone. > On Thu, 2017-02-09 at 03:06 -0600, Dr. Greg Wettstein wrote: > > Referring back to Ken's comments about having 20+ clients waiting to > > get access to the hardware. Even with the focus in TPM2 on having it > > be more of a cryptographic accelerator are we convinced that the > > hardware is ever going to be fast enough for a model of having it > > directly service large numbers of transactions in something like a > > 'cloud' model? > It's already in use as such today: > > https://tectonic.com/assets/pdf/TectonicTrustedComputing.pdf We are familiar with this work. I'm not sure, however, that this work is representative of the notion of using TPM hardware to support a transactional environment, particularly at the cloud/container level. There is not a great deal of technical detail on the CoreOS integrity architecture but it appears they are using TPM hardware to validate container integrity. I'm not sure this type of environment reflects the ability of TPM hardware to support transactional throughputs in an environment such as financial transaction processing. Intel's Clear Container work cites the need to achieve container startup times of 150 milliseconds and they are currently claiming 45 milliseconds as their optimal time. This work was designed to demonstrate the feasibility of providing virtual machine isolation guarantees to containers and as such one of the mandates was to achieve container start times comparable to standard namespaces. I ran some very rough timing metrics on one of our Skylake development systems with hardware TPM2 support. Here are the elapsed times for two common verification operations which I assume would be at the heart of generating any type of reasonable integrity guarantee: quote: 810 milliseconds verify signature: 635 milliseconds This is with the verifying key loaded into the chip. The elapsed time to load and validate a key into the chip averages 1200 milliseconds. Since we are discussing a resource manager which would be shuttling context into and out of the limited resource slots on the chip I believe it is valid to consider this overhead as well. This suggests that just a signature verification on the integrity of a container is a factor of 4.2 times greater then a well accepted start time metric for container technology. Based on that I'm assuming that if TPM based integrity guarantees are being implemented they are only on ingress of the container into the cloud environment. I'm assuming an alternate methodology must be in place to protect against time of measurement/time of use issues. Maybe people have better TPM2 hardware then what we have. I was going to run this on a Kaby Lake reference system but it appears that TXT is causing some type of context depletion problems which we we need to run down. > We're also planning something like this in the IBM Cloud. I assume if there is an expection of true transactional times you either will have better hardware then current generation TPM2 technology. Either that or I assume you will be using userspace simulators anchored with a hardware TPM trust root. Ken's reflection of having 21-22 competing transactions would appear to have problematic latency issues given our measurements. I influence engineering for a company which builds deterministically modeled Linux platforms. We've spent a lot of time considering TPM2 hardware bottlenecks since they constrain the rate at which we can validate platform behavioral measurements. We have a variation of this work which allows SGX OCALL's to validate platform behavior in order to provide a broader TCB resource spectrum to the enclave and hardware TPM performance is problematic there as well. > James Have a good weekend. Greg }-- End of excerpt from James Bottomley As always, Dr. G.W. Wettstein, Ph.D. Enjellic Systems Development, LLC. 4206 N. 19th Ave. Specializing in information infra-structure Fargo, ND 58102development. PH: 701-281-1686 FAX: 701-281-3949 EMAIL: g...@enjellic.com -- "After being a technician for 2 years, I've discovered if people took care of their health with the same reckless abandon as their computers, half would be at the kitchen table on the phone with the hospital, trying to remove their appendix with a butter knife." -- Brian Jones -- -- Check out the vibrant tech community on one of the world's most engaging tech sites, SlashDot.org! http://sdm.link/slashdot ___ tpmdd-devel mailing list tpmdd-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/tpmdd-devel
Re: [tpmdd-devel] [RFC] tpm2-space: add handling for global session exhaustion
On Jan 30, 11:58pm, Jarkko Sakkinen wrote: } Subject: Re: [tpmdd-devel] [RFC] tpm2-space: add handling for global sessi Good morning, I hope the day is going well for everyone. > I'm kind dilating to an opinion that we would leave this commit out > from the first kernel release that will contain the resource manager > with similar rationale as Jason gave me for whitelisting: get the > basic stuff in and once it is used with some workloads whitelisting > and exhaustion will take eventually the right form. > > How would you feel about this? I wasn't able to locate the exact context to include but we noted with interest Ken's comments about his need to support a model where a client needs a TPM session for transaction purposes which can last a highly variable amount of time. That and concerns about command white-listing, hardware denial of service and related issues tend to underscore our concerns about how much TPM resource management should go into the kernel. Once an API is in the kernel we live with it forever. Particularly with respect to TPM2, our field experiences suggest it is way too early to bake long term functionality into the kernel. Referring back to Ken's comments about having 20+ clients waiting to get access to the hardware. Even with the focus in TPM2 on having it be more of a cryptographic accelerator are we convinced that the hardware is ever going to be fast enough for a model of having it directly service large numbers of transactions in something like a 'cloud' model? The industry has very solid userspace implementations of TPM2. It seems that with respect to resource management about all we would want in the kernel is enough management to allow multiple privileged userspace process to establish a root of trust for a TPM2 based userspace instance with subsequent relinquishment of privilege. At that point one has the freedom to implement all sorts of policy. Given the potential lifespan of these security technologies I think a kernel design needs to factor in the availability of trusted execution environment's such as SGX as well. Politics aside, such environments do have the ability to significantly modify the guarantees which can be afforded to architectural models which focus on using the hardware TPM as a root of trust for userspace implementations of 'TPM' functionality and policy. We can always add functionality to the kernel but we can never subtract. It is way too early to lock security architecture decisions into the kernel. > /Jarkko Have a good weekend. Greg }-- End of excerpt from Jarkko Sakkinen As always, Dr. G.W. Wettstein, Ph.D. Enjellic Systems Development, LLC. 4206 N. 19th Ave. Specializing in information infra-structure Fargo, ND 58102development. PH: 701-281-1686 FAX: 701-281-3949 EMAIL: g...@enjellic.com -- "If I'd listened to customers, I'd have given them a faster horse." -- Henry Ford -- Check out the vibrant tech community on one of the world's most engaging tech sites, SlashDot.org! http://sdm.link/slashdot ___ tpmdd-devel mailing list tpmdd-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/tpmdd-devel
Re: [tpmdd-devel] [PATCH RFC 0/4] RFC: in-kernel resource manager
On Jan 3, 5:21pm, Ken Goldman wrote: } Subject: Re: [tpmdd-devel] [PATCH RFC 0/4] RFC: in-kernel resource manager Good morning, I hope this note finds the day going well for everyone. > On 1/3/2017 4:47 PM, Jason Gunthorpe wrote: > > > > I think we should also consider TPM 1.2 support in all of this, it is > > still a very popular piece of hardware and it is equally able to > > support a RM. > I suspect that TPM 2.0 and TPM 1.2 are so different that there may > be little or no code in common. > > My immediate need is for a 2.0 resource manager, since it's a gap in > the technology, while 1.2 does have tcsd. In the FWIW department. I influence architecture and engineering for a company which builds deterministically modeled and attested computing platforms for high security assurance environments. This entity actually builds systems based on TPM1.2 and TPM2 hardware. TPM2 prototypes were being developed based on the simulator which came out of Ken's lab as soon as it was first made available. The kernel needs a resource manager. Everyone needs to think VERY hard and VERY, VERY carefully about what gets put into the kernel. In making a decision, put the ABSOLUTE smallest amount of code into the kernel which allows various 'TPM2 personalities' to be implemented in userspace and functionally verified and protected by the physical instance. The emergence of commodity TEE's (SGX, et.al) should be in the back of everyone's mind as a factor in the roadmap. Repeat incessantly to oneself, TPM1.2 and TPM2 are only similar by virtue of sharing three ASCII characters. DO NOT rush this process. If we do not get this right we will ultimately end up trying to shove something which is conceptually worse then tss/tscd into the kernel. Repeat incesssantly to oneself, policy does not belong in the kernel. Pay homage to Ken, his TSS2 and TPM2 simulator work are beyond excellent... Greg }-- End of excerpt from Ken Goldman As always, Dr. G.W. Wettstein, Ph.D. Enjellic Systems Development, LLC. 4206 N. 19th Ave. Specializing in information infra-structure Fargo, ND 58102development. PH: 701-281-1686 FAX: 701-281-3949 EMAIL: g...@enjellic.com -- "... you should really focus more on simplifying your life. I actually spend most of my time finding ways to de-clog my brain." -- Sarah Wettstein At the lake -- Check out the vibrant tech community on one of the world's most engaging tech sites, SlashDot.org! http://sdm.link/slashdot ___ tpmdd-devel mailing list tpmdd-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/tpmdd-devel