A few thoughts here:
#1: This is a fast track. If we are going to start insisting that the
project team make significant changes to the project (especially changes
to which the project team doesn't readily agree), then the project
should probably be derailed.
#2: The only interesting consumers *right now* are PKCS#11. I think
enabling PKCS#11 in the global zone is useful enough, that it ought to
be allowed to proceed even if the final solution isn't quite what we
want. This is especially true since PKCS#11 by its very nature shields
applications from changes to underlying plumbing that might be necessary
later.
#3: I propose that in the meantime, the TSS API be reduced to
Consolidation Private binding, if not already at that level. Since
there are no consumers yet, this seems fairly reasonable and unlikely to
cause undue harm to projects.
#4: If the only reason that the daemon exists is to serialize access to
the TPM device (and I confess I'm not entirely convinced that this
assertion is true), then at some point in the future, It Would Be Nice
if the daemon could be eliminated, replaced with a fully zone-aware and
concurrent version of the TPM driver. I'm not sure what the challenges
to solve in that are, but hopefully the project team can elucidate them
when the time comes.
So, in the mean time, I think we need to either move ahead with a global
zone only solution (for now) and hope the project team will follow up
with a future case that addresses the virtualization problem, or we need
to take the first option and derail.
My opinion personally is against derailing, and to let this case through
with incomplete support for virtualization for now. If another member
feels differently, he can (and perhaps should!) press that derail button.
-- Garrett
James Carlson wrote:
> Edward Pilatowicz writes:
>
>> On Mon, Dec 01, 2008 at 04:00:30PM -0500, James Carlson wrote:
>>
>>> We end up being forced to support that wart going forward.
>>>
>>>
>> eventually if the tpm device is virtualized (or another communication
>> mechanism comes along that makes the zones support seamless) then we
>> just EOL this attribute and zonecfg/zoneadm can ignore it if it's
>> present.
>>
>
> The question I'm asking is whether it's worth expending effort to put
> that one-zone-at-a-time hack in place rather than putting in a more
> lasting TSS-IPC-to-the-global-zone mechanism.
>
>
>>> Since it's clear that (for whatever reason) the device driver "can't"
>>> be fixed to behave reasonably, I think that means you need a different
>>> IPC in order to be able to use it from within a non-global zone.
>>>
>>>
>> i don't recall anyone saying that this device "can't" be virtualized.
>> (i do recall an offline discussion where the project team said they were
>> on a tight schedule and attempting to do this would now would compromise
>> that schedule, but i don't recall anyone saying that this is not
>> technically possible.)
>>
>
> At least in the on-line discussion, this seems to be the case. The
> driver is intentionally not set up to handle more than one open stream
> at a time, and that's why they have that daemon. Why bother with the
> daemon to coordinate requests if the driver can do it?
>
>
>>> This won't be the first "doesn't play nicely with Zones" feature in
>>> the system. I agree that it's not good that it doesn't, but I'm much
>>> less sure that the 'assign it to one zone' approach is useful enough
>>> that it's both interesting and worth the effort compared to a real
>>> solution.
>>>
>>>
>> well, the existence of poorly integrated features is hardly a
>> justification for introducing new features with poor integration.
>>
>
> Agreed. I just don't think that it's an excuse to hack around the
> problem, either, and I think the one-zone assignment idea is a hack.
>
>
>> i guess i thought that providing this type of 'one zone' integration
>> would be pretty trivial (requiring only some zonecfg/zoneadm changes)
>> compared to virtualizing the tpm device driver.
>>
>
> I doubt it's substantially more complex than providing (say) a door or
> AF_UNIX socket in each zone that provides redirection to the
> global-zone resident daemon.
>
>
>>> Once we fix TSS so that it can be accessed normally from within a
>>> non-global zone, what does that attribute do? It doesn't disable TPM
>>> in the global zone anymore.
>>>
>>>
>> once TPM/TSS zones support is fixed, the attribute can be ignored.
>>
>
> Yuck.
>
>
>> also, the presence of the attribute wouldn't disable tpm in the global
>> zone, the admin has to do this themselves. (just like today when you
>> try to boot a zone that uses dynamic pools, if poold isn't running
>> zoneadm doesn't start it, instead it tells the admin that they need to
>> enable it.)
>>
>
> That sounds like the recipe for confusion.
>
> I would actually expect that the global zone's daemon would start
> early enough in the boot process that it would _have to_ be disabled
> administratively in order to make use of that one-zone-hot mechanism.
> Otherwise, it would take exclusive control, and the zone assigned to
> use the TPM would fail. (Either block on open() or fail out; not
> clear which.)
>
> The upgrade scenario would involve reenabling the global zone's
> service and ignoring those defunct parameters.
>
>
>> that said, we could always just forgo this kludge and wait for the full
>> feature integration. the project team is now aware of the deficiencies
>> in their currently proposed zones integration, so it seems like this is
>> something we'll see them address in a future case.
>>
>
> If we going to specify or even mandate something from the ARC level,
> I'd rather that it be a clean solution rather than a kludge that needs
> to be removed later.
>
>