Ethan Quach wrote:
> What I was approaching with this idea is that each subdirectory under 
> /var/share would be a separate dataset.  Yes, this could mean that there are
> lots of datasets, but datasets are relatively cheap.  If a package 
> update to a BE
> removes that directory, it removes the underlying mountpoint which
> lives in that BE.  All other BEs continue to have it, so they all mount up
> that dataset when they boot.
>
> The datasets for the directories under /var/share will be kept in a special
> location:
>
>     <pool>/SHARE/*
>
> Each dataset under <pool>/SHARE/ will have a mountpoint of
> /var/share/<something>.  If that mountpoint exists in the BE booting up,
> that dataset gets mounted, if it doesn't then that dataset doesn't get 
> mounted.
This sounds interesting. Though it makes me ask: Why then does it need 
to be mounted in /var/share/ at all?
Why not mount it where it belongs...

For example, if a ZFS filesystem <pool>/SHARE/mail is created, why can't 
the ZFS property 'mountpoint' be set to /var/spool/mail?
Why set it to /var/share/mail, and then make a softlink named 
/var/spool/mail?

Also right now ZFS creates the directory a filesystem is to be mounted 
on automatically. How will this 'mount only if mountpoint exists' 
function be implemented?
Will the special behaivor apply to all things in <pool>/SHARE? or will 
there be some other ZFS property ("conditionmount"?) that triggers this 
behavior?

If the former, what happens if someone happens to make a ZFS named 
<pool>/SHARE/foo on their own?
If the latter, why force them into <pool>/SHARE/ at all?

Other questions I've been meaning to ask about the interaction between 
ZFS and SnapUpgrades and BE's:

Can there be a ZFS pool for each BE? Or only one pool?
Can there be more than One BE in the same pool? or only one BE per pool?

Assuming that both of those are allowed... can a BE have FS's in more 
than one pool?

When there are more than one pool, what happens if there are more than 
one <pool>/SHARE/...??

   -Kyle

   -Kyle

>
> This has the aim of decoupling the actual sharing implementation from 'pkg'
> or whatever other software delivery mechanisms there might be.  'pkg' 
> doesn't
> have to know or do anything special with things that are delivered in 
> /var/share.
>   
>
>>  >
>>  >
>>  > One key point that is essential for the functionality of /var/share
>>  > with Snap Upgrade or similar schemes is that if the packaging tools
>>  > will add/remove/modify files or directories in /var/share there needs
>>  > to be provisions for not whacking them during upgrade operations.  In
>>  > particular, the running Solaris instance should not be modified when
>>  > altering the upgrade environment and fallback needs to be possible.
>>  >
>>
>>  Precisely.  Getting 'pkg' into the loop and defining special
>>  handling for /var/share could certainly curtail the limitation
>>  described above, but that's not yet been probed or questioned in
>>  this proposal.  I hadn't been going down this route because of
>>  requirement #3 in the proposal.
>>
>>  But lets discuss this a little bit more now.  What are the special
>>  provisions that would be needed to be handled by 'pkg' (or whatever
>>  software update mechanism is used) to enable the entire /var/share
>>  directory to be shareable across BEs?  A couple that I can think of
>>  are:
>>
>>  - a 'pkg remove' from any BE can't ever remove a package's contents
>>  from /var/share
>>  - a 'pkg update' to any BE can't ever remove or change a package's
>>  contents in /var/share, but can only add to /var/share (change
>>  would include even simple changes like the permissions on a file or
>>  directory)
>>  - a 'pkg install' to any BE can't ever replace or change existing
>>  content in /var/share even if it conflicts with the what's being
>>  installed.
>>     
>>
>> Here's an idea to address that.  I'm sure there are plenty of corner
>> cases that I don't cover, but it is only intended to be a start at a
>> possible approach.
>>
>> Suppose that there is a special package per boot enviornment.  For the
>> sake of conversation, the package for a BE named "curr" is "BEcurr".
>> This package would consist of:
>>
>>  - The definition of the BE (file systems, other meta data)
>>   
>>     
>
> Right now, there is no BE meta data.
>
>   
>>  - All files that are to be synchronized between boot environments
>>
>> With SysV packages today, the same file or directory may be delivered
>> or referenced by several packages.  
>> For example, on a Solaris 10 box I
>> see that /usr/sbin/sysidnfs4 is delivered by SUNWnfscu and SUNWadmap.
>>   
>>     
>
> This is a bug, probably due to a out of order patching. In S10 FCS, 
> /usr/sbin/sysidnfs4
> was delivered by SUNWnfscu, but as of S10U4, it moved over to SUNWadmap.  It
> shouldn't be owned by both.
>
>   
>> To say that /etc/passwd needs to be sync'd from a given boot
>> environment, a reference to /etc/passwd is added to the (initially
>> empty) BEcurr package through installf or its replacement.
>>
>> To deal with /var/share:
>>
>>  - When the next boot environment is created it has BEcurr as an
>>    installed package and BEnext is created as a copy of BEcurr.
>>  - When a package adds a file or directory to /var/share, the
>>    BE<whatver> package associated with the target BE is updated to
>>    reference that file.
>>  - When a package is removed, if removal of a file or directory causes
>>    only one reference to that file/directory to remain and that
>>    reference matches the BE currently being operated on, then that
>>    reference and file/directory should be removed.
>>  - Conflicts are handled according to the same policy that would be
>>    used to handle conflicts between packages that are installed.
>>
>> A mechanism to to keep all the BE* packages in sync between the boot
>> environments needs to exist.  Most likely this would involve having a
>> shared transaction log between all the boot environments.  If a BE is
>> mounted or activated, the pending BE reference and unreference
>> operations shall be applied to the newly mounted or activated BE.
>>
>> When a BE is destroyed, the corresponding BE package is removed from
>> all boot environments.  For any BE that is not mounted, that would be
>> a delayed operation through the transaction log.  
>>     
>
> We're trying to avoid needing BE meta data all together for the very 
> reason of
> having to keep it synced between every BE.  If we were to require meta 
> data at
> all though, it'd be best to keep all of it in a shared dataset.
>
> As opposed to LU where a system typically has no more a few BEs, BEs with
> Snap Upgrade will be plentiful, so syncing meta data across them all is 
> quite
> unattractive.
>
>
> -ethan
>
> _______________________________________________
> caiman-discuss mailing list
> caiman-discuss at opensolaris.org
> http://mail.opensolaris.org/mailman/listinfo/caiman-discuss
>   


Reply via email to