[EFS-dev] EFS meets OpenAFS: Incremental Volume dump/restore

Phillip Moore Sat, 25 Sep 2010 08:42:47 -0700

I am well aware that Ali and Co have managed to beat incremental volume
dump/restore into submission, but let's step back and look at the history
here.

First of all, the only reason that they have struggled with it is because I
built VMS around incremental vos dump/restore as the distribution mechanism,
and changing the means essentially rewriting VMS.  IOW, they had no choice:
I boxes them in.

I am NOT a fan of that mechanism at all, even if the various edge conditions
have been addressed, so I am strongly biases against it.  For years, I've
regretted that choice, and I have say it's very high on the list of things I
want to change.   It's use would also be at odds with other things I want to
change in the design.

For example, we designed VMS when the rule of thumb for volume sizes was a
few 100MB.   MEGAbtyes!?   Hello, 1980's....   We made each and every
"install" into a separate volume, and I think that proved to be overkill 10
years later.   VMS drove the number of volume  far higher than it needed to
be.

I would like to (as documented) use mount points at the
metaproj/project/release directories, and nowhere else.   I would also like
to be able to limit which platforms (i.e. installs) are shipped to which
cells.   There are two ways we need to filter what gets shipped around: we
need to be able to control what metaproj/projects are sent to which
region/campus/location, and we also need to be able to define the list of
platforms supported by each cell.  These are two key VMS features that are
missing from EFS (and we desperately need the first one in EFS 2).

If you use incremental vos dump/restore, AND you use release-level volumes
instead of install-level volumes, then filtering at the platform level won't
be possible.  That's not a long term edge condition, either.  We already
have support for 4 major Linux/BSD distributions, and I hope to see that
number grow.

Note that regardless of how we end up distributing the install contents
(rsync, vos dump/restore or something else), I will NOT use vos dump/restore
for the "container" volumes like we did in VMS.  That proved to be a serious
scalability and functionality issue in VMS.   Here I'm referring to the
volumes for the metaproj/project directories.  Those will be managed by
creating those volumes locally in each cell (using a globally consistent
name) and the contents managed directly.

In EFS, there are NO defaults, so those containers are pretty boring.  The
metaproj volume has NOTHING but mount points for the projects.  The project
volumes have NOTHING but mount points for the releases.   The releases
contain the real data, in directories, files, etc.

The issue is going to come down to finding a directory replication mechanism
that has full support for OpenAFS ACLs.   That was one of the reasons we
used vos dump/restore in VMS, but I would rather spend a week extending
rsync to support OpenAFS, than live with the limitations of vos for this
particular functionality.

Finally...   Note that an EFS domain is not where you decide to use OpenAFS,
you make that decision when you define a cell.  That is, one EFS domain can
support cells of multiple fstypes.  This is ESSENTIAL functionality, if you
want to be able to help sites migrate from NFS to OpenAFS (and of course, we
do).   That really kills vos for distribution, because you really want a
common mechanism that can work for inter-cell distribution when you have a
hybrid environment of NFS and AFS.

I expect that to be a moot point for early adopters of OpenAFS, since they
will already have OpenAFS cells, and we'll be figuring out how to co-exist
with what they already have.   I would hope that sites that adopt EFS and
build out NFSv3 or NFSv4 infrastructure would be able to eventually upgrade
to OpenAFS.

Now, if I may fantasize for a bit -- my dream for 2-3 years from now is to
have a complete set of management tools in EFS for managing OpenAFS, going
even as far as automating the KDC management (look at what EFS 3 does for
SSL and ssh keys to get an idea of the total world-domination approach that
is possible -- EFS owns the entire SSL infrastructure it depends on).  As
the product grows and evolves, that's the ultimate end-game vision that
drives me.

On Sat, Sep 25, 2010 at 12:11 AM, Steven Jenkins
<[email protected]>wrote:

> On Fri, Sep 24, 2010 at 1:28 PM, Phillip Moore
> <[email protected]> wrote:
> > I spent the morning working on this brain dump:
> > http://docs.openefs.org/efs-core-docs/DevGuide/Future/OpenAFS.html
> > Right now, there's a race on between the two sites that are trying to be
> the
> > first to bring up an EFS 3 domain, outside of my team (actually -- the
> first
> > other than me, personally, I think).
> > One of those sites wants to use EFS for OpenAFS environments, and there
> is
> > nothing I personally want more than to get work with OpenAFS again, so
> I'm
> > rooting for them :-P
> > Anyway, check out the doc if you have a few minutes.  It's FAR from
> > complete, and really just a brain dump, but it's touches on the bulk of
> the
> > issues we're going to have to figure out in order to make this work.
> > And make it work is precisely what I intend to do....
>
> I've put together some thoughts.  I'd be happy to provide a git diff
> of your doc if you prefer.  Otherwise, read on...
>
> Disclaimer: this was a brain-dump.  There is clearly more work/thinking
> that
> needs to occur.
>
> The 3 inter-related issues of EFS domains, Kerberos realms and uid/gid
> mappings across domains and realms is really about stitching those
> together.
> There are several projects in OpenAFS (or underway) that will probably
> help address this:
>
> - existing cross-(Kerberos)realm work in OpenAFS.  Currently,  there  is
> some limited cross-realm support (documentation is in the krb.conf and
> krb.excl man pages for OpenAFS).  By using that support, names from
> foreign realms can be treated as local.  Assuming all AFS ids are
> in sync across each cell, one can then configure each cell to trust the
> other cells (assuming that a cell maps to a Kerberos realm).
>
> That's a pretty unrealistic scenario.  It might be useful in some
> cases, but it won't be useful in the assumed more general case where
> the cells have not been centrally managed and thus AFS ids are not
> in sync across cells.
>
> - PTS extensions: c.f.,
>  https://datatracker.ietf.org/doc/draft-brashear-afs3-pts-extended-names/
> to provide a mechanism to map among AFS cells and Kerberos realms,
> as well as help with inconsistent uids in those cells & realms.
>
> Based on that, we should be able to view N realms as a logical whole
> (e.g., by first defining a 'canonical' mapping of uids, then building
> a mapping database.  Note that with the krb.excl file, we can exclude
> id's in certain realms, so a migration could conceivably happen in
> a controlled fashion).
>
> At a rough glance, these will get us very, very close.  Someone should
> touch base with Derrick and do some proof-of-concept mappings to verify
> this.  I don't know the status of that work -- Derrick mentioned today that
> he has some code, but it's not ready for anyone else to start playing with.
>
> - Various people who have played with using AD (or LDAP) as the
> backing store for PTS.  There are also other ways to solve this problem
> that
> have been discussed but aren't necessarily in the 'here's some code' stage.
> These projects may well play a part in a solution set to the cross-realm,
> cross-cell, inconsistent uid namespace problem.
>
> Misc notes:
>
> 1- We shouldn't require uid/gid consolidation to occur as a prequisite to
> adopting OpenEFS -- the organizational maturity bar for that is simply too
> high.
>
> 2- It's worth writing up a sample document describing how we want
> migration from your current multi-cell, multi-realm environment to EFS to
> occur.
>
> Creating mount points: more recent versions of OpenAFS (i.e., virtually
> anything that will be found in production) support dynamic mounting of
> volumes via a magic syntax (the exact syntax escapes me at the moment --
> I've not used it but only seen it mentioned a few times and was
> unable to locate the actual syntax)
>
> e.g.,
> /afs/example.org:user.wpmoore
>
> would be a path that would automagically mount the volume user.wpmoore.
>
> so the necessary fs mkmount could be done as follows:
>
> fs mkmount /afs/example.org:user.wpmoore/some/path some.volume
>
> without requiring any special pre and post mount hacking.
>
> rsync: my understanding is that incremental vos dump/restore
> is quite a bit better now (if I recall correctly, in 2008, Ali Ferguson
> from Morgan said at the OpenAFS workshop that Morgan was using it and
> that he was confident that the bugs had been shaken out of it, but that
> was after numerous failed attempts over the years).
>
> It would be useful to track pros and cons of volumes being per domain or
> per cell.  I don't think most people can make an argument either way (i.e.,
> the number of people who can seriously discuss the tradeoffs is, well,
> tiny).  I think we need to translate this to more 'non-insider-language'
> so that the various users can weigh in on how they would need this to work.
> Off the cuff, I don't see why anyone would really want per-domain over
> per-cell, but there may be some failure scenarios where it would be useful.
>
>
> Steven
> _______________________________________________
> EFS-dev mailing list
> [email protected]
> http://mailman.openefs.org/mailman/listinfo/efs-dev
>

_______________________________________________
EFS-dev mailing list
[email protected]
http://mailman.openefs.org/mailman/listinfo/efs-dev

[EFS-dev] EFS meets OpenAFS: Incremental Volume dump/restore

Reply via email to