From: Bryan Green <[EMAIL PROTECTED]>
    Date: Thu, 07 Dec 2006 19:56:46 -0800
[...]    
                   By comparison, 'mount -t lustre' pretty much characterizes 
the
    simplicity of 1.6.
    
Agreed.

    > Are you worrying about the kernel patching and other software installation
    > issues, or about how to set up the fs itself once you've got the software
    > together? 
    
    Kernel patching.  For software installation, the lustre ebuild that was put 
on
    this list recently seemed to do the trick for me, and setup was pretty easy.

Yeah, I think that ebuild came from us.
    
    I was able to patch the kernel, but the server was somewhat unstable.  

Do you remember how it was unstable?  That's the kind of thing I'd very much
like to understand, as we're proposing to depend heavily on it.  If there are
issues, whether specifically tied to our patches or not, I'd love to know
about them.

                                                                           
Actually,
    my memory is hazy.  I used the 'lustre-sources' ebuild, which effectively 
packaged
    up the patches.  It was a 2.6.15 kernel.  I also tried to make a custom 
kernel for
    lustre 1.4, but ultimately hit too many roadblocks.  I did learn a bit 
about how
    to use 'quilt' though.
    
Hmmm.  Maybe not.  Our stuff ditches quilt.

    > 
    > Very briefly, the kernel-patching issue is an ongoing headache.  Lustre
    > patches vfs in non-trivial ways.  Unfortunately, everybody else does too. 
 It
    > becomes a fairly ugly patch-merging problem.  If you want, I can detail 
the
    > process I've settled on for coming up with a kernel patchset, but you 
won't
    > like it.  There are similar issues around ldiskfs and other bits, but 
they're
    > simpler, at least by comparison.  
    
    I'd be interested in some of the details - off-list if that is more 
appropriate,
    though it might be of interest to others on the list as well.  Once you 
download a
    1.6 beta, how do you produce a kernel for Gentoo?  Do you patch a 
gentoo-sources
    kernel, a vanilla-sources kernel, or something else?  The ideal would 
perhaps be
    to have a 'lustre-sources' ebuild in the gentoo-science overlay.  :)
    
We can start here and if people get sick of hearing about it, take it
someplace else.

The approach taken by most of the patches in lustre/kernel_patches/patches is,
for any particular base kernel, go through and add the datastructures and
logic to implement the lustre-specific functionality, which involves changes
to core vfs datastructures, sometimes changes in locking strategy, changes to
arglists etc.  They generally start with RHEL or SLES kernels.  There are a
couple of problems for the rest of us with that; (a) the RHEL and SLES kernels
tend to be a bit antiquated, and (b) the vendors also tend to make quite free
with the patches to core datastructures.  Some of the latter is actually due
to the former; because they're using antique kernels, but they want some bits
of the latest and greatest fixes, they selectively import more modern stuff
as their own vendor patches.

The result of all this is the layer of patches to implement lustre
functionality, when viewed from the point of view of an unpatched kernel,
makes no sense at all.  If you try to install such a patchset on a vanilla-ish
kernel, even if you get the right base version, you'll get tons of rejects,
and when you look at them, it's obvious that they depend on stuff that's not
there.  

The way I settled on getting to a patchset which doesn't depend on all kinds
of RHEL or SLES was to essentially build a RHEL (fc5, if I recall) kernel,
then "subtract out" the RHEL-ness, then take the resultant kernel and diff it
against a virgin one.  That description covers a multitude of sins.
Subtracting out the RHEL-ness (by essentially doing patch -R, then cleaning up
the mess) has the inverse of many of the same problems that you get trying to
patch lustre on top of a vanilla kernel; arglists don't match etc.  The only
piece of good news is that you at that point have three datapoints to work
with; vanilla, RHEL, and RHEL+lustre, so it's rather easier (though not
exactly easy) to divine what the intention of the lustre patches is, and work
out how to do the analogous thing without RHEL.  Even at that, I had to be
wary of some bits of code which disappeared in the RHEL transition, but came
back when I backed out RHEL, which needed to be given the same treatment as
other analogous bits of code which were still there.

The bottom line is that you have to understand enough about what the lustre
patches are accomplishing that you can come up with analogous patches for the
kernel of your choice, which happens not to be one of the ones cfs ships
patchsets for.

The first time I did all that stuff it took something like 3-4 weeks, with
numerous false starts.  The most recent time I did it, it was something like a
couple of days, though that's misleading, because it was very close to the
previous version I was upgrading from.  If I had to do it today, starting from
scratch, I'd estimate 3-5 days.

You'll note that nowhere in that set of stuff did I utter the word "gentoo".
The kernel we're using is not really a gentoo kernel.  We're mips-based, so
we're starting from something that's perilously close to the mainline
linux-mips kernel, then building it up from there.  Thankfully, the linux-mips
guys don't go in for heavy-duty patching of non-platform-related stuff, so
from the point of view of adding lustre to it, it's virtually identical to a
vanilla kernel.org kernel.  I believe we may have pulled in a small number of
the gentoo kernel patches, but I'm not the kernel wizard, so don't know off
the top of my head.  From my point of view, it looks vanilla.

It's not clear to me how you'd go about making lustre installation work in a
more gentoo-ish kind of way, at least not without a very large amount of work.
I guess I think the most likely path forward would be to work with cfs to try
to get them to support more vanilla kernels, then try to work on the rest of
the gentoo kernel patches to make them fit better.  Unfortunately, I suspect
that that still isn't going to be easy, as you've got the classic
patch-collision problem happening all over the place.  I suspect that
following that approach would end up with two parallel streams of patches, one
for lustre kernels and one for non-lustre kernels.  Unless you can get the
gentoo community to roll lustre in as a standard part of the gentoo patchset.
That probably requires that somebody do a lustre patchset for every kernel
version.  Unlikely.

You could, of course, invert the problem and layer lustre on top, but until
such time as gentoo is much more prevalent, I doubt you'll get cfs to do that,
which means that somebody in the gentoo community gets signed up for the task
of re-doing the process I outlined above, for every gentoo kernel which comes
down the pike.  I'm not holding my breath for that one either.

A longer term solution is to do some combination of remodularizing vfs and
recasting the lustre stuff so as to depend less on getting its fingers into
the guts.  I once spent some time looking into that, and I do believe it's
possible, but it would take some work, and would really need to be done in
concert with the rest of the core kernel guys, and I ran out of time to pursue
it.  In the meantime, the more the gentoo community can resist the temptation
to patch the kernel (at least the vfs parts of it), the easier it will be to
add lustre.

Separate from the core kernel patching issues (Hah!  you thought I was done,
didn't you?) there's stuff around ldiskfs.  The strategy used by lustre is to
grab a copy of ext3, cart it off to the side, change all the names, insert a
few other strategically placed patches, and call it ldiskfs.  That then
becomes the basic facility by which actual bits are stored on block devices.
The issue there is roughly similar to the core kernel, but not as severe, ie
any given patchset depends heavily on which specific version of ext3 you
started from.  Update the kernel, and if it contained fixes to ext3, you've
got a problem.

In practice, this issue tends to be swamped by the core kernel one, ie getting
lustre going on a specific kernel binds you so tightly to that kernel that you
don't have to worry too much about changing ext3.  But at such time as the
kernel integration issue becomes easier to deal with, this one will have to be
addressed as well.  My preferred solution would be to simply snag a copy of
ext3 that works, do the foozling once, then make that code be a permanent part
of the lustre distribution, rather than relying on constructing it on the
fly.  But that's up to cfs.

So anyhow, the short answer is that there's no real rocket science involved in
getting lustre to work on a gentoo system, but it does take some work, and if
you do it the way I did it, you end up with a system which is more constrained
than a normal gentoo system, because you're no longer free to update the
kernel using the stock tools.  For us it's not a huge deal, but I suspect that
some of the gentoo community will balk at that.
[...]
    
    Are you considering getting support from CFS at some point?  

We are working with cfs.  That doesn't mean they're doing all our work for us
:-} 

Honestly, a big part of it is just plain old market sensitivity.  Cfs is
paying attention to where their bread and butter is.  So far, that's not
gentoo.  Perhaps if sicortex is wildly successful we'll be able to change that
equation :-}
                                                                 
                                                                 Sorry, you 
don't have
    to answer if that is a sensitive question.  But part of this thread has 
been the
    topic of encouraging CFS to support Gentoo.  Interestingly, my colleague, 
who is
    in charge of installing Lustre (1.4) on our test system, is talking to CFS 
about
    supporting a vanilla kernel configuration.  The reason?  We can't make the 
system
    stable with a SLES kernel.  It was stable for a long time with Gentoo.  

I have not observed stability problems; it pretty much just works.  If you can
say any more about what issues you ran across, I'd love to hear it.

                                                                            Now 
they
    seem to have gotten it stable with SLES plus a vanilla 2.6.19 kernel (which 
of
    course does not have the Lustre patches).  So they want Suse to provide a 
newer
    SLES kernel with the Lustre patches, and CFS to support that configuration.
    
Well, ok, I dunno what to tell you about working with the vendors on that
one. 

We actually did consider running RHEL or SLES kernels, but remember we're
mips, and looking at the state of the mips support in those kernels, it was
not a pretty picture.  We also didn't really want to be in the game of having
that much of a frankenstein system.  So our approach has boiled down to 

1.  Stick close to vanilla
2.  Make mips work
3.  Do whatever we need to do to make lustre layer on top of that

Based on what you've said, I wouldn't fool around with SLES, I'd just figure
out what close-to-vanilla kernel you want to start from (picking one you think
you can live with for a while) and do some part of what I described above.
You might have a somewhat easier time of it if you started with 2.6.18, as I
believe there's a cfs-supplied patchset for that one.  If you want to start
from a gentoo 2.6.18 one, I suspect your task will be to start with vanilla,
make that work, then work out how to re-apply the gentoo patches.  Re getting
cfs to help, my bet would be that you'll have an easier time getting the
gentoo community to create patches that are amenable to going on top of a
lustre-ized vanilla kernel (and relying on cfs to support vanilla kernels)
than you will getting cfs to generate patches to go on top of gentoo.  If you
watch the lustre lists, you'll see more people asking for vanilla than are
asking for gentoo.

Under no circumstances would I advocate getting a kernel working at some
level, then trying to use the kernel.org patches, or anybody else's, to move
it forward.  I tried that a few times, and while I actually did find a couple
of combinations that worked, most of the ones I tried blew up in my face.
It's the same problem; there's all kinds of activity going on in vfs.  I hope
that situation doesn't continue indefinitely, but that's the way it seems to
be right now.

I've gone on long enough for now.  Feel free to dig deeper if you dare :-}
-- 
[email protected] mailing list

Reply via email to