O> I'm counting on kmalloc to return a cache aligned buffer.  I found
> some reason to think it does, but I don't remember offhand what that

Its defined to

> reason was, or if it's configurable per-architecture.  The buffer has
> to be both physically and virtually contiguous, I was tempted to just
> allocate a page and waste some space but we've got 64K pages, so I'm a
> bit more sensitive about that.

Ok I was expecting a different approach if you mark the field with the
magic  ____cacheline_aligned tag after it (ie int foo ____blah_aligned;)
the compiler should align it all for you , which is probably cleaner if
it works.
