On Tue, 16 Dec 2008, Peter Wemm wrote:


let me Cc: virtualization and current@ for this reply (to have the
explicit HEADS UP) for what Peter pointed out.

The first to reply may want to trim the Cc: list again; possibly to
only virtualization.

On Sat, Dec 13, 2008 at 1:59 PM, Bjoern A. Zeeb <b...@freebsd.org> wrote:
 De-virtualize the MD5 context for TCP initial seq number generation
 and make it a function local variable like we do almost everywhere
 inside the kernel.
--- head/sys/netinet/vinet.h    Sat Dec 13 21:17:46 2008        (r186056)
+++ head/sys/netinet/vinet.h    Sat Dec 13 21:59:18 2008        (r186057)
@@ -142,7 +142,6 @@ struct vnet_inet {
       int     _isn_last_reseed;
       u_int32_t _isn_offset;
       u_int32_t _isn_offset_old;
-       MD5_CTX _isn_ctx;

       struct  inpcbhead _udb;
       struct  inpcbinfo _udbinfo;

I'm bitterly unhappy with this.  Every time these structs are touched,
either directly or indirectly, there is a guaranteed ABI breakage with
kernel modules.

There needs to be a __FreeBSD_version bump (or something similar)
every time any of these structures change, and any kernel modules
*must* be prevented from loading.  It can't be a >= some version, it
has to be an exact match.

With the global variable method the linker calculates the offsets at
load time.  With this abomination, the knowledge of the structure
layout is compiled into the generated code with no chance of a fixup.
There are no sanity checks.  If a module that referenced _isn_ctx is
loaded the old way, there would be a link error.  The new way will
have it silently trash _udb instead.

There is a whole world of hurt being unleashed here.  I suspect that
we might even be possible to run out of digits in our
__FreeBSD_version numbering scheme.

I know we've talked about ABI-stable alternatives after the
infrastructure is done, but it needs to be absolutely clear that
touching this structure in the current form is a guaranteed ABI break,
and is currently undetected.  If you boot kernel.old, you're hosed.
(Again, with -current, this might be ok for a while, but this scheme
won't wash with ports or other 3rd party modules)
In the mean time, I'd like to see some compile-time asserts in there
to make sure there are no accidental size changes of this structure.

Ok, there are multiple things here:

Size changes:

* I think catching pure size changes alone are not enough; any
  changes to the structs matter.
  Think of removing an int and adding an int (even at the same
  place). Something working on this will trash it.

* The size changes of course would guard about non-obvious, indirect
  struct changes like (just an example) a change to struct inpcbinfo
  that make me worry even more (like they seem to worry you).

More on structure changes:

* While this is on HEAD (refering to "Again, with -current, this might
  be ok for a while ..") I expected the following (major) changes
  coming with the continuing integration and testing:

  1) Final passthrough on the set of virtualized variables. That might
     happen after Step #3 when people can actually test of SVN.

  2) During Benchmarking - this would be about the same time - there
     might be shuffling.

  3) Before we turn to a STABLE branch. *fear*

* Again it's the indirect changes as said above (which are very

ABI problem:

* I agree that there are unaddressed challenges here.

* Ideally we want version checking - at least at load time for modules -
  per container struct, as people do not want to change and recompile
  everything if say something in gif or ipsec changes which might not
  affect them. netgraph has done something like this but my feeling is
  that that would be the wrong way to go, especially wrt to

* I am not sure if padding as we had it before will work for stable
  branches. We need to think of this problem as well. To my
  understanding MAC has another really large structure with sufficient
  padding but it's a subsystem more or less living on its own on the
  side, not as heavily changes between branches and MFCed as the
  netstack and it's (function) pointers there and less direct members.

* If you have suggestions or solutions, please share them.

* ..


* I was aware of the problem and failed on two fronts here:
  1) Doing my commit after Marko and forgetting about it going from
     the p4 branches to SVN.
  2) Forgetting to mention this in the HEADs UP:(

* The http://wiki.freebsd.org/Image/BeginnersGuideFAQ had a (somewhat
  hidden) "A single change would require complete recompiliation."
  I factored it out but will need to be more explicit there or refer
  to this thread.

* Thanks a lot for sending this out to comitters and making them aware
  of the problem. I Cc:ed current@ and virtualization@ in the reply
  and change the subject.

* I am happy there is another pair of eyes here now:) After losing
  the initial three other reviewers ...

* I think the current mode (is?/)was "getting the infrastructre in"
  and get more hands (again) for the remaining parts when inevitable
  confronted with it and the huge pile of changes is gone and one
  might see more clear again.

* We may want to always keep in mind that, not for 8 but maybe for 9,
  people may want to further virtualize more subsystems so whatever
  we do should possibly be general enough to work for other parts
  of the kernel as well.

* As a very last resort at the moment (which would be a pitty) we
  can still do the rollback, keep the globals (as default) and
  only have virtualization optional. I would expect there to be
  some "keeping in sync" problems but it would be a lot easier to
  have both working and have people to adhere to the ``V_rules''
  than on the side. It would be kind of hard for 3rd party vendors
  to supply (binary) modules with that then though.
  I really hope we won't need to go there.


Bjoern A. Zeeb                      The greatest risk is not taking one.
freebsd-virtualization@freebsd.org mailing list
To unsubscribe, send any mail to 

Reply via email to