On Mon, Jan 16, 2012 at 3:46 PM, Sage Weil <[email protected]> wrote:
> When we can't handle a change with a compatible encoding change, we can
> introduce a feature bits and conditionally encode old formats for old
> peers. This is just more work and eats into a more limited feature bit
> space.
I'd like to hear a lot more about this — it's the real problem. I
think we've already got one example in the codebase of encoding for
backwards-compatibility, and it's a huge mess.
Could we construct reasonable macros/functions that take in compat
bits from the daemon we're targeting and then select the appropriate
encoder based on those, for instance? It certainly shouldn't be hacked
together on a case-by-case basis because we're going to start running
into a lot more of these things as we start testing for
forwards-and-backwards compatibility.
> There are two things going on in this branch. The first part is a
> 'standard' way of constructing the encode/decode functions to facilitate
> backward and forward compatibility and incompatibility detection. The
> basic scheme is:
>
> 1 byte - version of this encoding
> 1 byte - incompat version.. the oldest code version we expect to be able
> to decode this
> 4 bytes - length of payload
> ... data ...
>
> In general, when we decode, we verify that the incompat version is <= our
> (code) version. If not, we throw an exception. Then we decode the
> payload, using the version for any conditionals we need to (e.g. to skip
> newly added fields, etc.). We skip any data at the end.
>
> When we revise an encoding, we should add new fields at the end, and in
> general leave old fields in place, ideally with values that won't confuse
> old code.
This sounds good to me!
> To make this painless, there are a few new macros to do the encode/decode
> boilerplate. If the encode/decode functions we originally
>
> void pool_snap_info_t::encode(bufferlist& bl) const
> {
> __u8 struct_v = 1;
> ::encode(struct_v, bl);
> ::encode(snapid, bl);
> ::encode(stamp, bl);
> ::encode(name, bl);
> }
>
> void pool_snap_info_t::decode(bufferlist::iterator& bl)
> {
> __u8 struct_v;
> ::decode(struct_v, bl);
> ::decode(snapid, bl);
> ::decode(stamp, bl);
> ::decode(name, bl);
> }
>
> then we would revise them to be
>
> void pool_snap_info_t::encode(bufferlist& bl) const
> {
> ENCODE_START(2, 2, bl);
>
> New version is 2. v1 code can't decode this, so the second argument
> (incompat) is also 2.
>
> ::encode(snapid, bl);
> ::encode(stamp, bl);
> ::encode(name, bl);
> ::encode(new_thing, bl);
> ENCODE_FINISH();
> }
>
> void pool_snap_info_t::decode(bufferlist::iterator& bl)
> {
> DECODE_START_LEGACY(2, bl, 2);
>
> We can still decode v1 code, but it doesn't have the (new) length and
> incompat version fields, so use the _LEGACY macro. The second 2 means we
> started using the new approach with v2.
Is there a DECODE_START (non-legacy) for new structs?
> This requires and initial incompat change to add the length and incompat
> fields, but then we can generally add things without breakage.
Hmm...if we come up with something that talks feature bits we could do
this nicely with a compat bit. :)
> The second question is how to test compatibility between different
> versions of code. There are a few parts to this.
>
> First, a ceph-dencoder tool is compiled for each version of the code that
> is able to encode, decode, and dump (in json) whatever structures we
> support. It works something like this:
>
> ceph-dencoder object_info_t -i inputfile decode dump_json
>
> to read in encoded data, decode it, and dump it into json. We can do a
> trivial identity check (that decode of encode matches) with
>
> ceph-dencoder object_info_t -i inputfile decode dump_json > /tmp/a
> ceph-dencoder object_info_t -i inputfile decode encode decode dump_json >
> /tmp/b
> cmp /tmp/a tmp/b
>
> Obviously that should always pass. For testing cross-version encoding, we
> need a ceph-dencoder and a corpus of objects encoded for each version.
> Assuming you have that, you can (a) make sure we can decode anything from
> other versions without crashing, (b) compare the dumps between version and
> whitelist changes (e.g., when fields are added or removed). You can also
> specify feature bits to test encoding for older versions, and take, say
> everything for the v0.42 corpus, encode for the v0.40 feature bits, and
> verify that the 0.40 version of ceph-dencoder can handle it. And
> verify+whitelist diffs.
Awesome!
> How to build the per-version corpus? We can write unit tests that
> explicitly generate interesting object instances. That's tedious and time
> consuming, but probably best, since the developer knows what corner cases
> are interesting.
If we've got a single tool that can handle all the structs we can
presumably encode each struct from a given version?
Obviously constructing actual state would require more work but just
checking the default constructor version of each struct would go a
long, loooong way toward automating encode/decode checks.
-off to review the actual code,
Greg
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to [email protected]
More majordomo info at http://vger.kernel.org/majordomo-info.html