Hi Loic,
I saw (if i am not mistaken) that you actually test only encoding ... so your 
idea is to guarantee that the encoding results in the same output and the 
encoding/decoding functionality is validated by the unit tests in each new 
version? 

In principle this restricts the encoding to never change the alignment in the 
future, which might be not optimal. We might get larger registers in the future 
on new CPUs and the alignment might change or they might deal perfectly with 
1-byte alignments. I suggest  to make sure that the new version can decode the 
old format, but it does not need to imply that it encodes it in exactly the 
same way ... this is slighly more complicated however I would feel more 
comfortable if you would do the brute-force decoding check in this 
infrastructure for all plug-ins and leave the flexibility to change the 
encoding format in the future.

Cheers Andreas.

________________________________________
From: [email protected] [[email protected]] on 
behalf of Loic Dachary [[email protected]]
Sent: 13 September 2014 13:50
To: Ceph Development
Subject: Tools and archive to check for non regression of erasure coded content

Hi Ceph,

An erasure coded object stored in Firefly when it was first introduced must be 
decoded by all versions after Firefly. The encoding is done by erasure code 
plugins[1] and they evolve over time[2]. There needs to be a tool to check that 
all content encoded by a given version of the plugin can also be encoded by all 
subsequent versions of the same plugin.

The general idea is to archive objects created with a given Ceph version and 
check them will all subsequent versions, via a teuthology workunit run on all 
supported distributions and architectures.

The ceph_erasure_code_non_regression command[3] creates an object and store it 
on disk or read an object from disk and checks that it can be read. It is used 
to create objects for all relevant variations of parameters of a given erasure 
code plugin. For instance:

ceph_erasure_code_non_regression --stripe-width 4651 --parameter packetsize=32 
--plugin jerasure --parameter technique=blaum_roth --parameter k=6 --parameter 
m=2 --create --base ../../ceph-erasure-code-corpus/v0.85-764-gf3a1532
ceph_erasure_code_non_regression --stripe-width 4651 --parameter packetsize=32 
--plugin jerasure --parameter technique=liber8tion --parameter k=6 --parameter 
m=2 --create --base ../../ceph-erasure-code-corpus/v0.85-764-gf3a1532
etc.

The script[4] and the objects are archived in a repository [5] that can be 
checked by later Ceph versions. The same script is used for checking and 
creating the objects so that there is no risk of confusion. These scripts are 
stored per version because a given script is developed for a given version of 
the plugins.

The encode-decode-non-regression.sh[6] workunit uses these scripts when run 
from teuthology[7] and will run all of them, up to and including the currently 
running ceph version. This ultimately ensures that all archived objects can be 
read on all supported distributions and architectures.

See also http://tracker.ceph.com/issues/9420 which is the ticket associated to 
this work.

Although this all sound sensible to me right now, I would be very interested to 
hear about ideas to make this easier or better :-)

Cheers

[1] firefly erasure code plugins 
https://github.com/ceph/ceph/tree/firefly/src/erasure-code
[2] giant erasure code plugins 
https://github.com/ceph/ceph/tree/v0.85/src/erasure-code
[3] ceph_erasure_code_non_regression 
https://github.com/dachary/ceph/commit/497a82b2b3113dae724a43d8d4c7e430acf44120
[4] non-regression.sh 
https://github.com/dachary/ceph-erasure-code-corpus/blob/master/v0.80.5-226-g71d2562/non-regression.sh
[5] ceph-erasure-code-corpus https://github.com/dachary/ceph-erasure-code-corpus
[6] encode-decode-non-regression.sh 
https://github.com/dachary/ceph/commit/4f7a9f83fb1f037662861e8782c9b90566dcaa31
[7] non regression workload https://github.com/ceph/ceph-qa-suite/pull/136
--
Loïc Dachary, Artisan Logiciel Libre
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to [email protected]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Reply via email to