Hi Andreas,

On 15/09/2014 09:27, Andreas Joachim Peters wrote:> Hi Loic,
> I saw (if i am not mistaken) that you actually test only encoding ... so your 
> idea is to guarantee that the encoding results in the same output and the 
> encoding/decoding functionality is validated by the unit tests in each new 
> version? 

You are correct: it only partially checks the encoding. It should also try 
decoding with various combinations of erasures and check that the content can 
actually be reconstructed. I did not go after that because the existing unit 
tests already perform this verification. But it should also be done in this 
context because the code may have changed in a way that makes backward 
compatibility slightly different when it comes to decoding with erasures. Since 
the unit tests focus on the version currently developed, there is a chance that 
a subtle difference is missed.

> In principle this restricts the encoding to never change the alignment in the 
> future, which might be not optimal. We might get larger registers in the 
> future on new CPUs and the alignment might change or they might deal 
> perfectly with 1-byte alignments. I suggest  to make sure that the new 
> version can decode the old format, but it does not need to imply that it 
> encodes it in exactly the same way ... this is slighly more complicated 
> however I would feel more comfortable if you would do the brute-force 
> decoding check in this infrastructure for all plug-ins and leave the 
> flexibility to change the encoding format in the future.

This is a very good point. I'm not sure what the correct answer is but I would 
also be inclined to leave it until we face a format change.

Cheers

> 
> Cheers Andreas.
> 
> ________________________________________
> From: [email protected] [[email protected]] on 
> behalf of Loic Dachary [[email protected]]
> Sent: 13 September 2014 13:50
> To: Ceph Development
> Subject: Tools and archive to check for non regression of erasure coded 
> content
> 
> Hi Ceph,
> 
> An erasure coded object stored in Firefly when it was first introduced must 
> be decoded by all versions after Firefly. The encoding is done by erasure 
> code plugins[1] and they evolve over time[2]. There needs to be a tool to 
> check that all content encoded by a given version of the plugin can also be 
> encoded by all subsequent versions of the same plugin.
> 
> The general idea is to archive objects created with a given Ceph version and 
> check them will all subsequent versions, via a teuthology workunit run on all 
> supported distributions and architectures.
> 
> The ceph_erasure_code_non_regression command[3] creates an object and store 
> it on disk or read an object from disk and checks that it can be read. It is 
> used to create objects for all relevant variations of parameters of a given 
> erasure code plugin. For instance:
> 
> ceph_erasure_code_non_regression --stripe-width 4651 --parameter 
> packetsize=32 --plugin jerasure --parameter technique=blaum_roth --parameter 
> k=6 --parameter m=2 --create --base 
> ../../ceph-erasure-code-corpus/v0.85-764-gf3a1532
> ceph_erasure_code_non_regression --stripe-width 4651 --parameter 
> packetsize=32 --plugin jerasure --parameter technique=liber8tion --parameter 
> k=6 --parameter m=2 --create --base 
> ../../ceph-erasure-code-corpus/v0.85-764-gf3a1532
> etc.
> 
> The script[4] and the objects are archived in a repository [5] that can be 
> checked by later Ceph versions. The same script is used for checking and 
> creating the objects so that there is no risk of confusion. These scripts are 
> stored per version because a given script is developed for a given version of 
> the plugins.
> 
> The encode-decode-non-regression.sh[6] workunit uses these scripts when run 
> from teuthology[7] and will run all of them, up to and including the 
> currently running ceph version. This ultimately ensures that all archived 
> objects can be read on all supported distributions and architectures.
> 
> See also http://tracker.ceph.com/issues/9420 which is the ticket associated 
> to this work.
> 
> Although this all sound sensible to me right now, I would be very interested 
> to hear about ideas to make this easier or better :-)
> 
> Cheers
> 
> [1] firefly erasure code plugins 
> https://github.com/ceph/ceph/tree/firefly/src/erasure-code
> [2] giant erasure code plugins 
> https://github.com/ceph/ceph/tree/v0.85/src/erasure-code
> [3] ceph_erasure_code_non_regression 
> https://github.com/dachary/ceph/commit/497a82b2b3113dae724a43d8d4c7e430acf44120
> [4] non-regression.sh 
> https://github.com/dachary/ceph-erasure-code-corpus/blob/master/v0.80.5-226-g71d2562/non-regression.sh
> [5] ceph-erasure-code-corpus 
> https://github.com/dachary/ceph-erasure-code-corpus
> [6] encode-decode-non-regression.sh 
> https://github.com/dachary/ceph/commit/4f7a9f83fb1f037662861e8782c9b90566dcaa31
> [7] non regression workload https://github.com/ceph/ceph-qa-suite/pull/136
> --
> Loïc Dachary, Artisan Logiciel Libre
> 

-- 
Loïc Dachary, Artisan Logiciel Libre

Attachment: signature.asc
Description: OpenPGP digital signature

Reply via email to