Hi Andreas,
On 15/09/2014 09:27, Andreas Joachim Peters wrote:> Hi Loic, > I saw (if i am not mistaken) that you actually test only encoding ... so your > idea is to guarantee that the encoding results in the same output and the > encoding/decoding functionality is validated by the unit tests in each new > version? You are correct: it only partially checks the encoding. It should also try decoding with various combinations of erasures and check that the content can actually be reconstructed. I did not go after that because the existing unit tests already perform this verification. But it should also be done in this context because the code may have changed in a way that makes backward compatibility slightly different when it comes to decoding with erasures. Since the unit tests focus on the version currently developed, there is a chance that a subtle difference is missed. > In principle this restricts the encoding to never change the alignment in the > future, which might be not optimal. We might get larger registers in the > future on new CPUs and the alignment might change or they might deal > perfectly with 1-byte alignments. I suggest to make sure that the new > version can decode the old format, but it does not need to imply that it > encodes it in exactly the same way ... this is slighly more complicated > however I would feel more comfortable if you would do the brute-force > decoding check in this infrastructure for all plug-ins and leave the > flexibility to change the encoding format in the future. This is a very good point. I'm not sure what the correct answer is but I would also be inclined to leave it until we face a format change. Cheers > > Cheers Andreas. > > ________________________________________ > From: [email protected] [[email protected]] on > behalf of Loic Dachary [[email protected]] > Sent: 13 September 2014 13:50 > To: Ceph Development > Subject: Tools and archive to check for non regression of erasure coded > content > > Hi Ceph, > > An erasure coded object stored in Firefly when it was first introduced must > be decoded by all versions after Firefly. The encoding is done by erasure > code plugins[1] and they evolve over time[2]. There needs to be a tool to > check that all content encoded by a given version of the plugin can also be > encoded by all subsequent versions of the same plugin. > > The general idea is to archive objects created with a given Ceph version and > check them will all subsequent versions, via a teuthology workunit run on all > supported distributions and architectures. > > The ceph_erasure_code_non_regression command[3] creates an object and store > it on disk or read an object from disk and checks that it can be read. It is > used to create objects for all relevant variations of parameters of a given > erasure code plugin. For instance: > > ceph_erasure_code_non_regression --stripe-width 4651 --parameter > packetsize=32 --plugin jerasure --parameter technique=blaum_roth --parameter > k=6 --parameter m=2 --create --base > ../../ceph-erasure-code-corpus/v0.85-764-gf3a1532 > ceph_erasure_code_non_regression --stripe-width 4651 --parameter > packetsize=32 --plugin jerasure --parameter technique=liber8tion --parameter > k=6 --parameter m=2 --create --base > ../../ceph-erasure-code-corpus/v0.85-764-gf3a1532 > etc. > > The script[4] and the objects are archived in a repository [5] that can be > checked by later Ceph versions. The same script is used for checking and > creating the objects so that there is no risk of confusion. These scripts are > stored per version because a given script is developed for a given version of > the plugins. > > The encode-decode-non-regression.sh[6] workunit uses these scripts when run > from teuthology[7] and will run all of them, up to and including the > currently running ceph version. This ultimately ensures that all archived > objects can be read on all supported distributions and architectures. > > See also http://tracker.ceph.com/issues/9420 which is the ticket associated > to this work. > > Although this all sound sensible to me right now, I would be very interested > to hear about ideas to make this easier or better :-) > > Cheers > > [1] firefly erasure code plugins > https://github.com/ceph/ceph/tree/firefly/src/erasure-code > [2] giant erasure code plugins > https://github.com/ceph/ceph/tree/v0.85/src/erasure-code > [3] ceph_erasure_code_non_regression > https://github.com/dachary/ceph/commit/497a82b2b3113dae724a43d8d4c7e430acf44120 > [4] non-regression.sh > https://github.com/dachary/ceph-erasure-code-corpus/blob/master/v0.80.5-226-g71d2562/non-regression.sh > [5] ceph-erasure-code-corpus > https://github.com/dachary/ceph-erasure-code-corpus > [6] encode-decode-non-regression.sh > https://github.com/dachary/ceph/commit/4f7a9f83fb1f037662861e8782c9b90566dcaa31 > [7] non regression workload https://github.com/ceph/ceph-qa-suite/pull/136 > -- > Loïc Dachary, Artisan Logiciel Libre > -- Loïc Dachary, Artisan Logiciel Libre
signature.asc
Description: OpenPGP digital signature
