Yves Martin <ymartin1...@gmail.com> writes: > I would like to better understand how logical addressing has impact on > packed shard. My objective is to provide a "unpack" feature with logical > addressing in my version of "fsfs-reshard.py" script: > https://github.com/ymartin59/svn-fsfs-reshard > > Do you have any hints to help me in that job or do you already know it > is irrelevant and should be considered as a pure waste of time ?
Logical addressing refers to items in the revision file by an index number rather than an offset. The revision file also contains an index map that allows index numbers to be converted to offsets and offsets to be converted to index numbers. The index map also contains the length of each item and the revision number; the revision number is trivial for an unpacked revision file. A pack file has a similar index map but in this case the revision number varies. The 1.9 tool svnfsfs can dump and load the index maps of revision and pack files. An example (shard size 4): $ svnfsfs dump-index repo 1 Start Length Type Revision Item Checksum 0 2a chgs 3 1 5f5b9c31 2a 2a chgs 2 1 efee8d5b 54 2a chgs 1 1 eee1b382 7e 1 chgs 0 1 f28a4f1d 7f 79 node 3 2 7e6fca28 f8 72 drep 3 5 21933af7 16a 55 drep 2 5 6f371fa3 1bf 39 drep 1 5 8da855e0 1f8 11 drep 0 3 60232b75 209 9d node 1 4 d684e01d 2a6 1b frep 1 3 1823e0a0 2c1 9d node 2 4 3bd76335 35e 1b frep 2 3 5b6fd650 379 9d node 3 4 70fb00b0 416 1b frep 3 3 1f9eb8e6 431 78 node 2 2 7c048873 4a9 78 node 1 2 cde8ee37 521 59 node 0 2 403dbe48 Note that the items that make up a revision are not consecutive in the pack file. In principal the unpack is not hard. Read the index map from the pack file. Then construct the revision files by extracting items from the pack files and adding them to revision files, keeping track of the new offsets. Do one revision file at a time or multiple revision files in parallel. Once all the items are present in a revision file construct the new index maps for each revision file. It might be tricky to implement this in Python simply because you need code to dump and load the index maps. You would have to write that code from scratch, or run the svnfsfs tool, or write a Python binding to the C code. An alternative would be to implement an unpack operation for svnadmin in C and use the existing C code to handle the index maps. -- Philip Martin WANdisco