Philip Martin wrote on Thu, 13 Jul 2017 21:36 +0100: > Branko Čibej <br...@apache.org> writes: > > > Whether this actually forces a format bump or not is a different > > question which I don't know the answer to. > > I think we would have to bump. The old code could either read the > pre-delta or the post-delta files, depending on how we decided to name > things, but not both. Either way the old code would not be able to read > all the revision files and the repository would look broken.
If we invent a "second form of revision file distinguished by name or path", then yes, we would require a format bump, to ensure all readers know to cope with the situation that the revision file has been unlinked from the currently-well-known name. It would also require us to figure out how to update all codepaths that open a revision file, to do the correct triple lookup (old name, new name, packed name). When I said format bump wouldn't be required, I envisioned that the rev file that contains a PLAIN rep could be replaced by a rev file that contains a DELTA rep, *if the DELTA rep is shorter*. A replacement rev file could be prepared (and atomically renamed into place) that replaces the PLAIN rep by the shorter DELTA rep, and updates the unexpanded-len member of the node-rev header. That would result in some never-read padding bytes, but FSFS f7's packing operation could regain them. (If the number of digits of unexpanded-len changed, the replacement rev file would need to add some padding to ensure the number of bytes in the node-rev header — and hence, offsets to the remainder of the file — don't change.) Existing readers don't care whether a rep is a DELTA rep or a PLAIN rep; they just care that it starts at the given byte offset, has "ENDREP\n" after the given length, the resulting file checksums to the given value. Now that I write this down I realize that rep-sharing complicates matters. The replacement would only be sound if the rep is not the target of rep-sharing from another revision; that is easily handled by only adding the rep to rep-cache.db after replacing the PLAIN rep by the equivalent, shorter DELTA rep. The remaining problem is what to do if the rep is shared between two noderevs inside a single revision, but <handwave>that's solvable</handwave>. Regarding the recompress-at-pack alternative, I note that we (= the 1.9 release notes) recommend to pack FSFS f7 repositories regularly. Cheers, Daniel