Some discussion on IRC concerning editing manifest records in-place, rather than via move-into-place of a tempfile, boiled down to "manifest records should not cross OS page boundaries", and therefore "manifest record length (i.e., the number of bytes for one revision's manifest entry) should be a power of 2".
This came up in context of a proposal involving in-place editing of manifests which hopefully one of us will bring to this list on a separate email if it's fruitable. [[[ 00:54:59 @danielsh | when you say "I/O reordering", do you mean 00:55:11 @danielsh | "write()s by any one process are (not) done in chronological order"? 00:55:23 @stefan2 | exactly 00:55:32 @danielsh | ok 00:55:36 @danielsh | so, yes, that's one point I had in mind 00:55:41 @danielsh | today we only rely on move-into-place 00:55:52 @danielsh | any [pg]-esque suggestion means we require something new 00:56:05 @stefan2 | but they should be come visible in chronological order between processes - at least within the same | address page 00:56:06 @danielsh | in this case, correct handling of overwriting of bytes in a file 00:56:34 @danielsh | re what you just said 00:56:40 @danielsh | why? also between threads of the same process 00:56:53 @danielsh | [ and I don't follow how/why address spaces factor in ] 00:57:24 @stefan2 | file caches use memory? 00:57:56 @danielsh | ah. 00:58:01 @stefan2 | the OS might reorder stuff for different pages but not for the same 00:58:15 @stefan2 | (no idea how relevant / likely that actually is) 00:58:18 @danielsh | so you said should == 'will probably be' rather than 'should == need to be' 00:58:38 @stefan2 | yes 00:59:02 @danielsh | so. if the manifest record crosses a page boundary we might have a VERY edge case bug? 00:59:42 @danielsh | have to admit I wouldn't have considered that... I would have stopped at the file level not at the | page-that-file-is-cached-in level 01:01:16 @stefan2 | I have no idea how an OS makes shared file content visible to the respective processes. the edge | case might happen only if a "record" crosses the page boundary 01:02:16 @danielsh | ack 01:02:49 @danielsh | you're saying the OS might not be ACID'ing the view of the file. (since it presents a view that | never was present on disk) 01:02:59 @danielsh | who knows, that may be a risk we'll take 01:03:25 @stefan2 | there is copy-on-write semantics for (some forms of) memory mapped files under windows. Depending on | when the respective page is mapped it may see the old or new content (the old page got copied while | the new one didn't need to back then) 01:03:45 @peterS | danielsh: if the manifest record crosses a page boundary, the reason this might matter is in case of | an OS crash. one block is written, the next not 01:04:29 @stefan2 | danielsh: yes. but I have no actual facts / pointers to support that suspicion. 01:04:34 @peterS | but if we're using 16-byte records and aligning to 16 byte boundaries, it would take a very strange | disk block size to make this possible 01:05:34 @peterS | if we think it's unreasonable to assume 1000 revprop blobs will always average less than 256 GB per | blob, though, then we'll need more than 16 bytes. or a binary representation as I've argued for | before 01:05:34 @danielsh | stefan2, peterS: so between the two of you you're arguing that records should not cross page | boundaries 01:05:56 @danielsh | fine, and I'll raise that point on dev@ for posterity 01:06:18 @peterS | I think that's reasonable. and we don't even know the disk or filesystem block size, i.e., we can't | _really_ assume it's greater than, say, 512 bytes 01:06:31 @peterS | so we really want to align on a power-of-two 01:06:33 @stefan2 | peterS: any 2^N size should work 01:06:42 @peterS | ...stefan2 agreed (: 01:06:48 @stefan2 | indeed ;) 01:07:09 @danielsh | +1 ]]]