Pádraig Brady wrote: ... > I thought a little about this today. Nice description of the issues.
It's probably worth putting something like this somewhere in version control, even if only as a long commit message on whatever change you make. > fallocate() is a feature to quickly allocate space in a file system. > It's useful for 3 things as far as I can see: > > 1. Improved file layout for subsequent access > 2. Immediate indication of ENOSPC > 3. Efficient writing of NUL portions > > Note 1. is somewhat moot with newer file systems that do "delayed allocation". > So what do we need to consider when using fallocate on the destination file? > Considering just cp for the moment, its inputs impacting this are the options: > ... > Copying sparse files > > It's worth noting again, the caveat mentioned above that we > might not recognise some sparse files due to tail allocation. Yes, this is worth repeating ;-) It is surprising, at least in part because significant tail allocation is not common. > Given that we use fiemap (with sync) for sparse files at present, > we can augment the fiemap copying code to use fallocate where appropriate. > So dependent on the options the operations would be: > --sparse=auto => 'Empty' -> 'Empty' > --sparse=always => 'Empty' -> 'Hole' && discard tail allocation > --sparse=never => 'Hole' -> 'Empty' > Perhaps the first case could be simplified to initially doing: > fallocate(dest, blocks*blocksize)) > > Copying normal files > > Note using SEEK_HOLE for this case, would only help > to avoid reading 'Hole' and more likely 'Empty' portions, > and should not impact on the use of fallocate(dest). > > So assuming we initially did: > > if ! --sparse=always > fallocate(dest, st_size) > > That would throw away any tail allocation in the source, > which is probably OK as noted above. In fact we might always > discard tail allocation for consistency, unless we can use fiemap > for all cases. All sounds reasonable. > I'll cook something up on this soon. Thanks.
