On Thu, Jul 15, 2010 at 12:51:36AM +0100, Pádraig Brady wrote: > On 14/07/10 18:45, Paul Eggert wrote:
First and foremost, I re-concur with the broad strokes of the --sparse={always,never,auto} conversation. I think you all knew that, though ;-) > > It's not just fiemap. It's also the Solaris interface with SEEK_HOLE > > and SEEK_DATA. The change should involve a module that isolates these > > low-level details from copy.c. copy.c should ask the new module for the > > locations of the holes (or the non-holes: that could be more convenient). > > On traditional hosts without fiemap or SEEK_DATA, the module should report > > that it doesn't know where the holes are; this can let copy.c resort to > > the existing heuristic of looking at the size and the disk usage and > > using the --sparse=always approach if the file "smells" like it's sparse. While I think the final result wants to support both fiemap and SEEK_HOLE, I think baby steps are in order. If we just implement fiemap right now, we can later turn that into init_extent_detection() and get_next_extent(). > >> 2. Performance optimization, invoke fallocate(2) if an extent flag is > >> UNWRITTEN > > > > This doesn't sound right. A FIEMAP_EXTENT_UNWRITTEN extent is all zeros, > > and > > so it should act as if it were a hole. The goal is not to copy the exact > > fiemap structure of the source (that's impossible): the goal is to use as > > little time and space as possible. What he said. If you find an FIEMAP_EXTENT_UNWRITTEN extent, you just skip it. It is a hole for the purposes of copying. If someone really wants to clone the extent layout, they can use reflink(8). > > It's not clear to me that the fiemap stuff can be cleanly separated > > from the fallocate stuff. To some extent they're the same issue. > > If they can easily be separated, that's better of course. > > I see fiemap as optimizing reads, > posix_fallocate() as optimizing writing zeros > and fallocate() as optimizing allocation. > > So not having thought much about implementation details, > it seems like they could be logically separated. I think they should absolutely be separated. The fiemap patch doesn't have to do anything with fallocate()/posix_fallocate() on the write side. Let's get a happy fiemap patch. Then a happy [posix]_fallocate() patch. Then a happy SEEK_HOLE patch. Joel -- "For every complex problem there exists a solution that is brief, concise, and totally wrong." -Unknown Joel Becker Consulting Software Developer Oracle E-mail: joel.bec...@oracle.com Phone: (650) 506-8127