On 15/06/10 23:19, Paul Eggert wrote: > On 06/15/2010 02:43 PM, Jim Meyering wrote: > >> I think that copying physical holes via FIEMAP should be the default, when >> possible. One problem is that the current code on the fiemap-copy branch >> does not honor --sparse=WHEN when in fiemap-copying mode. The solution >> would seem to be to change the regular-file-copying loop in the fiemap_copy >> function to use the same hole-preserving code that is used in copy_reg. > > I assume that the solution would be used only with cp --sparse=always? > (Otherwise, it would amount to copying logical holes by default.) > > If so, under this proposal, --sparse=always would copy logical holes, > --sparse=never would never copy holes, and --sparse=auto (the default) > would copy physical holes if supported, else it would copy logical > holes if the file seems to have enough physical holes, and otherwise > it would copy no holes. (Whew!) > > But this raises another issue. Under this proposal, the default behavior > copies physical _holes_, but it doesn't copy physical _extents_. > For example, suppose the input file has a single 128 MiB extent. The > proposal could create an output file with two 64-MiB extents > that are logically contiguous. This is because the code does not advise the > output file about the extents that the input file had, and the operating > system might assign a smaller output extent at first, discovering only too > late > that more space was needed. > > If the intent is to copy the physical extents, then the proposed code > needs to be fixed so that it uses fallocate() to attempt to create the > same extents in the destination that existed in the source. Obviously > this could fail for any number of reasons, for example, the destination > file system might be different from the source; but the goal would be > to preserve extents if the underlying OS is "nice" about implementing > fallocate(). > > Which raises another question: should cp attempt to create _better_ > extents in the destination than in the source? For example, if > the source contains two logically-adjacent 64-MiB extents, should the > destination contain just one 128-MiB extent? I can see where that > might be worthwile. > > This all is getting a bit complicated, I'm afraid. Perhaps it'd be better > to try this stuff out with a new option, "cp --sparse=extents" say, and > keep the default as-is for a while?
Well I previously had a patch to call fallocate() to alloc the largest extents possible, which I think is sensible to do by default. I didn't apply it as I was waiting for a change or even a comment on the weird fallocate() interface (which we can forget about this stage). http://lists.gnu.org/archive/html/bug-coreutils/2009-05/msg00260.html cheers, Pádraig.