On 26 March 2012 18:53, Phong Vo <[email protected]> wrote:
>
> As Dave observed, Sfio already handles writing files with holes compactly
> up to disk block constraints.

Well, it does something but IMO not the correct thing, where 'correct'
is defined as now FreeBSD, Solaris, GNU coreutils/Linux etc do it. The
cp, mv, pax commands should preserve the exact location of holes in a
sparse files if src and dest are on the same filesystem which is what
the majority of implementations of these commands now do.

> What surprised me is that SEEK_HOLE and SEEK_DATA were made parts
> of the lseek() interface.

This has been criticised before. So far noone came up with a better
interface. ext3/ext4 has an interface to access the allocation bitmaps
of the filesystem but this interface is cumbersome and does not scale
well for large files or large number of holes as we can see them in
database files, e.g. 2^24 holes in a 10PB files can be considered
'normal' in today's datacenters.
Other, private apis in Solaris ufs, SGI/Linux xfs and ext2/ext3/ext4 are worse.

> These features carry semantics beyond
> the traditional Unix "typeless" file system. They can be expensive
> to implement and may not even have consistent semantics!

They have consistent semantics if src and dest are on the same
filesystem. If src and dest are on different filesystems it is
expected that the position/size of the holes are going to be aligned
to the filesystems block size but this is not necessary. At least ZFS,
btrfs and hammerfs try to implement byte size granularity if
applicable but can fall back to block size granularity as the *worst*
case scenario.

> For example, can a hole start at any byte boundary or does it have
> to be on a disk block boundary? If it's the former, that can be
> expensive to implement as it requires examining every byte in
> the file on a seek operation. If it's the latter, would copy a file
> from one disk to another cause holes to change their characteristics?

I explained that above in detail. Performance for modern filesystems
appears to be acceptable.

> Aside from the cp, mv, pax-like operations, what can an application use
> these features for? They seem a bit too hastily considered.
>
> Phong
>
>> From [email protected] Mon Mar 26 11:08:42 2012
>> To: [email protected], [email protected]
>> Subject:  Re: [ast-users] Implementing SEEK_HOLE, SEEK_DATA in AST cp, mv, 
>> pax
>> Cc: [email protected]
>> Subject: Re: [ast-users] Implementing SEEK_HOLE, SEEK_DATA in AST cp, mv, pax
>
>> > are there plans to implement support for SEEK_HOLE (let lseek() seek
>> > to the next hole in a sparse file) and SEEK_DATA (let lseek() seek to
>> > the next place with real data, usually after a hole) in AST cp, mv and
>> > pax in the next 2-3 months? This has become VERY important to
>> > enterprise customers now that Linux+btrfs, GNU coreutils, Solaris,
>> > FreeBSD and others support this feature and that it is going to be
>> > included in the next iteration of the POSIX standard
>> > (http://man7.org/linux/man-pages/man2/lseek.2.html)
>> >
>> >
>
>> The AST tools do not use read/write/lseek directly but use SFIO for all
>> input and output.  Currently when writing a file and there are more than
>> a block full of 0 bytes, SFIO seeks to the next non-zero byte so that
>> the file can be created with holes.  The result is that cp foo bar will
>> create the file bar with the same or less space than bar.  I don't
>> understand how mv is effected by SEEK_HOLE and SEEK_DATA since this is
>> just a rename operation.  pax will create holes whenever the input
>> file contain large numbers of 0 bytes.
>
>> In order to take advantage of SEEK_HOLE and SEEK_DATA, we will have to
>> allow these to be passed into sfseek().  I don't know what gain will
>> be achieved over the current method except possibly on read.
>
>> David Korn
>> [email protected]
>
> _______________________________________________
> ast-users mailing list
> [email protected]
> https://mailman.research.att.com/mailman/listinfo/ast-users

_______________________________________________
ast-users mailing list
[email protected]
https://mailman.research.att.com/mailman/listinfo/ast-users

Reply via email to