As Dave observed, Sfio already handles writing files with holes compactly
up to disk block constraints.

What surprised me is that SEEK_HOLE and SEEK_DATA were made parts
of the lseek() interface. These features carry semantics beyond
the traditional Unix "typeless" file system. They can be expensive
to implement and may not even have consistent semantics!

For example, can a hole start at any byte boundary or does it have
to be on a disk block boundary? If it's the former, that can be
expensive to implement as it requires examining every byte in
the file on a seek operation. If it's the latter, would copy a file
from one disk to another cause holes to change their characteristics?

Aside from the cp, mv, pax-like operations, what can an application use
these features for? They seem a bit too hastily considered.

Phong

> From [email protected] Mon Mar 26 11:08:42 2012
> To: [email protected], [email protected]
> Subject:  Re: [ast-users] Implementing SEEK_HOLE, SEEK_DATA in AST cp, mv, pax
> Cc: [email protected]
> Subject: Re: [ast-users] Implementing SEEK_HOLE, SEEK_DATA in AST cp, mv, pax

> > are there plans to implement support for SEEK_HOLE (let lseek() seek
> > to the next hole in a sparse file) and SEEK_DATA (let lseek() seek to
> > the next place with real data, usually after a hole) in AST cp, mv and
> > pax in the next 2-3 months? This has become VERY important to
> > enterprise customers now that Linux+btrfs, GNU coreutils, Solaris,
> > FreeBSD and others support this feature and that it is going to be
> > included in the next iteration of the POSIX standard
> > (http://man7.org/linux/man-pages/man2/lseek.2.html)
> > 
> > 

> The AST tools do not use read/write/lseek directly but use SFIO for all
> input and output.  Currently when writing a file and there are more than
> a block full of 0 bytes, SFIO seeks to the next non-zero byte so that
> the file can be created with holes.  The result is that cp foo bar will
> create the file bar with the same or less space than bar.  I don't
> understand how mv is effected by SEEK_HOLE and SEEK_DATA since this is
> just a rename operation.  pax will create holes whenever the input
> file contain large numbers of 0 bytes.

> In order to take advantage of SEEK_HOLE and SEEK_DATA, we will have to
> allow these to be passed into sfseek().  I don't know what gain will
> be achieved over the current method except possibly on read.

> David Korn
> [email protected]

_______________________________________________
ast-users mailing list
[email protected]
https://mailman.research.att.com/mailman/listinfo/ast-users

Reply via email to