Re: [Bug-tar] Detection of sparse files is broken on btrfs

2018-01-23 Thread Joerg Schilling
Sorry for the resend mail, but it turned out that accidently typed "r" instead of "R" and I belive this may be of interes for more than just you Paul. Paul Eggert wrote: > On 01/22/2018 09:47 AM, Joerg Schilling wrote: > > we are talking about files that do not change while

Re: [Bug-tar] Detection of sparse files is broken on btrfs

2018-01-23 Thread Joerg Schilling
Andreas Dilger wrote: > Maybe you wrote a filesystem 30 years ago when everything was BSD FFS, but > things have moved on from that time. I'm one of the maintainers for ext4, Things changed since then because people followed my filesystem design from 30 years ago. Jörg

Re: [Bug-tar] Detection of sparse files is broken on btrfs

2018-01-22 Thread Andreas Dilger
On Jan 22, 2018, at 10:47 AM, Joerg Schilling wrote: > >> On Jan 22, 2018, at 3:28 AM, Joerg Schilling >> wrote: >>> If you still don't understand this, I recommend you to try to write an in >>> kernel filesystem

Re: [Bug-tar] Detection of sparse files is broken on btrfs

2018-01-22 Thread Paul Eggert
On 01/22/2018 09:47 AM, Joerg Schilling wrote: we are talking about files that do not change while something like TAR is reading them. It's reasonable for a file system to reorganize itself while 'tar' is reading files. Even if a file's contents do not change, its space utilization might

Re: [Bug-tar] Detection of sparse files is broken on btrfs

2018-01-22 Thread Joerg Schilling
Paul Eggert wrote: > The implementation that you suggest requires the file system to remember > how much reserved space that it initially allocated to the file, even if > that number changes as a result of file system reorganization. This can > place an undue burden on

Re: [Bug-tar] Detection of sparse files is broken on btrfs

2018-01-22 Thread Paul Eggert
On 01/22/2018 02:28 AM, Joerg Schilling wrote: Since you need to reserve space on the background storage before you can even write to the cached data for a file, you need to make stat() return the related state that includes the reserved space. The implementation that you suggest requires the

Re: [Bug-tar] Detection of sparse files is broken on btrfs

2018-01-22 Thread Joerg Schilling
Andreas Dilger wrote: > So, what you're saying is that filesystem resizing is forbidden by POSIX, > background data compression and data deduplication is forbidden by POSIX, > migration across storage tiers is forbidden by POSIX? All modifications > to the filesystem need to

Re: [Bug-tar] Detection of sparse files is broken on btrfs

2018-01-18 Thread Andreas Dilger
On Jan 17, 2018, at 9:49 PM, Tim Kientzle wrote: >> On Jan 17, 2018, at 1:09 PM, Andreas Dilger wrote: >> >>> So is there some other way to quickly identify sparse files so we can avoid >>> the SEEK_HOLE scan for non-sparse files? >> >> Given that calling

Re: [Bug-tar] Detection of sparse files is broken on btrfs

2018-01-18 Thread Paul Eggert
On 01/18/2018 02:33 AM, Joerg Schilling wrote: Returning a value for st_blocks, that changes with the phases of the moon while the content of that file is not changed is another unexpected behavior. Not at all. It's completely expected nowadays, for the same reason that invoking statvfs twice

Re: [Bug-tar] Detection of sparse files is broken on btrfs

2018-01-18 Thread Joerg Schilling
Andreas Dilger wrote: > > POSIX does not require you to call fsync() before you are able to get the > > expected result from stat() > > > > If POSIX did make such assumptions, it would document then. The fact that > > there is no related text in POSIX is sufficient to prove

Re: [Bug-tar] Detection of sparse files is broken on btrfs

2018-01-18 Thread Joerg Schilling
Andreas Dilger wrote: > Given that calling SEEK_HOLE is also going to have some cost, my suggestion > would be to ignore st_blocks completely for small files (size < 64KB) and > just read the file in this case, since the overhead of reading will probably > be about the same as

Re: [Bug-tar] Detection of sparse files is broken on btrfs

2018-01-17 Thread Tim Kientzle
> On Jan 17, 2018, at 1:09 PM, Andreas Dilger wrote: > >> So is there some other way to quickly identify sparse files so we can avoid >> the SEEK_HOLE scan for non-sparse files? > > Given that calling SEEK_HOLE is also going to have some cost, my suggestion > would be to

Re: [Bug-tar] Detection of sparse files is broken on btrfs

2018-01-17 Thread Andreas Dilger
> On Jan 8, 2018, at 11:50 AM, Joerg Schilling > wrote: > > Paul Eggert wrote: > >> On 01/08/2018 09:41 AM, Joerg Schilling wrote: >>> POSIX explains that st_blocks counts in units of DEV_BSIZE. >> >> That's not required by the

Re: [Bug-tar] Detection of sparse files is broken on btrfs

2018-01-17 Thread Andreas Dilger
On Jan 9, 2018, at 10:02 AM, Tim Kientzle wrote: >> Paul Eggert wrote: >> >>> POSIX does not require that st_nblocks remain constant across any system >>> call. It doesn't even require that it remain constant if you merely call >>> stat twice on the same

Re: [Bug-tar] Detection of sparse files is broken on btrfs

2018-01-17 Thread Ralph Corderoy
Hi, Paul wrote: > Nothing in the POSIX standard clearly disallows the behavior in > question, and the arguments you're making about what POSIX requires > are based on a long chain of indirect inferences that are > unconvincing. I'll make the point again. Raise the issue with the POSIX authors.

Re: [Bug-tar] Detection of sparse files is broken on btrfs

2018-01-17 Thread Paul Eggert
Joerg Schilling wrote: I'm afraid we'll just have to agree to disagree here. Even if you expect a particular behavior, it's not the behavior that I expect nor is it the behavior that we actually observe. You can take up up with the POSIX committee if you like; please reference this discussion so

Re: [Bug-tar] Detection of sparse files is broken on btrfs

2018-01-10 Thread Joerg Schilling
Paul Eggert wrote: > On 01/09/2018 01:38 AM, Joerg Schilling wrote: > > If POSIX would allow such unexpected behavior, this would have been > > documented. > > I'm afraid we'll just have to agree to disagree here. Even if you expect > a particular behavior, it's not the

Re: [Bug-tar] Detection of sparse files is broken on btrfs

2018-01-10 Thread Joerg Schilling
Tim Kientzle wrote: > What is the most efficient (preferably portable) way for an archiving program > (such as tar) to determine whether it should archive a particular file as > sparse or non-sparse? IIRC, a lseek() call is aprox. 2 microseconds. I did some reseach in 2005

Re: [Bug-tar] Detection of sparse files is broken on btrfs

2018-01-09 Thread Mark H Weaver
Mark H Weaver writes: > I don't expect any of this to convince you, but it is most likely the > last message I will write in this "debate" between you and the rest of > the world. Instead, I will focus on fixing the bug. I apologize for losing my patience here. This last

Re: [Bug-tar] Detection of sparse files is broken on btrfs

2018-01-09 Thread Paul Eggert
On 01/09/2018 01:38 AM, Joerg Schilling wrote: If POSIX would allow such unexpected behavior, this would have been documented. I'm afraid we'll just have to agree to disagree here. Even if you expect a particular behavior, it's not the behavior that I expect nor is it the behavior that we

Re: [Bug-tar] Detection of sparse files is broken on btrfs

2018-01-09 Thread Paul Eggert
On 01/09/2018 09:02 AM, Tim Kientzle wrote: So is there some other way to quickly identify sparse files so we can avoid the SEEK_HOLE scan for non-sparse files? Nothing that's at all portable, no.

Re: [Bug-tar] Detection of sparse files is broken on btrfs

2018-01-09 Thread Joerg Schilling
Paul Eggert wrote: > POSIX does not require that st_nblocks remain constant across any system > call. It doesn't even require that it remain constant if you merely call > stat twice on the same file, without doing anything else in between. So > I agree with you that it's

Re: [Bug-tar] Detection of sparse files is broken on btrfs

2018-01-08 Thread Joerg Schilling
Paul Eggert wrote: > On 01/08/2018 09:41 AM, Joerg Schilling wrote: > > POSIX explains that st_blocks counts in units of DEV_BSIZE. > > That's not required by the standard. It's merely a comment in the > rationale "Traditionally, some implementations defined the >

Re: [Bug-tar] Detection of sparse files is broken on btrfs

2018-01-08 Thread Joerg Schilling
Paul Eggert wrote: > On 01/08/2018 08:54 AM, Joerg Schilling wrote: > > The most important fact however is that allocating spade happens before you > > copy data into that space. > > Certainly users need the ability to make sure there's enough room before > starting to copy,

Re: [Bug-tar] Detection of sparse files is broken on btrfs

2018-01-08 Thread Paul Eggert
On 01/08/2018 08:54 AM, Joerg Schilling wrote: The most important fact however is that allocating spade happens before you copy data into that space. Certainly users need the ability to make sure there's enough room before starting to copy, and POSIX allows for that with posix_fallocate.

Re: [Bug-tar] Detection of sparse files is broken on btrfs

2018-01-08 Thread Mark H Weaver
Hi Joerg, Joerg Schilling writes: > Paul Eggert wrote: > >> On 01/08/2018 08:06 AM, Joerg Schilling wrote: >> > blkcnt_t st_blocks Number of blocks allocated for this object. >> > >> > I hope I do not need to explain the term

Re: [Bug-tar] Detection of sparse files is broken on btrfs

2018-01-08 Thread Ralph Corderoy
Hi, I wonder if http://austingroupbugs.net/main_page.php would be a better place to discuss this because it could involve more people that have worked on POSIX's wording that just Jörg, and thus may either have a different opinion, or be able to phrase it in a more persuasive manner. It doesn't

Re: [Bug-tar] Detection of sparse files is broken on btrfs

2018-01-08 Thread Joerg Schilling
Paul Eggert wrote: > On 01/08/2018 08:06 AM, Joerg Schilling wrote: > > blkcnt_t st_blocks Number of blocks allocated for this object. > > > > I hope I do not need to explain the term "allocated". > > I'm afraid that you do need to explain "allocated". Suppose, for >

Re: [Bug-tar] Detection of sparse files is broken on btrfs

2018-01-08 Thread Joerg Schilling
Adam Borowski wrote: > A file that doesn't have a single block allocated for it may thus return > st_blocks of 0, no matter if it's empty or not. _before_ you may add data to a file, you need to allocate space for it. This is what POSIX requires to return with a stat()

Re: [Bug-tar] Detection of sparse files is broken on btrfs

2018-01-08 Thread Paul Eggert
On 01/08/2018 08:06 AM, Joerg Schilling wrote: blkcnt_t st_blocks Number of blocks allocated for this object. I hope I do not need to explain the term "allocated". I'm afraid that you do need to explain "allocated". Suppose, for example, two files are clones: they have different inode

Re: [Bug-tar] Detection of sparse files is broken on btrfs

2018-01-08 Thread Adam Borowski
On Mon, Jan 08, 2018 at 04:28:36PM +0100, Joerg Schilling wrote: > Tim Kientzle wrote: > > > I'm not entirely sure I understand the above. > > > > It sounds like someone is claiming that: > > > > * Archiving programs should know about the *timing* of filesystem > >

Re: [Bug-tar] Detection of sparse files is broken on btrfs

2018-01-08 Thread Mark H Weaver
Hi Joerg, Joerg Schilling writes: > Mark H Weaver wrote: > >> I just got bitten by the same problem reported back in July 2016: >> >> https://lists.gnu.org/archive/html/bug-tar/2016-07/msg0.html >> >> At the time, Joerg Schilling

Re: [Bug-tar] Detection of sparse files is broken on btrfs

2018-01-08 Thread Joerg Schilling
Mark H Weaver wrote: > >> https://lists.gnu.org/archive/html/bug-tar/2016-07/msg0.html > >> > >> At the time, Joerg Schilling unilaterally refused to fix the bug, > >> claiming that Btrfs was broken and violated POSIX, although when asked > >> for a reference to back that

Re: [Bug-tar] Detection of sparse files is broken on btrfs

2018-01-08 Thread Joerg Schilling
Tim Kientzle wrote: > I'm not entirely sure I understand the above. > > It sounds like someone is claiming that: > > * Archiving programs should know about the *timing* of filesystem > implementations (60s here for btrfs, something else for XYZ>?) > > * And specifically

Re: [Bug-tar] Detection of sparse files is broken on btrfs

2018-01-08 Thread Tim Kientzle
Quoted from the earlier discussion: > > One option is if st_blocks == 0 then tar should also check if st_mtime is > > less than 60s in the past, and if yes then it should call fsync() on the > > file to flush any unwritten data to disk , or assume the file is not sparse > > and read the whole

Re: [Bug-tar] Detection of sparse files is broken on btrfs

2018-01-08 Thread Joerg Schilling
Mark H Weaver wrote: > I just got bitten by the same problem reported back in July 2016: > > https://lists.gnu.org/archive/html/bug-tar/2016-07/msg0.html > > At the time, Joerg Schilling unilaterally refused to fix the bug, > claiming that Btrfs was broken and violated

Re: [Bug-tar] Detection of sparse files is broken on btrfs

2018-01-07 Thread Mark H Weaver
Hi Paul, Paul Eggert writes: > Mark H Weaver wrote: >> I propose that we revisit this bug and fix it. > > Sounds good to me, if someone has the time to write a proper fix. Thank you for the quick response! I'd be glad to work on it. I expect that I can provide a proper

Re: [Bug-tar] Detection of sparse files is broken on btrfs

2018-01-07 Thread Paul Eggert
Mark H Weaver wrote: I propose that we revisit this bug and fix it. Sounds good to me, if someone has the time to write a proper fix.

[Bug-tar] Detection of sparse files is broken on btrfs

2018-01-07 Thread Mark H Weaver
Hi, I just got bitten by the same problem reported back in July 2016: https://lists.gnu.org/archive/html/bug-tar/2016-07/msg0.html At the time, Joerg Schilling unilaterally refused to fix the bug, claiming that Btrfs was broken and violated POSIX, although when asked for a reference to