Re: btrfs crash consistency bug : Blocks allocated beyond eof are lost

2018-02-23 Thread Filipe Manana
On Fri, Feb 23, 2018 at 4:35 PM, Jayashree Mohan
 wrote:
> Hi,
>
> [Fsync issue in btrfs]
> In addition to the above, I would like to bring to your notice that :
> After doing a fallocate or fallocate zero_range with keep size option,
> a fsync() operation would have no effect at all. If we crash after the
> fsync, on recovery the blocks allocated due to the fallocate call are
> lost. This aligns with a patch work here[1] for a similar issue with
> punch_hole. A simple test scenario that reproduces this bug is :
>
> 1. write (0-16K) to a file foo
> 2. sync()
> 3. fallocate keep_size (16K - 20K)
> 4. fsync(foo)
> 5. Crash
>
> On recovery, all blocks allocated in step 3 are lost (this is the true
> even when fallocate is replaced by zero_range operation supported in
> kernel 4.16 )
> Could you explain why a fsync() of the file would still not persist
> the metadata(block count in this case), across power failures?

In a very short explanation, because it thinks, on log recovery, that
a shrinking truncate happened before the file was fsync'ed, so it
drops the extents allocated by fallocate() after it replayed them from
the log.
I had seen this a year or 2 ago but never managed to fix it due to
other more important things, but I'll try to fix it soon.

>
> [1] https://patchwork.kernel.org/patch/5830801/
>
>
> Thanks,
> Jayashree Mohan
>
>
>
>
>
> On Wed, Feb 21, 2018 at 8:23 PM, Jayashree Mohan
>  wrote:
>> Hi,
>>
>> On btrfs (as of kernel 4.15), say we fallocate a file with keep_size
>> option, followed by fdatasync() or fsync(). If we now crash, on
>> recovery we see a wrong block count and all the blocks allocated
>> beyond the eof are lost. This bug was reported(xfstest generic/468)
>> and patched on ext4[1], and a variant of this, that did not recover
>> the correct file size was patched in f2fs[2]. I am wondering why this
>> is still not fixed in btrfs. You can reproduce this bug on btrfs using
>> a tool called CrashMonkey that we are building at UT Austin, which is
>> a test harness for filesystem crash consistency checks[3]
>>
>> To reproduce the bug, simply run :
>>  ./c_harness -f /dev/sda -d /dev/cow_ram0 -t btrfs -e 102400  -v
>> tests/generic_468.so
>>
>> Is there a reason why this is not yet patched in btrfs? I don't see
>> why even after a fsync(), losing the blocks allocated beyond the eof
>> are acceptable.
>>
>> [1] https://patchwork.kernel.org/patch/10120293/
>> [2] https://sourceforge.net/p/linux-f2fs/mailman/message/36104201/
>> [3] https://github.com/utsaslab/crashmonkey
>>
>> Thanks,
>>
>> Jayashree Mohan
>> 2nd Year PhD in Computer Science
>> University of Texas at Austin.
> --
> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html



-- 
Filipe David Manana,

“Whether you think you can, or you think you can't — you're right.”
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: btrfs crash consistency bug : Blocks allocated beyond eof are lost

2018-02-23 Thread Jayashree Mohan
Hi,

[Fsync issue in btrfs]
In addition to the above, I would like to bring to your notice that :
After doing a fallocate or fallocate zero_range with keep size option,
a fsync() operation would have no effect at all. If we crash after the
fsync, on recovery the blocks allocated due to the fallocate call are
lost. This aligns with a patch work here[1] for a similar issue with
punch_hole. A simple test scenario that reproduces this bug is :

1. write (0-16K) to a file foo
2. sync()
3. fallocate keep_size (16K - 20K)
4. fsync(foo)
5. Crash

On recovery, all blocks allocated in step 3 are lost (this is the true
even when fallocate is replaced by zero_range operation supported in
kernel 4.16 )
Could you explain why a fsync() of the file would still not persist
the metadata(block count in this case), across power failures?

[1] https://patchwork.kernel.org/patch/5830801/


Thanks,
Jayashree Mohan





On Wed, Feb 21, 2018 at 8:23 PM, Jayashree Mohan
 wrote:
> Hi,
>
> On btrfs (as of kernel 4.15), say we fallocate a file with keep_size
> option, followed by fdatasync() or fsync(). If we now crash, on
> recovery we see a wrong block count and all the blocks allocated
> beyond the eof are lost. This bug was reported(xfstest generic/468)
> and patched on ext4[1], and a variant of this, that did not recover
> the correct file size was patched in f2fs[2]. I am wondering why this
> is still not fixed in btrfs. You can reproduce this bug on btrfs using
> a tool called CrashMonkey that we are building at UT Austin, which is
> a test harness for filesystem crash consistency checks[3]
>
> To reproduce the bug, simply run :
>  ./c_harness -f /dev/sda -d /dev/cow_ram0 -t btrfs -e 102400  -v
> tests/generic_468.so
>
> Is there a reason why this is not yet patched in btrfs? I don't see
> why even after a fsync(), losing the blocks allocated beyond the eof
> are acceptable.
>
> [1] https://patchwork.kernel.org/patch/10120293/
> [2] https://sourceforge.net/p/linux-f2fs/mailman/message/36104201/
> [3] https://github.com/utsaslab/crashmonkey
>
> Thanks,
>
> Jayashree Mohan
> 2nd Year PhD in Computer Science
> University of Texas at Austin.
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html