Re: fallocate does not prevent ENOSPC on write

Qu Wenruo Wed, 24 Apr 2019 02:50:56 -0700


On 2019/4/24 下午5:28, Filipe Manana wrote:
[snip]
>>> So what's wrong with it? And how does it cause the ENOSPC?
>>
>> E.g.
>>
>> We have a 128Mb preallocated file extent.
>> And assume the fs only have 128M free data space, meaning 0 remaining
>> space at all.
> 
> That's a contradicting sentence...
> 
>>
>> Then we try to buffer write, which means buffered will just fail as it
>> will need data space.
>>
>> The idea is always here for fallocate/pwrite, just the timing where the
>> ENOSPC happens.
> 
> Can't make sense of that sentence as well.


My bad, that change is already in buffered_write(), so that sentence
makes no sense.
> 
> So I suppose what you are trying to say is that a write into an
> unwritten extent causes space allocation,
> and that can prevent some other write (which is not into an unwritten
> extent) from being able to allocate space and therefore fail.

That's one case.

> 
> That's a valid problem that should be temporary.

I just tried a basic script:
---
#!/bin/bash

dev=/dev/test/test
mnt=/mnt/btrfs

mkfs.btrfs -f $dev -b 512M

mount $dev $mnt

fallocate -l 384M $mnt/file1
echo "fallocate success"
dd if=/dev/zero bs=512K  conv=notrunc count=768 of=$mnt/file2

umount $mnt
---

This fails just like the error report.


At least in current form, if we're writing into the preallocated space,
it indeed skips the data space reservation so it shouldn't cause problem
at that buffered write in theory.


However we have other locations which can reserve data space:
- btrfs_page_mkwrite()
- btrfs_truncate_block()
- btrfs_direct_IO()

Haven't looked into why above script fails, but it should have something
to do with any of the data space reservation.

Thanks,
Qu
> 
> However when allocating space for a write into an unwritten extent (or
> any nodatacow write) we increment the data space info's bytes_may_use
> counter,
> but then if when writeback starts if we don't need to fallback into
> CoW, we end up never decrementing the bytes_may_use counter (even
> after writeback completes), leaking it.
> Not sure if this is the problem you were mentioning or just causing
> other writes to temporarily fail.
> 
> thanks
> 
> 
>>
>>
>> We have btrfs/153 for the same reason to fail for a long time, although
>> it's from quota, but the reason the completely the same.
>>
>> Thanks,
>> Qu
>>
>>>
>>> Trying the reproducer, at least on a 5.0 kernel, does never fail on a
>>> pwrite for me, but always on fallocate:
>>>
>>> $ mkfs.btrfs -f -b $((4 * 1024 * 1024 * 1024)) /dev/sdi
>>> $ mount /dev/sdi /mnt/sdi
>>> $ cd /mnt/sdi
>>> $ /path/to/reproducer
>>> reading from /dev/urandom
>>> writing to ./blob.IIa6tH
>>> writing blocks of 132096 bytes each
>>> total    125 MiB,  65.52 MiB/s
>>> total    251 MiB,  44.59 MiB/s
>>> total    377 MiB,  55.23 MiB/s
>>> total    503 MiB,  66.21 MiB/s
>>> total    629 MiB,  59.97 MiB/s
>>> total    755 MiB,   3.70 MiB/s
>>> total    881 MiB,  50.24 MiB/s
>>> total   1007 MiB,  64.51 MiB/s
>>> total   1133 MiB,  50.70 MiB/s
>>> total   1259 MiB,  49.29 MiB/s
>>> total   1385 MiB,  47.93 MiB/s
>>> total   1511 MiB,   4.00 MiB/s
>>> total   1637 MiB,  49.85 MiB/s
>>> total   1763 MiB,  48.11 MiB/s
>>> total   1889 MiB,  66.62 MiB/s
>>> total   2015 MiB,   5.60 MiB/s
>>> total   2141 MiB,  19.58 MiB/s
>>> total   2267 MiB,  64.80 MiB/s
>>> total   2393 MiB,  13.23 MiB/s
>>> total   2519 MiB,  14.95 MiB/s
>>> fallocate failed: No space left on device
>>>
>>> So either that was tested on a rather old kernel or:
>>>
>>> 1) we had snapshotting happening between a fallocate and a pwrite (or
>>> at the same time as the pwrite)
>>> 2) before the pwrite (or during) the unwritten/prealloc extent was
>>> reflinked (cp --reflink, clone or dedupe ioctls)
>>>
>>> What did I miss here?
>>>
>>> Thanks.
>>>
>>>>
>>>> E.g. reserved space underflow.
>>>>
>>>> I'll find the old thread and retry again.
>>>>
>>>> Thanks,
>>>> Qu
>>>>
>>>>> This seems to break the semantics of fallocate so the performance should
>>>>> not the main concern here.
>>>>>
>>>>
>>>
>>>
>>
> 
>

signature.asc
Description: OpenPGP digital signature

Re: fallocate does not prevent ENOSPC on write

Reply via email to