subject:"Re\: Scrub\: no spae left on device"

Re: Scrub: no spae left on device

2015-12-08 Thread Duncan

Marc MERLIN posted on Tue, 08 Dec 2015 08:06:15 -0800 as excerpted:

> On Tue, Dec 08, 2015 at 04:46:32PM +0100, Lionel Bouton wrote:
>> Le 08/12/2015 16:37, Holger Hoffstätte a écrit :
>> > On 12/08/15 16:06, Marc MERLIN wrote:
>> >>
>> >> Why would scrub need space and why would it cancel if there isn't
>> >> enough of it? (kernel 4.3)
>> >>
>> >> btrfs scrub start -Bd /dev/mapper/pool1
>> >> ERROR: scrubbing /dev/mapper/pool1 failed for device id 1
>> >> (No space left on device)
>> >> scrub device /dev/mapper/pool1 (id 1) canceled
>> > Scrub rewrites metadata (apparently even in -r aka readonly mode),
>> > and that can lead to temporary metadata expansion (stuff gets COWed
>> > around); it's a bit surprising but makes sense if you think about it.

Are you sure about that?

My / is mounted ro by default, and if I try to scrub it in normal mode, 
it'll error out due to read-only.  But I can run a read-only scrub just 
fine, and if I find errors, I simply mount it writable and redo the scrub 
without the -r.  (My / is only 8 GiB, under half used including metadata 
on a fast SSD, so scrubs complete in under 30 seconds, and doing a read-
only scrub followed by a mount-writable and a second fixing scrub if 
necessary, is trivial.)

>> Sorry I'm not sure why metadata is rewritten if no error is detected.

But scrub will of course do copy-on-write if there's an error, and it's 
possible that on initialization it checks for space to do a few cows if 
necessary, before it actually checks for the -r read-only flag.  I try to 
leave at least enough unallocated space to do a balance, which of course 
except for -dusage=0 (or -musage=0) writes a new chunk to rewrite 
existing chunks into, so I'd be unlikely to ever get that close to out of 
space to trigger the possible initialization-time space-warning, and thus 
wouldn't know whether it has one or whether it comes before the -r check, 
or not.

> And this is what I got:
> legolas:~# btrfs balance start -musage=10 -v /mnt/btrfs_pool1/
> Dumping filters: flags 0x6, state 0x0, force is off
>   METADATA (flags 0x2): balancing, usage=10
>   SYSTEM (flags 0x2): balancing, usage=10
> ERROR: error during balancing '/mnt/btrfs_pool1/' - No space left on
> device There may be more info in syslog - try dmesg | tail
> 
> Ok, that sucks.
> 
> legolas:~# btrfs balance start -musage=0 -v /mnt/btrfs_pool1/
> Dumping filters: flags 0x6, state 0x0, force is off
>   METADATA (flags 0x2): balancing, usage=0
>   SYSTEM (flags 0x2): balancing, usage=0
> Done, had to relocate 0 out of 618 chunks
> 
> This worked. Mmmh, I thought this wouldn't be necessary anymore in 4.3
> kernels?

Well, it said it had to relocate zero blocks, so it _appears_ that it 
didn't do anything, which would be expected on reasonably current kernels 
as they already clean up zero-usage chunks, automatically.  *BUT*...

> legolas:~# btrfs balance start -musage=10 -v /mnt/btrfs_pool1
> Dumping filters: flags 0x6, state 0x0, force is off
>   METADATA (flags 0x2): balancing, usage=10
>   SYSTEM (flags 0x2):  balancing, usage=10
> Done, had to relocate 1 out of 618 chunks

... if it did nothing in the -musage=0 case above, why did the -musage=10 
case fail before, but succeed after?

That's a very good question I don't have an answer to.  Good question for 
the devs and others that actually read code.

Meanwhile, note that if it relocates only a single chunk (of non-zero 
usage), under normal circumstances, it'll take exactly the same amount of 
space as before, because it'd allocate a new chunk of exactly the same 
size as the one it was rewriting.

However, once remaining unallocated space gets tight enough, it starts 
allocating smaller than normal chunks, which may be what happened this 
time.  Presumably that chunk was originally allocated when the filesystem 
still has much more unallocated free space, so it was a standard size 
chunk.  When it was rewritten, unallocated space was much tighter, so a 
smaller chunk would likely be written, which would then be rather fuller 
than it was previously, as it would have the same amount of metadata in 
it, but be a smaller chunk.

And, perhaps partially answering my own question above, the balance with 
-musage=0 somehow triggered a space reevaluation, thus allowing the 
-musage=10 balance to run afterward when it wouldn't before, even tho the 
-musage=0 didn't actually relocate (to /dev/null as they'd be empty, IOW, 
delete) any empty chunks.

But... it still shouldn't happen, as if -musage=0 didn't relocate 
anything, it shouldn't trigger a space reevaluage that -musage=10 
wouldn't trigger on its own, so while this might partially answer what 
happened, it does nothing to explain /why/ it happened.  I'd call it a 
bug in the balance code, as the result of the -musage=10 should be 
exactly the same before and after, because the -musage=0 didn't actually 
relocate/delete anything.

> And now I'm back in business...
> 
> Still, this is a bit disappointing and at the v

Re: Scrub: no spae left on device

2015-12-08 Thread Marc MERLIN

On Tue, Dec 08, 2015 at 05:24:16PM +0100, Holger Hoffstätte wrote:
> On 12/08/15 17:06, Marc MERLIN wrote:
> > Label: 'btrfs_pool1'  uuid: 5ee24229-2431-448a-868e-2c325d10bfa7
> > Total devices 1 FS bytes used 524.26GiB
> > devid1 size 615.01GiB used 614.94GiB path /dev/mapper/pool1
> 
> This is what I was alluding to. You could have started a -dusage balance
> *before* the scrub so that one or several data chunks get freed.
> Balancing metadata when you're out of space accomplishes nothing and only
> will very likely fail, just as you saw. You have ~90GB usable space, but
> that space is spread over chunks with low utilisation.

Yes, my partition got a bit full, I freed up space, and unfortunately we
still don't have a background rebalance to fix this, so I did run a manual
one.
But my filesystem was usable, I was writing to it just fine. I was just very
surprised that scrub needed to rewrite blocks on a single disk device.

You could make the case that scrub and balance=0 should be run together.
In the meantime, I upgraded my script:
http://marc.merlins.org/perso/btrfs/2014-03.html#Btrfs-Tips_-Btrfs-Scrub-and-Btrfs-Filesystem-Repair
http://marc.merlins.org/linux/scripts/btrfs-scrub

I figured there is no good reason not to run a balance 20 on metadata and
data every night.

Marc
-- 
"A mouse is a device used to point at the xterm you want to type in" - A.S.R.
Microsoft is to operating systems 
   what McDonalds is to gourmet cooking
Home page: http://marc.merlins.org/  
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: Scrub: no spae left on device

2015-12-08 Thread Holger Hoffstätte

On 12/08/15 17:06, Marc MERLIN wrote:
> Label: 'btrfs_pool1'  uuid: 5ee24229-2431-448a-868e-2c325d10bfa7
>   Total devices 1 FS bytes used 524.26GiB
>   devid1 size 615.01GiB used 614.94GiB path /dev/mapper/pool1

This is what I was alluding to. You could have started a -dusage balance
*before* the scrub so that one or several data chunks get freed.
Balancing metadata when you're out of space accomplishes nothing and only
will very likely fail, just as you saw. You have ~90GB usable space, but
that space is spread over chunks with low utilisation.

-h

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: Scrub: no spae left on device

2015-12-08 Thread Marc MERLIN

On Tue, Dec 08, 2015 at 04:46:32PM +0100, Lionel Bouton wrote:
> Le 08/12/2015 16:37, Holger Hoffstätte a écrit :
> > On 12/08/15 16:06, Marc MERLIN wrote:
> >> Howdy,
> >>
> >> Why would scrub need space and why would it cancel if there isn't enough of
> >> it?
> >> (kernel 4.3)
> >>
> >> /etc/cron.daily/btrfs-scrub:
> >> btrfs scrub start -Bd /dev/mapper/cryptroot
> >> scrub device /dev/mapper/cryptroot (id 1) done
> >>scrub started at Mon Dec  7 01:35:08 2015 and finished after 258 seconds
> >>total bytes scrubbed: 130.84GiB with 0 errors
> >> btrfs scrub start -Bd /dev/mapper/pool1
> >> ERROR: scrubbing /dev/mapper/pool1 failed for device id 1 (No space left 
> >> on device)
> >> scrub device /dev/mapper/pool1 (id 1) canceled
> > Scrub rewrites metadata (apparently even in -r aka readonly mode), and that
> > can lead to temporary metadata expansion (stuff gets COWed around); it's
> > a bit surprising but makes sense if you think about it.
> 
> How long must I think about it until it makes sense? :-)
> 
> Sorry I'm not sure why metadata is rewritten if no error is detected.
> I've several theories but lack information: is the fact that no error
> has been detected stored somewhere? is scrub using some kind of internal
> temporary snapshot(s) to avoid interfering with other operations? other
> reason I didn't think about?

Yeah, I was also wondering why metadata should be rewritten on a single
device scrub.
Does not make sense to me.

And this is what I got:
legolas:~# btrfs balance start -musage=10 -v /mnt/btrfs_pool1/ 
Dumping filters: flags 0x6, state 0x0, force is off
  METADATA (flags 0x2): balancing, usage=10
  SYSTEM (flags 0x2): balancing, usage=10
ERROR: error during balancing '/mnt/btrfs_pool1/' - No space left on device
There may be more info in syslog - try dmesg | tail

Ok, that sucks.

legolas:~# btrfs balance start -musage=0 -v /mnt/btrfs_pool1/
Dumping filters: flags 0x6, state 0x0, force is off
  METADATA (flags 0x2): balancing, usage=0
  SYSTEM (flags 0x2): balancing, usage=0
Done, had to relocate 0 out of 618 chunks

This worked. Mmmh, I thought this wouldn't be necessary anymore in 4.3 kernels?

legolas:~# btrfs balance start -musage=10 -v /mnt/btrfs_pool1
Dumping filters: flags 0x6, state 0x0, force is off
  METADATA (flags 0x2): balancing, usage=10
  SYSTEM (flags 0x2): balancing, usage=10
Done, had to relocate 1 out of 618 chunks

And now I'm back in business...

Still, this is a bit disappointing and at the very least very unexpected in 4.3.

legolas:~# btrfs fi df /mnt/btrfs_pool1
Data, single: total=604.88GiB, used=520.09GiB
System, DUP: total=32.00MiB, used=96.00KiB
Metadata, DUP: total=5.00GiB, used=4.17GiB
GlobalReserve, single: total=512.00MiB, used=0.00B
legolas:~# btrfs fi show /mnt/btrfs_pool1
Label: 'btrfs_pool1'  uuid: 5ee24229-2431-448a-868e-2c325d10bfa7
Total devices 1 FS bytes used 524.26GiB
devid1 size 615.01GiB used 614.94GiB path /dev/mapper/pool1


Marc
-- 
"A mouse is a device used to point at the xterm you want to type in" - A.S.R.
Microsoft is to operating systems 
   what McDonalds is to gourmet cooking
Home page: http://marc.merlins.org/  
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: Scrub: no spae left on device

2015-12-08 Thread Holger Hoffstätte

On 12/08/15 16:46, Lionel Bouton wrote:
> Le 08/12/2015 16:37, Holger Hoffstätte a écrit :
>> On 12/08/15 16:06, Marc MERLIN wrote:
>>> Howdy,
>>>
>>> Why would scrub need space and why would it cancel if there isn't enough of
>>> it?
>>> (kernel 4.3)
>>>
>>> /etc/cron.daily/btrfs-scrub:
>>> btrfs scrub start -Bd /dev/mapper/cryptroot
>>> scrub device /dev/mapper/cryptroot (id 1) done
>>> scrub started at Mon Dec  7 01:35:08 2015 and finished after 258 seconds
>>> total bytes scrubbed: 130.84GiB with 0 errors
>>> btrfs scrub start -Bd /dev/mapper/pool1
>>> ERROR: scrubbing /dev/mapper/pool1 failed for device id 1 (No space left on 
>>> device)
>>> scrub device /dev/mapper/pool1 (id 1) canceled
>> Scrub rewrites metadata (apparently even in -r aka readonly mode), and that
>> can lead to temporary metadata expansion (stuff gets COWed around); it's
>> a bit surprising but makes sense if you think about it.
> 
> How long must I think about it until it makes sense? :-)
> 
> Sorry I'm not sure why metadata is rewritten if no error is detected.
> I've several theories but lack information: is the fact that no error
> has been detected stored somewhere? is scrub using some kind of internal
> temporary snapshot(s) to avoid interfering with other operations? other
> reason I didn't think about?

Well..I have no idea what the historical motivation for this behaviour was,
even though I can make up at least two: rewriting known-good checksums
generally (since you know they are good this very moment), and in case of
error avoiding the area where the block error occurred (read errors on rust
are often clustered and affect entire tracks).

That's really all I know. I agree it's surprising, especially since it
happens by default and also in -r mode, which might be considered a bug.

-h

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: Scrub: no spae left on device

2015-12-08 Thread Austin S Hemmelgarn


On 2015-12-08 10:06, Marc MERLIN wrote:

Howdy,

Why would scrub need space and why would it cancel if there isn't enough of
it?
(kernel 4.3)

Wild guess here, but maybe scrub unconditionally updates the error 
counters, regardless of whether any errors were found or not?





smime.p7s
Description: S/MIME Cryptographic Signature

Re: Scrub: no spae left on device

2015-12-08 Thread Lionel Bouton

Le 08/12/2015 16:06, Marc MERLIN a écrit :
> Howdy,
>
> Why would scrub need space and why would it cancel if there isn't enough of
> it?
> (kernel 4.3)
>
> /etc/cron.daily/btrfs-scrub:
> btrfs scrub start -Bd /dev/mapper/cryptroot
> scrub device /dev/mapper/cryptroot (id 1) done
>   scrub started at Mon Dec  7 01:35:08 2015 and finished after 258 seconds
>   total bytes scrubbed: 130.84GiB with 0 errors
> btrfs scrub start -Bd /dev/mapper/pool1
> ERROR: scrubbing /dev/mapper/pool1 failed for device id 1 (No space left on 
> device)
> scrub device /dev/mapper/pool1 (id 1) canceled

I can't be sure (not-a-dev), but one possibility that comes to mind is
that if an error is detected writes must be done on the device. The
repair might not be done in-place but with CoW and even if the error is
not repaired by lack of redundancy IIRC each device tracks the number of
errors detected so I assume this is written somewhere (system or
metadata chunks most probably).

Best regards,

Lionel
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: Scrub: no spae left on device

2015-12-08 Thread Lionel Bouton

Le 08/12/2015 16:37, Holger Hoffstätte a écrit :
> On 12/08/15 16:06, Marc MERLIN wrote:
>> Howdy,
>>
>> Why would scrub need space and why would it cancel if there isn't enough of
>> it?
>> (kernel 4.3)
>>
>> /etc/cron.daily/btrfs-scrub:
>> btrfs scrub start -Bd /dev/mapper/cryptroot
>> scrub device /dev/mapper/cryptroot (id 1) done
>>  scrub started at Mon Dec  7 01:35:08 2015 and finished after 258 seconds
>>  total bytes scrubbed: 130.84GiB with 0 errors
>> btrfs scrub start -Bd /dev/mapper/pool1
>> ERROR: scrubbing /dev/mapper/pool1 failed for device id 1 (No space left on 
>> device)
>> scrub device /dev/mapper/pool1 (id 1) canceled
> Scrub rewrites metadata (apparently even in -r aka readonly mode), and that
> can lead to temporary metadata expansion (stuff gets COWed around); it's
> a bit surprising but makes sense if you think about it.

How long must I think about it until it makes sense? :-)

Sorry I'm not sure why metadata is rewritten if no error is detected.
I've several theories but lack information: is the fact that no error
has been detected stored somewhere? is scrub using some kind of internal
temporary snapshot(s) to avoid interfering with other operations? other
reason I didn't think about?

Lionel
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: Scrub: no spae left on device

2015-12-08 Thread Holger Hoffstätte

On 12/08/15 16:06, Marc MERLIN wrote:
> Howdy,
> 
> Why would scrub need space and why would it cancel if there isn't enough of
> it?
> (kernel 4.3)
> 
> /etc/cron.daily/btrfs-scrub:
> btrfs scrub start -Bd /dev/mapper/cryptroot
> scrub device /dev/mapper/cryptroot (id 1) done
>   scrub started at Mon Dec  7 01:35:08 2015 and finished after 258 seconds
>   total bytes scrubbed: 130.84GiB with 0 errors
> btrfs scrub start -Bd /dev/mapper/pool1
> ERROR: scrubbing /dev/mapper/pool1 failed for device id 1 (No space left on 
> device)
> scrub device /dev/mapper/pool1 (id 1) canceled

Scrub rewrites metadata (apparently even in -r aka readonly mode), and that
can lead to temporary metadata expansion (stuff gets COWed around); it's
a bit surprising but makes sense if you think about it. The fact that you
ENOSPCed means that the fs was probably already fully allocated.

If it bothers you, a subsequent balance with -musage=10 should vacuum things
up. Alternatively just keep using the filesystem; eventually the empty metadata
chunks should be collected, on the next remount at the latest.

tl;dr: Never allocate all the chunks. Yes, this needs more graceful handling.

-h

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: Scrub: no spae left on device

Re: Scrub: no spae left on device

Re: Scrub: no spae left on device

Re: Scrub: no spae left on device

Re: Scrub: no spae left on device

Re: Scrub: no spae left on device

Re: Scrub: no spae left on device

Re: Scrub: no spae left on device

Re: Scrub: no spae left on device

9 matches

Site Navigation

Mail list logo

Footer information