Re: Recovery from full metadata with all device space consumed?

2018-04-20 Thread Duncan
Timofey Titovets posted on Fri, 20 Apr 2018 01:32:42 +0300 as excerpted:

> 2018-04-20 1:08 GMT+03:00 Drew Bloechl :
>> I've got a btrfs filesystem that I can't seem to get back to a useful
>> state. The symptom I started with is that rename() operations started
>> dying with ENOSPC, and it looks like the metadata allocation on the
>> filesystem is full:
>>
>> # btrfs fi df /broken
>> Data, RAID0: total=3.63TiB, used=67.00GiB
>> System, RAID1: total=8.00MiB, used=224.00KiB
>> Metadata, RAID1: total=3.00GiB, used=2.50GiB
>> GlobalReserve, single: total=512.00MiB, used=0.00B
>>
>> All of the consumable space on the backing devices also seems to be in
>> use:
>>
>> # btrfs fi show /broken Label: 'mon_data'  uuid:
>> 85e52555-7d6d-4346-8b37-8278447eb590
>> Total devices 4 FS bytes used 69.50GiB
>> devid1 size 931.51GiB used 931.51GiB path /dev/sda1
>> devid2 size 931.51GiB used 931.51GiB path /dev/sdb1
>> devid3 size 931.51GiB used 931.51GiB path /dev/sdc1
>> devid4 size 931.51GiB used 931.51GiB path /dev/sdd1
>>
>> Even the smallest balance operation I can start fails (this doesn't
>> change even with an extra temporary device added to the filesystem):
>>
>> # btrfs balance start -v -dusage=1 /broken
>> Dumping filters: flags 0x1, state 0x0, force is off
>>   DATA (flags 0x2): balancing, usage=1
>> ERROR: error during balancing '/broken': No space left on device
>> There may be more info in syslog - try dmesg | tail
>> # dmesg | tail -1
>> [11554.296805] BTRFS info (device sdc1): 757 enospc errors during
>> balance
>>
>> The current kernel is 4.15.0 from Debian's stretch-backports
>> (specifically linux-image-4.15.0-0.bpo.2-amd64), but it was Debian's
>> 4.9.30 when the filesystem got into this state. I upgraded it in the
>> hopes that a newer kernel would be smarter, but no dice.
>>
>> btrfs-progs is currently at v4.7.3.
>>
>> Most of what this filesystem stores is Prometheus 1.8's TSDB for its
>> metrics, which are constantly written at around 50MB/second. The
>> filesystem never really gets full as far as data goes, but there's a
>> lot of never-ending churn for what data is there.
>>
>> Question 1: Are there other steps that can be tried to rescue a
>> filesystem in this state? I still have it mounted in the same state,
>> and I'm willing to try other things or extract debugging info.
>>
>> Question 2: Is there something I could have done to prevent this from
>> happening in the first place?
> 
> Not sure why this happening,
> but if you stuck at that state:
>   - Reboot to ensure no other problems will exists
>   - Add any other external device temporary to FS, as example zram.
> After you free small part of fs, delete external dev from FS and
> continue balance chunks.

He did try adding a temporary device.  Requoting from above:

>> Even the smallest balance operation I can start fails (this doesn't
>> change even with an extra temporary device added to the filesystem):

Never-the-less, that's the right idea in general, but I believe the 
following additional suggestions, now addressed to the original poster, 
will help.

1) Try with -dusage=0.

With any luck there will be some totally empty data chunks, which this 
should free, hopefully getting you at least enough space for the -dusage=1 
to work and free additional space.

The reason this can work is that unlike with actual usage, entirely empty 
chunks don't require writing a fresh block to copy the used extents 
into... because there aren't any.  But of course it does require that 
there's some totally empty chunks available to free, which with your 
numbers is somewhat likely, but not a given, especially since newer 
kernels (well, since some time now, but...) normally free entirely empty 
chunks automatically.

FWIW, 0-usage balances are near instant as all it has to do is eliminate 
the empty chunks from the chunk list.  1% usage balances, once you can do 
them, will go real fast too, and in your state may get you back some 
decent unallocated, tho they probably won't do much for people in less 
extreme unbalance conditions.  10% will do more and take a bit longer, 
but still be fast as it's only writing 1/10th of the chunk size, and as 
long as there's enough chunks at that level, it'll still be returning 10 
for every full one it rewrites.  At 50% it'll take much longer but will 
still be returning 2 chunks for every one it writes.  Above that, the 
payback goes down rather fast, so you're only getting back 1 for 2 
written at 67%, and one for 9 written at 90%.  As such, on spinning rust 
it's rarely worth trying above 70% or so, and often not worth trying 
above 50%, unless of course the filesystem really is almost full and you 
are trying to reclaim every last bit of unused chunk space to unallocated 
you can, regardless of the time it takes.  FWIW I'm on ssd and partition 
up so my filesystems are normally under 100 GiB, so even a full balance 
normally only 

Re: Recovery from full metadata with all device space consumed?

2018-04-19 Thread Hugo Mills
On Thu, Apr 19, 2018 at 04:12:39PM -0700, Drew Bloechl wrote:
> On Thu, Apr 19, 2018 at 10:43:57PM +, Hugo Mills wrote:
> >Given that both data and metadata levels here require paired
> > chunks, try adding _two_ temporary devices so that it can allocate a
> > new block group.
> 
> Thank you very much, that seems to have done the trick:
> 
> # fallocate -l 4GiB /var/tmp/btrfs-temp-1
> # fallocate -l 4GiB /var/tmp/btrfs-temp-2
> # losetup -f /var/tmp/btrfs-temp-1
> # losetup -f /var/tmp/btrfs-temp-2
> # btrfs device add /dev/loop0 /broken
> Performing full device TRIM (4.00GiB) ...
> # btrfs device add /dev/loop1 /broken
> Performing full device TRIM (4.00GiB) ...
> # btrfs balance start -v -dusage=1 /broken
> Dumping filters: flags 0x1, state 0x0, force is off
>   DATA (flags 0x2): balancing, usage=1

   Excellent. Don't forget to "btrfs dev delete" the devices after
you're finished the balance. You could damage the FS (possibly
irreparably) if you destroy the devices without doing so.

> I'm guessing that'll take a while to complete, but meanwhile, in another
> terminal:
> 
> # btrfs fi show /broken
> Label: 'mon_data'  uuid: 85e52555-7d6d-4346-8b37-8278447eb590
>   Total devices 6 FS bytes used 69.53GiB
>   devid1 size 931.51GiB used 731.02GiB path /dev/sda1
>   devid2 size 931.51GiB used 731.02GiB path /dev/sdb1
>   devid3 size 931.51GiB used 730.03GiB path /dev/sdc1
>   devid4 size 931.51GiB used 730.03GiB path /dev/sdd1
>   devid5 size 4.00GiB used 1.00GiB path /dev/loop0
>   devid6 size 4.00GiB used 1.00GiB path /dev/loop1
> 
> # btrfs fi df /broken
> Data, RAID0: total=2.77TiB, used=67.00GiB
> System, RAID1: total=8.00MiB, used=192.00KiB
> Metadata, RAID1: total=4.00GiB, used=2.49GiB
> GlobalReserve, single: total=512.00MiB, used=0.00B
> 
> Do I understand correctly that this could require up to 3 extra devices,
> if for instance you arrived in this situation with a RAID6 data profile?
> Or is the number even higher for profiles like RAID10?

   The minimum number of devices for each RAID level is:

single, DUP: 1
RAID-0, -1, -5:  2
RAID-6:  3
RAID-10: 4

   Hugo.

-- 
Hugo Mills | Gentlemen! You can't fight here! This is the War
hugo@... carfax.org.uk | Room!
http://carfax.org.uk/  |
PGP: E2AB1DE4  |Dr Strangelove


signature.asc
Description: Digital signature


Re: Recovery from full metadata with all device space consumed?

2018-04-19 Thread Drew Bloechl
On Thu, Apr 19, 2018 at 10:43:57PM +, Hugo Mills wrote:
>Given that both data and metadata levels here require paired
> chunks, try adding _two_ temporary devices so that it can allocate a
> new block group.

Thank you very much, that seems to have done the trick:

# fallocate -l 4GiB /var/tmp/btrfs-temp-1
# fallocate -l 4GiB /var/tmp/btrfs-temp-2
# losetup -f /var/tmp/btrfs-temp-1
# losetup -f /var/tmp/btrfs-temp-2
# btrfs device add /dev/loop0 /broken
Performing full device TRIM (4.00GiB) ...
# btrfs device add /dev/loop1 /broken
Performing full device TRIM (4.00GiB) ...
# btrfs balance start -v -dusage=1 /broken
Dumping filters: flags 0x1, state 0x0, force is off
  DATA (flags 0x2): balancing, usage=1

I'm guessing that'll take a while to complete, but meanwhile, in another
terminal:

# btrfs fi show /broken
Label: 'mon_data'  uuid: 85e52555-7d6d-4346-8b37-8278447eb590
Total devices 6 FS bytes used 69.53GiB
devid1 size 931.51GiB used 731.02GiB path /dev/sda1
devid2 size 931.51GiB used 731.02GiB path /dev/sdb1
devid3 size 931.51GiB used 730.03GiB path /dev/sdc1
devid4 size 931.51GiB used 730.03GiB path /dev/sdd1
devid5 size 4.00GiB used 1.00GiB path /dev/loop0
devid6 size 4.00GiB used 1.00GiB path /dev/loop1

# btrfs fi df /broken
Data, RAID0: total=2.77TiB, used=67.00GiB
System, RAID1: total=8.00MiB, used=192.00KiB
Metadata, RAID1: total=4.00GiB, used=2.49GiB
GlobalReserve, single: total=512.00MiB, used=0.00B

Do I understand correctly that this could require up to 3 extra devices,
if for instance you arrived in this situation with a RAID6 data profile?
Or is the number even higher for profiles like RAID10?
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Recovery from full metadata with all device space consumed?

2018-04-19 Thread Hugo Mills
On Thu, Apr 19, 2018 at 03:08:48PM -0700, Drew Bloechl wrote:
> I've got a btrfs filesystem that I can't seem to get back to a useful
> state. The symptom I started with is that rename() operations started
> dying with ENOSPC, and it looks like the metadata allocation on the
> filesystem is full:
> 
> # btrfs fi df /broken
> Data, RAID0: total=3.63TiB, used=67.00GiB
> System, RAID1: total=8.00MiB, used=224.00KiB
> Metadata, RAID1: total=3.00GiB, used=2.50GiB
> GlobalReserve, single: total=512.00MiB, used=0.00B
> 
> All of the consumable space on the backing devices also seems to be in
> use:
> 
> # btrfs fi show /broken
> Label: 'mon_data'  uuid: 85e52555-7d6d-4346-8b37-8278447eb590
>   Total devices 4 FS bytes used 69.50GiB
>   devid1 size 931.51GiB used 931.51GiB path /dev/sda1
>   devid2 size 931.51GiB used 931.51GiB path /dev/sdb1
>   devid3 size 931.51GiB used 931.51GiB path /dev/sdc1
>   devid4 size 931.51GiB used 931.51GiB path /dev/sdd1
> 
> Even the smallest balance operation I can start fails (this doesn't
> change even with an extra temporary device added to the filesystem):

   Given that both data and metadata levels here require paired
chunks, try adding _two_ temporary devices so that it can allocate a
new block group.

   Hugo.

> # btrfs balance start -v -dusage=1 /broken
> Dumping filters: flags 0x1, state 0x0, force is off
>   DATA (flags 0x2): balancing, usage=1
> ERROR: error during balancing '/broken': No space left on device
> There may be more info in syslog - try dmesg | tail
> # dmesg | tail -1
> [11554.296805] BTRFS info (device sdc1): 757 enospc errors during
> balance
> 
> The current kernel is 4.15.0 from Debian's stretch-backports
> (specifically linux-image-4.15.0-0.bpo.2-amd64), but it was Debian's
> 4.9.30 when the filesystem got into this state. I upgraded it in the
> hopes that a newer kernel would be smarter, but no dice.
> 
> btrfs-progs is currently at v4.7.3.
> 
> Most of what this filesystem stores is Prometheus 1.8's TSDB for its
> metrics, which are constantly written at around 50MB/second. The
> filesystem never really gets full as far as data goes, but there's a lot
> of never-ending churn for what data is there.
> 
> Question 1: Are there other steps that can be tried to rescue a
> filesystem in this state? I still have it mounted in the same state, and
> I'm willing to try other things or extract debugging info.
> 
> Question 2: Is there something I could have done to prevent this from
> happening in the first place?
> 
> Thanks!

-- 
Hugo Mills | Always be sincere, whether you mean it or not.
hugo@... carfax.org.uk |
http://carfax.org.uk/  |  Flanders & Swann
PGP: E2AB1DE4  |The Reluctant Cannibal


signature.asc
Description: Digital signature


Re: Recovery from full metadata with all device space consumed?

2018-04-19 Thread Timofey Titovets
2018-04-20 1:08 GMT+03:00 Drew Bloechl :
> I've got a btrfs filesystem that I can't seem to get back to a useful
> state. The symptom I started with is that rename() operations started
> dying with ENOSPC, and it looks like the metadata allocation on the
> filesystem is full:
>
> # btrfs fi df /broken
> Data, RAID0: total=3.63TiB, used=67.00GiB
> System, RAID1: total=8.00MiB, used=224.00KiB
> Metadata, RAID1: total=3.00GiB, used=2.50GiB
> GlobalReserve, single: total=512.00MiB, used=0.00B
>
> All of the consumable space on the backing devices also seems to be in
> use:
>
> # btrfs fi show /broken
> Label: 'mon_data'  uuid: 85e52555-7d6d-4346-8b37-8278447eb590
> Total devices 4 FS bytes used 69.50GiB
> devid1 size 931.51GiB used 931.51GiB path /dev/sda1
> devid2 size 931.51GiB used 931.51GiB path /dev/sdb1
> devid3 size 931.51GiB used 931.51GiB path /dev/sdc1
> devid4 size 931.51GiB used 931.51GiB path /dev/sdd1
>
> Even the smallest balance operation I can start fails (this doesn't
> change even with an extra temporary device added to the filesystem):
>
> # btrfs balance start -v -dusage=1 /broken
> Dumping filters: flags 0x1, state 0x0, force is off
>   DATA (flags 0x2): balancing, usage=1
> ERROR: error during balancing '/broken': No space left on device
> There may be more info in syslog - try dmesg | tail
> # dmesg | tail -1
> [11554.296805] BTRFS info (device sdc1): 757 enospc errors during
> balance
>
> The current kernel is 4.15.0 from Debian's stretch-backports
> (specifically linux-image-4.15.0-0.bpo.2-amd64), but it was Debian's
> 4.9.30 when the filesystem got into this state. I upgraded it in the
> hopes that a newer kernel would be smarter, but no dice.
>
> btrfs-progs is currently at v4.7.3.
>
> Most of what this filesystem stores is Prometheus 1.8's TSDB for its
> metrics, which are constantly written at around 50MB/second. The
> filesystem never really gets full as far as data goes, but there's a lot
> of never-ending churn for what data is there.
>
> Question 1: Are there other steps that can be tried to rescue a
> filesystem in this state? I still have it mounted in the same state, and
> I'm willing to try other things or extract debugging info.
>
> Question 2: Is there something I could have done to prevent this from
> happening in the first place?
>
> Thanks!
> --
> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

Not sure why this happening,
but if you stuck at that state:
  - Reboot to ensure no other problems will exists
  - Add any other external device temporary to FS, as example zram.
After you free small part of fs, delete external dev from FS and
continue balance chunks.

Thanks.

-- 
Have a nice day,
Timofey.
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Recovery from full metadata with all device space consumed?

2018-04-19 Thread Drew Bloechl
I've got a btrfs filesystem that I can't seem to get back to a useful
state. The symptom I started with is that rename() operations started
dying with ENOSPC, and it looks like the metadata allocation on the
filesystem is full:

# btrfs fi df /broken
Data, RAID0: total=3.63TiB, used=67.00GiB
System, RAID1: total=8.00MiB, used=224.00KiB
Metadata, RAID1: total=3.00GiB, used=2.50GiB
GlobalReserve, single: total=512.00MiB, used=0.00B

All of the consumable space on the backing devices also seems to be in
use:

# btrfs fi show /broken
Label: 'mon_data'  uuid: 85e52555-7d6d-4346-8b37-8278447eb590
Total devices 4 FS bytes used 69.50GiB
devid1 size 931.51GiB used 931.51GiB path /dev/sda1
devid2 size 931.51GiB used 931.51GiB path /dev/sdb1
devid3 size 931.51GiB used 931.51GiB path /dev/sdc1
devid4 size 931.51GiB used 931.51GiB path /dev/sdd1

Even the smallest balance operation I can start fails (this doesn't
change even with an extra temporary device added to the filesystem):

# btrfs balance start -v -dusage=1 /broken
Dumping filters: flags 0x1, state 0x0, force is off
  DATA (flags 0x2): balancing, usage=1
ERROR: error during balancing '/broken': No space left on device
There may be more info in syslog - try dmesg | tail
# dmesg | tail -1
[11554.296805] BTRFS info (device sdc1): 757 enospc errors during
balance

The current kernel is 4.15.0 from Debian's stretch-backports
(specifically linux-image-4.15.0-0.bpo.2-amd64), but it was Debian's
4.9.30 when the filesystem got into this state. I upgraded it in the
hopes that a newer kernel would be smarter, but no dice.

btrfs-progs is currently at v4.7.3.

Most of what this filesystem stores is Prometheus 1.8's TSDB for its
metrics, which are constantly written at around 50MB/second. The
filesystem never really gets full as far as data goes, but there's a lot
of never-ending churn for what data is there.

Question 1: Are there other steps that can be tried to rescue a
filesystem in this state? I still have it mounted in the same state, and
I'm willing to try other things or extract debugging info.

Question 2: Is there something I could have done to prevent this from
happening in the first place?

Thanks!
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html