date:20141013

On Sun, Oct 12, 2014 at 6:14 AM, Martin Steigerwald mar...@lichtvoll.de wrote:
 Am Freitag, 10. Oktober 2014, 10:37:44 schrieb Chris Murphy:
 On Oct 10, 2014, at 6:53 AM, Bob Marley bobmar...@shiftmail.org wrote:
  On 10/10/2014 03:58, Chris Murphy wrote:
  * mount -o recovery
 
Enable autorecovery attempts if a bad tree root is found at mount
time.
 
  I'm confused why it's not the default yet. Maybe it's continuing to
  evolve at a pace that suggests something could sneak in that makes
  things worse? It is almost an oxymoron in that I'm manually enabling an
  autorecovery
 
  If true, maybe the closest indication we'd get of btrfs stablity is the
  default enabling of autorecovery.
  No way!
  I wouldn't want a default like that.
 
  If you think at distributed transactions: suppose a sync was issued on
  both sides of a distributed transaction, then power was lost on one side,
  than btrfs had corruption. When I remount it, definitely the worst thing
  that can happen is that it auto-rolls-back to a previous known-good
  state.
 For a general purpose file system, losing 30 seconds (or less) of
 questionably committed data, likely corrupt, is a file system that won't
 mount without user intervention, which requires a secret decoder ring to
 get it to mount at all. And may require the use of specialized tools to
 retrieve that data in any case.

 The fail safe behavior is to treat the known good tree root as the default
 tree root, and bypass the bad tree root if it cannot be repaired, so that
 the volume can be mounted with default mount options (i.e. the ones in
 fstab). Otherwise it's a filesystem that isn't well suited for general
 purpose use as rootfs let alone for boot.

 To understand this a bit better:

 What can be the reasons a recent tree gets corrupted?

 I always thought with a controller and device and driver combination that
 honors fsync with BTRFS it would either be the new state of the last known
 good state *anyway*. So where does the need to rollback arise from?


In theory the recover option should never be necessary.  Btrfs makes
all the guarantees everybody wants it to - when the data is fsynced
then it will never be lost.

The question is what should happen when a corrupted tree root, which
should never happen, happens anyway.  The options are to refuse to
mount the filesystem by default, or mount it by default discarding
about 30-60s worth of writes.  And yes, when this situation happens
(whether it mounts by default or not) btrfs has broken its promise of
data being written after a successful fsync return.

As has been pointed out, braindead drive firmware is the most likely
cause of this sort of issue.  However, there are a number of other
hardware and software errors that could cause it, including errors in
linux outside of btrfs, and of course bugs in btrfs as well.

In an ideal world no filesystem would need any kind of recovery/repair
tools.  They can often mean that the fsync promise was broken.  The
real question is, once that has happened, how do you move on?

I think the best default is to auto-recover, but to have better
facilities for reporting errors to the user.  Right now btrfs is very
quiet about failures - maybe a cryptic message in dmesg, and nobody
reads all of that unless they're looking for something.  If btrfs
could report significant issues that might mitigate the impact of an
auto-recovery.

Also, another thing to consider during recovery is whether the damaged
data could be optionally stored in a snapshot of some kind - maybe in
the way that ext3/4 rollback data after conversion gets stored in a
snapshot.  My knowledge of the underlying structures is weak, but I'd
think that a corrupted tree root practically is a snapshot already,
and turning it into one might even be easier than cleaning it up.  Of
course, we would need to ensure the snapshot could be deleted without
further error.  Doing anything with the snapshot might require special
tools, but if people want to do disk scraping they could.

--
Rich
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: btrfs send and kernel 3.17

Actually it seems strange that a send operation could corrupt the
source subvolume or fs. Why would the send modify the source subvolume
in any significant way? The only way I can find to reconcile your
observations with mine is that maybe the snapshots get corrupted not
by the send operation by itself but when they are generated with -r
(readonly, as it is needed to send them). Are the corrupted snapshots
you have in machine 2 (the one in which send was never used) readonly?
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: btrfs balance segfault, kernel BUG at fs/btrfs/extent-tree.c:7727

On Thu, Oct 9, 2014 at 10:19 AM, Petr Janecek jane...@ucw.cz wrote:

   I have trouble finishing btrfs balance on five disk raid10 fs.
 I added a disk to 4x3TB raid10 fs and run btrfs balance start
 /mnt/b3, which segfaulted after few hours, probably because of the BUG
 below. btrfs check does not find any errors, both before the balance
 and after reboot (the fs becomes un-umountable).

 [22744.238559] WARNING: CPU: 0 PID: 4211 at fs/btrfs/extent-tree.c:876 
 btrfs_lookup_extent_info+0x292/0x30a [btrfs]()

 [22744.532378] kernel BUG at fs/btrfs/extent-tree.c:7727!

I am running into something similar. I just added a 3TB drive to my
raid1 btrfs and started a balance.  The balance segfaulted, and I find
this in dmesg:


[453046.291762] BTRFS info (device sde2): relocating block group
10367073779712 flags 17
[453062.494151] BTRFS info (device sde2): found 13 extents
[453069.283368] [ cut here ]
[453069.283468] kernel BUG at
/data/src/linux-3.17.0-gentoo/fs/btrfs/relocation.c:931!
[453069.283590] invalid opcode:  [#1] SMP
[453069.283666] Modules linked in: vhost_net vhost macvtap macvlan tun
ipt_MASQUERADE xt_conntrack veth nfsd auth_rpcgss oid_registry lockd
iptable_mangle iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4
nf_nat_ipv4 nf_nat nf_conntrack iptable_filter ip_tables it87
hwmon_vid hid_logitech_dj nxt200x cx88_dvb videobuf_dvb dvb_core
cx88_vp3054_i2c tuner_simple tuner_types tuner mousedev hid_generic
usbhid cx88_alsa radeon cx8800 cx8802 cx88xx snd_hda_codec_realtek
btcx_risc snd_hda_codec_generic videobuf_dma_sg videobuf_core kvm_amd
tveeprom kvm rc_core v4l2_common cfbfillrect fbcon videodev cfbimgblt
snd_hda_intel bitblit snd_hda_controller cfbcopyarea softcursor font
tileblit i2c_algo_bit k10temp snd_hda_codec backlight drm_kms_helper
snd_hwdep i2c_piix4 ttm snd_pcm snd_timer drm snd soundcore 8250 evdev
[453069.285043]  serial_core ext4 crc16 jbd2 mbcache zram lz4_compress
zsmalloc ata_generic pata_acpi btrfs xor zlib_deflate atkbd raid6_pq
ohci_pci firewire_ohci firewire_core crc_itu_t pata_atiixp ehci_pci
ohci_hcd ehci_hcd usbcore usb_common r8169 mii sunrpc dm_mirror
dm_region_hash dm_log dm_mod
[453069.285552] CPU: 1 PID: 17270 Comm: btrfs Not tainted 3.17.0-gentoo #1
[453069.285657] Hardware name: Gigabyte Technology Co., Ltd.
GA-880GM-UD2H/GA-880GM-UD2H, BIOS F8 10/11/2010
[453069.285806] task: 88040ec556e0 ti: 88010cf94000 task.ti:
88010cf94000
[453069.285925] RIP: 0010:[a02ddd62]  [a02ddd62]
build_backref_tree+0x1152/0x11b0 [btrfs]
[453069.286137] RSP: 0018:88010cf97848  EFLAGS: 00010206
[453069.286223] RAX: 8800ae67c800 RBX: 880122e94000 RCX:
880122e949c0
[453069.286336] RDX: 09270788d000 RSI: 880054c3fbc0 RDI:
8800ae67c800
[453069.286449] RBP: 88010cf97958 R08: 000159a0 R09:
880122e94000
[453069.286561] R10: 0003 R11:  R12:
8802da313000
[453069.286674] R13: 8802da313c60 R14: 880122e94780 R15:
88040c277000
[453069.286787] FS:  7f175ac51880() GS:880427c4()
knlGS:f7333b40
[453069.286913] CS:  0010 DS:  ES:  CR0: 8005003b
[453069.287005] CR2: 7f208de58000 CR3: 0003b0a9c000 CR4:
07e0
[453069.287116] Stack:
[453069.287151]  88010cf97868 880122e94000 01ff880122e94300
880342156060
[453069.287282]  880122e94780 8802da313c60 880122e94600
8800ae67c800
[453069.287412]  880122e947c0 8802da313000 88040c277120
88010005
[453069.287542] Call Trace:
[453069.287640]  [a02ddfa3] relocate_tree_blocks+0x1e3/0x630 [btrfs]
[453069.287796]  [a02e0550] relocate_block_group+0x3d0/0x650 [btrfs]
[453069.287951]  [a02e0958]
btrfs_relocate_block_group+0x188/0x2a0 [btrfs]
[453069.288113]  [a02b48f0]
btrfs_relocate_chunk.isra.61+0x70/0x780 [btrfs]
[453069.288276]  [a02c7fd0] ?
btrfs_set_lock_blocking_rw+0x70/0xc0 [btrfs]
[453069.288438]  [a02b0e79] ? free_extent_buffer+0x59/0xb0 [btrfs]
[453069.288590]  [a02b8e99] btrfs_balance+0x829/0xf40 [btrfs]
[453069.288738]  [a02bf80f] btrfs_ioctl_balance+0x1af/0x510 [btrfs]
[453069.288890]  [a02c59e4] btrfs_ioctl+0xa54/0x2950 [btrfs]
[453069.288995]  [8111d016] ?
lru_cache_add_active_or_unevictable+0x26/0x90
[453069.289119]  [8113a061] ? handle_mm_fault+0xbe1/0xdb0
[453069.289219]  [811ffdce] ? cred_has_capability+0x5e/0x100
[453069.289323]  [8104065c] ? __do_page_fault+0x1fc/0x4f0
[453069.289422]  [8117d80e] do_vfs_ioctl+0x7e/0x4f0
[453069.289513]  [811ff64f] ? file_has_perm+0x8f/0xa0
[453069.289606]  [8117dd09] SyS_ioctl+0x89/0xa0
[453069.289692]  [81040a1c] ? do_page_fault+0xc/0x10
[453069.289785]  [814f5752] system_call_fastpath+0x16/0x1b
[453069.289881] Code: ff ff 48 8b 9d 20 ff ff ff e9 11 ff ff ff 0f 0b
be ec 03 00 00 48 c7 c7 d0 f0 30 a0 e8 28 00 d7 e0 e9 06 f3 ff ff e8
c4 42

Re: 3.17.0-rc7: kernel BUG at fs/btrfs/relocation.c:931!

On Thu, Oct 2, 2014 at 3:27 AM, Tomasz Chmielewski t...@virtall.com wrote:
 Got this when running balance with 3.17.0-rc7:

 [173475.410717] kernel BUG at fs/btrfs/relocation.c:931!

I just started a post on another thread with this exact same issue on
3.17.0. I started a balance after adding a new drive.

[453046.291762] BTRFS info (device sde2): relocating block group
10367073779712 flags 17
[453062.494151] BTRFS info (device sde2): found 13 extents
[453069.283368] [ cut here ]
[453069.283468] kernel BUG at
/data/src/linux-3.17.0-gentoo/fs/btrfs/relocation.c:931!
[453069.283590] invalid opcode:  [#1] SMP
[453069.283666] Modules linked in: vhost_net vhost macvtap macvlan tun
ipt_MASQUERADE xt_conntrack veth nfsd auth_rpcgss oid_registry lockd
iptable_mangle iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4
nf_nat_ipv4 nf_nat nf_conntrack iptable_filter ip_tables it87
hwmon_vid hid_logitech_dj nxt200x cx88_dvb videobuf_dvb dvb_core
cx88_vp3054_i2c tuner_simple tuner_types tuner mousedev hid_generic
usbhid cx88_alsa radeon cx8800 cx8802 cx88xx snd_hda_codec_realtek
btcx_risc snd_hda_codec_generic videobuf_dma_sg videobuf_core kvm_amd
tveeprom kvm rc_core v4l2_common cfbfillrect fbcon videodev cfbimgblt
snd_hda_intel bitblit snd_hda_controller cfbcopyarea softcursor font
tileblit i2c_algo_bit k10temp snd_hda_codec backlight drm_kms_helper
snd_hwdep i2c_piix4 ttm snd_pcm snd_timer drm snd soundcore 8250 evdev
[453069.285043]  serial_core ext4 crc16 jbd2 mbcache zram lz4_compress
zsmalloc ata_generic pata_acpi btrfs xor zlib_deflate atkbd raid6_pq
ohci_pci firewire_ohci firewire_core crc_itu_t pata_atiixp ehci_pci
ohci_hcd ehci_hcd usbcore usb_common r8169 mii sunrpc dm_mirror
dm_region_hash dm_log dm_mod
[453069.285552] CPU: 1 PID: 17270 Comm: btrfs Not tainted 3.17.0-gentoo #1
[453069.285657] Hardware name: Gigabyte Technology Co., Ltd.
GA-880GM-UD2H/GA-880GM-UD2H, BIOS F8 10/11/2010
[453069.285806] task: 88040ec556e0 ti: 88010cf94000 task.ti:
88010cf94000
[453069.285925] RIP: 0010:[a02ddd62]  [a02ddd62]
build_backref_tree+0x1152/0x11b0 [btrfs]
[453069.286137] RSP: 0018:88010cf97848  EFLAGS: 00010206
[453069.286223] RAX: 8800ae67c800 RBX: 880122e94000 RCX:
880122e949c0
[453069.286336] RDX: 09270788d000 RSI: 880054c3fbc0 RDI:
8800ae67c800
[453069.286449] RBP: 88010cf97958 R08: 000159a0 R09:
880122e94000
[453069.286561] R10: 0003 R11:  R12:
8802da313000
[453069.286674] R13: 8802da313c60 R14: 880122e94780 R15:
88040c277000
[453069.286787] FS:  7f175ac51880() GS:880427c4()
knlGS:f7333b40
[453069.286913] CS:  0010 DS:  ES:  CR0: 8005003b
[453069.287005] CR2: 7f208de58000 CR3: 0003b0a9c000 CR4:
07e0
[453069.287116] Stack:
[453069.287151]  88010cf97868 880122e94000 01ff880122e94300
880342156060
[453069.287282]  880122e94780 8802da313c60 880122e94600
8800ae67c800
[453069.287412]  880122e947c0 8802da313000 88040c277120
88010005
[453069.287542] Call Trace:
[453069.287640]  [a02ddfa3] relocate_tree_blocks+0x1e3/0x630 [btrfs]
[453069.287796]  [a02e0550] relocate_block_group+0x3d0/0x650 [btrfs]
[453069.287951]  [a02e0958]
btrfs_relocate_block_group+0x188/0x2a0 [btrfs]
[453069.288113]  [a02b48f0]
btrfs_relocate_chunk.isra.61+0x70/0x780 [btrfs]
[453069.288276]  [a02c7fd0] ?
btrfs_set_lock_blocking_rw+0x70/0xc0 [btrfs]
[453069.288438]  [a02b0e79] ? free_extent_buffer+0x59/0xb0 [btrfs]
[453069.288590]  [a02b8e99] btrfs_balance+0x829/0xf40 [btrfs]
[453069.288738]  [a02bf80f] btrfs_ioctl_balance+0x1af/0x510 [btrfs]
[453069.288890]  [a02c59e4] btrfs_ioctl+0xa54/0x2950 [btrfs]
[453069.288995]  [8111d016] ?
lru_cache_add_active_or_unevictable+0x26/0x90
[453069.289119]  [8113a061] ? handle_mm_fault+0xbe1/0xdb0
[453069.289219]  [811ffdce] ? cred_has_capability+0x5e/0x100
[453069.289323]  [8104065c] ? __do_page_fault+0x1fc/0x4f0
[453069.289422]  [8117d80e] do_vfs_ioctl+0x7e/0x4f0
[453069.289513]  [811ff64f] ? file_has_perm+0x8f/0xa0
[453069.289606]  [8117dd09] SyS_ioctl+0x89/0xa0
[453069.289692]  [81040a1c] ? do_page_fault+0xc/0x10
[453069.289785]  [814f5752] system_call_fastpath+0x16/0x1b
[453069.289881] Code: ff ff 48 8b 9d 20 ff ff ff e9 11 ff ff ff 0f 0b
be ec 03 00 00 48 c7 c7 d0 f0 30 a0 e8 28 00 d7 e0 e9 06 f3 ff ff e8
c4 42 02 00 0f 0b 3c b0 0f 84 72 f1 ff ff be 22 03 00 00 48 c7 c7 d0
f0 30
[453069.290429] RIP  [a02ddd62]
build_backref_tree+0x1152/0x11b0 [btrfs]
[453069.290591]  RSP 88010cf97848
[453069.316194] ---[ end trace 5fdc0af4cc62bf41 ]---
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at

Re: btrfs send and kernel 3.17

On 10/13/2014 02:40 PM, john terragon wrote:
 Actually it seems strange that a send operation could corrupt the
 source subvolume or fs. Why would the send modify the source subvolume
 in any significant way? The only way I can find to reconcile your
 observations with mine is that maybe the snapshots get corrupted not
 by the send operation by itself but when they are generated with -r
 (readonly, as it is needed to send them). Are the corrupted snapshots
 you have in machine 2 (the one in which send was never used) readonly?
Yes, on both machines there are only readonly snapshots.
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: btrfs send and kernel 3.17

On Sun, Oct 12, 2014 at 7:11 AM, David Arendt ad...@prnet.org wrote:
 This weekend I finally had time to try btrfs send again on the newly
 created fs. Now I am running into another problem:

 btrfs send returns: ERROR: send ioctl failed with -12: Cannot allocate
 memory

 In dmesg I see only the following output:

 parent transid verify failed on 21325004800 wanted 2620 found 8325


I'm not using send at all, but I've been running into parent transid
verify failed messages where the wanted is way smaller than the found
when trying to balance a raid1 after adding a new drive.  Originally I
had gotten a BUG, and after reboot the drive finished balancing
(interestingly enough without moving any chunks to the new drive -
just consolidating everything on the old drives), and then when I try
to do another balance I get:
[ 4426.987177] BTRFS info (device sdc2): relocating block group
10367073779712 flags 17
[ 4446.287998] BTRFS info (device sdc2): found 13 extents
[ 4451.330887] parent transid verify failed on 10063286579200 wanted
987432 found 993678
[ 4451.350663] parent transid verify failed on 10063286579200 wanted
987432 found 993678

The btrfs program itself outputs:
btrfs balance start -v /data
Dumping filters: flags 0x7, state 0x0, force is off
  DATA (flags 0x0): balancing
  METADATA (flags 0x0): balancing
  SYSTEM (flags 0x0): balancing
ERROR: error during balancing '/data' - Cannot allocate memory
There may be more info in syslog - try dmesg | tail

This is also on 3.17.  This may be completely unrelated, but it seemed
similar enough to be worth mentioning.

The filesystem otherwise seems to work fine, other than the new drive
not having any data on it:
Label: 'datafs'  uuid: cd074207-9bc3-402d-bee8-6a8c77d56959
Total devices 6 FS bytes used 2.16TiB
devid1 size 2.73TiB used 2.40TiB path /dev/sdc2
devid2 size 931.32GiB used 695.03GiB path /dev/sda2
devid3 size 931.32GiB used 700.00GiB path /dev/sdb2
devid4 size 931.32GiB used 700.00GiB path /dev/sdd2
devid5 size 931.32GiB used 699.00GiB path /dev/sde2
devid6 size 2.73TiB used 0.00 path /dev/sdf2

This is btrfs-progs-3.16.2.

--
Rich
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: what is the best way to monitor raid1 drive failures?

2014-10-13 Thread Suman C

I had progs 3.12 and updated to the latest from git(3.16). With this
update, btrfs fi show reports there is a missing device immediately
after i pull it out. Thanks!

I am using virtualbox to test this. So, I am detaching the drive like so:

vboxmanage storageattach vm --storagectl controller --port port
--device device --medium none

Next I am going to try and test a more realistic scenario where a
harddrive is not pulled out, but is damaged.

Can/does btrfs mark a filesystem(say, 2 drive raid1) degraded or
unhealthy automatically when one drive is damaged badly enough that it
cannot be written to or read from reliably?

Suman

On Sun, Oct 12, 2014 at 7:21 PM, Anand Jain anand.j...@oracle.com wrote:

 Suman,

 To simulate the failure, I detached one of the drives from the system.
 After that, I see no sign of a problem except for these errors:

  Are you physically pulling out the device ? I wonder if lsblk or blkid
  shows the error ? reporting device missing logic is in the progs (so
  have that latest) and it works provided user script such as blkid/lsblk
  also reports the problem. OR for soft-detach tests you could use
  devmgt at http://github.com/anajain/devmgt

  Also I am trying to get the device management framework for the btrfs
  with a more better device management and reporting.

 Thanks,  Anand



 On 10/13/14 07:50, Suman C wrote:

 Hi,

 I am testing some disk failure scenarios in a 2 drive raid1 mirror.
 They are 4GB each, virtual SATA drives inside virtualbox.

 To simulate the failure, I detached one of the drives from the system.
 After that, I see no sign of a problem except for these errors:

 Oct 12 15:37:14 rock-dev kernel: btrfs: bdev /dev/sdb errs: wr 0, rd
 0, flush 1, corrupt 0, gen 0
 Oct 12 15:37:14 rock-dev kernel: lost page write due to I/O error on
 /dev/sdb

 /dev/sdb is gone from the system, but btrfs fi show still lists it.

 Label: raid1pool  uuid: 4e5d8b43-1d34-4672-8057-99c51649b7c6
  Total devices 2 FS bytes used 1.46GiB
  devid1 size 4.00GiB used 2.45GiB path /dev/sdb
  devid2 size 4.00GiB used 2.43GiB path /dev/sdc

 I am able to read and write just fine, but do see the above errors in
 dmesg.

 What is the best way to find out that one of the drives has gone bad?

 Suman
 --
 To unsubscribe from this list: send the line unsubscribe linux-btrfs in
 the body of a message to majord...@vger.kernel.org
 More majordomo info at  http://vger.kernel.org/majordomo-info.html


--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: btrfs random filesystem corruption in kernel 3.17

From my own experience and based on what other people are saying, I
think there is a random btrfs filesystem corruption problem in kernel
3.17 at least related to snapshots, therefore I decided to post using
another subject to draw attention from people not concerned about btrfs
send to it. More information can be found in the brtfs send posts.

Did the filesystem you tried to balance contain snapshots ? Read only ones ?

On 10/13/2014 07:22 PM, Rich Freeman wrote:
 On Sun, Oct 12, 2014 at 7:11 AM, David Arendt ad...@prnet.org wrote:
 This weekend I finally had time to try btrfs send again on the newly
 created fs. Now I am running into another problem:

 btrfs send returns: ERROR: send ioctl failed with -12: Cannot allocate
 memory

 In dmesg I see only the following output:

 parent transid verify failed on 21325004800 wanted 2620 found 8325

 I'm not using send at all, but I've been running into parent transid
 verify failed messages where the wanted is way smaller than the found
 when trying to balance a raid1 after adding a new drive.  Originally I
 had gotten a BUG, and after reboot the drive finished balancing
 (interestingly enough without moving any chunks to the new drive -
 just consolidating everything on the old drives), and then when I try
 to do another balance I get:
 [ 4426.987177] BTRFS info (device sdc2): relocating block group
 10367073779712 flags 17
 [ 4446.287998] BTRFS info (device sdc2): found 13 extents
 [ 4451.330887] parent transid verify failed on 10063286579200 wanted
 987432 found 993678
 [ 4451.350663] parent transid verify failed on 10063286579200 wanted
 987432 found 993678

 The btrfs program itself outputs:
 btrfs balance start -v /data
 Dumping filters: flags 0x7, state 0x0, force is off
   DATA (flags 0x0): balancing
   METADATA (flags 0x0): balancing
   SYSTEM (flags 0x0): balancing
 ERROR: error during balancing '/data' - Cannot allocate memory
 There may be more info in syslog - try dmesg | tail

 This is also on 3.17.  This may be completely unrelated, but it seemed
 similar enough to be worth mentioning.

 The filesystem otherwise seems to work fine, other than the new drive
 not having any data on it:
 Label: 'datafs'  uuid: cd074207-9bc3-402d-bee8-6a8c77d56959
 Total devices 6 FS bytes used 2.16TiB
 devid1 size 2.73TiB used 2.40TiB path /dev/sdc2
 devid2 size 931.32GiB used 695.03GiB path /dev/sda2
 devid3 size 931.32GiB used 700.00GiB path /dev/sdb2
 devid4 size 931.32GiB used 700.00GiB path /dev/sdd2
 devid5 size 931.32GiB used 699.00GiB path /dev/sde2
 devid6 size 2.73TiB used 0.00 path /dev/sdf2

 This is btrfs-progs-3.16.2.

 --
 Rich

--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: btrfs random filesystem corruption in kernel 3.17

On Mon, Oct 13, 2014 at 4:27 PM, David Arendt ad...@prnet.org wrote:
 From my own experience and based on what other people are saying, I
 think there is a random btrfs filesystem corruption problem in kernel
 3.17 at least related to snapshots, therefore I decided to post using
 another subject to draw attention from people not concerned about btrfs
 send to it. More information can be found in the brtfs send posts.

 Did the filesystem you tried to balance contain snapshots ? Read only ones ?

The filesystem contains numerous subvolumes and snapshots, many of
which are read-only.  I'm managing many with snapper.

The similarity of the transid verify errors made me think this issue
is related, and the root cause may have nothing to do with btrfs send.

As far as I can tell these errors aren't having any affect on my data
- hopefully the system is catching the problems before there are
actual disk writes/etc.

--
Rich
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: btrfs random filesystem corruption in kernel 3.17

I think I just found a consistent simple way to trigger the problem
(at least on my system). And, as I guessed before, it seems to be
related just to readonly snapshots:

1) I create a readonly snapshot
2) I do some changes on the source subvolume for the snapshot (I'm not
sure changes are strictly needed)
3) reboot (or probably just unmount and remount. I reboot because the
fs I've problems with contains my root subvolume)

After the rebooting (or the remount) I consistently have the corruption
with the usual multitude of these in dmesg
parent transid verify failed on 902316032 wanted 2484 found 4101
and the characteristic ls -la output

drwxr-xr-x 1 root root  250 Oct 10 15:37 root
d? ? ??   ?? root-b2
drwxr-xr-x 1 root root  250 Oct 10 15:37 root-b3
d? ? ??   ?? root-backup

root-backup and root-b2 are both readonly whereas root-b3 is rw (and
it didn't get corrupted).

David, maybe you can try the same steps on one of your machines?

John
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: btrfs random filesystem corruption in kernel 3.17

On Mon, Oct 13, 2014 at 4:48 PM, john terragon jterra...@gmail.com wrote:
 I think I just found a consistent simple way to trigger the problem
 (at least on my system). And, as I guessed before, it seems to be
 related just to readonly snapshots:

 1) I create a readonly snapshot
 2) I do some changes on the source subvolume for the snapshot (I'm not
 sure changes are strictly needed)
 3) reboot (or probably just unmount and remount. I reboot because the
 fs I've problems with contains my root subvolume)

 After the rebooting (or the remount) I consistently have the corruption
 with the usual multitude of these in dmesg
 parent transid verify failed on 902316032 wanted 2484 found 4101
 and the characteristic ls -la output

 drwxr-xr-x 1 root root  250 Oct 10 15:37 root
 d? ? ??   ?? root-b2
 drwxr-xr-x 1 root root  250 Oct 10 15:37 root-b3
 d? ? ??   ?? root-backup

 root-backup and root-b2 are both readonly whereas root-b3 is rw (and
 it didn't get corrupted).

 David, maybe you can try the same steps on one of your machines?


Look at that.  I didn't realize it, but indeed I have a corrupted snapshot:
/data/.snapshots/5338/:
ls: cannot access /data/.snapshots/5338/snapshot: Cannot allocate memory
total 4
drwxr-xr-x 1 root root  32 Oct 11 06:09 .
drwxr-x--- 1 root root  32 Oct 11 07:42 ..
-rw--- 1 root root 135 Oct 11 06:09 info.xml
d? ? ??  ?? snapshot

Several older snapshots are fine, and those predate my 3.17 upgrade.

I noticed that this corrupted snapshot isn't even listed in my snapper lists.

btrfs su delete /data/.snapshots/5338/snapshot
Transaction commit: none (default)
ERROR: error accessing '/data/.snapshots/5338/snapshot'

Removing them appears to be problematic as well.  I might just disable
compress=lzo and go back to 3.16 to see how that goes.

--
Rich
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: btrfs random filesystem corruption in kernel 3.17

On Mon, Oct 13, 2014 at 4:55 PM, Rich Freeman
r-bt...@thefreemanclan.net wrote:
 On Mon, Oct 13, 2014 at 4:48 PM, john terragon jterra...@gmail.com wrote:

 After the rebooting (or the remount) I consistently have the corruption
 with the usual multitude of these in dmesg
 parent transid verify failed on 902316032 wanted 2484 found 4101
 and the characteristic ls -la output

Sorry to double-reply, but I left this out.  I have a long string of
these early in boot as well that I never noticed before.

--
Rich
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: What is the vision for btrfs fs repair?

2014-10-13 Thread Josef Bacik

On 10/08/2014 03:11 PM, Eric Sandeen wrote:

I was looking at Marc's post:

https://urldefense.proofpoint.com/v1/url?u=http://marc.merlins.org/perso/btrfs/post_2014-03-19_Btrfs-Tips_-Btrfs-Scrub-and-Btrfs-Filesystem-Repair.htmlk=ZVNjlDMF0FElm4dQtryO4A%3D%3D%0Ar=cKCbChRKsMpTX8ybrSkonQ%3D%3D%0Am=XJPoqgf9jjvuE1IqCerEXXuwF4w3hbDS3%2F63x5KI4R4%3D%0As=b1f817d758eacf914bd60f20ada715384e13c1f8e040100794b0cb21261ec884

and it feels like there isn't exactly a cohesive, overarching vision for
repair of a corrupted btrfs filesystem.

In other words - I'm an admin cruising along, when the kernel throws some
fs corruption error, or for whatever reason btrfs fails to mount.
What should I do?

Marc lays out several steps, but to me this highlights that there seem to
be a lot of disjoint mechanisms out there to deal with these problems;
mostly from Marc's blog, with some bits of my own:

* btrfs scrub
Errors are corrected along if possible (what *is* possible?)
* mount -o recovery
Enable autorecovery attempts if a bad tree root is found at mount
time.
* mount -o degraded
Allow mounts to continue with missing devices.
(This isn't really a way to recover from corruption, right?)
* btrfs-zero-log
remove the log tree if log tree is corrupt
* btrfs rescue
Recover a damaged btrfs filesystem
chunk-recover
super-recover
How does this relate to btrfs check?
* btrfs check
repair a btrfs filesystem
--repair
--init-csum-tree
--init-extent-tree
How does this relate to btrfs rescue?
* btrfs restore
try to salvage files from a damaged filesystem
(not really repair, it's disk-scraping)

What's the vision for, say, scrub vs. check vs. rescue? Should they repair the
same errors, only online vs. offline? If not, what class of errors does one
fix vs.
the other? How would an admin know? Can btrfs check recover a bad tree root
in the same way that mount -o recovery does? How would I know if I should use
--init-*-tree, or chunk-recover, and what are the ramifications of using
these options?

It feels like recovery tools have been badly splintered, and if there's an
overarching design or vision for btrfs fs repair, I can't tell what it is.
Can anyone help me?

We probably should just consolidate under 3 commands, one for online
checking, one for offline repair and one for pulling stuff off of the
disk when things go to hell. A lot of these tools were born out of the
fact that we didn't have a fsck tool for a long time so there were these
stop gaps put into place, so now its time to go back and clean it up.

I'll try and do this after I finish my cleanup/sync between kernel and
progs work and fill out the documentation a little better so its clear
when to use what. Thanks,

Josef

--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html

Re: btrfs random filesystem corruption in kernel 3.17

I'm using compress=no so compression doesn't seem to be related, at
least in my case. Just read-only snapshots on 3.17 (although I haven't
tried 3.16).

John
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: btrfs random filesystem corruption in kernel 3.17

As these to machines are running as server for different purposes (yes,
I know that btrfs is unstable and any corruption or data loss is at my
own risk therefore I have good backups), I want to reboot them not more
then necessary.

However I tried to bring my reboot times in relation with corruptions:

machine 1:

d? ? ?  ? ?? root.20141009.000503.backup

reboot   system boot  3.17.0   Thu Oct  9 23:20   still running
reboot   system boot  3.17.0   Tue Oct  7 21:25 - 23:18 (2+01:53)
reboot   system boot  3.17.0   Mon Oct  6 22:47 - 23:18 (3+00:31)

For this machine, corruption seems to have occurred for a snapshot
created after a reboot.


machine 2:

d? ? ??  ?? root.20141006.003239.backup
d? ? ??  ?? root.20141007.001616.backup
d? ? ??  ?? root.20141008.000501.backup
d? ? ??  ?? root.20141009.052436.backup

reboot   system boot  3.17.0   Thu Oct  9 21:31   still running
reboot   system boot  3.17.0   Tue Oct  7 21:27 - 21:30 (2+00:03)
reboot   system boot  3.17.0   Tue Oct  7 17:51 - 21:26  (03:34)
reboot   system boot  3.17.0   Sun Oct  5 23:50 - 17:50 (1+17:59)
reboot   system boot  3.17.0   Sun Oct  5 23:47 - 23:49  (00:01)

During the next days, I will setup a virtual machine to do more tests.

On 10/13/2014 10:48 PM, john terragon wrote:
 I think I just found a consistent simple way to trigger the problem
 (at least on my system). And, as I guessed before, it seems to be
 related just to readonly snapshots:

 1) I create a readonly snapshot
 2) I do some changes on the source subvolume for the snapshot (I'm not
 sure changes are strictly needed)
 3) reboot (or probably just unmount and remount. I reboot because the
 fs I've problems with contains my root subvolume)

 After the rebooting (or the remount) I consistently have the corruption
 with the usual multitude of these in dmesg
 parent transid verify failed on 902316032 wanted 2484 found 4101
 and the characteristic ls -la output

 drwxr-xr-x 1 root root  250 Oct 10 15:37 root
 d? ? ??   ?? root-b2
 drwxr-xr-x 1 root root  250 Oct 10 15:37 root-b3
 d? ? ??   ?? root-backup

 root-backup and root-b2 are both readonly whereas root-b3 is rw (and
 it didn't get corrupted).

 David, maybe you can try the same steps on one of your machines?

 John

--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: btrfs random filesystem corruption in kernel 3.17

I'm also using no compression.

On 10/13/2014 11:22 PM, john terragon wrote:
 I'm using compress=no so compression doesn't seem to be related, at
 least in my case. Just read-only snapshots on 3.17 (although I haven't
 tried 3.16).

 John

--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: btrfs random filesystem corruption in kernel 3.17

2014-10-13 Thread Duncan

David Arendt posted on Mon, 13 Oct 2014 23:25:23 +0200 as excerpted:

 I'm also using no compression.
 
 On 10/13/2014 11:22 PM, john terragon wrote:
 I'm using compress=no so compression doesn't seem to be related, at
 least in my case. Just read-only snapshots on 3.17 (although I haven't
 tried 3.16).

While I'm not a mind-reader and thus don't know for sure, Rich's 
reference to 3.16 and compression might not be related to this bug at 
all.  In 3.15 and early 3.16, there was a different bug related to 
compression, tho IIRC it was patched in 3.16.2 and 3.17-rc2 (or maybe .3 
and rc3, it's patched in the latest 3.16.x anyway, and in 3.17).  So how 
I read his comment was that he was considering going back to 3.16 and 
disabling compression to deal with that bug (he may not know the patch 
was marked for stable and is in current 3.16.x), rather than stay on 
3.17, since this bug hasn't even been traced yet, let alone patched.

Meanwhile, this bug makes me glad my use-case doesn't involve snapshots, 
and I've seen nothing of it. =:^)

-- 
Duncan - List replies preferred.   No HTML msgs.
Every nonfree program has a lord, a master --
and if you use the program, he is your master.  Richard Stallman

--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: btrfs random filesystem corruption in kernel 3.17

2014-10-13 Thread Duncan

Rich Freeman posted on Mon, 13 Oct 2014 16:42:14 -0400 as excerpted:

 On Mon, Oct 13, 2014 at 4:27 PM, David Arendt ad...@prnet.org wrote:
 From my own experience and based on what other people are saying, I
 think there is a random btrfs filesystem corruption problem in kernel
 3.17 at least related to snapshots, therefore I decided to post using
 another subject to draw attention from people not concerned about btrfs
 send to it. More information can be found in the brtfs send posts.

 Did the filesystem you tried to balance contain snapshots ? Read only
 ones ?
 
 The filesystem contains numerous subvolumes and snapshots, many of which
 are read-only.  I'm managing many with snapper.
 
 The similarity of the transid verify errors made me think this issue is
 related, and the root cause may have nothing to do with btrfs send.
 
 As far as I can tell these errors aren't having any affect on my data -
 hopefully the system is catching the problems before there are actual
 disk writes/etc.

Summarizing what I've seen on the threads...

1) The bug seems to be read-only snapshot related.  The connection to 
send is that send creates read-only snapshots, but people creating read-
only snapshots for other purposes are now reporting the same problem, so 
it's not send, it's the read-only snapshots.

2) Writable snapshots haven't been implicated yet, and the working set 
from which the snapshots are taken doesn't seem to be affected, either.  
So in that sense it's not affecting ordinary usage, only the read-only 
snapshots themselves.

3) More problematic, however, is the fact that these apparently corrupted 
read-only snapshots often are not listed properly and can't be deleted, 
tho I'm not sure if that's /all/ the corrupted snapshots or only part of 
them. So while it may not affect ordinary operation in the short term, 
over time until there's a fix, people routinely doing read-only snapshots 
are going to be getting more and more of these undeletable snapshots, and 
depending on whether the eventual patch only prevents more or can 
actually fix the bad ones (possibly via btrfs check or the like), 
affected filesystems may ultimately have to be blown away and recreated 
with a fresh mkfs, in ordered to kill the currently undeletable snapshots.

So the first thing to do would be to shut off whatever's making read-only 
snapshots, so you don't make the problem worse while it's being 
investigated.  For those who can do that without too big an interruption 
to their normal routine (who don't depend on send/receive, for instance), 
just keep it off for the time being.  For those who depend on read-only 
snapshots (send-receive for backup and the data is too valuable to not do 
the backups for a few days), consider switching back to 3.16-stable -- 
from 3.16.3 at least, the patch for the compress bug is there, so that 
shouldn't be a problem.

And if you're affected, be aware that until we have a fix, we don't know 
if it'll be possible to remove the affected and currently undeletable 
snapshots.  If it's not, at some point you'll need to do a fresh 
mkfs.btrfs, to get rid of the damage.  Since the bug doesn't appear to 
affect writable snapshots or the head from which snapshots are made, 
it's not urgent, and a full fix is likely to include a patch to detect 
and fix the problem as well, but until we know what the problem is we 
can't be sure of that, so be prepared to do that mkfs at some point, as 
at this point it's possible that's the only way you'll be able to kill 
the corrupted snapshots.

4) Total speculation on my part, but given the wanted transid (aka 
generation, in different contexts) is significantly lower than the found 
transid, and the fact that the problem appears to be limited to
/read-only/ snapshots, my first suspicion is that something's getting 
updated that would normally apply to all snapshots, but the read-only 
nature of the snapshots is preventing the full update there.  The transid 
of the block is updated, but the snapshot being read-only is preventing 
update of the pointer in that snapshot accordingly.

What I do /not/ know is whether the bug is that something's getting 
updated that should NOT be, and it's simply the read-only snapshots 
letting us know about it since the writable snapshots are fully updated, 
even if that breaks the snapshot (breaking writable snapshots in a 
different and currently undetected way), or if instead, it's a legitimate 
update, like a balance simply moving the snapshot around but not 
affecting it otherwise, and the bug is that the read-only snapshots 
aren't allowing the legitimate update.

Either way, this more or less developed over the weekend, and it's Monday 
now, so the devs should be on it.  If it's anything like the 3.15/3.16 
compression bug, it'll take some time for them to properly trace it, and 
then to figure out an appropriate fix, but they will.  Chances are we'll 
have at least some decent progress on a trace by Friday, and maybe

Re: btrfs random filesystem corruption in kernel 3.17

On Mon, Oct 13, 2014 at 5:22 PM, john terragon jterra...@gmail.com wrote:
 I'm using compress=no so compression doesn't seem to be related, at
 least in my case. Just read-only snapshots on 3.17 (although I haven't
 tried 3.16).

I was using lzo compression, and hence my comment about turning it off
before going back to 3.16 (not realizing that 3.16 has subsequently
been fixed).

Ironically enough I discovered this as I was about to migrate my ext4
backup drive into my btrfs raid1.  Maybe I'll go ahead and wait on
that and have an rsync backup of the filesystem handy (minus
snapshots) just in case.  :)

I'd switch to 3.16, but it sounds like there is no way to remove the
snapshots at the moment, and I can live for a while without the
ability to create new ones.

interestingly enough it doesn't look like ALL snapshots are affected.
I checked and some of the snapshots I made last weekend while doing
system updates look accessible.  They are significantly smaller, and
the subvolumes they were made from are also fairly new - though I have
no idea if that is related.

The subvolumes do show up in btrfs su list.  They cannot be examined
using btrfs su show.

It would be VERY nice to have a way of cleaning this up without
blowing away the entire filesystem...

--
Rich
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: btrfs random filesystem corruption in kernel 3.17