Re: btrfs device delete missing - Input/output error
The first way to try to recover the current volume would be to overwrite LBA 2262535088 which you should only do with the filesystem unmounted. That's the sector causing the read error. If this is a 512 byte drive: dd if=/dev/zero of=/dev/sdb count=1 seek=2262535088 It is a 512 byte drive. Unfortunately overwriting that fails - which is quite weird, given that there are 0 reallocated sectors, according to SMART. Is there a way to determine what's stored there (i.e. if it's a file?). # dd if=/dev/zero of=/dev/sdb count=1 seek=2262535088 dd: writing to `/dev/sdb': Input/output error 1+0 records in 0+0 records out 0 bytes (0 B) copied, 2.88439 s, 0.0 kB/s # dmesg -c [ 8177.713212] ata4.00: exception Emask 0x0 SAct 0x400 SErr 0x0 action 0x0 [ 8177.713285] ata4.00: irq_stat 0x4008 [ 8177.713349] ata4.00: failed command: READ FPDMA QUEUED [ 8177.713419] ata4.00: cmd 60/08:50:b0:8b:db/00:00:86:00:00/40 tag 10 ncq 4096 in [ 8177.713419] res 41/40:08:b0:8b:db/00:00:86:00:00/00 Emask 0x409 (media error) F [ 8177.713662] ata4.00: status: { DRDY ERR } [ 8177.713725] ata4.00: error: { UNC } [ 8177.755099] ata4.00: configured for UDMA/133 [ 8177.755175] sd 3:0:0:0: [sdb] [ 8177.755252] Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE [ 8177.755318] sd 3:0:0:0: [sdb] [ 8177.755380] Sense Key : Medium Error [current] [descriptor] [ 8177.755449] Descriptor sense data with sense descriptors (in hex): [ 8177.755516] 72 03 11 04 00 00 00 0c 00 0a 80 00 00 00 00 00 [ 8177.755600] 86 db 8b b0 [ 8177.755667] sd 3:0:0:0: [sdb] [ 8177.755729] Add. Sense: Unrecovered read error - auto reallocate failed [ 8177.757316] sd 3:0:0:0: [sdb] CDB: [ 8177.757377] Read(16): 88 00 00 00 00 00 86 db 8b b0 00 00 00 08 00 00 [ 8177.757463] end_request: I/O error, dev sdb, sector 2262535088 [ 8177.757542] ata4: EH complete -- Tomasz Chmielewski http://www.sslrack.com -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Recent issues with btrfs fi df
On 2014-12-05 02:42, Satoru Takeuchi wrote: Hi Austin, (2014/12/04 23:31), Austin S Hemmelgarn wrote: I've recently noticed on some of my systems, that btrfs fi df doesn't consistently show all of the chunk types. I'll occasionally not see the GlobalReserve, or even anything but System, For one Btrfs file system, how inconsistent is the same even if time passes? In other word, a) Once GlobalReserve becomes to be not shown, it keep as is after tha, or b) Oneday GlobalReserver disappeared. Howevert it appear again at the next day or so. In general, once it changes the first time, things don't seem to change again afterwards. although the behavior seems to be consistent for a given filesystem. Did you confirm the following things for your Btrfs file system? a) btrfs scrub finishes without any problem, and b) dmesg doesn't show any suspicious message. Scrub and dmesg both look fine, I've also run btrfsck in no-op mode and that doesn't report any errors either. I'm using btrfs-progs 3.17.1 and kernel 3.17.4 with grsecurity patches (although with much of the grsec stuff disabled) on all such systems. I'd be happy to provide kernel .config or other information for debugging on request. Could you tell me the following information, if possible? - mkfs options and mount options In both cases that I currently have access to, I created the fs with: mkfs.btrfs -O extref,skinny-metadata,no-holes -F device-name and the mount option strings for the devices in question are: noatime,space_cache,ssd,autodefrag for / and: noatime,sync,nosuid,nodev,noexec,compress=zlib,ssd,space_cache,autodefrag for /boot - The output of btrfs fi df For /, I get: Data, single: total=43.00GiB, used=40.76GiB System, DUP: total=32.00MiB, used=16.00KiB Metadata, DUP: total=1.50GiB, used=1.05GiB For /boot, I get: System, single: total=32.00MiB, used=4.00KiB - .config I've attached a gzipped copy. - Any possible trigger to cause this problem There aren't any that I know of. - Btrfs specific operations, for example weekly btrfs scrub I run scrub weekly, and balance and fstrim as needed. - Do you have any system which works fine and uses a kernel without grsecurity patches? Yes, although said system has exclusively multi-device filesystems, while the affected one is all single device filesystems. In addition, running one of your system with - upstream kernel without grsecurity, and - btrfs file system with which btrfs fi df works correctly, I've got a recovery environment built using buildroot that is based on the same kernel version without grsec patches, I'll reboot into that and see what it says. can help to distinguish whether the problem comes from upstream kernel (of course it includes btrfs) or grsecurity. I'm not sure about grsecurity. however, I have encountered many problems caused by security modules. I do have one other local kernel patch that I use, I've attached that as well, although it should have no effect whatsoever on the fs code. config.gz Description: application/gzip diff -rU3 linux-3.16.3-gentoo/include/uapi/linux/vt.h /usr/src/linux-3.16.3-gentoo/include/uapi/linux/vt.h --- linux-3.16.3-gentoo/include/uapi/linux/vt.h 2014-10-01 14:20:45.855383160 -0400 +++ /usr/src/linux-3.16.3-gentoo/include/uapi/linux/vt.h 2014-08-03 18:25:02.0 -0400 @@ -7,8 +7,8 @@ * resizing). */ #define MIN_NR_CONSOLES 1 /* must be at least 1 */ -#define MAX_NR_CONSOLES 63 /* serial lines start at 64 */ -#define MAX_NR_USER_CONSOLES 63 /* must be root to allocate above this */ +#define MAX_NR_CONSOLES 15 /* serial lines start at 64 */ +#define MAX_NR_USER_CONSOLES 15 /* must be root to allocate above this */ /* Note: the ioctl VT_GETSTATE does not work for consoles 16 and higher (since it returns a short) */
Re: Recent issues with btrfs fi df
On 2014-12-05 07:19, Austin S Hemmelgarn wrote: On 2014-12-05 02:42, Satoru Takeuchi wrote: Hi Austin, (2014/12/04 23:31), Austin S Hemmelgarn wrote: I've recently noticed on some of my systems, that btrfs fi df doesn't consistently show all of the chunk types. I'll occasionally not see the GlobalReserve, or even anything but System, For one Btrfs file system, how inconsistent is the same even if time passes? In other word, a) Once GlobalReserve becomes to be not shown, it keep as is after tha, or b) Oneday GlobalReserver disappeared. Howevert it appear again at the next day or so. In general, once it changes the first time, things don't seem to change again afterwards. although the behavior seems to be consistent for a given filesystem. Did you confirm the following things for your Btrfs file system? a) btrfs scrub finishes without any problem, and b) dmesg doesn't show any suspicious message. Scrub and dmesg both look fine, I've also run btrfsck in no-op mode and that doesn't report any errors either. I'm using btrfs-progs 3.17.1 and kernel 3.17.4 with grsecurity patches (although with much of the grsec stuff disabled) on all such systems. I'd be happy to provide kernel .config or other information for debugging on request. Could you tell me the following information, if possible? - mkfs options and mount options In both cases that I currently have access to, I created the fs with: mkfs.btrfs -O extref,skinny-metadata,no-holes -F device-name and the mount option strings for the devices in question are: noatime,space_cache,ssd,autodefrag for / and: noatime,sync,nosuid,nodev,noexec,compress=zlib,ssd,space_cache,autodefrag for /boot - The output of btrfs fi df For /, I get: Data, single: total=43.00GiB, used=40.76GiB System, DUP: total=32.00MiB, used=16.00KiB Metadata, DUP: total=1.50GiB, used=1.05GiB For /boot, I get: System, single: total=32.00MiB, used=4.00KiB - .config I've attached a gzipped copy. - Any possible trigger to cause this problem There aren't any that I know of. - Btrfs specific operations, for example weekly btrfs scrub I run scrub weekly, and balance and fstrim as needed. - Do you have any system which works fine and uses a kernel without grsecurity patches? Yes, although said system has exclusively multi-device filesystems, while the affected one is all single device filesystems. In addition, running one of your system with - upstream kernel without grsecurity, and - btrfs file system with which btrfs fi df works correctly, I've got a recovery environment built using buildroot that is based on the same kernel version without grsec patches, I'll reboot into that and see what it says. OK, so it definitely appears to be a kernel issue, as btrfs fi df reports everything correctly when used from the recovery environment, both with the copy of btrfs-progs in the recovery environment, and the copy from the root filesystem of the affected system. I'm going to try to bisect down to what option in my kernel config is actually causing this, although it may be next week before I can actually do so. can help to distinguish whether the problem comes from upstream kernel (of course it includes btrfs) or grsecurity. I'm not sure about grsecurity. however, I have encountered many problems caused by security modules. I do have one other local kernel patch that I use, I've attached that as well, although it should have no effect whatsoever on the fs code. smime.p7s Description: S/MIME Cryptographic Signature
More and more kernel oopses in 3.17.4
Fedora Core21 with 3.17.4-301.fc21.x86_64 Now BTRFS sux bigtime :-/ And my syslog is full of : éc. 05 14:28:11 vajra kernel: [ cut here ] déc. 05 14:28:11 vajra kernel: WARNING: CPU: 3 PID: 613 at fs/btrfs/delayed- inode.c:1410 btrfs_assert_delayed_root_empty+0x39/0x40 [btrfs]() déc. 05 14:28:11 vajra kernel: Modules linked in: vfat fat uas usb_storage ccm rfcomm ip6t_rpfilter ip6t_REJECT xt_conntrack ebtable_nat ebtable_broute bridge stp llc ebtabl déc. 05 14:28:11 vajra kernel: snd_seq_device serio_raw memstick snd_pcm cfg80211 parport_pc rfkill parport wmi i2c_i801 snd_timer snd hp_accel lis3lv02d tpm_tis hp_wireles déc. 05 14:28:11 vajra kernel: CPU: 3 PID: 613 Comm: btrfs-transacti Tainted: GW OE 3.17.4-301.fc21.x86_64 #1 déc. 05 14:28:11 vajra kernel: Hardware name: Hewlett-Packard HP EliteBook 820 G1/1991, BIOS L71 Ver. 01.12 06/25/2014 déc. 05 14:28:11 vajra kernel: 2dda6cf7 88022df6bdb8 8173f929 déc. 05 14:28:11 vajra kernel: 88022df6bdf0 810970ad 88013f6378c0 déc. 05 14:28:11 vajra kernel: 88003672f000 8801f7014b40 déc. 05 14:28:11 vajra kernel: Call Trace: déc. 05 14:28:11 vajra kernel: [8173f929] dump_stack+0x45/0x56 déc. 05 14:28:11 vajra kernel: [810970ad] warn_slowpath_common+0x7d/0xa0 déc. 05 14:28:11 vajra kernel: [810971da] warn_slowpath_null+0x1a/0x20 déc. 05 14:28:11 vajra kernel: [a0300169] btrfs_assert_delayed_root_empty+0x39/0x40 [btrfs] déc. 05 14:28:11 vajra kernel: [a02ab5e8] btrfs_commit_transaction+0x388/0x950 [btrfs] déc. 05 14:28:11 vajra kernel: [a02a72b5] transaction_kthread+0x245/0x260 [btrfs] déc. 05 14:28:11 vajra kernel: [a02a7070] ? btrfs_cleanup_transaction+0x550/0x550 [btrfs] déc. 05 14:28:11 vajra kernel: [810b52fa] kthread+0xea/0x100 déc. 05 14:28:11 vajra kernel: [810b5210] ? kthread_create_on_node+0x1a0/0x1a0 déc. 05 14:28:11 vajra kernel: [81746a3c] ret_from_fork+0x7c/0xb0 déc. 05 14:28:11 vajra kernel: [810b5210] ? kthread_create_on_node+0x1a0/0x1a0 déc. 05 14:28:11 vajra kernel: ---[ end trace 6138896e6248d2bf ]--- déc. 05 14:28:12 vajra abrt-dump-journal-oops[1102]: abrt-dump-journal-oops: Found oopses: 1 déc. 05 14:28:12 vajra abrt-dump-journal-oops[1102]: abrt-dump-journal-oops: Creating problem directories déc. 05 14:28:12 vajra abrt-server[11566]: Deleting problem directory oops-2014-12-05-14:28:12-1102-0 (dup of oops-2014-12-01-12:12:02-1332-0) déc. 05 14:28:13 vajra abrt-dump-journal-oops[1102]: Reported 1 kernel oopses to Abrt déc. 05 14:28:41 vajra kernel: [ cut here ] déc. 05 14:28:41 vajra kernel: WARNING: CPU: 3 PID: 613 at fs/btrfs/delayed- inode.c:1410 btrfs_assert_delayed_root_empty+0x39/0x40 [btrfs]() déc. 05 14:28:41 vajra kernel: Modules linked in: vfat fat uas usb_storage ccm rfcomm ip6t_rpfilter ip6t_REJECT xt_conntrack ebtable_nat ebtable_broute bridge stp llc ebtabl déc. 05 14:28:41 vajra kernel: snd_seq_device serio_raw memstick snd_pcm cfg80211 parport_pc rfkill parport wmi i2c_i801 snd_timer snd hp_accel lis3lv02d tpm_tis hp_wireles déc. 05 14:28:41 vajra kernel: CPU: 3 PID: 613 Comm: btrfs-transacti Tainted: GW OE 3.17.4-301.fc21.x86_64 #1 déc. 05 14:28:41 vajra kernel: Hardware name: Hewlett-Packard HP EliteBook 820 G1/1991, BIOS L71 Ver. 01.12 06/25/2014 déc. 05 14:28:41 vajra kernel: 2dda6cf7 88022df6bdb8 8173f929 déc. 05 14:28:41 vajra kernel: 88022df6bdf0 810970ad 88008364e0a0 déc. 05 14:28:41 vajra kernel: 88003672f000 8801f7014c30 déc. 05 14:28:41 vajra kernel: Call Trace: déc. 05 14:28:41 vajra kernel: [8173f929] dump_stack+0x45/0x56 déc. 05 14:28:41 vajra kernel: [810970ad] warn_slowpath_common+0x7d/0xa0 déc. 05 14:28:41 vajra kernel: [810971da] warn_slowpath_null+0x1a/0x20 déc. 05 14:28:41 vajra kernel: [a0300169] btrfs_assert_delayed_root_empty+0x39/0x40 [btrfs] déc. 05 14:28:41 vajra kernel: [a02ab5e8] btrfs_commit_transaction+0x388/0x950 [btrfs] déc. 05 14:28:41 vajra kernel: [a02a72b5] transaction_kthread+0x245/0x260 [btrfs] déc. 05 14:28:41 vajra kernel: [a02a7070] ? btrfs_cleanup_transaction+0x550/0x550 [btrfs] déc. 05 14:28:41 vajra kernel: [810b52fa] kthread+0xea/0x100 déc. 05 14:28:41 vajra kernel: [810b5210] ? kthread_create_on_node+0x1a0/0x1a0 déc. 05 14:28:41 vajra kernel: [81746a3c] ret_from_fork+0x7c/0xb0 déc. 05 14:28:41 vajra kernel: [810b5210] ? kthread_create_on_node+0x1a0/0x1a0 déc. 05 14:28:41 vajra kernel: ---[ end trace 6138896e6248d2c0 ]--- déc. 05 14:28:42 vajra abrt-dump-journal-oops[1102]: abrt-dump-journal-oops: Found oopses: 1 déc. 05 14:28:42 vajra
Re: [RFC][PATCH v2] mount.btrfs helper
On Sun, Nov 30, 2014 at 5:57 PM, Dimitri John Ledkov x...@debian.org wrote: On 30 November 2014 at 22:31, cwillu cwi...@cwillu.com wrote: In ubuntu, the initfs runs a btrfs dev scan, which should catch anything that would be missed there. I'm sorry, udev rule(s) is not sufficient in the initramfs-less case, as outlined. In case of booting with initramfs, indeed, both Debian Ubuntu include snippets there to run btrfs scan. In an initramfs-less system, the root filesystem mount is done by the kernel, without calling any mount.btrfs. The mount helper has all the same problems that calling btrfs dev scan does, it's just being run by mount. I definitely agree that assembling the filesystem from userland is somewhat awkward, and people that don't want initrds end up needing to jump through hoops to get things done. But, the tools we have to avoid the hoops are initrds and udev, and I'd much rather spend time fixing filesystem bugs than recreating those tools. If people are having trouble with udev, or having trouble with tools in the initrd, lets contribute fixes to those projects instead. For people that really really don't want initrds, pass the devices on the command line. If that isn't working, we'll fix it, but if you really want a scan, please try an initrd. You can even make one without any kernel modules, and then you don't have to recreate it until you want to update the userland in your initrd. -chris -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC][PATCH v2] mount.btrfs helper
On 5 December 2014 at 15:32, Chris Mason c...@fb.com wrote: On Sun, Nov 30, 2014 at 5:57 PM, Dimitri John Ledkov x...@debian.org wrote: On 30 November 2014 at 22:31, cwillu cwi...@cwillu.com wrote: In ubuntu, the initfs runs a btrfs dev scan, which should catch anything that would be missed there. I'm sorry, udev rule(s) is not sufficient in the initramfs-less case, as outlined. In case of booting with initramfs, indeed, both Debian Ubuntu include snippets there to run btrfs scan. In an initramfs-less system, the root filesystem mount is done by the kernel, without calling any mount.btrfs. The mount helper has all the same problems that calling btrfs dev scan does, it's just being run by mount. Sure. in my initramfs-less system case the root filesystem was not btrfs. Simply there was a btrfs filesystem defined in /etc/fstab. I definitely agree that assembling the filesystem from userland is somewhat awkward, and people that don't want initrds end up needing to jump through hoops to get things done. But, the tools we have to avoid the hoops are initrds and udev, and I'd much rather spend time fixing filesystem bugs than recreating those tools. If people are having trouble with udev, or having trouble with tools in the initrd, lets contribute fixes to those projects instead. For people that really really don't want initrds, pass the devices on the command line. If that isn't working, we'll fix it, but if you really want a scan, please try an initrd. You can even make one without any kernel modules, and then you don't have to recreate it until you want to update the userland in your initrd. The other suggestion I received is to ship a systemd unit that does unconditional btrfs scan pre local filesystem target... =) kernel-module-less initrd sounds cool, i'll experiment with that. -- Regards, Dimitri. -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: btrfs device delete missing - Input/output error
On Fri, Dec 5, 2014 at 3:03 AM, Tomasz Chmielewski t...@virtall.com wrote: The first way to try to recover the current volume would be to overwrite LBA 2262535088 which you should only do with the filesystem unmounted. That's the sector causing the read error. If this is a 512 byte drive: dd if=/dev/zero of=/dev/sdb count=1 seek=2262535088 It is a 512 byte drive. Unfortunately overwriting that fails - which is quite weird, given that there are 0 reallocated sectors, according to SMART. Is there a way to determine what's stored there (i.e. if it's a file?). # dd if=/dev/zero of=/dev/sdb count=1 seek=2262535088 dd: writing to `/dev/sdb': Input/output error 1+0 records in 0+0 records out 0 bytes (0 B) copied, 2.88439 s, 0.0 kB/s # dmesg -c [ 8177.713212] ata4.00: exception Emask 0x0 SAct 0x400 SErr 0x0 action 0x0 [ 8177.713285] ata4.00: irq_stat 0x4008 [ 8177.713349] ata4.00: failed command: READ FPDMA QUEUED You're getting a read error when writing. This is expected when writing 512 bytes to a 4k sector. What do you get for parted /dev/sdb u s p -- Chris Murphy -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC][PATCH v2] mount.btrfs helper
On Fri, Dec 05, 2014 at 04:01:37PM +, Dimitri John Ledkov wrote: On 5 December 2014 at 15:32, Chris Mason c...@fb.com wrote: On Sun, Nov 30, 2014 at 5:57 PM, Dimitri John Ledkov x...@debian.org wrote: On 30 November 2014 at 22:31, cwillu cwi...@cwillu.com wrote: In ubuntu, the initfs runs a btrfs dev scan, which should catch anything that would be missed there. I'm sorry, udev rule(s) is not sufficient in the initramfs-less case, as outlined. In case of booting with initramfs, indeed, both Debian Ubuntu include snippets there to run btrfs scan. In an initramfs-less system, the root filesystem mount is done by the kernel, without calling any mount.btrfs. The mount helper has all the same problems that calling btrfs dev scan does, it's just being run by mount. Sure. in my initramfs-less system case the root filesystem was not btrfs. Simply there was a btrfs filesystem defined in /etc/fstab. So you could add a 'btrfs dev scan' before the fstab is going to be mounted. Either a local boot script or via some unit file. We're looking for good reasons to justify the existence of the helper, but this is still not enough IMHO. I can see the convenience to do it automatically, but this assumes no udev available which is probably rare these days. -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: btrfs device delete missing - Input/output error
On 2014-12-05 17:41, Chris Murphy wrote: You're getting a read error when writing. This is expected when writing 512 bytes to a 4k sector. What do you get for parted /dev/sdb u s p Right - so we're getting these: # parted /dev/sdb u s p Model: ATA ST3000DM001-9YN1 (scsi) Disk /dev/sdb: 5860533168s Sector size (logical/physical): 512B/4096B Partition Table: gpt Number StartEnd Size File system Name Flags 5 2048s4095s2048s bios_grub 1 4096s67112959s67108864s raid 2 67112960s68161535s1048576sraid 3 68161536s2215645183s 2147483648s raid 4 2215645184s 5860533134s 3644887951s btrfs raid -- Tomasz Chmielewski http://www.sslrack.com -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
abort device removal?
Hello all, I had a 6-device array I added a 4tb device to last night and ran the command to remove a previous 4tb device that still worked fine overnight. Unfortunately, one of the OTHER devices completely failed while this was happening, and it *looks* like btrfs did the right thing and stopped the move, except it's still marked as 0 space in btrfs fi show. The delete command is still running, though iotop shows it's not actually reading or writing anything and no further moving messages in dmesg/kern.log seems to indicate that too. So what I think I *need* to do is re-add the drive it's currently trying to remove so I can delete the now non-functioning other drive without losing any data. My fear is a reboot or unmount/remount will fail to mount the currently-being-removed drive as well causing me to lose everything. Here is some relevant info from the system: # uname -a Linux mytorrentflux1 3.13.0-40-generic #69-Ubuntu SMP Thu Nov 13 17:53:56 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux # btrfs --version Btrfs v3.17.3 # btrfs fi show Label: 'completed' uuid: 0d14bb0f-46cc-408e-9245-f06d50ec2da8 Total devices 7 FS bytes used 7.60TiB devid1 size 3.64TiB used 3.28TiB path /dev/mapper/fourtb1 devid2 size 3.64TiB used 3.29TiB path /dev/mapper/fourtb2 devid3 size 2.73TiB used 2.37TiB path /dev/mapper/threetb1 devid5 size 1.82TiB used 1.82TiB path /dev/mapper/twotb1 devid6 size 0.00B used 1.99TiB path /dev/mapper/fourtb3 devid7 size 2.73TiB used 2.22TiB path /dev/mapper/threetb2 devid8 size 3.64TiB used 240.29GiB path /dev/mapper/fourtb4 Btrfs v3.17.3 # btrfs fi df /mnt/completed/ Data, RAID10: total=6.26TiB, used=6.26TiB Data, RAID1: total=1.33TiB, used=1.33TiB System, RAID10: total=96.00MiB, used=852.00KiB Metadata, RAID10: total=10.77GiB, used=9.90GiB Metadata, RAID1: total=5.00GiB, used=4.37GiB fourtb4 is the new drive I just added, fourtb3 is the functioning drive I attempted to remove before threetb1 completely failed (smartctl can't even read anything from it, well, from the underlying device) dmesg/kern.log is too large too attach, here are some important-looking excerpts (3 lines often repeated): Dec 5 09:59:35 mytorrentflux1 kernel: [1549876.646751] btrfs: bdev /dev/mapper/threetb1 errs: wr 17599, rd 973, flush 0, corrupt 0, gen 0 Dec 5 09:59:35 mytorrentflux1 kernel: [1549877.022291] lost page write due to I/O error on /dev/mapper/threetb1 Dec 5 10:07:08 mytorrentflux1 kernel: [1550329.743294] btrfs_dev_stat_print_on_error: 264 callbacks suppressed I appreciate any help or guidance I can get on this issue so I don't lose data, hopefully. Thanks much! -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Moving an entire subvol?
On Tue, Dec 02, 2014 at 08:41:35PM +0530, Shriramana Sharma wrote: That makes sense. Is there anywhere that the official SuSE recommended subvol layout is mentioned that I can refer to without having to start up an installer? https://www.suse.com/documentation/sles-12/singlehtml/book_sle_admin/book_sle_admin.html#sec.snapper.setup Directories that are Excluded from Snapshots I am now reading a SuSECon 2013 presentation by Nyers and Schnell but they are very generic about the recommendations. There are some recommended defaults that should be ok, the configuration is flexible and the user can tune the settings later according to the usage pattern (expected amount of data changes between snapshots, frequency of changes). -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Possible to undo subvol delete?
On Thu, Dec 04, 2014 at 07:36:34PM +0530, Shriramana Sharma wrote: Well I don't know about you, but I'm just running an openSUSE 13.2 system updated to Tumbleweed here, and even if I just hit btrfs enter (no sudo, no btrfs commands) on my regular (non-root) prompt, I am getting: $ btrfs Absolute path to 'btrfs' is '/usr/sbin/btrfs', so running it may require superuser privileges (eg. root). $ ... so what to say of btrfs subvol, whether followed by crea or del! Oh I see. That must be some clever shell addon that prints that if you don't have /sbin in PATH and try to call a command from that path. -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Possible to undo subvol delete?
On Wed, Dec 03, 2014 at 02:54:14PM -0500, Zygo Blaxell wrote: Even with the tty/interactive shell detection in place? Maybe I understood the reference to lvm/mdadm tools wrong. My idea is that the scripts would work as now, no prompts there. How do we reliably distinguish between running in a script interactively, running in a script non-interactively, and running interactively? The first choice is to use isatty() function, but this would not be enough to detect #1, which would require an extra option to force the itneractive mode. I could easily run a script from the command line that creates and destroys subvols and runs for days or weeks (in fact, I quite often do). It would be a significant productivity hit if an upgrade made it stop every night waiting for confirmation from a user who went home hours ago and won't be back for hours more. Yeah, we want to avoid such surprises. Besides isatty, there could be more shell magic to detect the interactive mode reliably, I don't know. -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Possible to undo subvol delete?
David, I'm just running default openSUSE 13.2 now (had to reinstall for other reasons) and it's there. It's not something I added. Given that you're also on either openSUSE or SLED/SLES, I'd expect your system to act similarly as well. If not, it's downright curious. I guess I'll ask around on the openSUSE Forum... -- Shriramana Sharma ஶ்ரீரமணஶர்மா श्रीरमणशर्मा -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Possible to undo subvol delete?
OK so from https://forums.opensuse.org/showthread.php/440209-ifconfig I learnt that it's because /sbin, /usr/sbin etc is not on the normal user's path on openSUSE (they are, on Kubuntu). Adding them to PATH fixes the situation. (I wasn't even able to do ifconfig without giving the password. No idea why this is the openSUSE default...) -- Shriramana Sharma ஶ்ரீரமணஶர்மா श्रीरमणशर्मा -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC][PATCH v2] mount.btrfs helper
On 12/05/2014 05:41 PM, David Sterba wrote: We're looking for good reasons to justify the existence of the helper, but this is still not enough IMHO. I can see the convenience to do it automatically, but this assumes no udev available which is probably rare these days. I have the following reasons to support a mount.btrfs helper: 1) it is in a good point to check that everything is ok (see the thread related LVM snapshot, due to a dev.uuid conflicts), 2) it is in a good point to issue a good error explanation (missing device) 3) it may handle case like degraded mode, where the filesystem is not fully functional but even as degraded have some functionals.. On 12/05/2014 04:32 PM, Chris Mason wrote: I definitely agree that assembling the filesystem from userland is somewhat awkward, and people that don't want initrds end up needing to jump through hoops to get things done. But, the tools we have to avoid the hoops are initrds and udev, and I'd much rather spend time fixing filesystem bugs than recreating those tools. If people are having trouble with udev, or having trouble with tools in the initrd, lets contribute fixes to those projects instead. Chris, I am bit confused by your answer: mount.btrfs helper is not a solution for the initrd-less system (whom I am not a fan anymore [*]). And I don't think that the awkward-ness of btrfs is due to udev deficiencies. Btrfs is new because acts both as filesystem and as dm/md layer. We know that there are very good reasons to do that. But also it highlights new problems whom the old tools may be not a right solution. See this from another point of view: md/dm have specific tools to assemble the disks. So why btrfs wouldn't need a specific tool? BR G.Baroncelli [*] I hope to not start another flame-war. I am not against to the initrd-less system; but if you want a multidevice filesystem (with or without md/dm) simply you cannot rely to the kernel only (IMHO). -- gpg @keyserver.linux.it: Goffredo Baroncelli kreijackATinwind.it Key fingerprint BBF5 1610 0B64 DAC6 5F7D 17B2 0EDA 9B37 8B82 E0B5 -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Possibility to have a transient snapshot?
This is just a random idea that popped through my mind while I was looking into hardening a filesystem against damage, might be impractical, but the idea seems promising, and well suited to a snapshot file system. I'm sure some creative shell scripting could do something like this already, but I was more looking for something more bulletproof. General idea would be to have a transient snapshot (optional quota support possibility here) on top of a base snapshot (possibly readonly). On system start/restart (whether clean or dirty), the transient snapshot would be flushed, and the system would restart the snapshot, basically restarting from the base snapshot. If desired, the transient snapshot could be promoted to a regular snapshot (say after a software upgrade). If desired, a different base snapshot could be selected (although I'm sure the file system would have to be restarted to do this) From a caching perspective, this could make a noticable performance difference, since if you're running in a transient snapshot, the file system can be _extremely_ lazy about committing changes to disk. For the optional quote support I mentioned, on an unattended box, if the quota gets exceeded, a system reboot would probably fully correct the system. (Presumably a log file got out of control in that situation). -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH V2][BTRFS-PROGS] Don't use LVM snapshot device
On 12/05/2014 08:26 AM, Duncan wrote: Goffredo Baroncelli posted on Thu, 04 Dec 2014 19:39:37 +0100 as excerpted: To check if a device is a LVM snapshot, it is checked the 'udev' device property 'DM_UDEV_LOW_PRIORITY_FLAG' . If it is set to 1, the device has to be skipped. As consequence, btrfs now depends also by the libudev. Not being a coder I gotta ask... How does this patch deal with mdev (busybox) or static dev instead of udev? Does it gracefully degrade to legacy LVM-agnostic behavior? My patch no; of course we can put some #ifdef to make it an option. [dropped the part related to systemd ] -- gpg @keyserver.linux.it: Goffredo Baroncelli kreijackATinwind.it Key fingerprint BBF5 1610 0B64 DAC6 5F7D 17B2 0EDA 9B37 8B82 E0B5 -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC][PATCH v2] mount.btrfs helper
On Fri, Dec 5, 2014 at 1:15 PM, Goffredo Baroncelli kreij...@inwind.it wrote: On 12/05/2014 05:41 PM, David Sterba wrote: We're looking for good reasons to justify the existence of the helper, but this is still not enough IMHO. I can see the convenience to do it automatically, but this assumes no udev available which is probably rare these days. I have the following reasons to support a mount.btrfs helper: 1) it is in a good point to check that everything is ok (see the thread related LVM snapshot, due to a dev.uuid conflicts), 2) it is in a good point to issue a good error explanation (missing device) 3) it may handle case like degraded mode, where the filesystem is not fully functional but even as degraded have some functionals.. Ok, these three things are worth improving, but I'd like to take a slightly different direction. Instead of recreating chunks of btrfs dev scan, lets extend btrfs dev scan to at the very least understand #1 and #2. As much as possible we want to be leveraging the data in udev instead of recreating that functionality. #3 is a slightly different feature, but we can have an extended btrfs dev scan or show explain the state of the filesystem to you. From there if we really need a mount helper, it can either use a libbtrfs to hit the scan code or be a bash script. Thanks for trying to smooth our or wrinkles in this area. It's definitely worth working on, I just want to make sure we recreate as little infrastructure as possible ;) -chris -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC][PATCH v2] mount.btrfs helper
Hi Chris, On 12/05/2014 07:43 PM, Chris Mason wrote: On Fri, Dec 5, 2014 at 1:15 PM, Goffredo Baroncelli kreij...@inwind.it wrote: On 12/05/2014 05:41 PM, David Sterba wrote: We're looking for good reasons to justify the existence of the helper, but this is still not enough IMHO. I can see the convenience to do it automatically, but this assumes no udev available which is probably rare these days. I have the following reasons to support a mount.btrfs helper: 1) it is in a good point to check that everything is ok (see the thread related LVM snapshot, due to a dev.uuid conflicts), 2) it is in a good point to issue a good error explanation (missing device) 3) it may handle case like degraded mode, where the filesystem is not fully functional but even as degraded have some functionals.. Ok, these three things are worth improving, but I'd like to take a slightly different direction. Instead of recreating chunks of btrfs dev scan, lets extend btrfs dev scan to at the very least understand #1 and #2. As much as possible we want to be leveraging the data in udev instead of recreating that functionality. #3 is a slightly different feature, but we can have an extended btrfs dev scan or show explain the state of the filesystem to you. This is good suggestions From there if we really need a mount helper, it can either use a libbtrfs to hit the scan code or be a bash script. Thanks for trying to smooth our or wrinkles in this area. It's definitely worth working on, I just want to make sure we recreate as little infrastructure as possible ;) This is an RFC because I am not sure about the right direction. My first goal is more to start a sane discussion, than provide a solution. But I have to point out that btrfs device scan usually is started by udev, so no possibility to show [see] an error. More, btrfs dev scan is started on a device alone, from which is impossible to check dev.uuid conflicts... [except if you accept to extend the analysis to all devices] [*] Finally, if you fear that my mount helper recreates too much infrastructure... this is true, but it is an implementation problem; now I am looking for a high level solution. Goffredo [*] BTW, give a look to [PATCH V2][BTRFS-PROGS] Don't use LVM snapshot device, patch #5; this patch try to add a check about the dev.uuid conflicts; showing an error in this case... -chris -- gpg @keyserver.linux.it: Goffredo Baroncelli kreijackATinwind.it Key fingerprint BBF5 1610 0B64 DAC6 5F7D 17B2 0EDA 9B37 8B82 E0B5 -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
General Question: ctime, mtime, and xattrs
So I was reading the wiki on the internal layout. The INODE description says st_ctime. Also updated when xattrs change. Why isn't changing the xattrs a modification (st_mtime) event? It just seems odd to me... -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Possibility to have a transient snapshot?
James West posted on Fri, 05 Dec 2014 12:27:39 -0600 as excerpted: I'm sure some creative shell scripting could do something like this already, Indeed... but I was more looking for something more bulletproof. See below... General idea would be to have a transient snapshot (optional quota support possibility here) on top of a base snapshot (possibly readonly). On system start/restart (whether clean or dirty), the transient snapshot would be flushed, and the system would restart the snapshot, basically restarting from the base snapshot. So to this point we're effectively restoring from a golden image at boot, wiping the history of the last session away. I guess a lot of internet cafes do this sort of thing, either directly, or (these days) running from a VM, with multiple terminals, each logging into their own VM image. If desired, the transient snapshot could be promoted to a regular snapshot (say after a software upgrade). If desired, a different base snapshot could be selected (although I'm sure the file system would have to be restarted to do this) So optionally make the transient image permanent, effectively upgrading it to the golden image (presumably with a fallback to the previous golden image if necessary). From a caching perspective, this could make a noticable performance difference, since if you're running in a transient snapshot, the file system can be _extremely_ lazy about committing changes to disk. Indeed. I'm doing something here with a similar effect for root, except much less complicated and it doesn't require btrfs, tho btrfs snapshotting would be a useful if complicating variant. Basically all I did is stick ro in the mount-options for root, so when it would normally be remounted rw, it's remounted with all the extra operating options, but still ro. I only switch it to rw when I'm actively doing system maintenance. Since I have all the other options I want loaded and in fstab already, all I have to do is remount rw, and it automatically picks up the compression, autodefrag, etc, from fstab and the previous mount. Tho I don't use snapshots, instead preferring a backup of root, done manually when I've boot-tested the current config and believe it's stable enough, plus that it's time for a new backup. Since root is only 8 GiB in size, backup can and does consist of a simple mkfs on the backup (also 8 GiB), followed by mounting it and a mount-bind of root somewhere else so I can get at anything normally under a mountpoint and a straight copy won't accidentally stray into other filesystems, and copy everything over. fstab is already a symlink (effectively) pointing to fstab.working, with an fstab.backup already prepared as well, so after the copy I switch the fstab symlink pointer on the backup and modify an ID file (making it easier to double-check what root is actually mounted). I then umount the backup and the bind-mount, reboot, and select the grub menu entry that sets the kernel commandline root= to the backup instead of working copy, and verify that the backup is actually bootable. Thus I effectively have a working (normal root) and backup (backup root) golden image, with the working golden image mounted writable and updated whenever I update or modify system configuration, and the fallback image selectable from grub. Of course I have secondary and tertiary backups as well, tertiary not normally attached, just in case, as well. Gets rid of a lot of headaches, since I don't have to worry about root being corrupted in the normal crash-case at all, and if the working root /is/ ever unbootable either due to bad update or corruption while mounted writable, I can always boot to several levels of backup and have the fully working system I'm used to, including access to all manpages, X, the internet, etc, just as it was when I did the backup, to use as a recovery image. =:^) /home and /var/log (with others including the packages partition mounted only on demand) are of course mounted writable, with their own backups as well, but it's nice to have a fully functional and uncorrupted root complete with all tools and reference material, that I know will boot even if they're corrupted, to use when doing repair if they do get corrupted. =:^) I'm using fully independent filesystems instead of subvolumes and snapshots, since it didn't make any sense to me to risk putting all my data eggs in a single filesystem basket, and have them all broken at once if the bottom falls out, given that subvolumes and snapshots are still on the same filesystem, such that if one subvolume/snapshot is corrupted and unmountable, there's a good chance they're all gone. Better to have partition and filesystem barriers in place to contain the damage, particularly when corruption on the writable /var/log could easily mean the read-only root that I'd otherwise be using to repair /var/log, is corrupted as well. -- Duncan - List replies
Re: General Question: ctime, mtime, and xattrs
Ah... I've been thinking ctime is/was (still) create time. It seems that somewhere in the last couple decades it became change time; Or that I picked up that incorrect create time idea back in the UNIX Sys V R 3 days and just never had cause to think about it again... Never mind. /sigh... What a maroon. 8-) On 12/05/2014 03:03 PM, cwillu wrote: xattrs are commonly used to implement acls, which wouldn't typically be considered a content modification. On Dec 5, 2014 4:08 PM, Robert White rwh...@pobox.com wrote: So I was reading the wiki on the internal layout. The INODE description says st_ctime. Also updated when xattrs change. Why isn't changing the xattrs a modification (st_mtime) event? -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: General Question: ctime, mtime, and xattrs
On Fri, 5 Dec 2014 03:28:58 PM Robert White wrote: I've been thinking ctime is/was (still) create time. It seems that somewhere in the last couple decades it became change time; Or that I picked up that incorrect create time idea back in the UNIX Sys V R 3 days and just never had cause to think about it again... Sadly there's never been a creation time in Linux that you can get with a standard system call, there was an attempt 4-5 years ago to get xstat merged that would include creation time from filesystems that support it (like ext4) but it never went anywhere (for a variety of reasons). LWN article on the patch set: https://lwn.net/Articles/394298/ Linus knocking it back: https://lkml.org/lkml/2010/7/22/249 FreeBSD has: st_birthtim Time when the inode was created. No idea when that was added! All the best, Chris -- Chris Samuel : http://www.csamuel.org/ : Melbourne, VIC -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: General Question: ctime, mtime, and xattrs
On Fri, Dec 05, 2014 at 03:28:58PM -0800, Robert White wrote: Ah... I've been thinking ctime is/was (still) create time. It seems that somewhere in the last couple decades it became change time; Or that I picked up that incorrect create time idea back in the UNIX Sys V R 3 days and just never had cause to think about it again... v7 is the point where the third timestamp has first appeared (v6 has only two - access and update). And their stat(2) says this: st_atime is the file was last read. For reasons of efficiency, it is not set when a directory is searched, although this would be more logi‐ cal. st_mtime is the time the file was last written or created. It is not set by changes of owner, group, link count, or mode. st_ctime is set both both by writing and changing the i-node. FWIW, their /usr/sys/sys/sys4.c has chmod() { register struct inode *ip; register struct a { char*fname; int fmode; } *uap; uap = (struct a *)u.u_ap; if ((ip = owner()) == NULL) return; ip-i_mode = ~0; if (u.u_uid) uap-fmode = ~ISVTX; ip-i_mode |= uap-fmode0; ip-i_flag |= ICHG; if (ip-i_flagITEXT (ip-i_modeISVTX)==0) xrele(ip); iput(ip); } and /usr/sys/sys/iget.c has, in the end of iupdat() (called on the final iput(), as well as on stat(2) and several other paths), this: if(ip-i_flagIACC) dp-di_atime = *ta; if(ip-i_flagIUPD) dp-di_mtime = *tm; if(ip-i_flagICHG) dp-di_ctime = time; ip-i_flag = ~(IUPD|IACC|ICHG); (ta and tm are both equal to time in that call chain). IOW, chmod(2) definitely sets ctime. IOW, ctime has never been create time; it's change time and it had been that way since its introduction. -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: btrfs device delete missing - Input/output error
On Fri, Dec 5, 2014 at 10:28 AM, Tomasz Chmielewski t...@virtall.com wrote: On 2014-12-05 17:41, Chris Murphy wrote: You're getting a read error when writing. This is expected when writing 512 bytes to a 4k sector. What do you get for parted /dev/sdb u s p Right - so we're getting these: # parted /dev/sdb u s p Model: ATA ST3000DM001-9YN1 (scsi) Disk /dev/sdb: 5860533168s Sector size (logical/physical): 512B/4096B OK it's a 4096 byte physical sector drive so you have to use the bs=4096 command with the proper seek value (which is based on the bs value). -- Chris Murphy -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Possibility to have a transient snapshot?
On Fri, Dec 5, 2014 at 11:27 AM, James West ja...@terminalsystems.com wrote: General idea would be to have a transient snapshot (optional quota support possibility here) on top of a base snapshot (possibly readonly). On system start/restart (whether clean or dirty), the transient snapshot would be flushed, and the system would restart the snapshot, basically restarting from the base snapshot. Sounds similar to this idea: http://0pointer.net/blog/revisiting-how-we-put-together-linux-systems.html About 1/3 of the way down it gets to a proposal to Btrfs as a way to get to a stateless system, which is basically what you want to be able to rollback to. A variation on this that might serve the use case better is seed device. You can either drop the added device that stores changes to the seed device, or the volume (seed+added device) can become another seed if you want to make the current state persistent at next boot. And still another possibility is overlayfs, which isn't Btrfs specific. -- Chris Murphy -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html