ENOSPC on file deletion with 3.1.6
Hi, After upgrading my kernel from 2.6.38 (which has worked fine for months) to 3.1.6, I got ENOSPC on recompiling gcc (even though df says there is 16G free of 50G; this is a raid1 setup, so in fact it's 8 of 25). After this error, I tried to remove the compilation directory (with rm -r): this also gives ENOSPC. I am trying to work around this by first truncating files using echo $file, but this fails for some files, again with ENOSPC. Also, removal of files is very slow even if it succeeds. Moreover, any write operation on the file system now fails with ENOSPC. Reverting to my old kernel does not help: it now shows the same problem. Is this a known issue? Is there a way to make this file system unstuck? (I have backups, but I'd like to preserve snapshot information if possible.) Should I try upgrading to an even newer kernel? Kind regards, Arie -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: ENOSPC on file deletion with 3.1.6
Arie Peterson wrote (ao): After upgrading my kernel from 2.6.38 (which has worked fine for months) to 3.1.6, I got ENOSPC on recompiling gcc (even though df says there is 16G free of 50G; this is a raid1 setup, so in fact it's 8 of 25). After this error, I tried to remove the compilation directory (with rm -r): this also gives ENOSPC. I am trying to work around this by first truncating files using echo $file, but this fails for some files, again with ENOSPC. Also, removal of files is very slow even if it succeeds. Moreover, any write operation on the file system now fails with ENOSPC. Reverting to my old kernel does not help: it now shows the same problem. Is this a known issue? Is there a way to make this file system unstuck? (I have backups, but I'd like to preserve snapshot information if possible.) Should I try upgrading to an even newer kernel? Maybe your snapshots take up space. Can you show 'btrfs filesystem df /' ? FWIW, I also had a disk full just a few days ago. Removed all snapshots and some big files, but to no avail. Likely the background cleanup took too much time. A reboot fixed this. Sander -- Humilis IT Services and Solutions http://www.humilis.net -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: ENOSPC on file deletion with 3.1.6
On Tuesday 03 January 2012 15:06:43 Sander wrote: Maybe your snapshots take up space. Can you show 'btrfs filesystem df /' ? Data, RAID1: total=22.72GB, used=14.73GB Data: total=8.00MB, used=0.00 System, RAID1: total=8.00MB, used=12.00KB System: total=4.00MB, used=0.00 Metadata, RAID1: total=2.25GB, used=1.88GB Metadata: total=8.00MB, used=0.00 FWIW, I also had a disk full just a few days ago. Removed all snapshots and some big files, but to no avail. Likely the background cleanup took too much time. A reboot fixed this. OK, I'll keep this in mind. I'm a bit anxious to reboot, because I'm afraid booting will fail if the root file system cannot be written to. Sander Thanks, Arie -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: ENOSPC on file deletion with 3.1.6
Arie Peterson wrote (ao): On Tuesday 03 January 2012 15:06:43 Sander wrote: Maybe your snapshots take up space. Can you show 'btrfs filesystem df /' ? Data, RAID1: total=22.72GB, used=14.73GB Data: total=8.00MB, used=0.00 System, RAID1: total=8.00MB, used=12.00KB System: total=4.00MB, used=0.00 Metadata, RAID1: total=2.25GB, used=1.88GB Metadata: total=8.00MB, used=0.00 Hm, not full. FWIW, I also had a disk full just a few days ago. Removed all snapshots and some big files, but to no avail. Likely the background cleanup took too much time. A reboot fixed this. OK, I'll keep this in mind. I'm a bit anxious to reboot, because I'm afraid booting will fail if the root file system cannot be written to. But you did already reboot as you said the old kernel exposed the same behavior? Sander -- Humilis IT Services and Solutions http://www.humilis.net -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: ENOSPC on file deletion with 3.1.6
On Tue, Jan 3, 2012 at 8:12 AM, Arie Peterson ar...@xs4all.nl wrote: On Tuesday 03 January 2012 15:06:43 Sander wrote: Maybe your snapshots take up space. Can you show 'btrfs filesystem df /' ? Data, RAID1: total=22.72GB, used=14.73GB Data: total=8.00MB, used=0.00 System, RAID1: total=8.00MB, used=12.00KB System: total=4.00MB, used=0.00 Metadata, RAID1: total=2.25GB, used=1.88GB Metadata: total=8.00MB, used=0.00 FWIW, I also had a disk full just a few days ago. Removed all snapshots and some big files, but to no avail. Likely the background cleanup took too much time. A reboot fixed this. OK, I'll keep this in mind. I'm a bit anxious to reboot, because I'm afraid booting will fail if the root file system cannot be written to. I'd probably run a btrfs fi balance /, it should be able to recover the space. I'd typically be a little anxious if it was a large filesystem as it's not interruptable except via power-button (in principle it shouldn't matter, but...), but given that your filesystem is quite small, it shouldn't take more than an hour or so. -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Btrfs partition lost after RAID1 mirror disk failure?
Hi, I'm running Ubuntu with kernel 2.6.38 on a fileserver system. One of the disks in a RAID1 configuration failed (/dev/sdc), and since then I haven't been able to access the btrfs filesystem on the remaining disk (/dev/sdb). root@midnite:~/src/btrfs-progs-unstable# ./btrfsck /dev/sdb No valid Btrfs found on /dev/sdb root@midnite:~/src/btrfs-progs-unstable# ./btrfsck -s 1 /dev/sdb using SB copy 1, bytenr 67108864 No valid Btrfs found on /dev/sdb root@midnite:~/src/btrfs-progs-unstable# ./btrfsck -s 2 /dev/sdb using SB copy 2, bytenr 274877906944 No valid Btrfs found on /dev/sdb (This was using a btrfsck compiled from the only git repo I could find which was responding: http://git.darksatanic.net/repo/btrfs-progs-unstable.git version v0.19-102-g2482539) I include below a list of commands which were executed around the time of the disk failure, attempting to mount the single remaining device (which is on /dev/sdb, and the failed disk was on /dev/sdc). I'm pretty sure I didn't destroy anything in the process, but who knows - hence why I include the list. Any help appreciated in recovering the partition on /dev/sdb and accessing the data. Thanks, Dan G 527 btrfs device scan 610 btrfs device scan 611 btrfs device show 612 btrfs fi df 613 btrfs fi df -h 614 btrfs fi df nuvat 615 btrfs fi show 648 btrfs device scan 649 btrfs 650 btrfs fi df 651 btrfs fi df nuvat 652 btrfs fi show 653 btrfs fi df /nuvat/ 654 btrfs fi show 665 vi /usr/share/initramfs-tools/modules.d/btrfs 1136 btrfs 1137 btrfs device scan 1139 btrfs device scan 1140 man btrfs 1141 btrfs device scan /dev/sdc 1143 btrfs filesystem df /nuvat/ 1144 btrfsck 1145 btrfsck /dev/sdc 1146 btrfsck /nuvat 1147 btrfsctl --help 1148 btrfsctl -a 1149 btrfsctl -A /dev/sdc 1150 btrfs-show 1197 btrfs-show 1198 btrfsck 1199 btrfsck /dev/sdb 1200 btrfsck /dev/sdc 1201 btrfsck /dev/sdv 1202 btrfsck /dev/sdb 1203 btrfstune 1204 btrfsctl 1205 btrfsctl -a 1286 btrfs-vol 1287 btrfs filesystem show 1290 btrfs device scan 1292 btrfsck 1293 btrfsck /dev/sdb 1299 btrfsctl 1300 btrfsctl -a 1304 btrfsck -h 1305 btrfsck --help 1306 btrfsck 1307 btrfsck /dev/sdc 1308 btrfsck /dev/sdb 1309 btrfs-show 1310 btrfs-show nuvat 1311 btrfs-vol 1312 dpkg -l | grep btrfs 1313 apt-get install btrfs-tools 1314 btrfsctl 1315 btrfsctl -c 1316 btrfsctl -A 1317 btrfsctl -A /dev/sdb 1318 btrfsctl -d 1319 btrfsctl -d /nuvat/ 1320 btrfsctl -d /dev/sdb 1321 btrfs-show 1322 btrfs-show --help 1323 btrfs-show /dev/sdb 1326 btrfs-vol 1327 btrfs-vol -a 1328 btrfs-vol -a /nuvat 1329 btrfs-vol -a asdasd /nuvat 1330 btrfs-vol -a missing /nuvat 1331 btrfs-vol -a /dev/sdc /nuvat 1332 btrfs-vol -a /dev/sdb /nuvat 1334 btrfs-vol -a missing /nuvat 1335 btrfs 1336 btrfs device /dev/sdc /nuvat 1337 btrfs device add /dev/sdc /nuvat 1338 btrfs device delete /dev/sdc /nuvat 1339 btrfs fi show 1340 btrfs fi show /nuvat 1341 btrfs fi show nuvat 1342 btrfs filesystem show nuvat 1343 btrfs filesystem show 1344 btrfsctl -a 1345 btrfs device scan 1346 btrfs filesystem show all 1348 btrfs-show /dev/sdb 1352 btrfsck /dev/sdb 1355 btrfsck 1356 btrfsck -s 1357 btrfsck -s 1 1358 btrfsck -s 1 /dev/sdb 1360 apt-cache search btrfs 1361 btrfs filesystem show 1374 btrfs 1375 btrfs device scan 1376 btrfs fi show 1377 history | grep btrfs 1387 btrfs-vol 1388 btrfsck /dev/sdb 1389 btrfs subvolume 1390 btrfs fi show 1391 dpkg -l | grep btrfs 1392 apt-cache search btrfs 1393 git clone http://git.darksatanic.net/repo/btrfs-progs-unstable.git/btrfs-progs__git 1394 cd btrfs-progs__git/ 1398 ./btrfs device scan 1399 ./btrfsck 1400 ./btrfsck /dev/sdb 1401 ./btrfsctl -a -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: ENOSPC on file deletion with 3.1.6
On Tuesday 03 January 2012 15:22:58 Sander wrote: Hm, not full. OK, I'll keep this in mind. I'm a bit anxious to reboot, because I'm afraid booting will fail if the root file system cannot be written to. But you did already reboot as you said the old kernel exposed the same behavior? You are right; the full history of events was: (- compile new kernel (3.1.6)) - boot new kernel; - recompile gcc: problem occurs; - solve problem by removing compilation directory; - boot old kernel; - recompile gcc: problem occurs for this kernel as well. After trying to remove the problematic files for some time, I took the chance and rebooted. After the reboot, the file system still gave ENOSPC on any write operation. However, it was able to boot anyway, and now the removal of the problematic files went much faster and without new ENOSPC. The compilation directory was completely removed, and immediately afterwards, the file system became writeable again. Sander, thanks for your help. I am still curious if this a known problem, and if upgrading to a newer kernel might prevent it from reoccurring. (I was planning to wait for 3.2 to be released and included in Gentoo's repository; maybe I shouldn't wait for this...) Kind regards, Arie -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
NULL Pointer Dereference While Scrubbing
I've recently run into a kernel NULL pointer dereference while scrubbing a partition that had picked up error. I'm running kernel 3.2.0-rc7. I'd had a power outage, and noticed an error in a partition when running btrfsck after reboot: # ./btrfsck /dev/sdb5 root 5 inode 19772 errors 400 found 3123032064 bytes used err is 1 total csum bytes: 2476808 total tree bytes: 586649600 total fs tree bytes: 554622976 btree space waste bytes: 145500448 file data blocks allocated: 2536382464 referenced 5143969792 Btrfs Btrfs v0.19-dirty I ran scrub (even though this partition is formated with single data and metadata) to attempt to clear the error. My system froze for about 30 seconds (no HD activity, no mouse or keyboard movement), then the scrub proceeded to run, but with the following errors in dmesg: [ 3683.056829] [ cut here ] [ 3683.056848] WARNING: at lib/kref.c:34 kref_get+0x20/0x30() [ 3683.056851] Hardware name: [ 3683.056853] Modules linked in: nvidia(P) nvidia_agp i2c_nforce2 [ 3683.056861] Pid: 4349, comm: btrfs-readahead Tainted: P O 3.2.0-rc7-git-local+ #1 [ 3683.056865] Call Trace: [ 3683.056873] [c102ca5d] warn_slowpath_common+0x6d/0xa0 [ 3683.056878] [c13acbc0] ? kref_get+0x20/0x30 [ 3683.056882] [c13acbc0] ? kref_get+0x20/0x30 [ 3683.056886] [c102caad] warn_slowpath_null+0x1d/0x20 [ 3683.056889] [c13acbc0] kref_get+0x20/0x30 [ 3683.056898] [c135b89d] reada_pick_zone+0x11d/0x160 [ 3683.056903] [c135c491] reada_start_machine_worker+0x201/0x2f0 [ 3683.056910] [c133a4f9] worker_loop+0x89/0x370 [ 3683.056914] [c133a470] ? btrfs_queue_worker+0x250/0x250 [ 3683.056919] [c10468f4] kthread+0x74/0x80 [ 3683.056922] [c1046880] ? kthread_worker_fn+0x110/0x110 [ 3683.056929] [c1766036] kernel_thread_helper+0x6/0xd [ 3683.056932] ---[ end trace f9c0e14dc17013ed ]--- [ 3683.056941] BUG: unable to handle kernel NULL pointer dereference at 00e4 [ 3683.056947] IP: [c13aeeb4] radix_tree_delete+0x14/0x250 [ 3683.056953] *pde = [ 3683.056956] Oops: [#1] [ 3683.056960] Modules linked in: nvidia(P) nvidia_agp i2c_nforce2 [ 3683.056965] [ 3683.056968] Pid: 4349, comm: btrfs-readahead Tainted: PW O 3.2.0-rc7-git-local+ #1/MS-6570 [ 3683.056973] EIP: 0060:[c13aeeb4] EFLAGS: 00010282 CPU: 0 [ 3683.056977] EIP is at radix_tree_delete+0x14/0x250 [ 3683.056980] EAX: 00e4 EBX: f4cdf300 ECX: 0002c000 EDX: 00e4 [ 3683.056983] ESI: c135b650 EDI: f4e0c400 EBP: f0851ec4 ESP: f0851e70 [ 3683.056986] DS: 007b ES: 007b FS: GS: SS: 0068 [ 3683.056990] Process btrfs-readahead (pid: 4349, ti=f085 task=f4f0a9a0 task.ti=f085) [ 3683.056993] Stack: [ 3683.056994] 0022 0002c000 00e4 c175dfe7 c18aaeb0 f0851e94 f0851e9c c102c9ea [ 3683.057001] c18aaeb0 c17013ed c18f8971 f0851ec4 c102ca6d c18a740e c1aeb034 0022 [ 3683.057007] c13acbc0 c13acbc0 f4cdf300 c135b650 f4e0c400 f0851ed0 c135b66e f4cdf334 [ 3683.057008] Call Trace: [ 3683.057008] [c175dfe7] ? printk+0x18/0x1a [ 3683.057008] [c102c9ea] ? print_oops_end_marker+0x2a/0x30 [ 3683.057008] [c17013ed] ? xdr_process_buf+0x1d/0x1e0 [ 3683.057008] [c102ca6d] ? warn_slowpath_common+0x7d/0xa0 [ 3683.057008] [c13acbc0] ? kref_get+0x20/0x30 [ 3683.057008] [c13acbc0] ? kref_get+0x20/0x30 [ 3683.057008] [c135b650] ? reada_peer_zones_set_lock+0x60/0x60 [ 3683.057008] [c135b66e] reada_zone_release+0x1e/0x30 [ 3683.057008] [c13acb6c] kref_put+0x2c/0x60 [ 3683.057008] [c135b7b3] reada_pick_zone+0x33/0x160 [ 3683.057008] [c135c441] reada_start_machine_worker+0x1b1/0x2f0 [ 3683.057008] [c133a4f9] worker_loop+0x89/0x370 [ 3683.057008] [c133a470] ? btrfs_queue_worker+0x250/0x250 [ 3683.057008] [c10468f4] kthread+0x74/0x80 [ 3683.057008] [c1046880] ? kthread_worker_fn+0x110/0x110 [ 3683.057008] [c1766036] kernel_thread_helper+0x6/0xd [ 3683.057008] Code: 45 e0 8b 5d d8 83 c0 01 89 03 8b 45 e4 8d 65 f4 5b 5e 5f 5d c3 66 90 55 89 e5 57 56 53 83 ec 48 89 55 b0 89 c2 8b 4d b0 89 45 b4 8b 00 3b 0c 85 48 c1 9e c1 0f 87 9d 01 00 00 8b 5a 08 85 c0 89 [ 3683.057008] EIP: [c13aeeb4] radix_tree_delete+0x14/0x250 SS:ESP 0068:f0851e70 [ 3683.057008] CR2: 00e4 [ 3683.057149] ---[ end trace f9c0e14dc17013ee ]--- [ 3684.019436] checksum error at logical 24911872 on dev /dev/sdb5, sector 48656: metadata leaf (level 0) in tree 5 [ 3684.019445] checksum error at logical 24911872 on dev /dev/sdb5, sector 48656: metadata leaf (level 0) in tree 5 [ 3684.019451] btrfs: unable to fixup (regular) error at logical 24911872 [ 3684.675210] checksum error at logical 36139008 on dev /dev/sdb5, sector 70584: metadata leaf (level 0) in tree 5 [ 3684.675219] checksum error at logical 36139008 on dev /dev/sdb5, sector 70584: metadata leaf (level 0) in tree 5 [ 3684.675226] btrfs: unable to fixup (regular) error at logical 36139008 [ 3684.990958] checksum error at logical 43667456 on dev /dev/sdb5, sector 85288: metadata leaf (level 0) in tree 5 [ 3684.990967] checksum error at logical 43667456 on
Re: btrfsprogs source code
On Tue, 2012-01-03 at 23:26 +0530, debit2...@gmail.com wrote: Hi Everyone, I am very new to this mailing list and very much interested in getting into the internals of BTRFS file system I was looking for mkfs.btrfs source code so that I can start getting how the disk is formatted with btrfs system. Can anyone of you redirect me to that place to download the btrfsprogs source code. The best way to get the btrfs-progs source is probably via git; Chris Mason's repository for it can be found at http://git.kernel.org/?p=linux/kernel/git/mason/btrfs-progs.git -- Calvin Walton calvin.wal...@kepstin.ca -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: fstrim on BTRFS
On Thu, Dec 29, 2011 at 12:02:48PM +0800, Li Zefan wrote: Martin Steigerwald wrote: Hi! With 3.2-rc4 (probably earlier), Ext4 seems to remember what areas it trimmed: merkaba:~ fstrim -v /boot /boot: 224657408 bytes were trimmed merkaba:~ fstrim -v /boot /boot: 0 bytes were trimmed But BTRFS does not: merkaba:~ fstrim -v / /: 4431613952 bytes were trimmed merkaba:~ fstrim -v / /: 4341846016 bytes were trimmed Is it planned to add this feature to BTRFS as well? There's no such plan, but it's do-able, and I can take care of it. There's an issue though. Whether we want to store TRIMMED information on disk? ext4 doesn't do this, so the first fstrim will be slow though you've done fstrim in previous mount. I'd rather not store the trim status on disk. The extra trims don't have a huge cost, and since some devices have a large granularity for trims, they may ignore the trim until it tosses a larger contiguous area of the disk. I'd be fine with a flag to the in-memory free extent struct that indicates if it has been trimmed down to the device. -chris -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Btrfs partition lost after RAID1 mirror disk failure?
On Tue, Jan 3, 2012 at 8:44 AM, Dan Garton dan.gar...@gmail.com wrote: [...] 1327 btrfs-vol -a 1328 btrfs-vol -a /nuvat 1329 btrfs-vol -a asdasd /nuvat 1330 btrfs-vol -a missing /nuvat 1331 btrfs-vol -a /dev/sdc /nuvat 1332 btrfs-vol -a /dev/sdb /nuvat 1334 btrfs-vol -a missing /nuvat [...] these look destructive to me ... adding the wrong devices and the existing devices back to the current array? IIRC you should have `-r missing`, but in general, do not use the btrfsctl utility at all -- it won't have as much visibility/exception-handling/recovery as the `btrfs` utility. at what point did your FS become inaccessible? your command history suggest it was working until shortly after these commands ... :-( -- C Anthony -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html