Re: volume broken? btrfsck fails
On Mi, 04.08.10 21:30 Chris Mason chris.ma...@oracle.com wrote: On Wed, Aug 04, 2010 at 08:48:40PM +0200, Thomas Kuther wrote: On Di, 06.07.10 20:16 Chris Mason chris.ma...@oracle.com wrote: On Sat, Jun 26, 2010 at 03:15:04PM -0700, Yee-Ting Li wrote: Hi, i think my btrfs volume is hosed it mounts okay, but iostat shows /dev/sdg on 100% load. dmesg shows lots of 'parent transid verify failed on x wanted y found z'. then after a while i can't read from it (access to the filesystem freezes). the machine had crashed (prob from some other process), and upon reboot i've been experience this problem since. can anyone provide any guidance in how to proceed? These are definitely corruptions, and they probably came from the crash. Can you tell me more about the crash? (Power failure, what is the storage underneath etc, what are the write cache settings). We don't expect these kinds corruptions to happen. Yan Zheng is making a lot of progress on btrfsck, but I don't think you'll want to be one of the first testers there. I can definitely help copy things off if you're having trouble accessing the FS. -chris Hello Chris, sorry if I'm hijacking this thread. I got a similar problem, probably caused by a system crash due to faulty/badly timed memory dimms. The system suddenly hardlocked during write activity. - kernel is 2.6.35 - btrfs on top of a md raid5, which looks healthy. Desktop SATA disks. # cat /proc/mdstat|grep -A1 md0 md0 : active raid5 sdb1[0] sdd1[1] sdc1[2] 2930271872 blocks level 5, 64k chunk, algorithm 2 [3/3] [UUU] # btrfsck usage: btrfsck dev Btrfs v0.19-16-g075587c-dirty # btrfsck /dev/md0 parent transid verify failed on 2419218964480 wanted 127839 found 127260 parent transid verify failed on 2419218964480 wanted 127839 found 127260 parent transid verify failed on 2419218915328 wanted 127839 found 127260 parent transid verify failed on 2419218915328 wanted 127839 found 127260 parent transid verify failed on 2419214266368 wanted 127839 found 127837 parent transid verify failed on 2419214266368 wanted 127839 found 127837 parent transid verify failed on 2419214266368 wanted 127839 found 127837 Segmentation fault Mount endlessly loops, like explained in this thread. If there is a way, I would really like some aid copying the data off. The backup is quite out of date, shame on me. No problem, I'll get a test patch out in the morning. -chris Hi Chris, did you find the time to get that patch done meanwhile? I'm willing to test. Seems more people get this error after power outages, suspending or similar. Thanks in advance. ~Thomas signature.asc Description: PGP signature
Re: volume broken? btrfsck fails
On Di, 06.07.10 20:16 Chris Mason chris.ma...@oracle.com wrote: On Sat, Jun 26, 2010 at 03:15:04PM -0700, Yee-Ting Li wrote: Hi, i think my btrfs volume is hosed it mounts okay, but iostat shows /dev/sdg on 100% load. dmesg shows lots of 'parent transid verify failed on x wanted y found z'. then after a while i can't read from it (access to the filesystem freezes). the machine had crashed (prob from some other process), and upon reboot i've been experience this problem since. can anyone provide any guidance in how to proceed? These are definitely corruptions, and they probably came from the crash. Can you tell me more about the crash? (Power failure, what is the storage underneath etc, what are the write cache settings). We don't expect these kinds corruptions to happen. Yan Zheng is making a lot of progress on btrfsck, but I don't think you'll want to be one of the first testers there. I can definitely help copy things off if you're having trouble accessing the FS. -chris Hello Chris, sorry if I'm hijacking this thread. I got a similar problem, probably caused by a system crash due to faulty/badly timed memory dimms. The system suddenly hardlocked during write activity. - kernel is 2.6.35 - btrfs on top of a md raid5, which looks healthy. Desktop SATA disks. # cat /proc/mdstat|grep -A1 md0 md0 : active raid5 sdb1[0] sdd1[1] sdc1[2] 2930271872 blocks level 5, 64k chunk, algorithm 2 [3/3] [UUU] # btrfsck usage: btrfsck dev Btrfs v0.19-16-g075587c-dirty # btrfsck /dev/md0 parent transid verify failed on 2419218964480 wanted 127839 found 127260 parent transid verify failed on 2419218964480 wanted 127839 found 127260 parent transid verify failed on 2419218915328 wanted 127839 found 127260 parent transid verify failed on 2419218915328 wanted 127839 found 127260 parent transid verify failed on 2419214266368 wanted 127839 found 127837 parent transid verify failed on 2419214266368 wanted 127839 found 127837 parent transid verify failed on 2419214266368 wanted 127839 found 127837 Segmentation fault Mount endlessly loops, like explained in this thread. If there is a way, I would really like some aid copying the data off. The backup is quite out of date, shame on me. Best regards, Thomas -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: volume broken? btrfsck fails
On Wed, Aug 04, 2010 at 08:48:40PM +0200, Thomas Kuther wrote: On Di, 06.07.10 20:16 Chris Mason chris.ma...@oracle.com wrote: On Sat, Jun 26, 2010 at 03:15:04PM -0700, Yee-Ting Li wrote: Hi, i think my btrfs volume is hosed it mounts okay, but iostat shows /dev/sdg on 100% load. dmesg shows lots of 'parent transid verify failed on x wanted y found z'. then after a while i can't read from it (access to the filesystem freezes). the machine had crashed (prob from some other process), and upon reboot i've been experience this problem since. can anyone provide any guidance in how to proceed? These are definitely corruptions, and they probably came from the crash. Can you tell me more about the crash? (Power failure, what is the storage underneath etc, what are the write cache settings). We don't expect these kinds corruptions to happen. Yan Zheng is making a lot of progress on btrfsck, but I don't think you'll want to be one of the first testers there. I can definitely help copy things off if you're having trouble accessing the FS. -chris Hello Chris, sorry if I'm hijacking this thread. I got a similar problem, probably caused by a system crash due to faulty/badly timed memory dimms. The system suddenly hardlocked during write activity. - kernel is 2.6.35 - btrfs on top of a md raid5, which looks healthy. Desktop SATA disks. # cat /proc/mdstat|grep -A1 md0 md0 : active raid5 sdb1[0] sdd1[1] sdc1[2] 2930271872 blocks level 5, 64k chunk, algorithm 2 [3/3] [UUU] # btrfsck usage: btrfsck dev Btrfs v0.19-16-g075587c-dirty # btrfsck /dev/md0 parent transid verify failed on 2419218964480 wanted 127839 found 127260 parent transid verify failed on 2419218964480 wanted 127839 found 127260 parent transid verify failed on 2419218915328 wanted 127839 found 127260 parent transid verify failed on 2419218915328 wanted 127839 found 127260 parent transid verify failed on 2419214266368 wanted 127839 found 127837 parent transid verify failed on 2419214266368 wanted 127839 found 127837 parent transid verify failed on 2419214266368 wanted 127839 found 127837 Segmentation fault Mount endlessly loops, like explained in this thread. If there is a way, I would really like some aid copying the data off. The backup is quite out of date, shame on me. No problem, I'll get a test patch out in the morning. -chris -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: volume broken? btrfsck fails
so after leaving the array for a while, with the disk churning away for a few days, it stopped. i copied some files off the disk (everything seems okay) and decided to unmount and run btrfsck again - this time i get a different error: $ sudo /usr/local/bin/btrfsck /dev/sdf failed to read /dev/sr0 parent transid verify failed on 2703919247360 wanted 9066 found 7543 parent transid verify failed on 2703914500096 wanted 9066 found 7543 parent transid verify failed on 2703873781760 wanted 9074 found 9022 parent transid verify failed on 2703877693440 wanted 9070 found 9062 parent transid verify failed on 2703921868800 wanted 9066 found 7543 parent transid verify failed on 2703922647040 wanted 9066 found 7543 parent transid verify failed on 2703919247360 wanted 9066 found 7543 parent transid verify failed on 270391922 wanted 9066 found 7543 parent transid verify failed on 2703917125632 wanted 9066 found 7543 parent transid verify failed on 2703879294976 wanted 9075 found 9055 parent transid verify failed on 2703883194368 wanted 9075 found 9057 parent transid verify failed on 2703922688000 wanted 9066 found 7543 parent transid verify failed on 2703873781760 wanted 9074 found 9022 parent transid verify failed on 2703877693440 wanted 9070 found 9062 parent transid verify failed on 2703921868800 wanted 9066 found 7543 parent transid verify failed on 2703922647040 wanted 9066 found 7543 parent transid verify failed on 2703919247360 wanted 9066 found 7543 parent transid verify failed on 270391922 wanted 9066 found 7543 bad block 2703873781760 Extent back ref already exists for 365342720 parent 0 root 2 Extent back ref already exists for 2221870616576 parent 0 root 2 Extent back ref already exists for 383959040 parent 0 root 2 Extent back ref already exists for 367714304 parent 0 root 2 Extent back ref already exists for 706744320 parent 0 root 2 Extent back ref already exists for 368672768 parent 0 root 2 Extent back ref already exists for 315338752 parent 0 root 2 Extent back ref already exists for 377356288 parent 0 root 2 Extent back ref already exists for 368914432 parent 0 root 2 Extent back ref already exists for 369807360 parent 0 root 2 Extent back ref already exists for 2221957713920 parent 0 root 2 Extent back ref already exists for 370139136 parent 0 root 2 Extent back ref already exists for 369811456 parent 0 root 2 Extent back ref already exists for 370122752 parent 0 root 2 Extent back ref already exists for 365936640 parent 0 root 2 Extent back ref already exists for 2221948424192 parent 0 root 2 Extent back ref already exists for 3624002596864 parent 0 root 2 Extent back ref already exists for 706789376 parent 0 root 2 Extent back ref already exists for 2703778734080 parent 0 root 2 Extent back ref already exists for 372252672 parent 0 root 2 Extent back ref already exists for 372109312 parent 0 root 2 Extent back ref already exists for 372989952 parent 0 root 2 Extent back ref already exists for 373657600 parent 0 root 2 Extent back ref already exists for 374521856 parent 0 root 2 Extent back ref already exists for 374628352 parent 0 root 2 Extent back ref already exists for 374976512 parent 0 root 2 Extent back ref already exists for 2221948403712 parent 0 root 2 Extent back ref already exists for 375586816 parent 0 root 2 Extent back ref already exists for 375906304 parent 0 root 2 Extent back ref already exists for 376639488 parent 0 root 2 Extent back ref already exists for 706818048 parent 0 root 2 Extent back ref already exists for 383778816 parent 0 root 2 Extent back ref already exists for 377626624 parent 0 root 2 leaf parent key incorrect 2703874203648 bad block 2703874203648 leaf 080487424 items 37 free space 1183 generation 10279 owner 2 fs uuid ea7ea0b3-bc42-4b0c-9173-346df61d4454 chunk uuid 886b0dfb-fa34-49c7-9ab0-2589603f8ae4 item 0 key (364388352 EXTENT_ITEM 4096) itemoff 3944 itemsize 51 extent refs 1 gen 1061 flags 2 tree block key (18446744073709551606 80 200172044288) level 0 tree block backref root 7 item 1 key (364392448 EXTENT_ITEM 4096) itemoff 3893 itemsize 51 extent refs 1 gen 1061 flags 2 tree block key (18446744073709551606 80 200220258304) level 0 tree block backref root 7 item 2 key (364396544 EXTENT_ITEM 4096) itemoff 3842 itemsize 51 extent refs 1 gen 1061 flags 2 tree block key (18446744073709551606 80 200179384320) level 0 tree block backref root 7 item 3 key (364400640 EXTENT_ITEM 4096) itemoff 3791 itemsize 51 extent refs 1 gen 1061 flags 2 tree block key (18446744073709551606 80 200220258304) level 0 tree block backref root 7 item 4 key (364404736 EXTENT_ITEM 4096) itemoff 3740 itemsize 51 extent refs 1 gen 1061 flags 2 tree block key (18446744073709551606 80
Re: volume broken? btrfsck fails
On Wed, Jul 07, 2010 at 10:39:48PM -0400, Daniel Kozlowski wrote: Looks like we're looping on a single block. What happens when you dmesg -n1 to cut down on the console traffic? Nothing changes I still have endless repeats of parent transid verify failed on 1682586464256 wanted 285114 found 11257 If that doesn't help we can change it to spit a stack trace to figure out where the looping is happening. We should be erroring out instead of hitting it over and over again. In my kernel noviceness i tried attaching gdb to the btrfs-endio-met, however apparently you can't attach gdb to a kernel thread like that If you could assist me in obtaining a call trace I will gladly attempt to resolve the matter. Ok I had some free time and decided to excersice my googlefoo and came up with this trace parent transid verify failed on 3241193205760 wanted 285287 found 281382 Pid: 2163, comm: mount Not tainted 2.6.35-0.23.rc3.git6.fc14.x86_64 #1 Call Trace: [a047c376] verify_parent_transid+0xb7/0xfe [btrfs] [a047c4f2] btrfs_buffer_uptodate+0x49/0x59 [btrfs] [a04686a2] read_block_for_search+0x8f/0x289 [btrfs] [a046d554] btrfs_search_slot+0x3ae/0x513 [btrfs] [a0470ece] btrfs_read_block_groups+0x73/0x526 [btrfs] [8149b0a3] ? _raw_spin_unlock+0x2b/0x2f [a0469f56] ? btrfs_root_node+0x2a/0x32 [btrfs] [a047d287] ? find_and_setup_root+0xab/0xbc [btrfs] [a04800eb] open_ctree+0xf19/0x143a [btrfs] [a0467960] btrfs_get_sb+0x1ce/0x40b [btrfs] [810e9cfd] ? free_pages+0x49/0x4e [8112c9f9] vfs_kern_mount+0xbd/0x19b [8112cb3f] do_kern_mount+0x4d/0xed [81143742] do_mount+0x776/0x7ed [81143841] sys_mount+0x88/0xc2 [81009c32] system_call_fastpath+0x16/0x1b Ok, so we're never getting out of mount. A recent change to read_block_for_search is causing this problem. We're looping over and over again because it is returning -EAGAIN instead of -EIO. Thanks for nailing this trace down, I'll get a fix in for the looping. I'm afraid it won't bring back the filesystem though, you'll end up failing in mount. Would you like some helping copying the data off? -chris -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: volume broken? btrfsck fails
On 11 Jul 2010, at 17:43, Chris Mason wrote: Was this after a fresh mkfs? Clearly things are very corrupt on this original drive. It would be a good test case for Yan Zhengs new fsck code, but first I'd like to figure out if you're still seeing the old corruption of if you've started over. nope, same disk as before when the btrfsck exited with: btrfsck: disk-io.c:410: find_and_setup_root: Assertion `!(ret)' failed. the strange thing was that i'm pretty sure that btrfs crashed the system a couple of times (hung). after reboot the mounted drive would basically churn away for hours and spit out lots of the parent transid messages. but after a while it stops and everything seems fine again. i don't mind losing files on the disk array, but it would be nice if it could tell me the actual filenames which are corrupt. Yee.-- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: volume broken? btrfsck fails
On 8 July 2010 01:21, Daniel Kozlowski dan.kozlow...@gmail.com wrote: On Tue, Jul 6, 2010 at 8:19 PM, Chris Mason chris.ma...@oracle.com wrote: I am also having the same problem with a slightly different setup. In My case I cannot mount the filesystem. What is your hardware setup here? Including write cache settings. Did you have craces with 2.6.35-rc1 or rc2? My setup is Eight hard Drive four 1TB Drives four 500GB Drives All drives are connected through a 3ware Inc 9550SX SATA-II RAID PCI-X card The card is configured to export all drives essentially acting as a SATA port multiplier. (drives show up sdb - sdi) Drives are configured in btrfs raid0 Filesystem is mounted using: mount -t btrfs /dev/sdb /opt I have been able to lock up the system on 2.6.33.5-124.fc13.x86_64 2.6.35-0.13.rc3.git2.fc14.x86_64 2.6.35-0.23.rc3.git6.fc14.x86_64 and 2.6.35-0.23.rc3.git6.fc14.x86_64 with a DKMS build of the btrfs module (Btrfs v0.19-16-g075587c-dirty) If you would like me to pull out another version of the kernel or roll back specific commits from the kernel module I can I have been able to get different responses form different version 2.6.33.* - This will mount the volume but will hang shortly after mounting when reading data form the filesystem ( ls /opt) writes a bunch of transid verify failed messages hangs on ls 2.6.34.* - Will not mount at all still gives the transid verify failed hands on mount Looks like we're looping on a single block. What happens when you dmesg -n1 to cut down on the console traffic? Nothing changes I still have endless repeats of parent transid verify failed on 1682586464256 wanted 285114 found 11257 If that doesn't help we can change it to spit a stack trace to figure out where the looping is happening. We should be erroring out instead of hitting it over and over again. In my kernel noviceness i tried attaching gdb to the btrfs-endio-met, however apparently you can't attach gdb to a kernel thread like that If you could assist me in obtaining a call trace I will gladly attempt to resolve the matter. For grabbing kernel backtraces: $ sudo -s # dmesg -c /dev/null # echo t /proc/sysrq-trigger # dmesg backtraces.txt (there are other ways with The problem is that you'll be taking instantaneous snapshots, which may or may not be representative of the main looping, but over a few shots should be. Thanks, Daniel -- Daniel J Blueman -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: volume broken? btrfsck fails
Looks like we're looping on a single block. What happens when you dmesg -n1 to cut down on the console traffic? Nothing changes I still have endless repeats of parent transid verify failed on 1682586464256 wanted 285114 found 11257 If that doesn't help we can change it to spit a stack trace to figure out where the looping is happening. We should be erroring out instead of hitting it over and over again. In my kernel noviceness i tried attaching gdb to the btrfs-endio-met, however apparently you can't attach gdb to a kernel thread like that If you could assist me in obtaining a call trace I will gladly attempt to resolve the matter. Ok I had some free time and decided to excersice my googlefoo and came up with this trace parent transid verify failed on 3241193205760 wanted 285287 found 281382 Pid: 2163, comm: mount Not tainted 2.6.35-0.23.rc3.git6.fc14.x86_64 #1 Call Trace: [a047c376] verify_parent_transid+0xb7/0xfe [btrfs] [a047c4f2] btrfs_buffer_uptodate+0x49/0x59 [btrfs] [a04686a2] read_block_for_search+0x8f/0x289 [btrfs] [a046d554] btrfs_search_slot+0x3ae/0x513 [btrfs] [a0470ece] btrfs_read_block_groups+0x73/0x526 [btrfs] [8149b0a3] ? _raw_spin_unlock+0x2b/0x2f [a0469f56] ? btrfs_root_node+0x2a/0x32 [btrfs] [a047d287] ? find_and_setup_root+0xab/0xbc [btrfs] [a04800eb] open_ctree+0xf19/0x143a [btrfs] [a0467960] btrfs_get_sb+0x1ce/0x40b [btrfs] [810e9cfd] ? free_pages+0x49/0x4e [8112c9f9] vfs_kern_mount+0xbd/0x19b [8112cb3f] do_kern_mount+0x4d/0xed [81143742] do_mount+0x776/0x7ed [81143841] sys_mount+0x88/0xc2 [81009c32] system_call_fastpath+0x16/0x1b Dan Kozlowski -- S.D.G. -- S.D.G. -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: volume broken? btrfsck fails
On Tue, Jul 6, 2010 at 8:19 PM, Chris Mason chris.ma...@oracle.com wrote: I am also having the same problem with a slightly different setup. In My case I cannot mount the filesystem. What is your hardware setup here? Including write cache settings. Did you have craces with 2.6.35-rc1 or rc2? My setup is Eight hard Drive four 1TB Drives four 500GB Drives All drives are connected through a 3ware Inc 9550SX SATA-II RAID PCI-X card The card is configured to export all drives essentially acting as a SATA port multiplier. (drives show up sdb - sdi) Drives are configured in btrfs raid0 Filesystem is mounted using: mount -t btrfs /dev/sdb /opt I have been able to lock up the system on 2.6.33.5-124.fc13.x86_64 2.6.35-0.13.rc3.git2.fc14.x86_64 2.6.35-0.23.rc3.git6.fc14.x86_64 and 2.6.35-0.23.rc3.git6.fc14.x86_64 with a DKMS build of the btrfs module (Btrfs v0.19-16-g075587c-dirty) If you would like me to pull out another version of the kernel or roll back specific commits from the kernel module I can I have been able to get different responses form different version 2.6.33.* - This will mount the volume but will hang shortly after mounting when reading data form the filesystem ( ls /opt) writes a bunch of transid verify failed messages hangs on ls 2.6.34.* - Will not mount at all still gives the transid verify failed hands on mount Looks like we're looping on a single block. What happens when you dmesg -n1 to cut down on the console traffic? Nothing changes I still have endless repeats of parent transid verify failed on 1682586464256 wanted 285114 found 11257 If that doesn't help we can change it to spit a stack trace to figure out where the looping is happening. We should be erroring out instead of hitting it over and over again. In my kernel noviceness i tried attaching gdb to the btrfs-endio-met, however apparently you can't attach gdb to a kernel thread like that If you could assist me in obtaining a call trace I will gladly attempt to resolve the matter. Dan Kozlowski -- S.D.G. -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: volume broken? btrfsck fails
On Sat, Jun 26, 2010 at 03:15:04PM -0700, Yee-Ting Li wrote: Hi, i think my btrfs volume is hosed it mounts okay, but iostat shows /dev/sdg on 100% load. dmesg shows lots of 'parent transid verify failed on x wanted y found z'. then after a while i can't read from it (access to the filesystem freezes). the machine had crashed (prob from some other process), and upon reboot i've been experience this problem since. can anyone provide any guidance in how to proceed? These are definitely corruptions, and they probably came from the crash. Can you tell me more about the crash? (Power failure, what is the storage underneath etc, what are the write cache settings). We don't expect these kinds corruptions to happen. Yan Zheng is making a lot of progress on btrfsck, but I don't think you'll want to be one of the first testers there. I can definitely help copy things off if you're having trouble accessing the FS. -chris -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: volume broken? btrfsck fails
On 6 Jul 2010, at 17:16, Chris Mason wrote: These are definitely corruptions, and they probably came from the crash. Can you tell me more about the crash? (Power failure, what is the storage underneath etc, what are the write cache settings). We don't expect these kinds corruptions to happen. i think what happened was that the power got pulled accidentally. at the time i had a drive (sde) on an external usb controller. the other two drives are internal on a nForce 730i chipset. they are all 2TB WD drives (combination of EADS and EARS drives). according to hdparm all the drives have write-caching on. Yan Zheng is making a lot of progress on btrfsck, but I don't think you'll want to be one of the first testers there. I can definitely help copy things off if you're having trouble accessing the FS. i'm performing rsyncs at the moment to get some of the data off. i can read the drive fine, but after a while (i guess when something tries to access the corrupt file) i get the dmesgs again, and high cpu on the two btrfs-transacti and btrfs-endio-met threads. is there a way i can determine the actual filenames that may be corrupt? also, as i'm not using the /dev/sde drive (btrfs-show gives used 0.00TB) as i didn't do a balance after i installed it - is there a way i can degrade the array to recover that disk and keep the array with just two disks? then i will have enough storage to copy the 'good' files off :) once i have a replica, then i can test whatever code you'd like to throw at me :) cheers, Yee.-- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: volume broken? btrfsck fails
On 1 Jul 2010, at 05:51, Daniel Kozlowski wrote: I am also having the same problem with a slightly different setup. In My case I cannot mount the filesystem. mount, btrfs-endio-met and kblockd/0 will all continually run until the system freezes up and requires a power cycle. have you tried mounting with '-o degraded'? having monitored the system for a while, i also think that in fact it's btrfs that's killing my system. i'm on ubuntu 10.4 with: $ uname -a Linux htpc 2.6.32-22-server #36-Ubuntu SMP Thu Jun 3 20:38:33 UTC 2010 x86_64 GNU/Linux using the default kernel module, but git'd out the tools. following the other thread 'Is there a more aggressive fixer than btrfsck?' i suspect that we'll just have to wait until some actual fsck operations are available for btrfs :( on my system, it's btrfs-endio-met (only 1 out of 4) and btrfs-transacti (1 out of 2) that is taking up all the cpu/io wait cycles. i wonder if it's only certain files on the array that are hosed; if that's the case is there a way i can map the kernel messages to a real filename? i don't mind loosing the odd file on this array, but i don't fancy copying it all over to somewhere else (yeah-yeah, up to date backups blah blah!) - i figured given the momentum btrfs was gaining it would be much more stable than this :( Yee.-- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: volume broken? btrfsck fails
Yee-Ting Li yee379 at gmail.com writes: Hi, i think my btrfs volume is hosed it mounts okay, but iostat shows /dev/sdg on 100% load. dmesg shows lots of 'parent transid verify failed on x wanted y found z'. then after a while i can't read from it (access to the filesystem freezes). the machine had crashed (prob from some other process), and upon reboot i've been experience this problem since. can anyone provide any guidance in how to proceed? cheers, Yee. I am also having the same problem with a slightly different setup. In My case I cannot mount the filesystem. mount, btrfs-endio-met and kblockd/0 will all continually run until the system freezes up and requires a power cycle. I have both the kernel module and the tools checked out from git so if you have any ideas on fix's I can build them and test it out. here is some information about my setup [r...@solution ~]# uname -a Linux solution.bcig 2.6.35-0.13.rc3.git2.fc14.x86_64 #1 SMP Mon Jun 28 19:27:35 UTC 2010 x86_64 x86_64 x86_64 GNU/Linux [r...@solution ~]# [r...@solution ~]# btrfs-show Label: store uuid: 4ba1cc6b-e12a-454a-a064-f4019312c063 Total devices 7 FS bytes used 1.15TB devid1 size 931.51GB used 415.55GB path /dev/sdb devid2 size 931.51GB used 518.50GB path /dev/sdc devid3 size 931.51GB used 342.04GB path /dev/sdd devid4 size 931.51GB used 523.54GB path /dev/sde devid5 size 465.76GB used 402.54GB path /dev/sdf devid6 size 465.76GB used 382.54GB path /dev/sdg devid7 size 465.76GB used 367.54GB path /dev/sdh Btrfs v0.19-16-g075587c-dirty [r...@solution ~]# [r...@solution ~]# tail -n 12 /var/log/messages Jul 1 04:47:03 solution kernel: parent transid verify failed on 1682196926464 wanted 285263 found 283510 Jul 1 04:47:08 solution kernel: verify_parent_transid: 9244 callbacks suppressed Jul 1 04:47:08 solution kernel: parent transid verify failed on 1682196926464 wanted 285263 found 283510 Jul 1 04:47:08 solution kernel: parent transid verify failed on 1682196926464 wanted 285263 found 283510 Jul 1 04:47:08 solution kernel: parent transid verify failed on 1682196926464 wanted 285263 found 283510 Jul 1 04:47:08 solution kernel: parent transid verify failed on 1682196926464 wanted 285263 found 283510 Jul 1 04:47:08 solution kernel: parent transid verify failed on 1682196926464 wanted 285263 found 283510 Jul 1 04:47:08 solution kernel: parent transid verify failed on 1682196926464 wanted 285263 found 283510 Jul 1 04:47:08 solution kernel: parent transid verify failed on 1682196926464 wanted 285263 found 283510 Jul 1 04:47:08 solution kernel: parent transid verify failed on 1682196926464 wanted 285263 found 283510 Jul 1 04:47:08 solution kernel: parent transid verify failed on 1682196926464 wanted 285263 found 283510 Jul 1 04:47:08 solution kernel: parent transid verify failed on 1682196926464 wanted 285263 found 283510 [r...@solution ~]# -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
volume broken? btrfsck fails
Hi, i think my btrfs volume is hosed it mounts okay, but iostat shows /dev/sdg on 100% load. dmesg shows lots of 'parent transid verify failed on x wanted y found z'. then after a while i can't read from it (access to the filesystem freezes). the machine had crashed (prob from some other process), and upon reboot i've been experience this problem since. can anyone provide any guidance in how to proceed? cheers, Yee. $ sudo /usr/local/bin/btrfs-show failed to read /dev/sr0 Label: none uuid: ea7ea0b3-bc42-4b0c-9173-346df61d4454 Total devices 3 FS bytes used 3.56TB devid3 size 1.82TB used 0.00 path /dev/sde devid1 size 1.82TB used 1.82TB path /dev/sdf devid2 size 1.82TB used 1.82TB path /dev/sdg Btrfs v0.19-16-g075587c $ sudo /usr/local/bin/btrfsck /dev/sdf failed to read /dev/sr0 parent transid verify failed on 2703873638400 wanted 9074 found 9016 parent transid verify failed on 2703884750848 wanted 9074 found 9055 parent transid verify failed on 2703884763136 wanted 9074 found 9060 parent transid verify failed on 2703883599872 wanted 9074 found 9034 parent transid verify failed on 2703920717824 wanted 9066 found 7543 parent transid verify failed on 2703912325120 wanted 9066 found 7543 parent transid verify failed on 2703912034304 wanted 9066 found 7543 parent transid verify failed on 2703881900032 wanted 9071 found 9060 parent transid verify failed on 2703881793536 wanted 9069 found 9057 bad block 2703860367360 Extent back ref already exists for 2703873536000 parent 0 root 2 bad block 2703860621312 bad block 2703861547008 Extent back ref already exists for 2703876689920 parent 0 root 2 Extent back ref already exists for 2703881900032 parent 0 root 2 Extent back ref already exists for 2703879290880 parent 0 root 2 Extent back ref already exists for 2703873753088 parent 0 root 2 parent transid verify failed on 2703921885184 wanted 9066 found 7543 parent transid verify failed on 2703921889280 wanted 9066 found 7543 parent transid verify failed on 2703879036928 wanted 9069 found 9061 parent transid verify failed on 2703881867264 wanted 9075 found 9065 parent transid verify failed on 2703873536000 wanted 9074 found 9062 parent transid verify failed on 2703883190272 wanted 9075 found 9061 parent transid verify failed on 2703869997056 wanted 9073 found 9060 parent transid verify failed on 2703922012160 wanted 9066 found 7543 parent transid verify failed on 2703921975296 wanted 9066 found 7543 parent transid verify failed on 2703867707392 wanted 9071 found 9060 parent transid verify failed on 2703922679808 wanted 9066 found 7543 parent transid verify failed on 2703922032640 wanted 9066 found 7543 parent transid verify failed on 2703881891840 wanted 9075 found 9057 parent transid verify failed on 2703882297344 wanted 9075 found 9061 parent transid verify failed on 2703884488704 wanted 9074 found 9057 parent transid verify failed on 2703884353536 wanted 9074 found 9057 parent transid verify failed on 2703884365824 wanted 9074 found 9055 parent transid verify failed on 2703921500160 wanted 9066 found 7543 parent transid verify failed on 2703883177984 wanted 9075 found 9061 parent transid verify failed on 2703921487872 wanted 9066 found 7543 parent transid verify failed on 2703922683904 wanted 9066 found 7543 parent transid verify failed on 2703873753088 wanted 9074 found 9062 parent transid verify failed on 2703874314240 wanted 9074 found 9056 Extent back ref already exists for 2703865823232 parent 0 root 2 Extent back ref already exists for 2703866810368 parent 0 root 2 Extent back ref already exists for 2703866986496 parent 0 root 2 Extent back ref already exists for 2703867031552 parent 0 root 2 Extent back ref already exists for 2703867625472 parent 0 root 2 Extent back ref already exists for 2703867609088 parent 0 root 2 Extent back ref already exists for 2703868829696 parent 0 root 2 Extent back ref already exists for 2703869734912 parent 0 root 2 Extent back ref already exists for 2703870255104 parent 0 root 2 Extent back ref already exists for 2703870562304 parent 0 root 2 Extent back ref already exists for 2703871201280 parent 0 root 2 Extent back ref already exists for 2703871168512 parent 0 root 2 Extent back ref already exists for 2703873040384 parent 0 root 2 Extent back ref already exists for 2703872610304 parent 0 root 2 Extent back ref already exists for 2703874686976 parent 0 root 2 Extent back ref already exists for 2703873318912 parent 0 root 2 Extent back ref already exists for 2703873740800 parent 0 root 2 Extent back ref already exists for 2703874465792 parent 0 root 2 Extent back ref already exists for 2703876370432 parent 0 root 2 Extent back ref already exists for 2703877046272 parent 0 root 2 Extent back ref already exists for 2703877050368 parent 0 root 2 Extent back ref already exists for 2703878647808 parent 0 root 2 Extent back ref already exists for 2703876407296 parent 0 root 2 Extent back ref