Re: btrfs-transaction blocked for more than 120 seconds
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Dear Chris! Certainly: I have 3 HDDs, all of which WD20EARS. Originally I wanted to let btrfs handle all 3 devices directly without making partitions, but this was impossible, as at least /boot needed to be ext4, at least back then when I set up the server. And back then btrfs also hadn't raid5-like functionality, so I decided to put good old partitions and md-Raids and LVM on them and use btrfs just as plain file-systems on the partitions provided by LVM. On the WD disks I thus created 2 partitions each, the first sdX1 being ~500MiB, the rest, 1.9995 TiB is one partition of, sdX2. I built a Raid1 on the 3 small partitions sdX1 with ext4 for boot, each disk is bootable with grub installed into the MBR. I combined the 3 large partitions to a Raid5 of size 3,64TB: /proc/mdstat reads: md0 : active raid1 sda1[5] sdb1[4] sdc1[3] 498676 blocks super 1.2 [3/3] [UUU] md1 : active raid5 sda2[5] sdb2[4] sdc2[3] 3904907520 blocks super 1.2 level 5, 8k chunk, algorithm 2 [3/3] [UUU] the information you requested: # sudo mdadm -D /dev/md1 /dev/md1: Version : 1.2 Creation Time : Thu Jul 14 18:49:25 2011 Raid Level : raid5 Array Size : 3904907520 (3724.01 GiB 3998.63 GB) Used Dev Size : 1952453760 (1862.01 GiB 1999.31 GB) Raid Devices : 3 Total Devices : 3 Persistence : Superblock is persistent Update Time : Sun Jan 5 22:07:22 2014 State : clean Active Devices : 3 Working Devices : 3 Failed Devices : 0 Spare Devices : 0 Layout : left-symmetric Chunk Size : 8K Name : freedom:1 (local to host freedom) UUID : 44b72520:a78af6f7:dba13fb3:2203127d Events : 576884 Number Major Minor RaidDevice State 4 8 180 active sync /dev/sdb2 5 821 active sync /dev/sda2 3 8 342 active sync /dev/sdc2 I use the Raid5 md1 as physical volume for LVM: pvdisplay gives: --- Physical volume --- PV Name /dev/md1 VG Name MAIN PV Size 3.64 TiB / not usable 2.06 MiB Allocatable yes PE Size 4.00 MiB Total PE 953346 Free PE 6274 Allocated PE 947072 PV UUID WcuEx8-ehJL-xHdf-ElwF-b9s3-dlmM-KZlDNG I keep a reserve of 6274 4MiB blocks (=24GiB) in case one of the logical volumes runs out of space... I created the following logical volumes, named after their intended mountpoints: --- Logical volume --- LV Path/dev/MAIN/ROOT LV NameROOT VG NameMAIN LV UUIDkURJks-xHox-73B5-n02x-eZfS-agDD-n1dtAm LV Write Accessread/write LV Creation host, time , LV Status available # open 1 LV Size19.31 GiB Current LE 4944 Segments 2 Allocation inherit Read ahead sectors auto - currently set to 256 Block device 252:0 and similar: --- Logical volume --- LV Path/dev/MAIN/SWAP: 1.8GB LV Path/dev/MAIN/HOME: 18.6GB LV Path/dev/MAIN/TMP: 9.3 GB LV Path/dev/MAIN/DATA1 2.6 TB LV Path/dev/MAIN/DATA2: 0.9 TB as filesystem I used btrfs during install form an ubuntu server, I don't recall which, might have been 11.10 or 12.04 (?) for all logical partitions except swap, of course, any other information I can supply? regards, Sulla - -- Cogito cogito ergo cogito sum. Ambrose Bierce -BEGIN PGP SIGNATURE- Version: GnuPG v2.0.21 (MingW32) Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/ iEYEARECAAYFAlLJy+8ACgkQR6b2EdogPFupxgCfeDRdeO+PYoQNIjtySAYEmSEr PNoAoLPNcSqDHsDzM8pAuHlbva7j18MS =XBOA -END PGP SIGNATURE- -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: btrfs-transaction blocked for more than 120 seconds
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Thanks Chris! Thanks for your support. echo 120 /sys/block/sdX/device/timeout timeout is 30 for my HDDs. I'm well aware that the WD green HDDs are not the perfect ones for servers, but they were cheaper - and quieter - than the black ones for servers. I'll get the red ones next, though. ;-) You also need to schedule regular scrubs at the md level as well. Ubuntu does that once a month. cat /sys/block/mdX/mismatch_cnt this resides in cat /sys/devices/virtual/block/md1/md/mismatch_cnt on my machine. the count is zero. The workload is presumably small file sizes, like a mail server? Yes. It serves as a mailserver (maildir-format), but also as a samba file server with quite big files... btrfs ran fine for more than a year, so I'm not sure how reproducible the problem is... I don't really wish to install or compile cumstom kernels, to be honest. Not sure how problematic they might be during the next do-release-upgrade... Sulla - -- Russian Roulette is not the same without a gun and baby when it's love, if it's not rough, it isn't fun, fun. Lady GaGa, Pokerface -BEGIN PGP SIGNATURE- Version: GnuPG v2.0.21 (MingW32) Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/ iEYEARECAAYFAlLJ+A8ACgkQR6b2EdogPFuFwwCffSjZpDJvIj70Ag+CPbClCVuc viEAnjqnxcEdhKR2Gq84eGYEXfjfb23F =pmTS -END PGP SIGNATURE- -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: btrfs-transaction blocked for more than 120 seconds
Oh gosh, I don't know what went wrong with my btrfs root filesystem, and I probably will never know, too: The sudo balance start / was running fine for about 4 or 5 hours, running at a system load of ~3 when balance status / told me the balancing was on its way and had completed 19 out of 23 extents. At this moment the system load started to increase and increase an increase and when it reached 147 (!!) (while top was showing me NOTHING was going on) I resetted the computer. TTY1 showed some kernel panics and btrfs-bug messages, but those files were lost because they've never made it to disk. Fortunately my RAID5 stayed in sync and everything was fine. System also booted, but with the same 120+ secs hangs as before. System was unusable, as e.g. all IMAP logins time-out-ed. So * I booted into a live-CD * mounted a backup disk * cp-ed all files of the root fs to the backup disk (it could read them flawlessly) * formatted the root-partition to ext4 (yes, I feel sad about it) * cp-ed all root-files from the backupdisk to the ext4 root system * stroke the subvol=@ boot argument from /boot/grub/grub.cfg * and rebooted my server. How I love linux! Wouldn't be possible with M$!! Now its running fine again, system is responsive as it should be. No clue 'bout what went wrong, though. I still have /home and the huge data partitions on btrfs and plan to leave it so. While it would not be difficult to put /home on ext4 it would be a major effort to cp the ~3TB data off and on the disks... Thanx for your support, Sulla -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: btrfs-transaction blocked for more than 120 seconds
Dear Duncan! Thanks very much for your exhaustive answer. Hm, I also thought of fragmentation. Alhtough I don't think this is really very likely, as my server doesn't serve things that likely cause fragmentation. It is a mailserver (but only maildir-format), fileserver for windows clients (huge files that hardly don't get rewritten), a server for TV-records (but only copy recordings from a sat receiver after they have been recorded, so no heavy rewriting here), a tiny webserver and all kinds of such things, but not a storage for huge databases, virtual machines or a target for filesharing clients. It however serves as a target for a hardlink-based backupprogram run on windows PCs, but only once per month or so, so that shouldn't bee too much. The problem must lie somewhere on the root partition itslef, because the system is already slow before mounting the fat data-partitions. I'll give the defragmentation a try. But # sudo btrfs filesystem defrag -r doesn't work, because -r is an unknown option (I'm running Btrfs v0.20-rc1 on an Ubuntu 3.11.0-14-generic kernel). I'm doing a # sudo btrfs filesystem defrag / on the root directory at the moment. Question: will this defragment everything or just the root-fs and will I need to run a defragment on /home as well, as /home is a separate btrfs filesystem? I've also added autodefrag mountoptions and will do a mount -a after the defragmentation. I've considered a # sudo btrfs balance start as well, would this do any good? How close should I let the data fill the partition? The large data partitions are 85% used, root is 70% used. Is this safe or should I add space? Thanx, Wolfgang -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
btrfs-transaction blocked for more than 120 seconds
Dear all! On my Ubuntu Server 13.10 I use a RAID5 blockdevice consisting of 3 WD20EARS drives. On this I built a LVM and in this LVM I use quite normal partitions /, /home, SWAP (/boot resides on a RAID1.) and also a custom /data partition. Everything (except boot and swap) is on btrfs. sometimes my system hangs for quite some time (top is showing a high wait percentage), then runs on normally. I get kernel messages into /var/log/sylsog, see below. I am unable to make any sense of the kernel messages, there is no reference to the filesystem or drive affected (at least I can not find one). Question: What is happening here? * Is a HDD failing (smart looks good, however) * Is something wrong with my btrfs-filesystem? with which one? * How can I find the cause? thanks, Wolfgang Dec 31 12:27:49 freedom kernel: [ 4681.264112] INFO: task btrfs-transacti:529 blocked for more than 120 seconds. Dec 31 12:27:49 freedom kernel: [ 4681.264239] echo 0 /proc/sys/kernel/hung_task_timeout_secs disables this message. Dec 31 12:27:49 freedom kernel: [ 4681.264367] btrfs-transacti D 88013fc14580 0 529 2 0x Dec 31 12:27:49 freedom kernel: [ 4681.264377] 880138345e10 0046 880138345fd8 00014580 Dec 31 12:27:49 freedom kernel: [ 4681.264386] 880138345fd8 00014580 880135615dc0 880132fb6a00 Dec 31 12:27:49 freedom kernel: [ 4681.264393] 880133f45800 880138345e30 880137ee2000 880137ee2070 Dec 31 12:27:49 freedom kernel: [ 4681.264402] Call Trace: Dec 31 12:27:49 freedom kernel: [ 4681.264418] [816eaa79] schedule+0x29/0x70 Dec 31 12:27:49 freedom kernel: [ 4681.264477] [a032a57d] btrfs_commit_transaction+0x34d/0x980 [btrfs] Dec 31 12:27:49 freedom kernel: [ 4681.264487] [81085580] ? wake_up_atomic_t+0x30/0x30 Dec 31 12:27:49 freedom kernel: [ 4681.264517] [a0321be5] transaction_kthread+0x1a5/0x240 [btrfs] Dec 31 12:27:49 freedom kernel: [ 4681.264548] [a0321a40] ? verify_parent_transid+0x150/0x150 [btrfs] Dec 31 12:27:49 freedom kernel: [ 4681.264557] [810847b0] kthread+0xc0/0xd0 Dec 31 12:27:49 freedom kernel: [ 4681.264565] [810846f0] ? kthread_create_on_node+0x120/0x120 Dec 31 12:27:49 freedom kernel: [ 4681.264573] [816f566c] ret_from_fork+0x7c/0xb0 Dec 31 12:27:49 freedom kernel: [ 4681.264580] [810846f0] ? kthread_create_on_node+0x120/0x120 Dec 31 12:27:49 freedom kernel: [ 4681.264610] INFO: task kworker/u4:0:9975 blocked for more than 120 seconds. Dec 31 12:27:49 freedom kernel: [ 4681.264722] echo 0 /proc/sys/kernel/hung_task_timeout_secs disables this message. Dec 31 12:27:49 freedom kernel: [ 4681.264847] kworker/u4:0D 88013fd14580 0 9975 2 0x Dec 31 12:27:49 freedom kernel: [ 4681.264861] Workqueue: writeback bdi_writeback_workfn (flush-btrfs-4) Dec 31 12:27:49 freedom kernel: [ 4681.264865] 8800a8739538 0046 8800a8739fd8 00014580 Dec 31 12:27:49 freedom kernel: [ 4681.264873] 8800a8739fd8 00014580 8801351e5dc0 8801351e5dc0 Dec 31 12:27:49 freedom kernel: [ 4681.264880] 880134c5e6a8 880134c5e6b0 880134c5e6b8 Dec 31 12:27:49 freedom kernel: [ 4681.264887] Call Trace: Dec 31 12:27:49 freedom kernel: [ 4681.264895] [816eaa79] schedule+0x29/0x70 Dec 31 12:27:49 freedom kernel: [ 4681.264902] [816ec465] rwsem_down_write_failed+0x105/0x1e0 Dec 31 12:27:49 freedom kernel: [ 4681.264911] [8136257d] ? __rwsem_do_wake+0xdd/0x160 Dec 31 12:27:49 freedom kernel: [ 4681.264918] [81369763] call_rwsem_down_write_failed+0x13/0x20 Dec 31 12:27:49 freedom kernel: [ 4681.264927] [816e9e7d] ? down_write+0x2d/0x30 Dec 31 12:27:49 freedom kernel: [ 4681.264956] [a030fbe0] cache_block_group+0x290/0x3b0 [btrfs] Dec 31 12:27:49 freedom kernel: [ 4681.264963] [81085580] ? wake_up_atomic_t+0x30/0x30 Dec 31 12:27:49 freedom kernel: [ 4681.264991] [a0317d48] find_free_extent+0xa38/0xac0 [btrfs] Dec 31 12:27:49 freedom kernel: [ 4681.265022] [a0317ef2] btrfs_reserve_extent+0xa2/0x1c0 [btrfs] Dec 31 12:27:49 freedom kernel: [ 4681.265056] [a033103d] __cow_file_range+0x15d/0x4a0 [btrfs] Dec 31 12:27:49 freedom kernel: [ 4681.265090] [a0331efa] cow_file_range+0x8a/0xd0 [btrfs] Dec 31 12:27:49 freedom kernel: [ 4681.265122] [a0332290] run_delalloc_range+0x350/0x390 [btrfs] Dec 31 12:27:49 freedom kernel: [ 4681.265158] [a0346bf1] ? find_lock_delalloc_range.constprop.42+0x1d1/0x1f0 [btrfs] Dec 31 12:27:49 freedom kernel: [ 4681.265194] [a0348764] __extent_writepage+0x304/0x750 [btrfs] Dec 31 12:27:49 freedom kernel: [ 4681.265202] [8109a1d5] ? set_next_entity+0x95/0xb0 Dec 31 12:27:49 freedom kernel: [ 4681.265212] [810115c6] ? __switch_to+0x126/0x4b0 Dec 31 12:27:49 freedom kernel: [