Re: fs was hung

2018-02-24 Thread Rich Rauenzahn
straightforward! :) :) On Wed, Feb 21, 2018 at 8:50 PM, Rich Rauenzahn <rraue...@gmail.com> wrote: > I have a mount point that became ... hung for lack of a better word. > > I was doing a LARGE sort of a file in it, using temporary files in that > directory, and the > sys

Re: fs was hung and sort of full

2018-02-23 Thread Rich Rauenzahn
It is still hanging afterall. Currently, it hangs on mount: [154422.778624] mount D0 27894 27742 0x0084 [154422.778625] Call Trace: [154422.779018] __schedule+0x28d/0x880 [154422.779494] schedule+0x36/0x80 [154422.779886] io_schedule+0x16/0x40 [154422.780288]

fs was hung and sort of full

2018-02-21 Thread Rich Rauenzahn
I have a mount point that became ... hung for lack of a better word. I was doing a LARGE sort of a file in it, using temporary files in that directory, and the system hung and probably the watchdog kicked in, reset the system. Upon reboot, the filesystem hung if you touched it. Processes were

Re: kernel hangs during balance

2017-12-23 Thread Rich Rauenzahn
I finally got a full stack trace via sysrq. A lot of stacks seem to end in page fault -- I wonder if its because my page file is a loopback on btrfs. https://pastebin.com/GyWAu1EP $ cat /proc/cmdline BOOT_IMAGE=/vmlinuz-4.14.8-1.el7.elrepo.x86_64 root=UUID=35f0ce3f-0902-47a3-8ad8-86179d1f3e3a

Re: kernel hangs during balance

2017-12-20 Thread Rich Rauenzahn
I switched to the LT kernel because of this issue. I was running mainline and thought that LT would get me stability. I can switch back to LT while we RCA. At the risk of changing two things, I could add that (scsi_mod.use_blk_mq=n) to my boot and also switch back to ML. I do notice that disk

Re: kernel hangs during balance

2017-12-20 Thread Rich Rauenzahn
4681.220565] [] kthread_worker_fn+0x7b/0x170 Dec 20 01:14:51 tendo [64681.221056] [] ? kthread_create_on_node+0x1a0/0x1a0 Dec 20 01:14:51 tendo [64681.221533] [] kthread+0xe5/0x100 Dec 20 01:14:51 tendo [64681.222009] [] ? kthread_park+0x60/0x60 Dec 20 01:14:51 tendo [64681.222485] [] ret_from

Re: kernel hangs during balance

2017-12-19 Thread Rich Rauenzahn
On Tue, Dec 19, 2017 at 9:14 AM, Hans van Kranenburg wrote: > Just wild first guess... are you also using btrfs send/receive > functionality where the system having problems is the sending part? No. >>> Every night I'm getting a kernel hang, sometimes caught by

Re: kernel hangs during balance

2017-12-19 Thread Rich Rauenzahn
What's also confusing is I just ran a manual balance on the fs using defaults (which are aggressive) and it completed with no problems. It smells more like a race condition than a particular corruption. On Tue, Dec 19, 2017 at 8:09 AM, Rich Rauenzahn <rraue...@gmail.com> wrote: > I'

kernel hangs during balance

2017-12-19 Thread Rich Rauenzahn
I'm running 4.4.106-1.el7.elrepo.x86_64 and I do a btrfs balance everynight. Every night I'm getting a kernel hang, sometimes caught by my watchdog, sometimes not. Last night's hang was on the balance of DATA on / at 70. I'm not sure how to further trace this down to help you -- the console by

Re: WARNING: CPU: 3 PID: 439 at fs/btrfs/ctree.h:1559 btrfs_update_device+0x1c5/0x1d0 [btrfs]

2017-09-20 Thread Rich Rauenzahn
On 9/20/2017 9:58 AM, Rich Rauenzahn wrote: What's the most direct way to do that?  (Was about to risk breaking the mirror and repartitioning!  I'd rather not!) Hmm -- maybe this worked: $ sudo btrfs filesystem resize -1m /.MEDIA/ Resize '/.MEDIA/' of '-1m' No, doesn't seem to have worked

Re: WARNING: CPU: 3 PID: 439 at fs/btrfs/ctree.h:1559 btrfs_update_device+0x1c5/0x1d0 [btrfs]

2017-09-20 Thread Rich Rauenzahn
On 9/19/2017 10:39 PM, Qu Wenruo wrote: In v4.13 kernel, the newly added devices are in fact rounded down. But existing device doesn't get the round down. So it's recommended to resize (shrink) your fs for very small size to fix it if you don't want to wait for the kernel fix. What's

Re: WARNING: CPU: 3 PID: 439 at fs/btrfs/ctree.h:1559 btrfs_update_device+0x1c5/0x1d0 [btrfs]

2017-09-20 Thread Rich Rauenzahn
For the warning -- is there anyway to add in the filesystem/disk causing the issue?  I didn't see any identifier in the message that told me which it was. -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More

Re: WARNING: CPU: 3 PID: 439 at fs/btrfs/ctree.h:1559 btrfs_update_device+0x1c5/0x1d0 [btrfs]

2017-09-19 Thread Rich Rauenzahn
On 9/19/2017 5:31 PM, Qu Wenruo wrote: On 2017年09月19日 23:56, Rich Rauenzahn wrote: [    4.747377] WARNING: CPU: 3 PID: 439 at fs/btrfs/ctree.h:1559 btrfs_update_device+0x1c5/0x1d0 [btrfs] Is that line the following WARN_ON()? --- static inline void btrfs_set_device_total_bytes(struct

WARNING: CPU: 3 PID: 439 at fs/btrfs/ctree.h:1559 btrfs_update_device+0x1c5/0x1d0 [btrfs]

2017-09-19 Thread Rich Rauenzahn
I've filed a bug on this kernel trace -- I get 100's of these a day. I'd like to make them go away https://bugzilla.kernel.org/show_bug.cgi?id=196949 [4.747356] [ cut here ] [4.747377] WARNING: CPU: 3 PID: 439 at fs/btrfs/ctree.h:1559

Re: btrfs_remove_chunk call trace?

2017-09-12 Thread Rich Rauenzahn
On 9/11/2017 1:35 PM, Duncan wrote: Rich Rauenzahn posted on Sun, 10 Sep 2017 22:45:50 -0700 as excerpted: ...and can it be related to the Samsung 840 SSD's not supporting NCQ Trim? (Although I can't tell which device this trace is from -- it could be a mechanical Western Digital.) On Sun

Re: btrfs_remove_chunk call trace?

2017-09-10 Thread Rich Rauenzahn
...and can it be related to the Samsung 840 SSD's not supporting NCQ Trim? (Although I can't tell which device this trace is from -- it could be a mechanical Western Digital.) On Sun, Sep 10, 2017 at 10:16 PM, Rich Rauenzahn <rraue...@gmail.com> wrote: > Is this something to be concer

btrfs_remove_chunk call trace?

2017-09-10 Thread Rich Rauenzahn
Is this something to be concerned about? I'm running the latest mainline kernel on CentOS 7. [ 1338.882288] [ cut here ] [ 1338.883058] WARNING: CPU: 2 PID: 790 at fs/btrfs/ctree.h:1559 btrfs_update_device+0x1c5/0x1d0 [btrfs] [ 1338.883809] Modules linked in: xt_nat veth

btrfs corrupt message when mounting (but still mounts and no scrub errors.)

2017-08-21 Thread Rich Rauenzahn
I'm getting messages like this when mounting: [4.034300] BTRFS info (device sdg3): bdev /dev/sdg3 errs: wr 0, rd 0, flush 0, corrupt 4, gen 0 [4.034828] BTRFS info (device sdg3): bdev /dev/sdf3 errs: wr 0, rd 0, flush 0, corrupt 68, gen 0 But it mounts fine, and scrub says it is fine: $

Re: btrfs full, but not full, can't rebalance

2015-07-12 Thread Rich Rauenzahn
Just a final note -- I'm finally back in person with the CentOS 7 server and so booted it to the latest kernel-ml from elrepo. It is a 4.1 kernel. But while still remote with the older 3.10 kernel, I also tried doing a 'mount -oremount,clear_cache /' I can't swear it helped, but things did seem

Re: btrfs full, but not full, can't rebalance

2015-07-03 Thread Rich Rauenzahn
I'm still seeing periodic issues here. The filesystem will go full occasionally even though there appears to be plenty of space. Running another rebalance seems to fix it ... but I don't see why the system thinks it needs to be rebalanced. # while ! btrfs balance start /; do btrfs fi show /;

Re: btrfs full, but not full, can't rebalance

2015-07-03 Thread Rich Rauenzahn
thanks to Donald for getting me to this point! On Thu, Jul 2, 2015 at 10:57 PM, Rich Rauenzahn rraue...@gmail.com wrote: Yes, I tried that -- and adding the loopback device. # btrfs device add /dev/loop1 / Performing full device TRIM (5.00GiB) ... # btrfs fi show / Label: 'centos7' uuid

Re: btrfs full, but not full, can't rebalance

2015-07-03 Thread Rich Rauenzahn
: Because this is raid1 I believe you need another for that to work. On Fri, Jul 3, 2015 at 12:57 AM, Rich Rauenzahn rraue...@gmail.com wrote: Yes, I tried that -- and adding the loopback device. # btrfs device add /dev/loop1 / Performing full device TRIM (5.00GiB) ... # btrfs fi show / Label

Re: btrfs full, but not full, can't rebalance

2015-07-02 Thread Rich Rauenzahn
donaldwhpear...@gmail.com wrote: Have you seen this article? I think the interesting part for you is the balance cannot run because the filesystem is full heading. http://marc.merlins.org/perso/btrfs/post_2014-05-04_Fixing-Btrfs-Filesystem-Full-Problems.html On Fri, Jul 3, 2015 at 12:32 AM, Rich

btrfs full, but not full, can't rebalance

2015-07-02 Thread Rich Rauenzahn
Running on CentOS7 ... / got full, I removed the files, but it still thinks it is full. I've tried following the FAQ, even adding a loopback device during the rebalance. # btrfs fi show / Label: 'centos7' uuid: 35f0ce3f-0902-47a3-8ad8-86179d1f3e3a Total devices 2 FS bytes used 24.27GiB

Re: How do I make 'btrfs scrub' report errors via email?

2015-06-13 Thread Rich Rauenzahn
On 6/13/2015 6:48 AM, crocket wrote: I can check the result of 'btrfs scrub' later, but I don't want to take time to actually check it. Does anyone know how to make 'btrfs scrub' report errors via email? It seems google doesn't know. I use this job in /etc/cron.d/btrfs.scrub.cron : 0 0 * *

checksum error on vmem file

2015-06-05 Thread Rich Rauenzahn
I get this error occasionally on the same file. I suspect it is some interaction between vmware server and btrfs. Besides this error, I have no ill effects. Is this a known issue? Is it something that ought to be investigated to improve btrfs? I might be able to help find the right folks at

Re: Two uncorrectable errors across RAID1 at same logical block?

2014-10-11 Thread Rich Rauenzahn
think I'm picking a file in /tmp with that name, which is a block I dumped. Renamed it. $ sudo ./btrfs-debug-tree /dev/sdf3 | grep 58464632832 Now it returns nothing... On Sat, Oct 11, 2014 at 1:52 AM, Liu Bo bo.li@oracle.com wrote: On Thu, Oct 09, 2014 at 09:58:03AM -0700, Rich Rauenzahn

Re: Two uncorrectable errors across RAID1 at same logical block?

2014-10-09 Thread Rich Rauenzahn
On 10/9/2014 12:13 AM, Liu Bo wrote: sudo ./btrfs inspect-internal logical-resolve -v 58464632832 / $ sudo ./btrfs inspect-internal logical-resolve -v 58464632832 / ioctl ret=0, total_size=4096, bytes_left=4080, bytes_missing=0, cnt=0, missed=0 I also tried -P and -s 1 Also

Re: Two uncorrectable errors across RAID1 at same logical block?

2014-10-08 Thread Rich Rauenzahn
On 10/8/2014 7:20 AM, Liu Bo wrote: On Mon, Oct 06, 2014 at 07:18:06PM -0700, Rich Rauenzahn wrote: On 10/6/2014 7:05 PM, Liu Bo wrote: btrfs inspect-internal logical-resolve 58464632832 $ sudo btrfs inspect-internal logical-resolve 58464632832 / ...no output? Hmm...have you tried

Re: Two uncorrectable errors across RAID1 at same logical block?

2014-10-06 Thread Rich Rauenzahn
On 10/6/2014 7:05 PM, Liu Bo wrote: btrfs inspect-internal logical-resolve 58464632832 $ sudo btrfs inspect-internal logical-resolve 58464632832 / ...no output? -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org