btrfs hang on rsync in linux 4.16.8

2018-05-17 Thread E V
running an rsync -av --del as the only process hiting my btrfs backup filesystem. rsync is now stuck and so is all other access to the filesystem. Looking at ps it seems the btrfs-cleaner is running, so maybe that deadlocked with Stack for the rsync: [<0>]

Hang/deadlock in 4.14.40 running rsync

2018-05-14 Thread E V
Running an rsync to copy data onto a btrfs filesystem used for backups. It's appearing to deadlock/hang. The rsync process stops doing any IO, and no other IO is ongoing against the filesystem. The rsync is in state D in ps and is unkillable even with kill -9. The /proc//stack for the rsync: []

Re: Status of FST and mount times

2018-02-21 Thread E V
On Wed, Feb 21, 2018 at 9:49 AM, Ellis H. Wilson III wrote: > On 02/20/2018 08:49 PM, Qu Wenruo wrote: > > On 2018年02月16日 22:12, Ellis H. Wilson III wrote: >> >> $ sudo btrfs-debug-tree -t chunk /dev/sdb | grep CHUNK_ITEM | wc -l >> 3454 > > >>

Re: btrfs-cleaner / snapshot performance analysis

2018-02-13 Thread E V
On Mon, Feb 12, 2018 at 10:37 AM, Ellis H. Wilson III wrote: > On 02/11/2018 01:24 PM, Hans van Kranenburg wrote: >> >> Why not just use `btrfs fi du ` now and then and >> update your administration with the results? .. Instead of putting the >> burden of keeping track of

Re: Slow mounting raid1

2017-08-01 Thread E V
On Tue, Aug 1, 2017 at 2:43 AM, Leonidas Spyropoulos wrote: > Hi Duncan, > > Thanks for your answer In general I think btrfs takes time proportional to the size of your metadata to mount. Bigger and/or fragmented metadata leads to longer mount times. My big backup fs with

Re: [PATCH v2 3/4] btrfs: Add zstd support

2017-06-30 Thread E V
On Thu, Jun 29, 2017 at 3:41 PM, Nick Terrell wrote: > Add zstd compression and decompression support to BtrFS. zstd at its > fastest level compresses almost as well as zlib, while offering much > faster compression and decompression, approaching lzo speeds. > > I benchmarked

enospc_debug seems to need more info, or btrfs file usage bug?

2017-05-19 Thread E V
grep btrfs /proc/mounts /dev/sdb /mirror btrfs rw,noatime,compress=zlib,space_cache=v2,enospc_debug,subvolid=5,subvol=/ 0 0 Filesystem on 4.9.18 went read-only w/ ENOSPC. Now it is close to full, but can't really tell why it actually filled up at that time. Here's the dmesg: [1491582.099306]

Way to force allocation of more metadata?

2017-02-16 Thread E V
It would be nice if there was an easy way to tell btrfs to allocate another metadata chunk. For example, the below fs is full due to exhausted metadata: Device size:1013.28GiB Device allocated: 1013.28GiB Device unallocated:2.00MiB Device

Re: Metadata balance fails ENOSPC

2016-12-01 Thread E V
I've frequently seen free space cache corruption lead to phantom ENOSPC. You could try clearing the space cache, and/or mounting with nospache_cache. On Thu, Dec 1, 2016 at 6:55 AM, Stefan Priebe - Profihost AG wrote: > > Am 01.12.2016 um 09:12 schrieb Andrei Borzenkov: >>

Re: [Bug 186671] New: OOM on system with just rsync running 32GB of ram 30GB of pagecache

2016-11-25 Thread E V
24078188 3451400666760 08:55:01 AM 2064900 30930356 93.74234800 26480996 2730384 3.36 24009244 5044012 50028 On Tue, Nov 22, 2016 at 9:48 AM, Vlastimil Babka <vba...@suse.cz> wrote: > On 11/22/2016 02:58 PM, E V wrote: >> System OOM'd seve

Re: [Bug 186671] New: OOM on system with just rsync running 32GB of ram 30GB of pagecache

2016-11-22 Thread E V
System OOM'd several times last night with 4.8.10, I attached the page_owner output from a morning cat ~8 hours after OOM's to the bugzilla case, split and compressed to fit under the 5M attachment limit. Let me know if you need anything else. On Fri, Nov 18, 2016 at 10:02 AM, E V <eli

Re: [Bug 186671] New: OOM on system with just rsync running 32GB of ram 30GB of pagecache

2016-11-18 Thread E V
to be the root cause of the panic so I haven't spent any time looking into that as of yet, Thanks, -Eli On Fri, Nov 18, 2016 at 6:54 AM, Tetsuo Handa <penguin-ker...@i-love.sakura.ne.jp> wrote: > On 2016/11/18 6:49, Vlastimil Babka wrote: >> On 11/16/2016 02:39 PM, E V wrote: >>>

Re: [Bug 186671] New: OOM on system with just rsync running 32GB of ram 30GB of pagecache

2016-11-16 Thread E V
System panic'd overnight running 4.9rc5 & rsync. Attached a photo of the stack trace, and the 38 call traces in a 2 minute window shortly before, to the bugzilla case for those not on it's e-mail list: https://bugzilla.kernel.org/show_bug.cgi?id=186671 On Mon, Nov 14, 2016 at 3:56 PM,

Re: [Bug 186671] New: OOM on system with just rsync running 32GB of ram 30GB of pagecache

2016-11-14 Thread E V
Babka <vba...@suse.cz> wrote: > On 11/14/2016 02:27 PM, E V wrote: >> System is an intel dual socket Xeon E5620, 7500/5520/5500/X58 ICH10 >> family according to lspci. Anyways 4.8.4 OOM'd while I was gone. I'll >> download the current 4.9rc and reboot, but in the mean

Re: [Bug 186671] New: OOM on system with just rsync running 32GB of ram 30GB of pagecache

2016-11-14 Thread E V
:0kB [737778.768375] oom_reaper: reaped process 3718 (dsm_om_connsvcd), now anon-rss:0kB, file-rss:0kB, shmem-rss:0kB On Fri, Nov 4, 2016 at 5:00 PM, Vlastimil Babka <vba...@suse.cz> wrote: > On 11/04/2016 03:13 PM, E V wrote: >> After the system panic'd yesterday I booted back into 4.8

Re: [Bug 186671] New: OOM on system with just rsync running 32GB of ram 30GB of pagecache

2016-11-04 Thread E V
After the system panic'd yesterday I booted back into 4.8.4 and restarted the rsync's. I'm away on vacation next week, so when I get back I'll get rc4 or rc5 and try again. In the mean time here's data from the system running 4.8.4 without problems for about a day. I'm not familiar with xxd and

linux 4.9-rc3 OOM's anyone else?

2016-11-03 Thread E V
Looks like 4.9-rc3 still has the OOM issue since 4.7 as far as I can tell. Anyone else still seeing it? I created a kernel bug report for it: https://bugzilla.kernel.org/show_bug.cgi?id=186671. Not sure if it's btrfs that's making it worse or what not. -- To unsubscribe from this list: send the

dmesg traces from enospc_debug & kernel 4.9-r3

2016-11-01 Thread E V
Testing 4.9-rc3 on my big backup system running rsync. Saw 10 traces over the last day, I broke them down into groups for were they differ with a full trace at the bottom for reference. Hopefully useful to somebody. Everything is running fine so far. Can send all the full traces if someone is

4.8rc8 & OOM panic

2016-09-28 Thread E V
I just booted my backup box with 4.8rc8 and started an rsync onto btrfs and it panic'd with OOM a couple hours later. I thought the OOM problems from 4.7 we're supposed to be fixed in 4.8, or did I get that wrong? No users or anything else on the system. -- To unsubscribe from this list: send the

Re: Thoughts on btrfs RAID-1 for cold storage/archive?

2016-09-16 Thread E V
will be offsite, and hopefully only be needed if the checksum check on the data retrieved from the 1st drive fails(hopefully very infrequently.) On Fri, Sep 16, 2016 at 7:45 AM, Austin S. Hemmelgarn <ahferro...@gmail.com> wrote: > On 2016-09-15 22:58, Duncan wrote: >> >> E V posted o

Thoughts on btrfs RAID-1 for cold storage/archive?

2016-09-15 Thread E V
I'm investigating using btrfs for archiving old data and offsite storage, essentially put 2 drives in btrfs RAID-1, copy the data to the filesystem and then unmount, remove a drive and take it to an offsite location. Remount the other drive -o ro,degraded until my systems slots fill up, then

Re: linux 4.7.2 & btrfs & rsync & OOM gone crazy

2016-08-29 Thread E V
Didn't go so well unfortunately, system ended up panicking: Out of memory and no killable processes. So I guess I'll be staying on 4.6 for a bit longer. On Sat, Aug 27, 2016 at 9:57 AM, E V <eliven...@gmail.com> wrote: > OOM killer still killed rsync with swappiness 0, so rebuilt and &

Re: linux 4.7.2 & btrfs & rsync & OOM gone crazy

2016-08-27 Thread E V
<to...@pipebreaker.pl> wrote: > On Fri, Aug 26, 2016 at 03:00:48PM -0400, E V wrote: >> Just upgraded from 4.6.5 to 4.7.2 for my btrfs backup server with 32GB >> of ram. Only thing that run's on it is an rsync of an NFS filesystem >> to the local btrfs. Cached mem tends

Re: linux 4.7.2 & btrfs & rsync & OOM gone crazy

2016-08-26 Thread E V
.pl> wrote: > On Fri, Aug 26, 2016 at 03:00:48PM -0400, E V wrote: >> Just upgraded from 4.6.5 to 4.7.2 for my btrfs backup server with 32GB >> of ram. Only thing that run's on it is an rsync of an NFS filesystem >> to the local btrfs. Cached mem tends to hang out around

linux 4.7.2 & btrfs & rsync & OOM gone crazy

2016-08-26 Thread E V
Just upgraded from 4.6.5 to 4.7.2 for my btrfs backup server with 32GB of ram. Only thing that run's on it is an rsync of an NFS filesystem to the local btrfs. Cached mem tends to hang out around 26-30GB, but with 4.7.2 the OOM is now going crazy and trying to kill whatever it can including my ssh

Re: Cannot balance FS (No space left on device)

2016-06-15 Thread E V
In my experience phantom ENOSPC messages are frequently due to the free space cache being corrupt. Mounting with nospace_cache or space_cache=v2 may help. On Wed, Jun 15, 2016 at 6:59 AM, ojab // wrote: > On Fri, Jun 10, 2016 at 2:58 PM, ojab // wrote: >> [Please CC

kernel 4.5.5 & space_cache=v2 early enospc, forced read-only

2016-05-20 Thread E V
Just trying space_cache=v2 on my big backup btrfs, mounted via space_cache=v2,enospc_debug,nofail,noatime,compress=zlib. Looks like something got confused during an rsync which then quickly propigated up to forcing the fs read-only in the long stack traces below. I'll be happy to test the new

RE: btrfs forced readonly + errno=-28 No space left

2016-04-21 Thread E V
>we use btrfs subvolumes for rsync-based backups. During backups btrfs often >fails with "No >space >left" error and goes to readonly mode (dmesg output is below) while there's >still plenty of >unallocated space I have the same use case and the same issue with no real solution that I've found.

Re: linux 4.4.3 oops on aborted transaction, forces FS read-only

2016-03-28 Thread E V
, but the rsync now completes every time without the fs once being forced read-only. On Fri, Mar 4, 2016 at 9:14 AM, E V <eliven...@gmail.com> wrote: > Looks like the transaction abort ends up causing the no space, if > that's at all helpful. Lot's of free space seems to be irrelevant.

linux 4.4.3 oops on aborted transaction, forces FS read-only

2016-03-04 Thread E V
Looks like the transaction abort ends up causing the no space, if that's at all helpful. Lot's of free space seems to be irrelevant. Any chance this will be getting better soon? Seems to happen to me a lot these days, and adding space doesn't change anything. [282713.823416] WARNING: CPU: 4 PID:

Re: 4.2.5 forced read-only -ENOSPC w/ free space

2015-11-04 Thread E V
FYI, 4.1.12 completed the big rsync without issues. Guess I'm using longterm for now. On Mon, Nov 2, 2015 at 9:53 AM, E V <eliven...@gmail.com> wrote: > During an rsync, 20TB unallocated space. Currently, no snapshots. > Should I try 4.1.12, or 4.3? > dmesg: > [122014.436612] BT

4.2.5 forced read-only -ENOSPC w/ free space

2015-11-02 Thread E V
During an rsync, 20TB unallocated space. Currently, no snapshots. Should I try 4.1.12, or 4.3? dmesg: [122014.436612] BTRFS: error (device sde) in btrfs_run_delayed_refs:2781: errno=-28 No space left [122014.436615] BTRFS info (device sde): forced readonly [122014.436624] BTRFS: error (device sde)

Re: Kernel 4.1.4 btrfs forced read-only during rsync

2015-08-24 Thread E V
of snapshots to make it free something and now rsync is running again fine. On Thu, Aug 20, 2015 at 10:07 AM, E V eliven...@gmail.com wrote: Thanks for looking, let me know if you need any other info. I haven't touched the system yet, but it appears I'll need to unmount to btrfs check or mount rw

Re: Kernel 4.1.4 btrfs forced read-only during rsync

2015-08-20 Thread E V
Thanks for looking, let me know if you need any other info. I haven't touched the system yet, but it appears I'll need to unmount to btrfs check or mount rw,recovery to try and get it working again. Who knows what will happen then? I can leave it as if for a few days if it will help any diagnosis.

Kernel 4.1.4 btrfs forced read-only during rsync

2015-08-19 Thread E V
linux 4.1.4 forced read-only during an rsync, complaining about lack of space, with ~30TB free. Filesystem has 6 snapshots, basically 3 incremental rsync's of 2 different external filesystems. Not sure how to proceed, balance -dusage=5 then try and remount, doesn't balance need rw? # btrfs file

Re: BTRFS balance segfault, where to go from here

2014-10-28 Thread E V
I've seen dead locks on 3.16.3. Personally, I'm staying with 3.14 until something newer stabilizes, haven't had any issues with it. You might want to try the latest 3.14, though I think there should be a new one pretty soon with quite a few btrfs patches. On Tue, Oct 28, 2014 at 7:33 AM, Stephan

deadlock with 3.16.3

2014-10-09 Thread E V
running wheezy Debian 3.16.3-2~bpo70+1 system has locked up 2 nights in a row running rsync copying from remote to a ~100TB btrfs. Only job running on the server, no interactive users or anything. soft locks showed up in kern.log across many CPUs shortly before system became non-responsive. First

What's enospc_debug telling me?

2014-03-28 Thread E V
Doing a data balance profile conversion raid0-single fails with enospc, even given lots of space. So tried out enospc_debug: Mar 28 08:48:44 btrfs: relocating block group 579997815275520 flags 9 Mar 28 08:58:06 space_info 1 has 1574929309696 free, is not full Mar 28 08:58:06 space_info

balance failure on large files

2014-03-28 Thread E V
balance start -dprofiles=raid0,convert=single,vrange=0..580 fails given lots of space. Looking at the paths of the failed block groups via btrfs inspect-internal logical-resolve, I see they are 7 largish files, from 4G-133GB in size. So looks like the profile conversion code hit's a

Re: Bug in profile balance not present in normal balance

2014-03-25 Thread E V
What about unallocated space, that is, the difference between total space and used space, per device, as reported by btrfs filesystem show? File show gives: Total devices 3 FS bytes used 60.22TiB devid1 size 30.01TiB used 17.61TiB path /dev/sdb devid2 size

Getting profile conversion to finish

2014-03-04 Thread E V
I'm having difficulty getting a profile conversion from raid0 - single to finish. Is there a way to tell the filesystem to stop all new allocations for the raid0 profile, before the full balance finishes? Otherwise I feel like I'm in a sisyphian push of blocks into higher block numbers. Since the

Fwd: no space left on balance

2013-12-03 Thread E V
Looks like the same issue as kernel bugzilla 66221 just different config. debian wheezy with hand built 3.11.9. Ideas for fixing would be great. I guess I'll try deleting a couple snapshots and try again. btrfs balance start -dconvert=single /mirror ERROR: error during balancing '/mirror' - No