Re: [PATCH V5 RESEND] Btrfs: enchanse raid1/10 balance heuristic

2018-09-20 Thread Peter Becker
i like the idea. do you have any benchmarks for this change? the general logic looks good for me.

Re: Transaction aborted (error -28) btrfs_run_delayed_refs*0x163/0x190

2018-07-12 Thread Peter Chant
On 07/12/2018 07:10 AM, Nikolay Borisov wrote: > > > On 10.07.2018 10:04, Pete wrote: >> I've just had the error in the subject which caused the file system to >> go read-only. >> >> Further part of error message: >> WARNING: CPU: 14 PID: 1351 at fs/btrfs/extent-tree.c:3076 >>

Re: [PATCH 04/10] locking: export osq_lock()/osq_unlock()

2018-05-18 Thread Peter Zijlstra
On Fri, May 18, 2018 at 07:32:05AM -0400, Kent Overstreet wrote: > It does strike me that the whole optimistic spin algorithm > (mutex_optimistic_spin() and rwsem_optimistic_spin()) are ripe for factoring > out. They've been growing more optimizations I see, and the optimizations > mostly >

Re: [PATCH 04/10] locking: export osq_lock()/osq_unlock()

2018-05-18 Thread Peter Zijlstra
On Fri, May 18, 2018 at 06:18:04AM -0400, Kent Overstreet wrote: > On Fri, May 18, 2018 at 11:52:04AM +0200, Peter Zijlstra wrote: > > On Fri, May 18, 2018 at 03:49:06AM -0400, Kent Overstreet wrote: > > > > No.. and most certainly not without a _very_ good reason. &

Re: [PATCH 03/10] locking: bring back lglocks

2018-05-18 Thread Peter Zijlstra
On Fri, May 18, 2018 at 06:13:53AM -0400, Kent Overstreet wrote: > On Fri, May 18, 2018 at 11:51:02AM +0200, Peter Zijlstra wrote: > > On Fri, May 18, 2018 at 03:49:04AM -0400, Kent Overstreet wrote: > > > bcachefs makes use of them - also, add a proper lg_lock_init() >

Re: [PATCH 04/10] locking: export osq_lock()/osq_unlock()

2018-05-18 Thread Peter Zijlstra
On Fri, May 18, 2018 at 03:49:06AM -0400, Kent Overstreet wrote: No.. and most certainly not without a _very_ good reason. -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at

Re: [PATCH 03/10] locking: bring back lglocks

2018-05-18 Thread Peter Zijlstra
On Fri, May 18, 2018 at 03:49:04AM -0400, Kent Overstreet wrote: > bcachefs makes use of them - also, add a proper lg_lock_init() Why?! lglocks are horrid things, we got rid of them for a reason. They have terrifying worst case preemption off latencies. Why can't you use something like per-cpu

[PATCH] btrfs-progs: build: Do not use cp -a to install files

2018-04-04 Thread Peter Kjellerstedt
rather than be owned by root. Signed-off-by: Peter Kjellerstedt <peter.kjellerst...@axis.com> --- Makefile | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/Makefile b/Makefile index 92cfe7b5..0e8bfd98 100644 --- a/Makefile +++ b/Makefile @@ -578,7 +578,7 @@ install:

Kernel warning - not sure if this is important

2018-03-19 Thread Peter Chant
I got this kernel warning overnight. Possibly during or after a dedup using duperemove. I'm not sure if that is relevent. Seems to relate to fs/btrfs/backref.c line 1266. Don't know if it is important. Thought I'd post it just in case. I'm afraid this is a screen-shot in the old fashioned

Re: btrfs-cleaner / snapshot performance analysis

2018-02-09 Thread Peter Grandi
> I am trying to better understand how the cleaner kthread > (btrfs-cleaner) impacts foreground performance, specifically > during snapshot deletion. My experience so far has been that > it can be dramatically disruptive to foreground I/O. That's such a warmly innocent and optimistic question!

Re: [PATCH 0/2] Policy to balance read across mirrored devices

2018-01-31 Thread Peter Becker
es to use this as performance tuning. at least the feature with the devid. Thanks Austin, Thanks Anand 2018-01-31 17:11 GMT+01:00 Austin S. Hemmelgarn <ahferro...@gmail.com>: > On 2018-01-31 09:52, Peter Becker wrote: >> >> This is all clear. My question referes to "use

Re: [PATCH 0/2] Policy to balance read across mirrored devices

2018-01-31 Thread Peter Becker
stripe to use] = [preffer stripes present on read_mirror_policy devids] > [fallback to pid % stripe count] Perhaps I'm not be able to express myself in English or did I misunderstand you? 2018-01-31 15:26 GMT+01:00 Anand Jain <anand.j...@oracle.com>: > > > On 01/31/2018 06:47 PM,

Re: [PATCH 0/2] Policy to balance read across mirrored devices

2018-01-30 Thread Peter Becker
A little question about mount -o read_mirror_policy=. How would this work with RAID1 over 3 or 4 HDD's? In particular, if the desired block is not available on device . Could i repeat this option like the device-option to specify a order/priority like this: mount -o read_mirror_policy=

Darlehen Geld für Einzelpersonen und Fachleute in weniger als 72 Stunden

2018-01-25 Thread Peter Schuster
Fragen zu beantworten. Danke, dass Sie mir per E-Mail an Sie von : klaus.peterschus...@outlook.de Mit freundlichen Grüßen. Peter Schuster Financial Bank https://firstfinancialsa.com/de -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message

Business Possibility

2018-01-04 Thread Peter Deng
Hello there, My name is Peter Deng a South African citizen and a friend to Mrs Mugabe sister . I got your contact through Korean business online directory. I represent the interest of Mrs Mugabe who wishes to move a total amount of $19 million into a safe account owns by a trusted business man

Re: Btrfs reserve metadata problem

2018-01-02 Thread Peter Grandi
> When testing Btrfs with fio 4k random write, That's an exceptionally narrowly defined workload. Also it is narrower than that, because it must be without 'fsync' after each write, or else there would be no accumulation of dirty blocks in memory at all. > I found that volume with smaller free

Re: Unexpected raid1 behaviour

2017-12-19 Thread Peter Grandi
[ ... ] > The advantage of writing single chunks when degraded, is in > the case where a missing device returns (is readded, > intact). Catching up that device with the first drive, is a > manual but simple invocation of 'btrfs balance start > -dconvert=raid1,soft -mconvert=raid1,soft' The

Re: Unexpected raid1 behaviour

2017-12-18 Thread Peter Grandi
>> The fact is, the only cases where this is really an issue is >> if you've either got intermittently bad hardware, or are >> dealing with external > Well, the RAID1+ is all about the failing hardware. >> storage devices. For the majority of people who are using >> multi-device setups, the

Re: Unexpected raid1 behaviour

2017-12-18 Thread Peter Grandi
>> I haven't seen that, but I doubt that it is the radical >> redesign of the multi-device layer of Btrfs that is needed to >> give it operational semantics similar to those of MD RAID, >> and that I have vaguely described previously. > I agree that btrfs volume manager is incomplete in view of >

Re: Unexpected raid1 behaviour

2017-12-17 Thread Peter Grandi
"Duncan"'s reply is slightly optimistic in parts, so some further information... [ ... ] > Basically, at this point btrfs doesn't have "dynamic" device > handling. That is, if a device disappears, it doesn't know > it. That's just the consequence of what is a completely broken conceptual

Re: [PATCH 0/7] retry write on error

2017-12-03 Thread Peter Grandi
> [ ... ] btrfs incorporates disk management which is actually a > version of md layer, [ ... ] As far as I know Btrfs has no disk management, and was wisely designed without any, just like MD: Btrfs volumes and MD sets can be composed from "block devices", not disks, and block devices are quite

Re: [PATCH 0/7] retry write on error

2017-11-28 Thread Peter Grandi
>>> If the underlying protocal doesn't support retry and there >>> are some transient errors happening somewhere in our IO >>> stack, we'd like to give an extra chance for IO. >> A limited number of retries may make sense, though I saw some >> long stalls after retries on bad disks. Indeed! One

Re: Fixed subject: updatedb does not index separately mounted btrfs subvolumes

2017-11-05 Thread Peter Grandi
>> The issue is that updatedb by default will not index bind >> mounts, but by default on Fedora and probably other distros, >> put /home on a subvolume and then mount that subvolume which >> is in effect a bind mount. > > So the issue isn't /home being btrfs (as you said in the > subject), but

Re: defragmenting best practice?

2017-11-01 Thread Peter Grandi
> Another one is to find the most fragmented files first or all > files of at least 1M with with at least say 100 fragments as in: > find "$HOME" -xdev -type f -size +1M -print0 | xargs -0 filefrag \ > | perl -n -e 'print "$1\0" if (m/(.*): ([0-9]+) extents/ && $1 > 100)' \ > | xargs -0 btrfs fi

Re: Need help with incremental backup strategy (snapshots, defragmentingt & performance)

2017-11-01 Thread Peter Grandi
[ ... ] > The poor performance has existed from the beginning of using > BTRFS + KDE + Firefox (almost 2 years ago), at a point when > very few snapshots had yet been created. A comparison system > running similar hardware as well as KDE + Firefox (and LVM + > EXT4) did not have the performance

Re: defragmenting best practice?

2017-11-01 Thread Peter Grandi
> When defragmenting individual files on a BTRFS filesystem with > COW, I assume reflinks between that file and all snapshots are > broken. So if there are 30 snapshots on that volume, that one > file will suddenly take up 30 times more space... [ ... ] Defragmentation works by effectively making

Re: defragmenting best practice?

2017-10-31 Thread Peter Grandi
> I'm following up on all the suggestions regarding Firefox performance > on BTRFS. [ ... ] I haven't read that yet, so maybe I am missing something, but I use Firefox with Btrfs all the time and I haven't got issues. [ ... ] > 1. BTRFS snapshots have proven to be too useful (and too important

RE: SLES 11 SP4: can't mount btrfs

2017-10-26 Thread Peter Grandi
>> But it could simply be that you have forgotten to refresh the >> 'initramfs' with 'mkinitrd' after modifying the '/etc/fstab'. > I finally managed it. I'm pretty sure having changed > /boot/grub/menu.lst, but somehow changes got lost/weren't > saved ? So the next thing to check would indeed

RE: SLES 11 SP4: can't mount btrfs

2017-10-26 Thread Peter Grandi
> I formatted the / partition with Btrfs again and could restore > the files from a backup. Everything seems to be there, I can > mount the Btrfs manually. [ ... ] But SLES finds from where I > don't know a UUID (see screenshot). This UUID is commented out > in fstab and replaced by

Re: Is it safe to use btrfs on top of different types of devices?

2017-10-19 Thread Peter Grandi
[ ... ] >> are USB drives really that unreliable [ ... ] [ ... ] > There are similar SATA chips too (occasionally JMicron and > Marvell for example are somewhat less awesome than they could > be), and practically all Firewire bridge chips of old "lied" a > lot [ ... ] > That plus Btrfs is

Re: Is it safe to use btrfs on top of different types of devices?

2017-10-19 Thread Peter Grandi
[ ... ] >>> Oh please, please a bit less silliness would be welcome here. >>> In a previous comment on this tedious thread I had written: > If the block device abstraction layer and lower layers work > correctly, Btrfs does not have problems of that sort when > adding new devices;

Re: Is it safe to use btrfs on top of different types of devices?

2017-10-19 Thread Peter Grandi
> [ ... ] when writes to a USB device fail due to a temporary > disconnection, the kernel can actually recognize that a write > error happened. [ ... ] Usually, but who knows? Maybe half transfer gets written; maybe the data gets written to the wrong address; maybe stuff gets written but failure

Re: Is it safe to use btrfs on top of different types of devices?

2017-10-19 Thread Peter Grandi
> [ ... ] However, the disappearance of the device doesn't get > propagated up to the filesystem correctly, Indeed, sometimes it does, sometimes it does not, in part because of chipset bugs, in part because the USB protocol signaling side does not handle errors well even if the chipset were bug

Re: Is it safe to use btrfs on top of different types of devices?

2017-10-19 Thread Peter Grandi
[ ... ] >> Oh please, please a bit less silliness would be welcome here. >> In a previous comment on this tedious thread I had written: >> > If the block device abstraction layer and lower layers work >> > correctly, Btrfs does not have problems of that sort when >> > adding new devices;

Re: Is it safe to use btrfs on top of different types of devices?

2017-10-18 Thread Peter Grandi
> [ ... ] After all, btrfs would just have to discard one copy > of each chunk. [ ... ] One more thing that is not clear to me > is the replication profile of a volume. I see that balance can > convert chunks between profiles, for example from single to > raid1, but I don't see how the default

Re: Is it safe to use btrfs on top of different types of devices?

2017-10-18 Thread Peter Grandi
>> I forget sometimes that people insist on storing large >> volumes of data on unreliable storage... Here obviously "unreliable" is used on the sense of storage that can work incorrectly, not in the sense of storage that can fail. > In my opinion the unreliability of the storage is the exact >

Re: Is it safe to use btrfs on top of different types of devices?

2017-10-14 Thread Peter Grandi
> A few years ago I tried to use a RAID1 mdadm array of a SATA > and a USB disk, which lead to strange error messages and data > corruption. That's common, quite a few reports of similar issues in previous entries in this mailing list and for many other filesystems. > I did some searching back

Re: btrfs errors over NFS

2017-10-13 Thread Peter Grandi
>> TL;DR: ran into some btrfs errors and weird behaviour, but >> things generally seem to work. Just posting some details in >> case it helps devs or other users. [ ... ] I've run into a >> btrfs error trying to do a -j8 build of android on a btrfs >> filesystem exported over NFSv3. [ ... ] I

Re: What means "top level" in "btrfs subvolume list" ?

2017-09-30 Thread Peter Grandi
> I am trying to figure out which means "top level" in the > output of "btrfs sub list" The terminology (and sometimes the detailed behaviour) of Btrfs is not extremely consistent, I guess because of permissive editorship of the design, in a "let 1000 flowers bloom" sort of fashion so that does

Re: Btrfs performance with small blocksize on SSD

2017-09-26 Thread Peter Grandi
> i run a few performance tests comparing mdadm, hardware raid > and the btrfs raid. Fantastic beginning already! :-) > I noticed that the performance I have seen over the years a lot of messages like this where there is a wanton display of amusing misuses of terminology, of which the misuse of

Re: how to run balance successfully (No space left on device)?

2017-09-18 Thread Peter Becker
i'm not sure if it would help, but maybe you could try adding an 8GB (or more) USB flash drive to the pool and try to start balance. if it works out, you can throw him out of the pool after that. -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to

Re: A user cannot remove his readonly snapshots?!

2017-09-16 Thread Peter Grandi
[ ... ] > I can delete normal subvolumes but not the readonly snapshots: It is because of ordinary permissions for both subvolumes and snapshots: tree$ btrfs sub create /fs/sda7/sub Create subvolume '/fs/sda7/sub' tree$ chmod a-w /fs/sda7/sub tree$ btrfs sub del /fs/sda7/sub

Re: A user cannot remove his readonly snapshots?!

2017-09-15 Thread Peter Grandi
> [ ... ] mounted with option user_subvol_rm_allowed [ ... ] > root can delete this snapshot, but not the user. Why? [ ... ] Ordinary permissions still apply both to 'create' and 'delete': tree$ sudo mkdir /fs/sda7/dir tree$ btrfs sub create /fs/sda7/dir/sub ERROR: cannot access

Re: defragmenting best practice?

2017-09-15 Thread Peter Grandi
[ ... ] Case #1 2x 7200 rpm HDD -> md raid 1 -> host BTRFS rootfs -> qemu cow2 storage -> guest BTRFS filesystem SQL table row insertions per second: 1-2 Case #2 2x 7200 rpm HDD -> md raid 1 -> host BTRFS rootfs -> qemu raw storage -> guest EXT4 filesystem

Re: defragmenting best practice?

2017-09-15 Thread Peter Grandi
> Case #1 > 2x 7200 rpm HDD -> md raid 1 -> host BTRFS rootfs -> qemu cow2 storage > -> guest BTRFS filesystem > SQL table row insertions per second: 1-2 "Doctor, if I stab my hand with a fork it hurts a lot: can you cure that?" > Case #2 > 2x 7200 rpm HDD -> md raid 1 -> host BTRFS rootfs ->

Re: snapshots of encrypted directories?

2017-09-15 Thread Peter Becker
2017-09-15 12:01 GMT+02:00 Ulli Horlacher : > On Fri 2017-09-15 (06:45), Andrei Borzenkov wrote: > >> The actual question is - do you need to mount each individual btrfs >> subvolume when using encfs? > > And even worse it goes with ecryptfs: I do not know at all how

Re: generic name for volume and subvolume root?

2017-09-10 Thread Peter Grandi
> As I am writing some documentation abount creating snapshots: > Is there a generic name for both volume and subvolume root? Yes, it is from the UNIX side 'root directory' and from the Btrfs side 'subvolume'. Like some other things Btrfs, its terminology is often inconsistent, but "volume"

Re: test if a subvolume is a snapshot?

2017-09-08 Thread Peter Grandi
> How can I test if a subvolume is a snapshot? [ ... ] This question is based on the assumption that "snapshot" is a distinct type of subvolume and not just an operation that creates a subvolume with reflinked contents. Unfortunately Btrfs does indeed make snapshots a distinct type of

Re: 4.13: No space left with plenty of free space (/home/kernel/COD/linux/fs/btrfs/extent-tree.c:6989 __btrfs_free_extent.isra.62+0xc2c/0xdb0)

2017-09-08 Thread Peter Grandi
[ ... ] > [233787.921018] Call Trace: > [233787.921031] ? btrfs_merge_delayed_refs+0x62/0x550 [btrfs] > [233787.921039] __btrfs_run_delayed_refs+0x6f0/0x1380 [btrfs] > [233787.921047] btrfs_run_delayed_refs+0x6b/0x250 [btrfs] > [233787.921054] btrfs_write_dirty_block_groups+0x158/0x390 [btrfs]

Re: Workqueue: events_unbound btrfs_async_reclaim_metadata_space [btrfs]

2017-09-07 Thread Peter Becker
2017-09-07 16:37 GMT+02:00 Marco Lorenzo Crociani : [...] > I got: > > 00-49: 1 > 50-79: 0 > 80-89: 0 > 90-99: 1 > 100:25540 > > this means that fs has only one block group used under 50% and 1 between 90 > and 99% while the rest are all full? > yes ..

Re: Workqueue: events_unbound btrfs_async_reclaim_metadata_space [btrfs]

2017-09-07 Thread Peter Becker
You can check the usage of each block group with the following scripts. If there are many blockgroups with low usage you should run btrfs balance -musage= -dusage= /data cd /tmp wget https://raw.githubusercontent.com/kdave/btrfs-progs/master/btrfs-debugfs chmod +x btrfs-debugfs stats=$(sudo

Re: speed up big btrfs volumes with ssds

2017-09-04 Thread Peter Grandi
>>> [ ... ] Currently without any ssds i get the best speed with: >>> - 4x HW Raid 5 with 1GB controller memory of 4TB 3,5" devices >>> and using btrfs as raid 0 for data and metadata on top of >>> those 4 raid 5. [ ... ] the write speed is not as good as i >>> would like - especially for random

Re: speed up big btrfs volumes with ssds

2017-09-04 Thread Peter Grandi
>> [ ... ] Currently the write speed is not as good as i would >> like - especially for random 8k-16k I/O. [ ... ] > [ ... ] So this 60TB is then 20 4TB disks or so and the 4x 1GB > cache is simply not very helpful I think. The working set > doesn't fit in it I guess. If there is mostly single or

Re: read-only for no good reason on 4.9.30

2017-09-04 Thread Peter Grandi
> [ ... ] I ran "btrfs balance" and then it started working > correctly again. It seems that a btrfs filesystem if left > alone will eventually get fragmented enough that it rejects > writes [ ... ] Free space will get fragmented, because Btrfs has a 2-level allocator scheme (chunks within

Re: speed up big btrfs volumes with ssds

2017-09-03 Thread Peter Grandi
> [ ... ] - needed volume size is 60TB I wonder how long that takes to 'scrub', 'balance', 'check', 'subvolume delete', 'find', etc. > [ ... ] 4x HW Raid 5 with 1GB controller memory of 4TB 3,5" > devices and using btrfs as raid 0 for data and metadata on top > of those 4 raid 5. [ ... ] the

Re: number of subvolumes

2017-08-24 Thread Peter Grandi
>> Using hundreds or thousands of snapshots is probably fine >> mostly. As I mentioned previously, with a link to the relevant email describing the details, the real issue is reflinks/backrefs. Usually subvolume and snapshots involve them. > We find that typically apt is very slow on a machine

Re: number of subvolumes

2017-08-23 Thread Peter Grandi
> This is a vanilla SLES12 installation: [ ... ] Why does SUSE > ignore this "not too many subvolumes" warning? As in many cases with Btrfs "it's complicated" because of the interaction of advanced features among themselves and the chosen implementation and properties of storage; anisotropy

Re: user snapshots

2017-08-23 Thread Peter Grandi
> So, still: What is the problem with user_subvol_rm_allowed? As usual, it is complicated: mostly that while subvol creation is very cheap, subvol deletion can be very expensive. But then so can be creating many snapshots, as in this: https://www.spinics.net/lists/linux-btrfs/msg62760.html

Re: netapp-alike snapshots?

2017-08-22 Thread Peter Grandi
[ ... ] It is beneficial to not have snapshots in-place. With a local directory of snapshots, [ ... ] Indeed and there is a fair description of some options for subvolume nesting policies here which may be interesting to the original poster:

Re: finding root filesystem of a subvolume?

2017-08-22 Thread Peter Grandi
[ ... ] >> There is no fixed relationship between the root directory >> inode of a subvolume and the root directory inode of any >> other subvolume or the main volume. > Actually, there is, because it's inherently rooted in the > hierarchy of the volume itself. That root inode for the >

Re: netapp-alike snapshots?

2017-08-22 Thread Peter Becker
.de>: > On Tue 2017-08-22 (15:44), Peter Becker wrote: >> Is use: https://github.com/jf647/btrfs-snap >> >> 2017-08-22 15:22 GMT+02:00 Ulli Horlacher <frams...@rus.uni-stuttgart.de>: >> > With Netapp/waffle you have automatic hourly/daily/weekly snapshots.

Re: finding root filesystem of a subvolume?

2017-08-22 Thread Peter Grandi
> How do I find the root filesystem of a subvolume? > Example: > root@fex:~# df -T > Filesystem Type 1K-blocks Used Available Use% Mounted on > - -1073740800 104244552 967773976 10% /local/.backup/home [ ... ] > I know, the root filesystem is /local, That question is

Re: netapp-alike snapshots?

2017-08-22 Thread Peter Becker
Is use: https://github.com/jf647/btrfs-snap 2017-08-22 15:22 GMT+02:00 Ulli Horlacher : > With Netapp/waffle you have automatic hourly/daily/weekly snapshots. > You can find these snapshots in every local directory (readonly). > Example: > > framstag@fex:/sw/share:

Re: slow btrfs with a single kworker process using 100% CPU

2017-08-16 Thread Peter Grandi
>>> I've one system where a single kworker process is using 100% >>> CPU sometimes a second process comes up with 100% CPU >>> [btrfs-transacti]. [ ... ] >> [ ... ]1413 Snapshots. I'm deleting 50 of them every night. But >> btrfs-cleaner process isn't running / consuming CPU currently. Reminder

Re: RedHat 7.4 Release Notes: "Btrfs has been deprecated" - wut?

2017-08-16 Thread Peter Grandi
[ ... ] >>> Snapshots work fine with nodatacow, each block gets CoW'ed >>> once when it's first written to, and then goes back to being >>> NOCOW. >>> The only caveat is that you probably want to defrag either >>> once everything has been rewritten, or right after the >>> snapshot. >> I thought

Re: RedHat 7.4 Release Notes: "Btrfs has been deprecated" - wut?

2017-08-16 Thread Peter Grandi
[ ... ] > But I've talked to some friend at the local super computing > centre and they have rather general issues with CoW at their > virtualisation cluster. Amazing news! :-) > Like SUSE's snapper making many snapshots leading the storage > images of VMs apparently to explode (in terms of

Re: RedHat 7.4 Release Notes: "Btrfs has been deprecated" - wut?

2017-08-16 Thread Peter Grandi
> We use the crcs to catch storage gone wrong, [ ... ] And that's an opportunistically feasible idea given that current CPUs can do that in real-time. > [ ... ] It's possible to protect against all three without COW, > but all solutions have their own tradeoffs and this is the setup > we chose.

Re: Btrfs + compression = slow performance and high cpu usage

2017-08-01 Thread Peter Grandi
[ ... ] > This is the "storage for beginners" version, what happens in > practice however depends a lot on specific workload profile > (typical read/write size and latencies and rates), caching and > queueing algorithms in both Linux and the HA firmware. To add a bit of slightly more advanced

Re: Btrfs + compression = slow performance and high cpu usage

2017-08-01 Thread Peter Grandi
>> [ ... ] a "RAID5 with 128KiB writes and a 768KiB stripe >> size". [ ... ] several back-to-back 128KiB writes [ ... ] get >> merged by the 3ware firmware only if it has a persistent >> cache, and maybe your 3ware does not have one, > KOS: No I don't have persistent cache. Only the 512 Mb cache

Re: Btrfs + compression = slow performance and high cpu usage

2017-08-01 Thread Peter Grandi
> Peter, I don't think the filefrag is showing the correct > fragmentation status of the file when the compression is used. As reported on a previous message the output of 'filefrag -v' which can be used to see what is going on: >>>> filefrag /mnt/sde3/testfile >>>&g

Re: Btrfs + compression = slow performance and high cpu usage

2017-07-31 Thread Peter Grandi
> [ ... ] It is hard for me to see a speed issue here with > Btrfs: for comparison I have done a simple test with a both a > 3+1 MD RAID5 set with a 256KiB chunk size and a single block > device on "contemporary" 1T/2TB drives, capable of sequential > transfer rates of 150-190MB/s: [ ... ] The

Re: Btrfs + compression = slow performance and high cpu usage

2017-07-31 Thread Peter Grandi
[ ... ] > Also added: Feeling very generous :-) today, adding these too: soft# mkfs.btrfs -mraid10 -draid10 -L test5 /dev/sd{b,c,d,e}3 [ ... ] soft# mount -t btrfs -o commit=10,compress-force=zlib /dev/sdb3 /mnt/test5 soft# rm -f /mnt/test5/testfile soft# /usr/bin/time dd

Re: Btrfs + compression = slow performance and high cpu usage

2017-07-31 Thread Peter Grandi
[ ... ] > grep 'model name' /proc/cpuinfo | sort -u > model name : Intel(R) Xeon(R) CPU E5645 @ 2.40GHz Good, contemporary CPU with all accelerations. > The sda device is a hardware RAID5 consisting of 4x8TB drives. [ ... ] > Strip Size : 256 KB So the full RMW data

Re: Btrfs + compression = slow performance and high cpu usage

2017-07-28 Thread Peter Grandi
In addition to my previous "it does not happen here" comment, if someone is reading this thread, there are some other interesting details: > When the compression is turned off, I am able to get the > maximum 500-600 mb/s write speed on this disk (raid array) > with minimal cpu usage. No details

Re: Btrfs + compression = slow performance and high cpu usage

2017-07-28 Thread Peter Grandi
> I am stuck with a problem of btrfs slow performance when using > compression. [ ... ] That to me looks like an issue with speed, not performance, and in particular with PEBCAK issues. As to high CPU usage, when you find a way to do both compression and checksumming without using much CPU time,

Re: kernel btrfs file system wedged -- is it toast?

2017-07-21 Thread Peter Grandi
> [ ... ] announce loudly and clearly to any potential users, in > multiple places (perhaps a key announcement in a few places > and links to that announcement from many places, https://btrfs.wiki.kernel.org/index.php/Gotchas#Having_many_subvolumes_can_be_very_slow > ... DO expect to first have

Re: Exactly what is wrong with RAID5/6

2017-06-21 Thread Peter Grandi
> [ ... ] This will make some filesystems mostly RAID1, negating > all space savings of RAID5, won't it? [ ... ] RAID5/RAID6/... don't merely save space, more precisely they trade lower resilience and a more anisotropic and smaller performance envelope to gain lower redundancy (= save space). --

Re: does using different uid/gid/forceuid/... mount options for different subvolumes work / does fuse.bindfs play nice with btrfs?

2017-06-20 Thread Peter Grandi
> I intend to provide different "views" of the data stored on > btrfs subvolumes. e.g. mount a subvolume in location A rw; > and ro in location B while also overwriting uids, gids, and > permissions. [ ... ] That's not how UNIX/Linux permissions and ACLs are supposed to work, perhaps you should

Fwd: confusing "no space left" -- how to troubleshoot and "be prepared"?

2017-05-18 Thread Peter Becker
2017-05-18 15:41 GMT+02:00 Yaroslav Halchenko : > > our python-based program crashed with > > File > "/home/yoh/proj/datalad/datalad/venv-tests/local/lib/python2.7/site-packages/gitdb/stream.py", > line 695, in write > os.write(self._fd, data) > OSError: [Errno 28] No

Re: Struggling with file system slowness

2017-05-04 Thread Peter Grandi
> Trying to peg down why I have one server that has > btrfs-transacti pegged at 100% CPU for most of the time. Too little information. Is IO happening at the same time? Is compression on? Deduplicated? Lots of subvolumes? SSD? What kind of workload and file size/distribution profile? Typical

Re: Ded

2017-05-03 Thread Peter Grandi
> I have a btrfs filesystem mounted at /btrfs_vol/ Every N > minutes, I run bedup for deduplication of data in /btrfs_vol > Inside /btrfs_vol, I have several subvolumes (consider this as > home directories of several users) I have set individual > qgroup limits for each of these subvolumes. [ ...

Re: btrfs, journald logs, fragmentation, and fallocate

2017-04-29 Thread Peter Grandi
>> [ ... ] these extents are all over the place, they're not >> contiguous at all. 4K here, 4K there, 4K over there, back to >> 4K here next to this one, 4K over there...12K over there, 500K >> unwritten, 4K over there. This seems not so consequential on >> SSD, [ ... ] > Indeed there were recent

Re: btrfs, journald logs, fragmentation, and fallocate

2017-04-29 Thread Peter Grandi
> [ ... ] Instead, you can use raw files (preferably sparse unless > there's both nocow and no snapshots). Btrfs does natively everything > you'd gain from qcow2, and does it better: you can delete the master > of a cloned image, deduplicate them, deduplicate two unrelated images; > you can turn

Re: btrfs, journald logs, fragmentation, and fallocate

2017-04-28 Thread Peter Grandi
> [ ... ] these extents are all over the place, they're not > contiguous at all. 4K here, 4K there, 4K over there, back to > 4K here next to this one, 4K over there...12K over there, 500K > unwritten, 4K over there. This seems not so consequential on > SSD, [ ... ] Indeed there were recent

Re: btrfs, journald logs, fragmentation, and fallocate

2017-04-28 Thread Peter Grandi
>> The gotcha though is there's a pile of data in the journal >> that would never make it to rsyslogd. If you use journalctl >> -o verbose you can see some of this. > You can send *all the info* to rsyslogd via imjournal > http://www.rsyslog.com/doc/v8-stable/configuration/modules/imjournal.html

Re: btrfs, journald logs, fragmentation, and fallocate

2017-04-28 Thread Peter Grandi
> [ ... ] And that makes me wonder whether metadata > fragmentation is happening as a result. But in any case, > there's a lot of metadata being written for each journal > update compared to what's being added to the journal file. [ > ... ] That's the "wandering trees" problem in COW filesystems,

Re: btrfs, journald logs, fragmentation, and fallocate

2017-04-28 Thread Peter Grandi
> Old news is that systemd-journald journals end up pretty > heavily fragmented on Btrfs due to COW. This has been discussed before in detail indeeed here, but also here: http://www.sabi.co.uk/blog/15-one.html?150203#150203 > While journald uses chattr +C on journal files now, COW still >

Re: [PATCH 0/5] v2: block subsystem refcounter conversions

2017-04-21 Thread Peter Zijlstra
On Fri, Apr 21, 2017 at 08:03:13AM -0600, Jens Axboe wrote: > You have it so easy - the code is completely standalone, building a > small test framework around it and measuring performance in _user space_ > is trivial. Something like this you mean:

Re: About free space fragmentation, metadata write amplification and (no)ssd

2017-04-08 Thread Peter Grandi
> [ ... ] This post is way too long [ ... ] Many thanks for your report, it is really useful, especially the details. > [ ... ] using rsync with --link-dest to btrfs while still > using rsync, but with btrfs subvolumes and snapshots [1]. [ > ... ] Currently there's ~35TiB of data present on the

Re: btrfs filesystem keeps allocating new chunks for no apparent reason

2017-04-07 Thread Peter Grandi
[ ... ] >>> I've got a mostly inactive btrfs filesystem inside a virtual >>> machine somewhere that shows interesting behaviour: while no >>> interesting disk activity is going on, btrfs keeps >>> allocating new chunks, a GiB at a time. [ ... ] > Because the allocator keeps walking forward every

Re: Do different btrfs volumes compete for CPU?

2017-04-04 Thread Peter Grandi
> [ ... ] I tried to use eSATA and ext4 first, but observed > silent data corruption and irrecoverable kernel hangs -- > apparently, SATA is not really designed for external use. SATA works for external use, eSATA works well, but what really matters is the chipset of the adapter card. In my

Re: Shrinking a device - performance?

2017-04-01 Thread Peter Grandi
[ ... ] >>> $ D='btrfs f2fs gfs2 hfsplus jfs nilfs2 reiserfs udf xfs' >>> $ find $D -name '*.ko' | xargs size | sed 's/^ *//;s/ .*\t//g' >>> textfilename >>> 832719 btrfs/btrfs.ko >>> 237952 f2fs/f2fs.ko >>> 251805 gfs2/gfs2.ko >>> 72731 hfsplus/hfsplus.ko >>> 171623

Re: Do different btrfs volumes compete for CPU?

2017-04-01 Thread Peter Grandi
>> Approximately 16 hours ago I've run a script that deleted >> >~100 snapshots and started quota rescan on a large >> USB-connected btrfs volume (5.4 of 22 TB occupied now). That "USB-connected is a rather bad idea. On the IRC channel #Btrfs whenever someone reports odd things happening I ask

Re: Shrinking a device - performance?

2017-03-31 Thread Peter Grandi
> [ ... ] what the signifigance of the xargs size limits of > btrfs might be. [ ... ] So what does it mean that btrfs has a > higher xargs size limit than other file systems? [ ... ] Or > does the lower capacity for argument length for hfsplus > demonstrate it is the superior file system for

Re: Shrinking a device - performance?

2017-03-31 Thread Peter Grandi
>>> My guess is that very complex risky slow operations like >>> that are provided by "clever" filesystem developers for >>> "marketing" purposes, to win box-ticking competitions. >>> That applies to those system developers who do know better; >>> I suspect that even some filesystem developers

Re: Shrinking a device - performance?

2017-03-31 Thread Peter Grandi
>> [ ... ] CentOS, Redhat, and Oracle seem to take the position >> that very large data subvolumes using btrfs should work >> fine. But I would be curious what the rest of the list thinks >> about 20 TiB in one volume/subvolume. > To be sure I'm a biased voice here, as I have multiple >

Re: Shrinking a device - performance?

2017-03-31 Thread Peter Grandi
> Can you try to first dedup the btrfs volume? This is probably > out of date, but you could try one of these: [ ... ] Yep, > that's probably a lot of work. [ ... ] My recollection is that > btrfs handles deduplication differently than zfs, but both of > them can be very, very slow But the big

Re: Shrinking a device - performance?

2017-03-31 Thread Peter Grandi
>>> The way btrfs is designed I'd actually expect shrinking to >>> be fast in most cases. [ ... ] >> The proposed "move whole chunks" implementation helps only if >> there are enough unallocated chunks "below the line". If regular >> 'balance' is done on the filesystem there will be some, but

Re: Shrinking a device - performance?

2017-03-30 Thread Peter Grandi
>> My guess is that very complex risky slow operations like that are >> provided by "clever" filesystem developers for "marketing" purposes, >> to win box-ticking competitions. That applies to those system >> developers who do know better; I suspect that even some filesystem >> developers are

Re: Shrinking a device - performance?

2017-03-30 Thread Peter Grandi
> I’ve glazed over on “Not only that …” … can you make youtube > video of that :)) [ ... ] It’s because I’m special :* Well played again, that's a fairly credible impersonation of a node.js/mongodb developer :-). > On a real note thank’s [ ... ] to much of open source stuff is > based on short

  1   2   3   4   >