i like the idea.
do you have any benchmarks for this change?
the general logic looks good for me.
On 07/12/2018 07:10 AM, Nikolay Borisov wrote:
>
>
> On 10.07.2018 10:04, Pete wrote:
>> I've just had the error in the subject which caused the file system to
>> go read-only.
>>
>> Further part of error message:
>> WARNING: CPU: 14 PID: 1351 at fs/btrfs/extent-tree.c:3076
>>
On Fri, May 18, 2018 at 07:32:05AM -0400, Kent Overstreet wrote:
> It does strike me that the whole optimistic spin algorithm
> (mutex_optimistic_spin() and rwsem_optimistic_spin()) are ripe for factoring
> out. They've been growing more optimizations I see, and the optimizations
> mostly
>
On Fri, May 18, 2018 at 06:18:04AM -0400, Kent Overstreet wrote:
> On Fri, May 18, 2018 at 11:52:04AM +0200, Peter Zijlstra wrote:
> > On Fri, May 18, 2018 at 03:49:06AM -0400, Kent Overstreet wrote:
> >
> > No.. and most certainly not without a _very_ good reason.
&
On Fri, May 18, 2018 at 06:13:53AM -0400, Kent Overstreet wrote:
> On Fri, May 18, 2018 at 11:51:02AM +0200, Peter Zijlstra wrote:
> > On Fri, May 18, 2018 at 03:49:04AM -0400, Kent Overstreet wrote:
> > > bcachefs makes use of them - also, add a proper lg_lock_init()
>
On Fri, May 18, 2018 at 03:49:06AM -0400, Kent Overstreet wrote:
No.. and most certainly not without a _very_ good reason.
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at
On Fri, May 18, 2018 at 03:49:04AM -0400, Kent Overstreet wrote:
> bcachefs makes use of them - also, add a proper lg_lock_init()
Why?! lglocks are horrid things, we got rid of them for a reason. They
have terrifying worst case preemption off latencies.
Why can't you use something like per-cpu
rather than be owned by root.
Signed-off-by: Peter Kjellerstedt <peter.kjellerst...@axis.com>
---
Makefile | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/Makefile b/Makefile
index 92cfe7b5..0e8bfd98 100644
--- a/Makefile
+++ b/Makefile
@@ -578,7 +578,7 @@ install:
I got this kernel warning overnight. Possibly during or after a dedup
using duperemove. I'm not sure if that is relevent. Seems to relate to
fs/btrfs/backref.c line 1266.
Don't know if it is important. Thought I'd post it just in case.
I'm afraid this is a screen-shot in the old fashioned
> I am trying to better understand how the cleaner kthread
> (btrfs-cleaner) impacts foreground performance, specifically
> during snapshot deletion. My experience so far has been that
> it can be dramatically disruptive to foreground I/O.
That's such a warmly innocent and optimistic question!
es to use this as
performance tuning. at least the feature with the devid.
Thanks Austin,
Thanks Anand
2018-01-31 17:11 GMT+01:00 Austin S. Hemmelgarn <ahferro...@gmail.com>:
> On 2018-01-31 09:52, Peter Becker wrote:
>>
>> This is all clear. My question referes to "use
stripe to use] = [preffer stripes present on read_mirror_policy
devids] > [fallback to pid % stripe count]
Perhaps I'm not be able to express myself in English or did I misunderstand you?
2018-01-31 15:26 GMT+01:00 Anand Jain <anand.j...@oracle.com>:
>
>
> On 01/31/2018 06:47 PM,
A little question about mount -o read_mirror_policy=.
How would this work with RAID1 over 3 or 4 HDD's?
In particular, if the desired block is not available on device .
Could i repeat this option like the device-option to specify a
order/priority like this:
mount -o read_mirror_policy=
Fragen zu beantworten.
Danke, dass Sie mir per E-Mail an Sie von : klaus.peterschus...@outlook.de
Mit freundlichen Grüßen.
Peter Schuster
Financial Bank
https://firstfinancialsa.com/de
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message
Hello there, My name is Peter Deng a South African citizen and a friend to Mrs
Mugabe sister . I got your contact through Korean business online directory. I
represent the interest of Mrs Mugabe who wishes to move a total amount of $19
million into a safe account owns by a trusted business man
> When testing Btrfs with fio 4k random write,
That's an exceptionally narrowly defined workload. Also it is
narrower than that, because it must be without 'fsync' after
each write, or else there would be no accumulation of dirty
blocks in memory at all.
> I found that volume with smaller free
[ ... ]
> The advantage of writing single chunks when degraded, is in
> the case where a missing device returns (is readded,
> intact). Catching up that device with the first drive, is a
> manual but simple invocation of 'btrfs balance start
> -dconvert=raid1,soft -mconvert=raid1,soft' The
>> The fact is, the only cases where this is really an issue is
>> if you've either got intermittently bad hardware, or are
>> dealing with external
> Well, the RAID1+ is all about the failing hardware.
>> storage devices. For the majority of people who are using
>> multi-device setups, the
>> I haven't seen that, but I doubt that it is the radical
>> redesign of the multi-device layer of Btrfs that is needed to
>> give it operational semantics similar to those of MD RAID,
>> and that I have vaguely described previously.
> I agree that btrfs volume manager is incomplete in view of
>
"Duncan"'s reply is slightly optimistic in parts, so some
further information...
[ ... ]
> Basically, at this point btrfs doesn't have "dynamic" device
> handling. That is, if a device disappears, it doesn't know
> it.
That's just the consequence of what is a completely broken
conceptual
> [ ... ] btrfs incorporates disk management which is actually a
> version of md layer, [ ... ]
As far as I know Btrfs has no disk management, and was wisely
designed without any, just like MD: Btrfs volumes and MD sets
can be composed from "block devices", not disks, and block
devices are quite
>>> If the underlying protocal doesn't support retry and there
>>> are some transient errors happening somewhere in our IO
>>> stack, we'd like to give an extra chance for IO.
>> A limited number of retries may make sense, though I saw some
>> long stalls after retries on bad disks.
Indeed! One
>> The issue is that updatedb by default will not index bind
>> mounts, but by default on Fedora and probably other distros,
>> put /home on a subvolume and then mount that subvolume which
>> is in effect a bind mount.
>
> So the issue isn't /home being btrfs (as you said in the
> subject), but
> Another one is to find the most fragmented files first or all
> files of at least 1M with with at least say 100 fragments as in:
> find "$HOME" -xdev -type f -size +1M -print0 | xargs -0 filefrag \
> | perl -n -e 'print "$1\0" if (m/(.*): ([0-9]+) extents/ && $1 > 100)' \
> | xargs -0 btrfs fi
[ ... ]
> The poor performance has existed from the beginning of using
> BTRFS + KDE + Firefox (almost 2 years ago), at a point when
> very few snapshots had yet been created. A comparison system
> running similar hardware as well as KDE + Firefox (and LVM +
> EXT4) did not have the performance
> When defragmenting individual files on a BTRFS filesystem with
> COW, I assume reflinks between that file and all snapshots are
> broken. So if there are 30 snapshots on that volume, that one
> file will suddenly take up 30 times more space... [ ... ]
Defragmentation works by effectively making
> I'm following up on all the suggestions regarding Firefox performance
> on BTRFS. [ ... ]
I haven't read that yet, so maybe I am missing something, but I
use Firefox with Btrfs all the time and I haven't got issues.
[ ... ]
> 1. BTRFS snapshots have proven to be too useful (and too important
>> But it could simply be that you have forgotten to refresh the
>> 'initramfs' with 'mkinitrd' after modifying the '/etc/fstab'.
> I finally managed it. I'm pretty sure having changed
> /boot/grub/menu.lst, but somehow changes got lost/weren't
> saved ?
So the next thing to check would indeed
> I formatted the / partition with Btrfs again and could restore
> the files from a backup. Everything seems to be there, I can
> mount the Btrfs manually. [ ... ] But SLES finds from where I
> don't know a UUID (see screenshot). This UUID is commented out
> in fstab and replaced by
[ ... ]
>> are USB drives really that unreliable [ ... ]
[ ... ]
> There are similar SATA chips too (occasionally JMicron and
> Marvell for example are somewhat less awesome than they could
> be), and practically all Firewire bridge chips of old "lied" a
> lot [ ... ]
> That plus Btrfs is
[ ... ]
>>> Oh please, please a bit less silliness would be welcome here.
>>> In a previous comment on this tedious thread I had written:
> If the block device abstraction layer and lower layers work
> correctly, Btrfs does not have problems of that sort when
> adding new devices;
> [ ... ] when writes to a USB device fail due to a temporary
> disconnection, the kernel can actually recognize that a write
> error happened. [ ... ]
Usually, but who knows? Maybe half transfer gets written; maybe
the data gets written to the wrong address; maybe stuff gets
written but failure
> [ ... ] However, the disappearance of the device doesn't get
> propagated up to the filesystem correctly,
Indeed, sometimes it does, sometimes it does not, in part
because of chipset bugs, in part because the USB protocol
signaling side does not handle errors well even if the chipset
were bug
[ ... ]
>> Oh please, please a bit less silliness would be welcome here.
>> In a previous comment on this tedious thread I had written:
>> > If the block device abstraction layer and lower layers work
>> > correctly, Btrfs does not have problems of that sort when
>> > adding new devices;
> [ ... ] After all, btrfs would just have to discard one copy
> of each chunk. [ ... ] One more thing that is not clear to me
> is the replication profile of a volume. I see that balance can
> convert chunks between profiles, for example from single to
> raid1, but I don't see how the default
>> I forget sometimes that people insist on storing large
>> volumes of data on unreliable storage...
Here obviously "unreliable" is used on the sense of storage that
can work incorrectly, not in the sense of storage that can fail.
> In my opinion the unreliability of the storage is the exact
>
> A few years ago I tried to use a RAID1 mdadm array of a SATA
> and a USB disk, which lead to strange error messages and data
> corruption.
That's common, quite a few reports of similar issues in previous
entries in this mailing list and for many other filesystems.
> I did some searching back
>> TL;DR: ran into some btrfs errors and weird behaviour, but
>> things generally seem to work. Just posting some details in
>> case it helps devs or other users. [ ... ] I've run into a
>> btrfs error trying to do a -j8 build of android on a btrfs
>> filesystem exported over NFSv3. [ ... ]
I
> I am trying to figure out which means "top level" in the
> output of "btrfs sub list"
The terminology (and sometimes the detailed behaviour) of Btrfs
is not extremely consistent, I guess because of permissive
editorship of the design, in a "let 1000 flowers bloom" sort
of fashion so that does
> i run a few performance tests comparing mdadm, hardware raid
> and the btrfs raid.
Fantastic beginning already! :-)
> I noticed that the performance
I have seen over the years a lot of messages like this where
there is a wanton display of amusing misuses of terminology, of
which the misuse of
i'm not sure if it would help, but maybe you could try adding an 8GB
(or more) USB flash drive to the pool and try to start balance.
if it works out, you can throw him out of the pool after that.
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to
[ ... ]
> I can delete normal subvolumes but not the readonly snapshots:
It is because of ordinary permissions for both subvolumes and
snapshots:
tree$ btrfs sub create /fs/sda7/sub
Create subvolume '/fs/sda7/sub'
tree$ chmod a-w /fs/sda7/sub
tree$ btrfs sub del /fs/sda7/sub
> [ ... ] mounted with option user_subvol_rm_allowed [ ... ]
> root can delete this snapshot, but not the user. Why? [ ... ]
Ordinary permissions still apply both to 'create' and 'delete':
tree$ sudo mkdir /fs/sda7/dir
tree$ btrfs sub create /fs/sda7/dir/sub
ERROR: cannot access
[ ... ]
Case #1
2x 7200 rpm HDD -> md raid 1 -> host BTRFS rootfs
-> qemu cow2 storage -> guest BTRFS filesystem
SQL table row insertions per second: 1-2
Case #2
2x 7200 rpm HDD -> md raid 1 -> host BTRFS rootfs
-> qemu raw storage -> guest EXT4 filesystem
> Case #1
> 2x 7200 rpm HDD -> md raid 1 -> host BTRFS rootfs -> qemu cow2 storage
> -> guest BTRFS filesystem
> SQL table row insertions per second: 1-2
"Doctor, if I stab my hand with a fork it hurts a lot: can you
cure that?"
> Case #2
> 2x 7200 rpm HDD -> md raid 1 -> host BTRFS rootfs ->
2017-09-15 12:01 GMT+02:00 Ulli Horlacher :
> On Fri 2017-09-15 (06:45), Andrei Borzenkov wrote:
>
>> The actual question is - do you need to mount each individual btrfs
>> subvolume when using encfs?
>
> And even worse it goes with ecryptfs: I do not know at all how
> As I am writing some documentation abount creating snapshots:
> Is there a generic name for both volume and subvolume root?
Yes, it is from the UNIX side 'root directory' and from the
Btrfs side 'subvolume'. Like some other things Btrfs, its
terminology is often inconsistent, but "volume"
> How can I test if a subvolume is a snapshot? [ ... ]
This question is based on the assumption that "snapshot" is a
distinct type of subvolume and not just an operation that
creates a subvolume with reflinked contents.
Unfortunately Btrfs does indeed make snapshots a distinct type
of
[ ... ]
> [233787.921018] Call Trace:
> [233787.921031] ? btrfs_merge_delayed_refs+0x62/0x550 [btrfs]
> [233787.921039] __btrfs_run_delayed_refs+0x6f0/0x1380 [btrfs]
> [233787.921047] btrfs_run_delayed_refs+0x6b/0x250 [btrfs]
> [233787.921054] btrfs_write_dirty_block_groups+0x158/0x390 [btrfs]
2017-09-07 16:37 GMT+02:00 Marco Lorenzo Crociani
:
[...]
> I got:
>
> 00-49: 1
> 50-79: 0
> 80-89: 0
> 90-99: 1
> 100:25540
>
> this means that fs has only one block group used under 50% and 1 between 90
> and 99% while the rest are all full?
>
yes ..
You can check the usage of each block group with the following
scripts. If there are many blockgroups with low usage you should run
btrfs balance -musage= -dusage= /data
cd /tmp
wget https://raw.githubusercontent.com/kdave/btrfs-progs/master/btrfs-debugfs
chmod +x btrfs-debugfs
stats=$(sudo
>>> [ ... ] Currently without any ssds i get the best speed with:
>>> - 4x HW Raid 5 with 1GB controller memory of 4TB 3,5" devices
>>> and using btrfs as raid 0 for data and metadata on top of
>>> those 4 raid 5. [ ... ] the write speed is not as good as i
>>> would like - especially for random
>> [ ... ] Currently the write speed is not as good as i would
>> like - especially for random 8k-16k I/O. [ ... ]
> [ ... ] So this 60TB is then 20 4TB disks or so and the 4x 1GB
> cache is simply not very helpful I think. The working set
> doesn't fit in it I guess. If there is mostly single or
> [ ... ] I ran "btrfs balance" and then it started working
> correctly again. It seems that a btrfs filesystem if left
> alone will eventually get fragmented enough that it rejects
> writes [ ... ]
Free space will get fragmented, because Btrfs has a 2-level
allocator scheme (chunks within
> [ ... ] - needed volume size is 60TB
I wonder how long that takes to 'scrub', 'balance', 'check',
'subvolume delete', 'find', etc.
> [ ... ] 4x HW Raid 5 with 1GB controller memory of 4TB 3,5"
> devices and using btrfs as raid 0 for data and metadata on top
> of those 4 raid 5. [ ... ] the
>> Using hundreds or thousands of snapshots is probably fine
>> mostly.
As I mentioned previously, with a link to the relevant email
describing the details, the real issue is reflinks/backrefs.
Usually subvolume and snapshots involve them.
> We find that typically apt is very slow on a machine
> This is a vanilla SLES12 installation: [ ... ] Why does SUSE
> ignore this "not too many subvolumes" warning?
As in many cases with Btrfs "it's complicated" because of the
interaction of advanced features among themselves and the chosen
implementation and properties of storage; anisotropy
> So, still: What is the problem with user_subvol_rm_allowed?
As usual, it is complicated: mostly that while subvol creation
is very cheap, subvol deletion can be very expensive. But then
so can be creating many snapshots, as in this:
https://www.spinics.net/lists/linux-btrfs/msg62760.html
[ ... ]
It is beneficial to not have snapshots in-place. With a local
directory of snapshots, [ ... ]
Indeed and there is a fair description of some options for
subvolume nesting policies here which may be interesting to the
original poster:
[ ... ]
>> There is no fixed relationship between the root directory
>> inode of a subvolume and the root directory inode of any
>> other subvolume or the main volume.
> Actually, there is, because it's inherently rooted in the
> hierarchy of the volume itself. That root inode for the
>
.de>:
> On Tue 2017-08-22 (15:44), Peter Becker wrote:
>> Is use: https://github.com/jf647/btrfs-snap
>>
>> 2017-08-22 15:22 GMT+02:00 Ulli Horlacher <frams...@rus.uni-stuttgart.de>:
>> > With Netapp/waffle you have automatic hourly/daily/weekly snapshots.
> How do I find the root filesystem of a subvolume?
> Example:
> root@fex:~# df -T
> Filesystem Type 1K-blocks Used Available Use% Mounted on
> - -1073740800 104244552 967773976 10% /local/.backup/home
[ ... ]
> I know, the root filesystem is /local,
That question is
Is use: https://github.com/jf647/btrfs-snap
2017-08-22 15:22 GMT+02:00 Ulli Horlacher :
> With Netapp/waffle you have automatic hourly/daily/weekly snapshots.
> You can find these snapshots in every local directory (readonly).
> Example:
>
> framstag@fex:/sw/share:
>>> I've one system where a single kworker process is using 100%
>>> CPU sometimes a second process comes up with 100% CPU
>>> [btrfs-transacti]. [ ... ]
>> [ ... ]1413 Snapshots. I'm deleting 50 of them every night. But
>> btrfs-cleaner process isn't running / consuming CPU currently.
Reminder
[ ... ]
>>> Snapshots work fine with nodatacow, each block gets CoW'ed
>>> once when it's first written to, and then goes back to being
>>> NOCOW.
>>> The only caveat is that you probably want to defrag either
>>> once everything has been rewritten, or right after the
>>> snapshot.
>> I thought
[ ... ]
> But I've talked to some friend at the local super computing
> centre and they have rather general issues with CoW at their
> virtualisation cluster.
Amazing news! :-)
> Like SUSE's snapper making many snapshots leading the storage
> images of VMs apparently to explode (in terms of
> We use the crcs to catch storage gone wrong, [ ... ]
And that's an opportunistically feasible idea given that current
CPUs can do that in real-time.
> [ ... ] It's possible to protect against all three without COW,
> but all solutions have their own tradeoffs and this is the setup
> we chose.
[ ... ]
> This is the "storage for beginners" version, what happens in
> practice however depends a lot on specific workload profile
> (typical read/write size and latencies and rates), caching and
> queueing algorithms in both Linux and the HA firmware.
To add a bit of slightly more advanced
>> [ ... ] a "RAID5 with 128KiB writes and a 768KiB stripe
>> size". [ ... ] several back-to-back 128KiB writes [ ... ] get
>> merged by the 3ware firmware only if it has a persistent
>> cache, and maybe your 3ware does not have one,
> KOS: No I don't have persistent cache. Only the 512 Mb cache
> Peter, I don't think the filefrag is showing the correct
> fragmentation status of the file when the compression is used.
As reported on a previous message the output of 'filefrag -v'
which can be used to see what is going on:
>>>> filefrag /mnt/sde3/testfile
>>>&g
> [ ... ] It is hard for me to see a speed issue here with
> Btrfs: for comparison I have done a simple test with a both a
> 3+1 MD RAID5 set with a 256KiB chunk size and a single block
> device on "contemporary" 1T/2TB drives, capable of sequential
> transfer rates of 150-190MB/s: [ ... ]
The
[ ... ]
> Also added:
Feeling very generous :-) today, adding these too:
soft# mkfs.btrfs -mraid10 -draid10 -L test5 /dev/sd{b,c,d,e}3
[ ... ]
soft# mount -t btrfs -o commit=10,compress-force=zlib /dev/sdb3 /mnt/test5
soft# rm -f /mnt/test5/testfile
soft# /usr/bin/time dd
[ ... ]
> grep 'model name' /proc/cpuinfo | sort -u
> model name : Intel(R) Xeon(R) CPU E5645 @ 2.40GHz
Good, contemporary CPU with all accelerations.
> The sda device is a hardware RAID5 consisting of 4x8TB drives.
[ ... ]
> Strip Size : 256 KB
So the full RMW data
In addition to my previous "it does not happen here" comment, if
someone is reading this thread, there are some other interesting
details:
> When the compression is turned off, I am able to get the
> maximum 500-600 mb/s write speed on this disk (raid array)
> with minimal cpu usage.
No details
> I am stuck with a problem of btrfs slow performance when using
> compression. [ ... ]
That to me looks like an issue with speed, not performance, and
in particular with PEBCAK issues.
As to high CPU usage, when you find a way to do both compression
and checksumming without using much CPU time,
> [ ... ] announce loudly and clearly to any potential users, in
> multiple places (perhaps a key announcement in a few places
> and links to that announcement from many places,
https://btrfs.wiki.kernel.org/index.php/Gotchas#Having_many_subvolumes_can_be_very_slow
> ... DO expect to first have
> [ ... ] This will make some filesystems mostly RAID1, negating
> all space savings of RAID5, won't it? [ ... ]
RAID5/RAID6/... don't merely save space, more precisely they
trade lower resilience and a more anisotropic and smaller
performance envelope to gain lower redundancy (= save space).
--
> I intend to provide different "views" of the data stored on
> btrfs subvolumes. e.g. mount a subvolume in location A rw;
> and ro in location B while also overwriting uids, gids, and
> permissions. [ ... ]
That's not how UNIX/Linux permissions and ACLs are supposed to
work, perhaps you should
2017-05-18 15:41 GMT+02:00 Yaroslav Halchenko :
>
> our python-based program crashed with
>
> File
> "/home/yoh/proj/datalad/datalad/venv-tests/local/lib/python2.7/site-packages/gitdb/stream.py",
> line 695, in write
> os.write(self._fd, data)
> OSError: [Errno 28] No
> Trying to peg down why I have one server that has
> btrfs-transacti pegged at 100% CPU for most of the time.
Too little information. Is IO happening at the same time? Is
compression on? Deduplicated? Lots of subvolumes? SSD? What kind
of workload and file size/distribution profile?
Typical
> I have a btrfs filesystem mounted at /btrfs_vol/ Every N
> minutes, I run bedup for deduplication of data in /btrfs_vol
> Inside /btrfs_vol, I have several subvolumes (consider this as
> home directories of several users) I have set individual
> qgroup limits for each of these subvolumes. [ ...
>> [ ... ] these extents are all over the place, they're not
>> contiguous at all. 4K here, 4K there, 4K over there, back to
>> 4K here next to this one, 4K over there...12K over there, 500K
>> unwritten, 4K over there. This seems not so consequential on
>> SSD, [ ... ]
> Indeed there were recent
> [ ... ] Instead, you can use raw files (preferably sparse unless
> there's both nocow and no snapshots). Btrfs does natively everything
> you'd gain from qcow2, and does it better: you can delete the master
> of a cloned image, deduplicate them, deduplicate two unrelated images;
> you can turn
> [ ... ] these extents are all over the place, they're not
> contiguous at all. 4K here, 4K there, 4K over there, back to
> 4K here next to this one, 4K over there...12K over there, 500K
> unwritten, 4K over there. This seems not so consequential on
> SSD, [ ... ]
Indeed there were recent
>> The gotcha though is there's a pile of data in the journal
>> that would never make it to rsyslogd. If you use journalctl
>> -o verbose you can see some of this.
> You can send *all the info* to rsyslogd via imjournal
> http://www.rsyslog.com/doc/v8-stable/configuration/modules/imjournal.html
> [ ... ] And that makes me wonder whether metadata
> fragmentation is happening as a result. But in any case,
> there's a lot of metadata being written for each journal
> update compared to what's being added to the journal file. [
> ... ]
That's the "wandering trees" problem in COW filesystems,
> Old news is that systemd-journald journals end up pretty
> heavily fragmented on Btrfs due to COW.
This has been discussed before in detail indeeed here, but also
here: http://www.sabi.co.uk/blog/15-one.html?150203#150203
> While journald uses chattr +C on journal files now, COW still
>
On Fri, Apr 21, 2017 at 08:03:13AM -0600, Jens Axboe wrote:
> You have it so easy - the code is completely standalone, building a
> small test framework around it and measuring performance in _user space_
> is trivial.
Something like this you mean:
> [ ... ] This post is way too long [ ... ]
Many thanks for your report, it is really useful, especially the
details.
> [ ... ] using rsync with --link-dest to btrfs while still
> using rsync, but with btrfs subvolumes and snapshots [1]. [
> ... ] Currently there's ~35TiB of data present on the
[ ... ]
>>> I've got a mostly inactive btrfs filesystem inside a virtual
>>> machine somewhere that shows interesting behaviour: while no
>>> interesting disk activity is going on, btrfs keeps
>>> allocating new chunks, a GiB at a time.
[ ... ]
> Because the allocator keeps walking forward every
> [ ... ] I tried to use eSATA and ext4 first, but observed
> silent data corruption and irrecoverable kernel hangs --
> apparently, SATA is not really designed for external use.
SATA works for external use, eSATA works well, but what really
matters is the chipset of the adapter card.
In my
[ ... ]
>>> $ D='btrfs f2fs gfs2 hfsplus jfs nilfs2 reiserfs udf xfs'
>>> $ find $D -name '*.ko' | xargs size | sed 's/^ *//;s/ .*\t//g'
>>> textfilename
>>> 832719 btrfs/btrfs.ko
>>> 237952 f2fs/f2fs.ko
>>> 251805 gfs2/gfs2.ko
>>> 72731 hfsplus/hfsplus.ko
>>> 171623
>> Approximately 16 hours ago I've run a script that deleted
>> >~100 snapshots and started quota rescan on a large
>> USB-connected btrfs volume (5.4 of 22 TB occupied now).
That "USB-connected is a rather bad idea. On the IRC channel
#Btrfs whenever someone reports odd things happening I ask
> [ ... ] what the signifigance of the xargs size limits of
> btrfs might be. [ ... ] So what does it mean that btrfs has a
> higher xargs size limit than other file systems? [ ... ] Or
> does the lower capacity for argument length for hfsplus
> demonstrate it is the superior file system for
>>> My guess is that very complex risky slow operations like
>>> that are provided by "clever" filesystem developers for
>>> "marketing" purposes, to win box-ticking competitions.
>>> That applies to those system developers who do know better;
>>> I suspect that even some filesystem developers
>> [ ... ] CentOS, Redhat, and Oracle seem to take the position
>> that very large data subvolumes using btrfs should work
>> fine. But I would be curious what the rest of the list thinks
>> about 20 TiB in one volume/subvolume.
> To be sure I'm a biased voice here, as I have multiple
>
> Can you try to first dedup the btrfs volume? This is probably
> out of date, but you could try one of these: [ ... ] Yep,
> that's probably a lot of work. [ ... ] My recollection is that
> btrfs handles deduplication differently than zfs, but both of
> them can be very, very slow
But the big
>>> The way btrfs is designed I'd actually expect shrinking to
>>> be fast in most cases. [ ... ]
>> The proposed "move whole chunks" implementation helps only if
>> there are enough unallocated chunks "below the line". If regular
>> 'balance' is done on the filesystem there will be some, but
>> My guess is that very complex risky slow operations like that are
>> provided by "clever" filesystem developers for "marketing" purposes,
>> to win box-ticking competitions. That applies to those system
>> developers who do know better; I suspect that even some filesystem
>> developers are
> I’ve glazed over on “Not only that …” … can you make youtube
> video of that :)) [ ... ] It’s because I’m special :*
Well played again, that's a fairly credible impersonation of a
node.js/mongodb developer :-).
> On a real note thank’s [ ... ] to much of open source stuff is
> based on short
1 - 100 of 371 matches
Mail list logo