For btrfs,
Raid5 can't go below 2 devs, not 3;
Raid6 can't go below 3 devs, not 4.
Signed-off-by: Gui Hecheng
---
ioctl.h | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/ioctl.h b/ioctl.h
index c3ee270..6742ba6 100644
--- a/ioctl.h
+++ b/ioctl.h
@@ -489,9 +489,9 @@ static
This introduces dedup state and relative operations to mark and unmark
the dedup data range, it'll be used in later patches.
Signed-off-by: Liu Bo
---
fs/btrfs/extent_io.c | 14 ++
fs/btrfs/extent_io.h | 5 +
2 files changed, 19 insertions(+)
diff --git a/fs/btrfs/extent_io.c b
If the ordered extent had an IOERR or something else went wrong we need to
return the space for this ordered extent back to the allocator, but if the
extent is marked as a dedup one, we don't free the space because we just
use the existing space instead of allocating new space.
Signed-off-by: Liu
Hello,
This the 10th attempt for in-band data dedupe, based on Linux _3.14_ kernel.
Data deduplication is a specialized data compression technique for eliminating
duplicate copies of repeating data.[1]
This patch set is also related to "Content based storage" in project ideas[2],
it introduces i
The dedup ref is quite a special one, it is just used to store the hash value
of the extent and cannot be used to find data, so we skip it during backref
walking.
Signed-off-by: Liu Bo
---
fs/btrfs/backref.c| 9 +
fs/btrfs/relocation.c | 3 +++
2 files changed, 12 insertions(+)
diff
The main part of data dedup.
This introduces a FORMAT CHANGE.
Btrfs provides online(inband/synchronous) and block-level dedup.
It maps naturally to btrfs's block back-reference, which enables us
to store multiple copies of data as single copy with references
on that copy.
The workflow is
(1) wr
We need to reset @refs_to_drop to 1 when we're going to delete the last
special dedup reference, otherwise we can trigger (@refs < @refs_to_drop)
and end up with transaction abortion.
Signed-off-by: Liu Bo
---
fs/btrfs/extent-tree.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/fs/btrfs/ex
With the special dedup reference, in the case of (refs == 1) in
__btrfs_free_extent,
we'll actually free the extent, so pinned_bytes of it should not be added to
that
global counter.
Signed-off-by: Liu Bo
---
fs/btrfs/extent-tree.c | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)
d
This adds deduplication subcommands, 'btrfs dedup command ',
including enable/disable/on/off.
- btrfs dedup enable
Create the dedup tree, and it's the very first step when you're going to use
the dedup feature.
- btrfs dedup disable
Delete the dedup tree, after this we're not able to use dedup an
While removing a file with dedup extents, we could have a great number of
delayed refs pending to process, and these refs refer to droping
a ref of the extent, which is of BTRFS_DROP_DELAYED_REF type.
But in order to prevent an extent's ref count from going down to zero when
there still are pendin
In the case of dedupe, btrfs will produce large number of delayed refs, and
processing them can very likely eat all of the space reserved in
global_block_rsv, and we'll end up with transaction abortion due to ENOSPC.
I tried several different ways to reserve more space for global_block_rsv to
hope
Because of dedupe, data space info cannot reflect how many data has
been written, in order to get global_rsv more proper, use total_bytes
instead.
Signed-off-by: Liu Bo
---
fs/btrfs/extent-tree.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/fs/btrfs/extent-tree.c b/fs/
This adds a dedup flag and dedup hash into ordered extent so that
we can insert dedup extents to dedup tree at endio time.
The benefit is simplicity, we don't need to fall back to cleanup dedup
structures if the write is cancelled for some reasons.
Signed-off-by: Liu Bo
---
fs/btrfs/ordered-dat
This is a preparation step for online/inband dedup tree.
It introduces dedup tree and its relatives, including hash driver and
some structures.
Signed-off-by: Liu Bo
---
fs/btrfs/ctree.h | 73
fs/btrfs/disk-io.c | 36
The dedup reference is a special kind of delayed refs, and the delayed refs
are batched to be processed later.
If we find a matched dedup extent, then we queue an ADD delayed ref on it within
endio work, but there is already a DROP delayed ref queued,
t1 t2
Checking for dedup references needs to allocate memory so it cannot
be run within spin_lock, otherwise it will end up with heavy deadlock.
Signed-off-by: Liu Bo
---
fs/btrfs/extent-tree.c | 3 +--
1 file changed, 1 insertion(+), 2 deletions(-)
diff --git a/fs/btrfs/extent-tree.c b/fs/btrfs/exte
The operations consist of finding matched items, adding new items and
removing items.
Signed-off-by: Liu Bo
---
fs/btrfs/ctree.h | 9 +++
fs/btrfs/file-item.c | 210 +++
2 files changed, 219 insertions(+)
diff --git a/fs/btrfs/ctree.h b/fs/b
It's unnecessary to do qgroups accounting without enabling quota.
Signed-off-by: Liu Bo
---
fs/btrfs/ctree.c | 2 +-
fs/btrfs/delayed-ref.c | 18 ++
fs/btrfs/qgroup.c | 3 +++
3 files changed, 18 insertions(+), 5 deletions(-)
diff --git a/fs/btrfs/ctree.c b/fs/btrfs
So far we have 4 commands to control dedup behaviour,
- btrfs dedup enable
Create the dedup tree, and it's the very first step when you're going to use
the dedup feature.
- btrfs dedup disable
Delete the dedup tree, after this we're not able to use dedup any more unless
you enable it again.
- btr
When encountering memory pressure, testers have run into the following
lockdep warning. It was caused by __link_block_group calling kobject_add
with the groups_sem held. kobject_add calls kvasprintf with GFP_KERNEL,
which gets us into reclaim context. The kobject doesn't actually need
to be added u
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1
On 4/9/14, 12:05 PM, Filipe David Manana wrote:
> On Wed, Mar 26, 2014 at 6:11 PM, Jeff Mahoney
> wrote:
>> When encountering memory pressure, testers have run into the
>> following lockdep warning. It was caused by __link_block_group
>> calling kobje
Marc MERLIN posted on Wed, 09 Apr 2014 09:51:34 -0700 as excerpted:
> But since we're talking about this, is btrfsck ever supposed to return
> clean on a clean filesystem?
FWIW, it seems to return clean here, on everything I've tried it on.
But I run relatively small partitions (the biggest is I
On 04/09/2014 12:51 PM, Marc MERLIN wrote:
On Wed, Apr 09, 2014 at 11:46:13AM -0400, Chris Mason wrote:
Downloading the image now. I'd just run a readonly btrfsck /dev/xxx
https://urldefense.proofpoint.com/v1/url?u=http://marc.merlins.org/tmp/btrfs-raid0-image-fsck.txt&k=ZVNjlDMF0FElm4dQtryO4
Hi,
I'm experiencing the following error when performing btrfsck --repair on
the damaged filesystem (some files indicate 'stale NFS handle' errors
when accessing).
checking extents
parent transid verify failed on 19964 wanted 1868 found 1586
parent transid verify failed on 19964 wanted 18
On Mon, Apr 07, 2014 at 01:00:02PM -0700, Marc MERLIN wrote:
> On Mon, Apr 07, 2014 at 03:32:13PM -0400, Chris Mason wrote:
> > >You're recommending that I try btrfs-next on a 3.15 pre kernel, correct?
> > >If so would it be likely to fix my filesystem and let me go back to a
> > >stable 3.14? (I'm
Thanks for the BUG_ON() fix here.
Strangely, I'm now seeing EIO returned for reads following the second
clone-range.
Please see the subsequent xfstests patch.
Cheers, David
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.
With kernel commit 00fdf13a2e9f313a044288aa59d3b8ec29ff904a, the first
clone-range overwrite attempt now fails with EOPNOTSUPP.
FIXME: The second clone-range causes EIO on subsequent read attempts.
Signed-off-by: David Disseldorp
---
tests/btrfs/035 | 10 ++
tests/btrfs/035.out | 5
On Wed, Apr 09, 2014 at 11:46:13AM -0400, Chris Mason wrote:
> Downloading the image now. I'd just run a readonly btrfsck /dev/xxx
http://marc.merlins.org/tmp/btrfs-raid0-image-fsck.txt (6MB)
I admit to not knowing how to read that output, I've only ever seen
thousands of lines of output from it
On Wed, Mar 26, 2014 at 6:11 PM, Jeff Mahoney wrote:
> When encountering memory pressure, testers have run into the following
> lockdep warning. It was caused by __link_block_group calling kobject_add
> with the groups_sem held. kobject_add calls kvasprintf with GFP_KERNEL,
> which gets us into re
On 04/09/2014 11:42 AM, Marc MERLIN wrote:
On Tue, Apr 08, 2014 at 10:31:39PM -0700, Marc MERLIN wrote:
On Tue, Apr 08, 2014 at 09:31:25PM -0700, Marc MERLIN wrote:
On Tue, Apr 08, 2014 at 07:49:14PM -0400, Chris Mason wrote:
On 04/08/2014 06:09 PM, Marc MERLIN wrote:
I forgot to add that
On Tue, Apr 08, 2014 at 10:31:39PM -0700, Marc MERLIN wrote:
> On Tue, Apr 08, 2014 at 09:31:25PM -0700, Marc MERLIN wrote:
> > On Tue, Apr 08, 2014 at 07:49:14PM -0400, Chris Mason wrote:
> > >
> > >
> > > On 04/08/2014 06:09 PM, Marc MERLIN wrote:
> > > >I forgot to add that while I'm not sure
Hi Josef,
2014-03-28 2:56 GMT+08:00 Josef Bacik :
> This was done to allow NO_COW to continue to be NO_COW after relocation but it
> is not right. When relocating we will convert blocks to FULL_BACKREF that we
> relocate. We can leave some of these full backref blocks behind if they are
> not
>
When the csum tree is empty, our leaf (path->nodes[0]) has a number
of items equal to 0 and since btrfs_header_nritems() returns an
unsigned integer (and so is our local nritems variable) the following
comparison always evaluates to false:
if (path->slots[0] >= nritems - 1) {
As the casting
To ease finding bugs during development related to modifying btree leaves
in such a way that it makes its items not sorted by key anymore. Since this
is an expensive check, it's only enabled if CONFIG_BTRFS_FS_CHECK_INTEGRITY
is set, which isn't meant to be enabled for regular users.
Signed-off-by
Swâmi Petaramesh posted on Wed, 09 Apr 2014 13:15:24 +0200 as excerpted:
> In the quest for BTRFS and performance, and having received the advice
> to "chattr +C" my akonadi DB directory to make it noCow, I would like to
> be sure about what will happen when I take a snapshot of the concerned
> BT
On Wed, Apr 09, 2014 at 01:15:24PM +0200, Swâmi Petaramesh wrote:
> Hi,
>
> In the quest for BTRFS and performance, and having received the advice to
> "chattr +C" my akonadi DB directory to make it noCow, I would like to be sure
> about what will happen when I take a snapshot of the concerned B
In close_ctree(), after we have stopped all workers,there maybe still
some read requests(for example readahead) to submit and this *maybe* trigger
an oops that user reported before:
kernel BUG at fs/btrfs/async-thread.c:619!
By hacking codes, i can reproduce this problem with one cpu available.
W
Hi,
In the quest for BTRFS and performance, and having received the advice to
"chattr +C" my akonadi DB directory to make it noCow, I would like to be sure
about what will happen when I take a snapshot of the concerned BTRFS
subvolume.
1/ Being noCow, will the database be modified in the snaps
On Fri, 4 Apr 2014 10:02:27 AM Swâmi Petaramesh wrote:
> However I'm still concerned with chronic BTRFS dreadful performance and
> still find that BRTFS degrades much over time even with periodic defrag
> and "best practices" etc.
That's odd, I've been running it on laptops with SSDs since 2009
When running this script:
dd if=/dev/zero of=seed-disk.img bs=1M seek=1k count=0
dd if=/dev/zero of=test-disk.img bs=1M seek=1k count=0
# Make image of seed device
mkfs.btrfs seed-disk.img
seed_dev=`losetup -f --show seed-disk.img`
mount $seed_dev /mnt/tmp
touch /mnt/tmp/a
umount /mnt/tmp
losetup
On Mon, 7 Apr 2014 11:11:11 AM Austin S Hemmelgarn wrote:
> This is because every other filesystem (except ZFS) doesn't use COW
> semantics.
There is an interesting article on LWN at the moment (subscriber only for the
next day or two, but if you can afford it I'd suggest considering subscribing
On Wed, Apr 09, 2014 at 06:10:40PM +0800, Liu Bo wrote:
> This adds deduplication subcommands, 'btrfs dedup command ',
> including enable/disable/on/off.
>
> - btrfs dedup enable
> Create the dedup tree, and it's the very first step when you're going to use
> the dedup feature.
>
> - btrfs dedup
This adds deduplication subcommands, 'btrfs dedup command ',
including enable/disable/on/off.
- btrfs dedup enable
Create the dedup tree, and it's the very first step when you're going to use
the dedup feature.
- btrfs dedup disable
Delete the dedup tree, after this we're not able to use dedup an
As mentioned at the beginning, I prefer to remove the device remove test
in testcase, then the patch can be reverted without any complain from me.
Thats a wrong approach as well.
The test case 003 is a very real case at the data centers.
Especially the iscsi/san luns go offline/online all t
Sorry, there is a typo in the subject line, thanks cwillu for pointing it out.
s/quata/quota/g.
-liubo
On Wed, Apr 09, 2014 at 03:08:29PM +0800, Liu Bo wrote:
> It's unnecessary to do qgroups accounting without enabling quota.
>
> Signed-off-by: Liu Bo
> ---
> fs/btrfs/ctree.c | 2 +-
On Sat, Apr 5, 2014 at 10:54 AM, Anders Aagaard wrote:
> Hi
>
> I just recently repartitioned my harddrive, and in the process
switched from
> ext4+ecryptfs to dm-crypt and btrfs. I'm on ubuntu 14.04, using kernel
> 3.14.0-031400-generic. I'm using a intel ssd, which btrfs detects
(ssd mode
> ena
Because of dedupe, data space info cannot reflect how many data has
been written, in order to get global_rsv more proper, use total_bytes
instead.
Signed-off-by: Liu Bo
---
fs/btrfs/extent-tree.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/fs/btrfs/extent-tree.c b/fs/
The dedup ref is quite a special one, it is just used to store the hash value
of the extent and cannot be used to find data, so we skip it during backref
walking.
Signed-off-by: Liu Bo
---
fs/btrfs/backref.c| 9 +
fs/btrfs/relocation.c | 3 +++
2 files changed, 12 insertions(+)
diff
The main part of data dedup.
This introduces a FORMAT CHANGE.
Btrfs provides online(inband/synchronous) and block-level dedup.
It maps naturally to btrfs's block back-reference, which enables us
to store multiple copies of data as single copy with references
on that copy.
The workflow is
(1) wr
With the special dedup reference, in the case of (refs == 1) in
__btrfs_free_extent,
we'll actually free the extent, so pinned_bytes of it should not be added to
that
global counter.
Signed-off-by: Liu Bo
---
fs/btrfs/extent-tree.c | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)
d
In the case of dedupe, btrfs will produce large number of delayed refs, and
processing them can very likely eat all of the space reserved in
global_block_rsv, and we'll end up with transaction abortion due to ENOSPC.
I tried several different ways to reserve more space for global_block_rsv to
hope
If the ordered extent had an IOERR or something else went wrong we need to
return the space for this ordered extent back to the allocator, but if the
extent is marked as a dedup one, we don't free the space because we just
use the existing space instead of allocating new space.
Signed-off-by: Liu
This adds deduplication subcommands, 'btrfs dedup command ',
including enable/disable/on/off.
- btrfs dedup enable
Create the dedup tree, and it's the very first step when you're going to use
the dedup feature.
- btrfs dedup disable
Delete the dedup tree, after this we're not able to use dedup an
This is a preparation step for online/inband dedup tree.
It introduces dedup tree and its relatives, including hash driver and
some structures.
Signed-off-by: Liu Bo
---
fs/btrfs/ctree.h | 73
fs/btrfs/disk-io.c | 36
This adds a dedup flag and dedup hash into ordered extent so that
we can insert dedup extents to dedup tree at endio time.
The benefit is simplicity, we don't need to fall back to cleanup dedup
structures if the write is cancelled for some reasons.
Signed-off-by: Liu Bo
---
fs/btrfs/ordered-dat
While removing a file with dedup extents, we could have a great number of
delayed refs pending to process, and these refs refer to droping
a ref of the extent, which is of BTRFS_DROP_DELAYED_REF type.
But in order to prevent an extent's ref count from going down to zero when
there still are pendin
We need to reset @refs_to_drop to 1 when we're going to delete the last
special dedup reference, otherwise we can trigger (@refs < @refs_to_drop)
and end up with transaction abortion.
Signed-off-by: Liu Bo
---
fs/btrfs/extent-tree.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/fs/btrfs/ex
Checking for dedup references needs to allocate memory so it cannot
be run within spin_lock, otherwise it will end up with heavy deadlock.
Signed-off-by: Liu Bo
---
fs/btrfs/extent-tree.c | 3 +--
1 file changed, 1 insertion(+), 2 deletions(-)
diff --git a/fs/btrfs/extent-tree.c b/fs/btrfs/exte
So far we have 4 commands to control dedup behaviour,
- btrfs dedup enable
Create the dedup tree, and it's the very first step when you're going to use
the dedup feature.
- btrfs dedup disable
Delete the dedup tree, after this we're not able to use dedup any more unless
you enable it again.
- btr
The dedup reference is a special kind of delayed refs, and the delayed refs
are batched to be processed later.
If we find a matched dedup extent, then we queue an ADD delayed ref on it within
endio work, but there is already a DROP delayed ref queued,
t1 t2
Hello,
This the 9th attempt for in-band data dedupe.
Data deduplication is a specialized data compression technique for eliminating
duplicate copies of repeating data.[1]
This patch set is also related to "Content based storage" in project ideas[2],
it introduces inband data deduplication for bt
It's unnecessary to do qgroups accounting without enabling quota.
Signed-off-by: Liu Bo
---
fs/btrfs/ctree.c | 2 +-
fs/btrfs/delayed-ref.c | 18 ++
fs/btrfs/qgroup.c | 3 +++
3 files changed, 18 insertions(+), 5 deletions(-)
diff --git a/fs/btrfs/ctree.c b/fs/btrfs
The operations consist of finding matched items, adding new items and
removing items.
Signed-off-by: Liu Bo
---
fs/btrfs/ctree.h | 9 +++
fs/btrfs/file-item.c | 210 +++
2 files changed, 219 insertions(+)
diff --git a/fs/btrfs/ctree.h b/fs/b
This introduces dedup state and relative operations to mark and unmark
the dedup data range, it'll be used in later patches.
Signed-off-by: Liu Bo
---
fs/btrfs/extent_io.c | 14 ++
fs/btrfs/extent_io.h | 5 +
2 files changed, 19 insertions(+)
diff --git a/fs/btrfs/extent_io.c b
64 matches
Mail list logo