On Thu, Jan 6, 2011 at 12:36 AM, Josef Bacik jo...@redhat.com wrote:
Here are patches to do offline deduplication for Btrfs. It works well for the
cases it's expected to, I'm looking for feedback on the ioctl interface and
such, I'm well aware there are missing features for the userspace app
I have been thinking a lot about de-duplication for a backup application
I am writing. I wrote a little script to figure out how much it would
save me. For my laptop home directory, about 100 GiB of data, it was a
couple of percent, depending a bit on the size of the chunks. With 4 KiB
chunks, I
Chris Mason wrote:
Excerpts from Gordan Bobic's message of 2011-01-05 12:42:42 -0500:
Josef Bacik wrote:
Basically I think online dedup is huge waste of time and completely useless.
I couldn't disagree more. First, let's consider what is the
general-purpose use-case of data deduplication.
On Thu, Jan 06, 2011 at 10:37:46AM +0100, Tomasz Chmielewski wrote:
I have been thinking a lot about de-duplication for a backup application
I am writing. I wrote a little script to figure out how much it would
save me. For my laptop home directory, about 100 GiB of data, it was a
couple of
Spelic wrote:
On 01/06/2011 02:03 AM, Gordan Bobic wrote:
That's just alarmist. AES is being cryptanalyzed because everything
uses it. And the news of it's insecurity are somewhat exaggerated (for
now at least).
Who cares... the fact of not being much used is a benefit for RIPEMD /
Tomasz Chmielewski wrote:
I have been thinking a lot about de-duplication for a backup application
I am writing. I wrote a little script to figure out how much it would
save me. For my laptop home directory, about 100 GiB of data, it was a
couple of percent, depending a bit on the size of the
This patch comes from Forced readonly mounts on errors ideas.
As we know, this is the first step in being more fault tolerant of disk
corruptions instead of just using BUG() statements.
The major content:
- add a framework for generating errors that should result in filesystems
going
Gordan Bobic wrote:
Josef Bacik wrote:
Basically I think online dedup is huge waste of time and completely
useless.
I couldn't disagree more. First, let's consider what is the
general-purpose use-case of data deduplication. What are the resource
requirements to perform it? How do these
Simon Farnsworth wrote:
The basic idea is to use fanotify/inotify (whichever of the notification
systems works for this) to track which inodes have been written to. It can
then mmap() the changed data (before it's been dropped from RAM) and do the
same process as an offline dedupe (hash,
Gordan Bobic wrote:
Simon Farnsworth wrote:
The basic idea is to use fanotify/inotify (whichever of the notification
systems works for this) to track which inodes have been written to. It
can then mmap() the changed data (before it's been dropped from RAM) and
do the same process as an
On Thursday, January 06, 2011 05:48:18 am you wrote:
Can you elaborate what you're talking about here? How does the length of
a directory name affect alignment of file block contents? I don't see
how variability of length matters, other than to make things a lot more
complicated.
I'm saying in
Peter A wrote:
On Thursday, January 06, 2011 05:48:18 am you wrote:
Can you elaborate what you're talking about here? How does the length of
a directory name affect alignment of file block contents? I don't see
how variability of length matters, other than to make things a lot more
complicated.
On Thu, Jan 06, 2011 at 12:18:34PM +, Simon Farnsworth wrote:
Gordan Bobic wrote:
Josef Bacik wrote:
snip
Then again, for a lot of use-cases there are perhaps better ways to
achieve the targed goal than deduping on FS level, e.g. snapshotting or
something like fl-cow:
Ondřej Bílka wrote:
Then again, for a lot of use-cases there are perhaps better ways to
achieve the targed goal than deduping on FS level, e.g. snapshotting or
something like fl-cow:
http://www.xmailserver.org/flcow.html
As VM are concerned fl-cow is poor replacement of deduping.
Depends on
Tomasz Torcz wrote:
On Thu, Jan 06, 2011 at 02:19:04AM +0100, Spelic wrote:
CPU can handle considerably more than 250 block hashings per
second. You could argue that this changes in cases of sequential
I/O on big files, but a 1.86GHz GHz Core2 can churn through
111MB/s of SHA256, which even
On Thursday, January 06, 2011 09:00:47 am you wrote:
Peter A wrote:
I'm saying in a filesystem it doesn't matter - if you bundle everything
into a backup stream, it does. Think of tar. 512 byte allignment. I tar
up a directory with 8TB total size. No big deal. Now I create a new,
empty
Peter A wrote:
On Thursday, January 06, 2011 09:00:47 am you wrote:
Peter A wrote:
I'm saying in a filesystem it doesn't matter - if you bundle everything
into a backup stream, it does. Think of tar. 512 byte allignment. I tar
up a directory with 8TB total size. No big deal. Now I create a
On Thu, Jan 06, 2011 at 02:41:28PM +, Gordan Bobic wrote:
Ondřej Bílka wrote:
Then again, for a lot of use-cases there are perhaps better ways to
achieve the targed goal than deduping on FS level, e.g. snapshotting or
something like fl-cow:
http://www.xmailserver.org/flcow.html
As VM
On 05.12.2010, Milan Broz wrote:
It still seems to like dmcrypt with its parallel processing is just
trigger to another bug in 37-rc.
To come back to this: my 3 systems (XFS filesystem) running the latest
dm-crypt-scale-to-multiple-cpus patch from Andi Kleen/Milan Broz have
not showed a
On Thursday, January 06, 2011 10:07:03 am you wrote:
I'd be interested to see the evidence of the variable length argument.
I have a sneaky suspicion that it actually falls back to 512 byte
blocks, which are much more likely to align, when more sensibly sized
blocks fail. The downside is that
On Thursday 06 of January 2011 10:51:04 Mike Hommey wrote:
On Thu, Jan 06, 2011 at 10:37:46AM +0100, Tomasz Chmielewski wrote:
I have been thinking a lot about de-duplication for a backup application
I am writing. I wrote a little script to figure out how much it would
save me. For my
I am trying to understand how btrfs works with Raid1.
Is it possible to create the filesystem with -m raid1 -d raid1 in which
there is only one device available when the filesystem is created. Is
it possible to refer to a second device as missing
The use case I am thinking of is converting
tor 2011-01-06 klockan 04:10 +0300 skrev Vasiliy G Tolstov:
Hello. I have two questions:
1) When btrfs can be used in systems, that can be rebooted unexpectedly?
btrfs fsck is ready to use? does after power failure or hard reboot file
system can be damaged and can't be corrected?
If you turn
Is there a reason one can not mount a btrfs and it's byte-for-byte
copy in parallel, or is the driver just acting silly?
Mini example:
# dd if=/dev/zero of=fs1 bs=1 count=1 seek=$((1024*1024*1024-1))
# mkfs.btrfs fs1
# cp --sparse=always fs1 fs2
# mkdir 1 2
# mount fs1 1 -o loop
# mount fs2 2 -o
Just a quick update, I've dropped the hashing stuff in favor of doing a memcmp
in the kernel to make sure the data is still the same. The thing that takes a
while is reading the data up from disk, so doing a memcmp of the entire buffer
isn't that big of a deal, not to mention there's a possiblity
This program very basically does dedup. It searches a directory recursively and
scans all of the files looking for 64k extents and hashing them to figure out if
there are any duplicates. After that it calls the btrfs same extent ioctl for
all of the duplicates in order to dedup the space on
This adds the ability for userspace to tell btrfs which extents match eachother.
You pass in
-a logical offset
-a length
-a list of file descriptors with their logical offset
and this ioctl will split up the extent on the target file and then link all of
the files with the target files extent
Excerpts from Peter A's message of 2011-01-05 22:58:36 -0500:
On Wednesday, January 05, 2011 08:19:04 pm Spelic wrote:
I'd just make it always use the fs block size. No point in making it
variable.
Agreed. What is the reason for variable block size?
First post on this list - I
I am setting up a backup server for the garage, to back up my HTPC in case of
theft or fire. The HTPC has a 4TB RAID10 array (mdadm, JFS), and will be
connected to the backup server using GB ethernet. The backup server will have
a 4TB BTRFS RAID0 array. Debian Testing running on both.
I
Hi list
These traces appeared on my CEPH nodes (node 1 and 2) while running some iozone
tests on the client.
Let me know if you need any other details.
- Jan
[18565.087014] [ cut here ]
[18565.088005] kernel BUG at fs/btrfs/inode.c:6403!
[18565.088005] invalid opcode:
This adds the ability for userspace to tell btrfs which extents match eachother.
You pass in
-a logical offset
-a length
-a list of file descriptors with their logical offset
and this ioctl will split up the extent on the target file and then link all of
the files with the target files extent
On Thu, Jan 6, 2011 at 9:35 AM, Carl Cook cac...@quantum-sci.com wrote:
I am setting up a backup server for the garage, to back up my HTPC in case of
theft or fire. The HTPC has a 4TB RAID10 array (mdadm, JFS), and will be
connected to the backup server using GB ethernet. The backup server
On 01/06/11 19:35, Hugo Mills wrote:
On Thu, Jan 06, 2011 at 05:36:48PM +0100, Ivan Labáth wrote:
Is there a reason one can not mount a btrfs and it's byte-for-byte
copy in parallel, or is the driver just acting silly?
Probably because both filesystems have identical UUIDs (and labels)
2011/1/6 Freddie Cash fjwc...@gmail.com:
On Thu, Jan 6, 2011 at 9:35 AM, Carl Cook cac...@quantum-sci.com wrote:
I am setting up a backup server for the garage, to back up my HTPC in case
of theft or fire. The HTPC has a 4TB RAID10 array (mdadm, JFS), and will be
connected to the backup
On Thu, Jan 6, 2011 at 11:33 AM, Marcin Kuk marcin@gmail.com wrote:
Rsync is good, but not for all cases. Be aware of databases files -
you should do snapshot filesystem before rsyncing.
We script a dump of all databases before the rsync runs, so we get
both text and binary backups. If
Hi,
Does btrfs support atomic file data replaces? Basically, the atomic
variant of this:
// old stage
open(O_TRUNC)
write() // 0+ times
close()
// new state
--
Olaf
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More
On Thu, Jan 6, 2011 at 1:47 PM, Freddie Cash fjwc...@gmail.com wrote:
On Thu, Jan 6, 2011 at 11:33 AM, Marcin Kuk marcin@gmail.com wrote:
Rsync is good, but not for all cases. Be aware of databases files -
you should do snapshot filesystem before rsyncing.
We script a dump of all
On Fri, Jan 7, 2011 at 12:35 AM, Carl Cook cac...@quantum-sci.com wrote:
I want to keep a duplicate copy of the HTPC data, on the backup server
Is there a BTRFS tool that would do this?
AFAIK zfs is the only opensource filesystem today that can transfer
block-level delta between two snapshots,
On Thu, Jan 6, 2011 at 12:07 PM, C Anthony Risinger anth...@extof.me wrote:
On Thu, Jan 6, 2011 at 1:47 PM, Freddie Cash fjwc...@gmail.com wrote:
On Thu, Jan 6, 2011 at 11:33 AM, Marcin Kuk marcin@gmail.com wrote:
Rsync is good, but not for all cases. Be aware of databases files -
you
On Thu, Jan 6, 2011 at 2:13 PM, Freddie Cash fjwc...@gmail.com wrote:
On Thu, Jan 6, 2011 at 12:07 PM, C Anthony Risinger anth...@extof.me wrote:
On Thu, Jan 6, 2011 at 1:47 PM, Freddie Cash fjwc...@gmail.com wrote:
On Thu, Jan 6, 2011 at 11:33 AM, Marcin Kuk marcin@gmail.com wrote:
Rsync
Unfortunately, we don't use btrfs or LVM on remote servers, so there's
no snapshotting available during the backup run. In a perfect world,
btrfs would be production-ready, ZFS would be available on Linux, and
we'd no longer need the abomination called LVM. :)
As a matter of fact, ZFS _IS_
(switched to email. Please respond via emailed reply-to-all, not via the
bugzilla web interface).
On Thu, 6 Jan 2011 20:59:08 GMT
bugzilla-dae...@bugzilla.kernel.org wrote:
https://bugzilla.kernel.org/show_bug.cgi?id=26242
Summary: BUG: unable to handle kernel NULL pointer
On Thu, Jan 6, 2011 at 1:06 PM, Gordan Bobic gor...@bobich.net wrote:
Unfortunately, we don't use btrfs or LVM on remote servers, so there's
no snapshotting available during the backup run. In a perfect world,
btrfs would be production-ready, ZFS would be available on Linux, and
we'd no
On Thu 06 January 2011 11:16:49 Freddie Cash wrote:
Just run rsync on the backup server, tell it to connect via ssh to the
remote server, and rsync / (root filesystem) into /backups/htpc/ (or
whatever directory you want). Use an exclude file to exclude the
directories you don't want backed up
On Thu 06 January 2011 12:12:13 Fajar A. Nugraha wrote:
With other filesystems, something like rsync + LVM snapshot is
probably your best bet, and it doesn't really care what filesystem you
use.
I'm not running LVM though. Is this where the snapshotting ability comes from?
--
To unsubscribe
On Thu 06 January 2011 12:07:17 C Anthony Risinger wrote:
as for the DB stuff, you definitely need to snapshot _before_ rsync. roughly:
) read lock and flush tables
) snapshot
) unlock tables
) mount snapshot
) rsync from snapshot
ie. the same as whats needed for LVM:
On 01/06/2011 06:35 PM, Carl Cook wrote:
I want to keep a duplicate copy of the HTPC data, on the backup
server, and I think a regular full file copy is not optimal and may
take days to do. So I'm looking for a way to sync the arrays at some
interval. Ideally the sync would scan the HTPC
It seems to me that we leak the memory allocated to 'value' in
btrfs_get_acl() if the call to posix_acl_from_xattr() fails.
Here's a patch that attempts to correct that problem.
Signed-off-by: Jesper Juhl j...@chaosbits.net
---
acl.c |4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)
On Thu, Jan 6, 2011 at 1:42 PM, Carl Cook cac...@quantum-sci.com wrote:
On Thu 06 January 2011 11:16:49 Freddie Cash wrote:
Also with this system, I'm concerned that if there is corruption on the
HTPC, it could be propagated to the backup server. Is there some way to
address this?
On 01/06/2011 09:44 PM, Carl Cook wrote:
On Thu 06 January 2011 12:07:17 C Anthony Risinger wrote:
as for the DB stuff, you definitely need to snapshot _before_ rsync. roughly:
) read lock and flush tables
) snapshot
) unlock tables
) mount snapshot
) rsync from snapshot
ie. the same as
On Thu, Jan 6, 2011 at 1:44 PM, Carl Cook cac...@quantum-sci.com wrote:
On Thu 06 January 2011 12:07:17 C Anthony Risinger wrote:
as for the DB stuff, you definitely need to snapshot _before_ rsync.
roughly:
) read lock and flush tables
) snapshot
) unlock tables
) mount snapshot
)
On Thu 06 January 2011 13:58:41 Freddie Cash wrote:
Simplest solution is to write a script to create a mysqldump of all
databases into a directory, add that to cron so that it runs at the
same time everyday, 10-15 minutes before the rsync run is done. That
way, rsync to the backup server
On 01/06/2011 10:26 PM, Carl Cook wrote:
On Thu 06 January 2011 13:58:41 Freddie Cash wrote:
Simplest solution is to write a script to create a mysqldump of all
databases into a directory, add that to cron so that it runs at the
same time everyday, 10-15 minutes before the rsync run is done.
On Thu 06 January 2011 14:26:30 Carl Cook wrote:
According To Doyle...
Er, Hoyle...
I am trying to create a multi-device BTRFS system using two identical drives.
I want them to be raid 0 for no redunancy, and a total of 4TB.
But in the wiki it says nothing about using fdisk to set up the drive
On Fri, Jan 7, 2011 at 5:26 AM, Carl Cook cac...@quantum-sci.com wrote:
On Thu 06 January 2011 13:58:41 Freddie Cash wrote:
Simplest solution is to write a script to create a mysqldump of all
databases into a directory, add that to cron so that it runs at the
same time everyday, 10-15 minutes
After doing something silly (not sure what yet) with a server's
4-drive btrfs raid1 root, I've been booting off a 5th drive (also on
btrfs) while poking at the original array. I've found that errors
triggered by poking around on the mounted-but-broken 4-drive raid (on
/mnt) cause the system to
[50010.838804] [ cut here ]
[50010.838931] kernel BUG at fs/btrfs/inode.c:1616!
[50010.839053] invalid opcode: [#1] PREEMPT SMP
[50010.839185] last sysfs file: /sys/module/nf_conntrack/parameters/hashsize
[50010.839307] CPU 0
[50010.839313] Modules linked in:
A similar bug has been reported by Kenneth Lakin [kennethla...@gmail.com] last
week. It's related to my check-in (git commit
914ee295af418e936ec20a08c1663eaabe4cd07a). I am looking into it now.
I found one suspicious code in prepage_pages (fs/btrfs/file.c):
start_pos = pos
I have checked latest mkfs code. If page size is 4k. sector size will be 4k
too. So at least for x86 hardware, page size and sector size will always be the
same.
-Original Message-
From: linux-btrfs-ow...@vger.kernel.org
[mailto:linux-btrfs-ow...@vger.kernel.org] On Behalf Of Zhong,
59 matches
Mail list logo