date:20120223

Re: [RFC] btrfs auto snapshot

2012-02-23 Thread Fajar A. Nugraha

On Thu, Aug 18, 2011 at 12:38 AM, Matthias G. Eckermann m...@suse.com wrote:
 Ah, sure. Sorry.  Packages for blocxx for:
        Fedora_14       Fedora_15
        RHEL-5          RHEL-6
        SLE_11_SP1
        openSUSE_11.4   openSUSE_Factory

 are available in the openSUSE buildservice at:

        http://download.opensuse.org/repositories/home:/mge1512:/snapper/


Hi Matthias,

I'm testing your packages on top of RHEL6 + kernel 3.2.7. A small
suggestion, you should include /etc/sysconfig/snapper in the package
(at least for RHEL6, haven't tested the other ones). Even if it just
contains

SNAPPER_CONFIGS=

Thanks,

Fajar
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: 3.3 restripe between different raid levels

2012-02-23 Thread Ilya Dryomov

On Wed, Feb 22, 2012 at 07:14:44PM +, Alex wrote:
 [Referring to https://lkml.org/lkml/2012/1/17/381], and perhaps I'm a bit
 previous, but what are the command sequence to change the raid levels?
 
 Wouldn't mind being pointed to git manual if better for you.

Look at http://article.gmane.org/gmane.comp.file-systems.btrfs/15211.

The only syntax change that has been merged since then is that you can
invoke balancing commands w/o 'filesystem' prefix, so instead of 'btrfs
fi balance' you can (and should) use 'btrfs balance'.

To get the code you can pull for-chris branch from my repo:

git://github.com/idryomov/btrfs-progs.git for-chris

I'll add proper man pages shortly.

Thanks,

Ilya
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: Is there any data recovery tool?

2012-02-23 Thread qasdfgtyuiop

OK.

On Wed, Feb 22, 2012 at 8:58 PM, Duncan 1i5t5.dun...@cox.net wrote:
 qasdfgtyuiop posted on Tue, 21 Feb 2012 20:11:06 +0800 as excerpted:

 I'm using GNU/linux with btrfs root. My filesystem is created with
 command mkfs.btrfs /dev/sda .  Today I'm trying to install Microsoft
 Windows 7 on /dev/sdb , a 16GB esata ssd.  After the installation, I
 found that Windows create a hidden NTFS partition called System
 Reserved on the first 100MB of my /dev/sda and that my btrfs filesystem
 was lost!  I have searched google for help but I got no useful
 information.  Is there any data recovery tools?

 The btrfs kernel option says:

 Btrfs filesystem (EXPERIMENTAL) Unstable disk format

 Its description says in part:

 Btrfs is highly experimental, and THE DISK FORMAT IS NOT YET FINALIZED.
 You should say N here unless you are interested in testing Btrfs with non-
 critical data. [...] If unsure, say N.

 The front page and getting started pages of the wiki (see URL below)
 also heavily emphasize the development aspect and backups, and the source
 code section has this to say:

 Warning, Btrfs evolves very quickly do not test it unless:

    You have good backups and you have tested the restore capability
    You have a backup installation that you can switch to when
        something breaks
    You are willing to report any issues you find
    You can apply patches and compile the latest btrfs code against your
        kernel (quite easy with git and dkms, see below)
    You acknowledge that btrfs may eat your data
    Backups! Backups! Backups!


 Given all that, any data you store on btrfs is by definition not particularly
 important, either because you have it backed up in a more stable format
 elsewhere (which might be the net, or local), or because the data really
 /isn't/ particularly important to you in the first place, or you'd have
 made and tested backups (naturally, always test recovery from your backups,
 as an untested backup is worse than none, since it's likely to give you
 a false sense of security) before putting it on the after all still
 experimental and under heavy development btrfs in the first place.

 Thus, you shouldn't need to worry about a data recovery tool, since
 you can either simply restore from backups (which since you tested
 recovery, you're already familiar with the recovery procedures),
 or the data was simply garbage you were using for testing and didn't
 care about losing anyway.


 Never-the-less, yes, there's a recovery tool, naturally experimental
 just like the filesystem itself at this point, but there is one.  Testing
 and suggestions for improvements, especially with patches, will be
 welcomed.

 It seems you need to read up on the wiki, which covers this among other
 things.  There's an older version on btrfs.wiki.kernel.org, but that's
 not updated ATM due to restrictions in place since the kernel.org
 breakin some months ago.  The temporary (but six months and counting,
 I believe) replacement is at btrfs.ipv5.de:

 http://btrfs.ipv5.de/index.php?title=Main_Page

 The restore and find-root commands from btrfs-progs are specifically
 covered on this page:

 http://btrfs.ipv5.de/index.php?title=Restore

 If you wish to try a newer copy of btrfs-progs (after all, it's all
 still in development, and bugs are fixed all the time), you'll also want
 to read:

 http://btrfs.ipv5.de/index.php?title=Getting_started#Compiling_Btrfs_from_sources


 --
 Duncan - List replies preferred.   No HTML msgs.
 Every nonfree program has a lord, a master --
 and if you use the program, he is your master.  Richard Stallman

 --
 To unsubscribe from this list: send the line unsubscribe linux-btrfs in
 the body of a message to majord...@vger.kernel.org
 More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 1/2] treewide: fix memory corruptions when TASK_COMM_LEN != 16

2012-02-23 Thread Jan Engelhardt


On Thursday 2012-02-23 10:57, Andrew Morton wrote:

But there's more,
 
 24931 ?S  0:00  \_ [btrfs-endio-met]
   \_ [kconservative/5]
   \_ [ext4-dio-unwrit]
 
 [with a wondersome patch:] $ grep Name /proc/{29431,29432}/stat*
 /proc/29431/status:Name: btrfs-endio-meta-1
 /proc/29432/status:Name: btrfs-endio-meta-write-1
  Name: kconservative/512
  Name: ext4-dio-unwritten

doh.  The fix for that is to have less clueless btrfs developers.

And truncate their names to SUNWbtfs, ORCLintg and EXT4diou?
I think not :)
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [RFC] btrfs auto snapshot

2012-02-23 Thread Anand Jain




 autosnap code is available either end of this week or early
 next week and what you will notice is autosnap snapshots
 are named using uuid.

 Main reason to drop time-stamp based names is that,
- test (clicking on Take-snapshot button) which took more
  than one snapshot per second was failing.
- a more descriptive creation time is available using a
   command line option as in the example below.
 -
 # btrfs su list -t tag=@minute,parent=/btrfs/sv1 /btrfs
 /btrfs/.autosnap/6c0dabfa-5ddb-11e1-a8c1-0800271feb99 Thu Feb 23 13:01:18 2012 
/btrfs/sv1 @minute
 /btrfs/.autosnap/5669613e-5ddd-11e1-a644-0800271feb99 Thu Feb 23 13:15:01 2012 
/btrfs/sv1 @minute
 -
 
 As of now code for time-stamp as autosnap snapshot name is

 commented out, if more people wanted it to be a time-stamp
 based names, I don't mind having that way. Please do let me know.

Thanks, Anand
  


On Thursday 23,February,2012 06:37 PM, Hubert Kario wrote:

On Wednesday 17 of August 2011 10:15:46 Anand Jain wrote:

   btrfs auto snapshot feature will include:
   Initially:

snip

   - snapshot destination will be subvol/.btrfs/snapshot@time  and
 snapshot/.btrfs/snapshot@time  for subvolume and snapshot
 respectively


Is there some reason not to use the format used by shadow_copy2 overlay for
Samba? (The one providing Shadow Volume Copy functionality for Windows
clients):

Current date in this format you get like this:

@GMT-`date -u '+%Y.%m.%d-%H.%M.%S'`

For example: @GMT-2012.02.23-10.34.32

This way, when the volume is exported using Samba, you can easily export
past copies too, without creating links.

Regards,

--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: 3.3 restripe between different raid levels

2012-02-23 Thread Alex


Thank you very much Ilya.

--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [RFC] btrfs auto snapshot

2012-02-23 Thread Hubert Kario

On Thursday 23 of February 2012 20:02:38 Anand Jain wrote:
   autosnap code is available either end of this week or early
   next week and what you will notice is autosnap snapshots
   are named using uuid.

   Main reason to drop time-stamp based names is that,
  - test (clicking on Take-snapshot button) which took more
than one snapshot per second was failing.
  - a more descriptive creation time is available using a
 command line option as in the example below.
   -
   # btrfs su list -t tag=@minute,parent=/btrfs/sv1 /btrfs
   /btrfs/.autosnap/6c0dabfa-5ddb-11e1-a8c1-0800271feb99 Thu Feb 23 13:01:18
 2012 /btrfs/sv1 @minute
 /btrfs/.autosnap/5669613e-5ddd-11e1-a644-0800271feb99 Thu Feb 23 13:15:01
 2012 /btrfs/sv1 @minute -

   As of now code for time-stamp as autosnap snapshot name is
   commented out, if more people wanted it to be a time-stamp
   based names, I don't mind having that way. Please do let me know.

I'd say, that having it as configure option (do Samba-style snapshot naming
vs. uuid based) would be sufficient. The question remains what should be the
default.

That being said, what use-case would require snapshots taken more often than
every second? I doubt that you actually can do snapshots every second on a
busy file system, let alone more often. On lightly-used one they will be
identical and just clutter the name-space.

Regards,
--
Hubert Kario
QBS - Quality Business Software
02-656 Warszawa, ul. Ksawerów 30/85
tel. +48 (22) 646-61-51, 646-74-24
www.qbs.com.pl

smime.p7s
Description: S/MIME cryptographic signature

Re: [PATCH] Btrfs: clear the extent uptodate bits during parent transid failures

2012-02-23 Thread Chris Mason

On Thu, Feb 23, 2012 at 10:12:26AM +0800, Liu Bo wrote:
 On 02/23/2012 01:43 AM, Chris Mason wrote:
  Normally I just toss patches into git, but this one is pretty subtle and
  I wanted to send it around for extra review.  QA at Oracle did a test
  where they unplugged one drive of a btrfs raid1 mirror for a while and
  then plugged it back in.
  
  The end result is that we have a whole bunch of out-of-date blocks on
  the bad mirror.  The btrfs parent transid pointers are supposed to
  detect these bad blocks and then we're supposed to read from the good
  copy instead.
  
  The good news is we did detect the bad blocks.  The bad news is we
  didn't jump over to the good mirror instead.  This patch explains why:
  
  Author: Chris Mason chris.ma...@oracle.com
  Date:   Wed Feb 22 12:36:24 2012 -0500
  
  Btrfs: clear the extent uptodate bits during parent transid failures
  
  If btrfs reads a block and finds a parent transid mismatch, it clears
  the uptodate flags on the extent buffer, and the pages inside it.  But
  we only clear the uptodate bits in the state tree if the block straddles
  more than one page.
  
  This is from an old optimization from to reduce contention on the extent
  state tree.  But it is buggy because the code that retries a read from
  a different copy of the block is going to find the uptodate state bits
  set and skip the IO.
  
  The end result of the bug is that we'll never actually read the good
  copy (if there is one).
  
  The fix here is to always clear the uptodate state bits, which is safe
  because this code is only called when the parent transid fails.
  
 
 Reviewed-by: Liu Bo liubo2...@cn.fujitsu.com

Thanks!

 
 or we can be safer:
 
 diff --git a/fs/btrfs/extent_io.c b/fs/btrfs/extent_io.c
 index fcf77e1..c1fe25d 100644
 --- a/fs/btrfs/extent_io.c
 +++ b/fs/btrfs/extent_io.c
 @@ -3859,8 +3859,12 @@ int clear_extent_buffer_uptodate(struct extent_io_tree 
 *tree,
   }
   for (i = 0; i  num_pages; i++) {
   page = extent_buffer_page(eb, i);
 - if (page)
 + if (page) {
 + u64 start = (u64)page-index  PAGE_CACHE_SHIFT;
 + u64 end = start + PAGE_CACHE_SIZE - 1;
 +
   ClearPageUptodate(page);
 + clear_extent_uptodate(tree, start, end, NULL, GFP_NOFS);
   }
   return 0;
  }

Hmmm, I'm not sure this is safer.  Our readpage trusts the extent
uptodate bits unconditionally, so we should really clear them
unconditionally as well.

-chris
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 1/2] treewide: fix memory corruptions when TASK_COMM_LEN != 16

2012-02-23 Thread Andrew Morton

On Thu, 23 Feb 2012 12:19:28 +0100 (CET) Jan Engelhardt jeng...@medozas.de 
wrote:

 
 On Thursday 2012-02-23 10:57, Andrew Morton wrote:
 
 But there's more,
  
  24931 ?S  0:00  \_ [btrfs-endio-met]
\_ [kconservative/5]
\_ [ext4-dio-unwrit]
  
  [with a wondersome patch:] $ grep Name /proc/{29431,29432}/stat*
  /proc/29431/status:Name: btrfs-endio-meta-1
  /proc/29432/status:Name: btrfs-endio-meta-write-1
   Name: kconservative/512
   Name: ext4-dio-unwritten
 
 doh.  The fix for that is to have less clueless btrfs developers.
 
 And truncate their names to SUNWbtfs, ORCLintg and EXT4diou?
 I think not :)

Teach ps(1) to look in /proc/pid/status for kernel threads?
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 1/2] treewide: fix memory corruptions when TASK_COMM_LEN != 16

2012-02-23 Thread Jan Engelhardt


On Thursday 2012-02-23 18:30, Andrew Morton wrote:

Teach ps(1) to look in /proc/pid/status for kernel threads?

To what end? The name in /proc/pid/status was also limited to
TASK_COMM_LEN.
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Strange prformance degradation when COW writes happen at fixed offsets

2012-02-23 Thread Nik Markovic

Hi,

My kernel version is 32-bit 3.2.0-rc5 and using btrfs-tools 0.19

I was having performance issues with BTRFS with fragmentation and
HDDs, so I decided to switch to an SSD to see if these would go away.
Performance was much better but at times, I would see a freeze
happen which I can't really explain. The CPU would spike up to 100%
at times.

I decided to try reproduce this, hough it may or may not be related,
while testing BTFS performance, I encountered this interesting problem
where performance would depend on whether a file is freshly copied
onto a BTRFS filesystem or obtained via COW children. This is all
happening on a Crucial M4 SSD, so something on the SSD firmware could
be causing the issue but I feel it's related to BTRFS  metadata.

Here is the test:
1. Write a fresh large file to the file system called A
2. Make a reflink of A COW copy B
3. Modify a set of random blocks on B
4. Remove A
5. Repeat 2-5 but use newly produced B as new A

Expected results:
Each steps takes equal amount of time to complete on an SSD because
there is no fragmentation involved and the system is in the same state
at #2 because there's always only one file on the filesystem.

I used 1GB file as my source. I repeated tests using different
algorithms for the write in step #2 above.
Algorithm 1 (random): Write 8 bytes randomly
Algorithm 2 (fixed): Write first 8 bytes and continue at 50k offsets
Algorithm 3 (incremental): Write first 8 bytes at offset = random
(50k) then continue at 50k offsets
For each test, there were 40k writes total. Algorithm is in the Java code below.

The following is observed with each iteration ONLY when using algorithm #3
1. Over time, the time to modify the file increases
2. Over time, the time to make the reflink copy increases
3. Over time, the time to remove the file increases
4. First few writes take less then normal time to complete.

Data for 1st/5th/10th/15th/20th iteration:
Algorithm 1 and 2:
Always Write:6s
Always Copy: 0.5s
Always Remove: 0.10s

Algorithm 2:
Write: 2/6/9/10/11.5
Copy: 0.5/3/4.5/5.5/6
Remove: 0.1/1/2/2/2

As you can see, things degrade and taper off after the 10th iteration.
This probably has to do with 4k block size being near 50k/10. I don't
think this has to do with SSD garbage collection because I ran these
tests multiple times.

To use this script, cd into an empty directory on a btrfs filesystem
and and run it with incremental as argument. You can use other modes
to confirm expected behavior.
Script used to produce the bug:
#!/bin/bash

mode=$1
if [ -z $mode ]; then
echo Usage $0 incremental|random|fixed
exit -1
fi
mode=$1

src=`pwd`/test/src
dst=`pwd`/test/dst
srcfile=$src/test.tar
dstfile=$dst/test.tar

mkdir -p $src
mkdir -p $dst

filesize=100MB

#build a 1GB file from a smaller download. You can tweak filesize and
the loop below for lower bandwidth
if [ ! -f $srcfile ]; then
cd $src
if [ ! -f $srcfile.dl ]; then
wget http://download.thinkbroadband.com/${filesize}.zip
--output-document=$srcfile.dl
fi
rm -rf tarbase
mkdir tarbase
for  i in {1..10}; do
cp --reflink=always $srcfile.dl tarbase/$i.dl
done
tar -cvf $srcfile tarbase
rm -rf tarbase
fi

cat END  $src/FileTest.java
import java.io.IOException;
import java.io.RandomAccessFile;
public class FileTest {
public static final int BLOCK_SIZE = 5;
public static final int MAX_ITERATIONS = 4;
public static void main(String args[]) throws IOException {
String mode = args[0];
RandomAccessFile f = new RandomAccessFile(args[1], rw);
//int offset = 0;
int i;
int offset = new java.util.Random().nextInt(BLOCK_SIZE); //
initializer ONLY for incremental mode
for (i=0; i  MAX_ITERATIONS; i++) {
try {
int writeOffset;
if (mode.equals(incremental)) {
writeOffset = new
java.util.Random().nextInt(offset + i * BLOCK_SIZE);
} else { // mode.equals random
writeOffset = new
java.util.Random().nextInt(((int)f.length() - 100));
offset = writeOffset; // for reporting it at the end
}
f.seek(writeOffset);
f.writeBytes(DEADBEEF);
} catch (java.io.IOException e) {
System.out.println(EOF);
break;
}
}
System.out.print(Last offset= + offset);
System.out.println(. Made  + i +  random writes.);
f.close();
}
}

END

cd $src
javac FileTest.java


/usr/bin/time --format 'rm: %E' rm -rf $dst/*
cp --reflink=always $srcfile.dl $dst/1.tst
cd $dst
for i in {1..20}; do
echo -n $i.
i_plus=`expr $i + 1`
/usr/bin/time --format 'write: %E' java -cp $src FileTest $mode $i.tst
/usr/bin/time --format 'cp:%E' cp --reflink=always $i.tst 
$i_plus.tst
/usr/bin/time --format

Re: Set nodatacow per file?

2012-02-23 Thread dima


On 02/13/2012 04:17 PM, Ralf-Peter Rohbeck wrote:

Hello,
is it possible to set nodatacow on a per-file basis? I couldn't find
anything.
If not, wouldn't that be a great feature to get around the performance
issues with VM and database storage? Of course cloning should still
cause COW.


Hello,
Going back to the original question from Ralf I wanted to share my 
experience.


Yesterday I set up KVM+qemu and set -z -C with David's 'fileflags' 
utility for the VM image file.
I was very pleased with results - Redhat 6 Minimal installation was 
installed in 10 minutes whereas it was taking 'forever' the last time I 
tried it some 4 months ago. Writes during installation were very 
moderate. Performance of VM is excellent. Installing some big packages 
with yum inside VM goes very quickly with the speed indistinguishable 
from that of bare metal installs.


I am not quite sure should this improvement be attributed to the nocow 
and nocompress flags or to the overall improvement of btrfs (I am on 
3.3-rc4 kernel) but KVM is definitely more than usable on btrfs now.


I am yet to test the install speed and performance without those flags set.

best
~dima
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: git resources

2012-02-23 Thread Anand Jain





(When we have this I shall update the btrfs wiki)


 As promised an article was posted here sometime back.
 
  Writing patch for btrfs

  http://btrfs.ipv5.de/index.php?title=Writing_patch_for_btrfs

 It was long waiting in my mail draft as kernel.org was down
 sorry for the delay.

-Anand
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [RFC] btrfs auto snapshot

2012-02-23 Thread Anand Jain




 Thanks for the inputs.  there is no clear winner as of now.
 Let me keep the uuid for now, if more sysadmin feel timestamp
 is better we could device it that way.

-Anand

--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [RFC] btrfs auto snapshot

2012-02-23 Thread Fahrzin Hemmati

I'd like to vote for timestamp/timestamp-uuid as a sysadmin. The 
timestamp allows for easy conversion from clients' wants to actual 
commands: I need my data from two days ago is easy when I have 
timestamps to use.


On 2/23/2012 10:05 PM, Anand Jain wrote:



 Thanks for the inputs.  there is no clear winner as of now.
 Let me keep the uuid for now, if more sysadmin feel timestamp
 is better we could device it that way.

-Anand

--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: Strange prformance degradation when COW writes happen at fixed offsets

2012-02-23 Thread Duncan

Nik Markovic posted on Thu, 23 Feb 2012 20:31:02 -0600 as excerpted:

 I noticed a few errors in the script that I used. I corrected it and it
 seems that degradation is occurring even at fully random writes:

I don't have an ssd, but is it possible that you're simply seeing erase-
block related degradation due to multi-write-block sized erase-blocks?

It seems to me that when originally written to the btrfs-on-ssd, the file 
will likely be written block-sequentially enough that the file as a whole 
takes up relatively few erase-blocks.  As you COW-write individual 
blocks, they'll be written elsewhere, perhaps all the changed blocks to a 
new erase-block, perhaps each to a different erase block.

As you increase the successive COW generation count, the file's file-
system/write blocks will be spread thru more and more erase-blocks, 
basically fragmentation but of the SSD-critical type, into more and more 
erase blocks, thus affecting modification and removal time but not read 
time.

IIRC I saw a note about this on the wiki, in regard to the nodatacow 
mount-option.  Let's see if I can find it again.  Hmm... yes...

http://btrfs.ipv5.de/index.php?title=Getting_started#Mount_Options

In particular this (for nodatacow, read the rest as there's additional 
implications):


Performance gain is usually  5% unless the workload is random writes to 
large database files, where the difference can become very large.


In addition to nodatacow, see the note on the autodefrag option.

IOW, with the repeated generations of random-writes to cow-copies, you're 
apparently triggering a cow-worst-case fragmentation situation.  It 
shouldn't affect read-time much on SSD, but it certainly will affect copy 
and erase time, as the data and metadata (which as you'll recall is 2X by 
default on btrfs) gets written to more and more blocks that need updated 
at copy/erase time, 


That /might/ be the problem triggering the freezes you noted that set off 
the original investigation as well, if the SSD firmware is running out of 
erase blocks and having to pause access while it rearranges data to allow 
operations to continue.  Since your original issue on rotating rust 
drives was fragmentation, rewriting would seem to be something you do 
quite a lot of, triggering different but similar-cause issues on SSDs as 
well.

FWIW, with that sort of database-style workload, large files constantly 
random-change rewritten, something like xfs might be more appropriate 
than btrfs.  See the recent xfs presentations (were they at ScaleX or 
LinuxConf.au? both happened about the same time and were covered in the 
same LWN weekly edition) as covered a couple weeks ago on LWN for more.

-- 
Duncan - List replies preferred.   No HTML msgs.
Every nonfree program has a lord, a master --
and if you use the program, he is your master.  Richard Stallman

--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [RFC] btrfs auto snapshot

Re: 3.3 restripe between different raid levels

Re: Is there any data recovery tool?

Re: [PATCH 1/2] treewide: fix memory corruptions when TASK_COMM_LEN != 16

Re: [RFC] btrfs auto snapshot

Re: 3.3 restripe between different raid levels

Re: [RFC] btrfs auto snapshot

Re: [PATCH] Btrfs: clear the extent uptodate bits during parent transid failures

Re: [PATCH 1/2] treewide: fix memory corruptions when TASK_COMM_LEN != 16

Re: [PATCH 1/2] treewide: fix memory corruptions when TASK_COMM_LEN != 16

Strange prformance degradation when COW writes happen at fixed offsets

Re: Set nodatacow per file?

Re: git resources

Re: [RFC] btrfs auto snapshot

Re: [RFC] btrfs auto snapshot

Re: Strange prformance degradation when COW writes happen at fixed offsets

16 matches

Site Navigation

Mail list logo

Footer information