Re: btrfs for files 10GB = random spontaneous CRC failure.

2013-02-05 Thread Tomasz Kusmierz

On 16/01/13 09:21, Bernd Schubert wrote:

On 01/16/2013 12:32 AM, Tom Kusmierz wrote:


p.s. bizzare that when I fill ext4 partition with test data everything
check's up OK (crc over all files), but with Chris tool it gets
corrupted - for both Adaptec crappy pcie controller and for mother board
built in one. Also since courses of history proven that my testing
facilities are crap - any suggestion's on how can I test ram, cpu 
controller would be appreciated.


Similar issues had been the reason we wrote ql-fstest at q-leap. Maybe 
you could try that? You can easily see the pattern of the corruption 
with that. But maybe Chris' stress.sh also provides it.
Anyway, I yesterday added support to specify min and max file size, as 
it before only used 1MiB to 1GiB sizes... It's a bit cryptic with 
bits, though, I will improve that later.

https://bitbucket.org/aakef/ql-fstest/downloads


Cheers,
Bernd


PS: But see my other thread, using ql-fstest I yesterday entirely 
broke a btrfs test file system resulting in kernel panics.


Hi,

Its been a while, but I think I should provide a definite anwser or 
simply what was the cause of whole problem:


It was a printer!

Long story short, I was going nuts trying to diagnose which bit of my 
server is going bad and effectively I was down to blaming a interface 
card that connects hotswapable disks to mobo / pcie controllers. When 
I've got back from my holiday I've sat in front of server and decided to 
go with ql-fstest which in a very nice way reports errors with a very 
low lag (~2 minutes) after they occurred. At this point my printer 
kicked in with self clean and error just showed up after ~ two minutes 
- so I've restarted printer and while it was going through it's own post 
with self clean another error showed up. Issue here turned out to be 
that I was using one of those fantastic pci 4 port ethernet cards and 
printer was directly to it - after moving it and everything else to 
switch all problem and issues have went away. AT the moment I'm running 
server for 2 weeks without any corruptions, any random kernel btrfs 
crashes etc.



Anyway I wanted to thank again to Chris and rest of btrFS dev people for 
this fantastic filesystem that let me discover how stupid setup I was 
running and how deep into shiet I've put my self.


CHEERS LADS !



Tom.
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: btrfs for files 10GB = random spontaneous CRC failure.

2013-02-05 Thread Chris Mason
On Tue, Feb 05, 2013 at 03:16:34AM -0700, Tomasz Kusmierz wrote:
 On 16/01/13 09:21, Bernd Schubert wrote:
  On 01/16/2013 12:32 AM, Tom Kusmierz wrote:
 
  p.s. bizzare that when I fill ext4 partition with test data everything
  check's up OK (crc over all files), but with Chris tool it gets
  corrupted - for both Adaptec crappy pcie controller and for mother board
  built in one. Also since courses of history proven that my testing
  facilities are crap - any suggestion's on how can I test ram, cpu 
  controller would be appreciated.
 
  Similar issues had been the reason we wrote ql-fstest at q-leap. Maybe 
  you could try that? You can easily see the pattern of the corruption 
  with that. But maybe Chris' stress.sh also provides it.
  Anyway, I yesterday added support to specify min and max file size, as 
  it before only used 1MiB to 1GiB sizes... It's a bit cryptic with 
  bits, though, I will improve that later.
  https://bitbucket.org/aakef/ql-fstest/downloads
 
 
  Cheers,
  Bernd
 
 
  PS: But see my other thread, using ql-fstest I yesterday entirely 
  broke a btrfs test file system resulting in kernel panics.
 
 Hi,
 
 Its been a while, but I think I should provide a definite anwser or 
 simply what was the cause of whole problem:
 
 It was a printer!
 
 Long story short, I was going nuts trying to diagnose which bit of my 
 server is going bad and effectively I was down to blaming a interface 
 card that connects hotswapable disks to mobo / pcie controllers. When 
 I've got back from my holiday I've sat in front of server and decided to 
 go with ql-fstest which in a very nice way reports errors with a very 
 low lag (~2 minutes) after they occurred. At this point my printer 
 kicked in with self clean and error just showed up after ~ two minutes 
 - so I've restarted printer and while it was going through it's own post 
 with self clean another error showed up. Issue here turned out to be 
 that I was using one of those fantastic pci 4 port ethernet cards and 
 printer was directly to it - after moving it and everything else to 
 switch all problem and issues have went away. AT the moment I'm running 
 server for 2 weeks without any corruptions, any random kernel btrfs 
 crashes etc.

Wow, I've never heard that one before.  You might want to try a
different 4 port card and/or report it to the driver maintainer.  That
shouldn't happen ;)

ql-fstest looks neat, I'll check it out (thanks Bernd).
 
-chris

--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: btrfs for files 10GB = random spontaneous CRC failure.

2013-02-05 Thread Roman Mamedov
On Tue, 05 Feb 2013 10:16:34 +
Tomasz Kusmierz tom.kusmi...@gmail.com wrote:

 that I was using one of those fantastic pci 4 port ethernet cards and 
 printer was directly to it - after moving it and everything else to 
 switch all problem and issues have went away. AT the moment I'm running 
 server for 2 weeks without any corruptions, any random kernel btrfs 
 crashes etc.

If moving the printer over to a switch helped, perhaps it is indeed an
electrical interference problem, but if your card is an old one from Sun, keep
in mind that they also have some problems with DMA on machines with large
amounts of RAM:

  sunhme experiences corrupt packets if machine has more than 2GB of memory
  https://bugzilla.kernel.org/show_bug.cgi?id=10790

Not hard to envision a horror story scenario where a rogue network card would
shred your filesystem buffer cache with network packets DMAed all over it,
like bullets from a machine gun :) But in reality afaik IOMMU is supposed to
protect against this.

-- 
With respect,
Roman


signature.asc
Description: PGP signature


Re: btrfs for files 10GB = random spontaneous CRC failure.

2013-02-05 Thread Tomasz Kusmierz

On 05/02/13 12:49, Chris Mason wrote:

On Tue, Feb 05, 2013 at 03:16:34AM -0700, Tomasz Kusmierz wrote:

On 16/01/13 09:21, Bernd Schubert wrote:

On 01/16/2013 12:32 AM, Tom Kusmierz wrote:


p.s. bizzare that when I fill ext4 partition with test data everything
check's up OK (crc over all files), but with Chris tool it gets
corrupted - for both Adaptec crappy pcie controller and for mother board
built in one. Also since courses of history proven that my testing
facilities are crap - any suggestion's on how can I test ram, cpu 
controller would be appreciated.

Similar issues had been the reason we wrote ql-fstest at q-leap. Maybe
you could try that? You can easily see the pattern of the corruption
with that. But maybe Chris' stress.sh also provides it.
Anyway, I yesterday added support to specify min and max file size, as
it before only used 1MiB to 1GiB sizes... It's a bit cryptic with
bits, though, I will improve that later.
https://bitbucket.org/aakef/ql-fstest/downloads


Cheers,
Bernd


PS: But see my other thread, using ql-fstest I yesterday entirely
broke a btrfs test file system resulting in kernel panics.

Hi,

Its been a while, but I think I should provide a definite anwser or
simply what was the cause of whole problem:

It was a printer!

Long story short, I was going nuts trying to diagnose which bit of my
server is going bad and effectively I was down to blaming a interface
card that connects hotswapable disks to mobo / pcie controllers. When
I've got back from my holiday I've sat in front of server and decided to
go with ql-fstest which in a very nice way reports errors with a very
low lag (~2 minutes) after they occurred. At this point my printer
kicked in with self clean and error just showed up after ~ two minutes
- so I've restarted printer and while it was going through it's own post
with self clean another error showed up. Issue here turned out to be
that I was using one of those fantastic pci 4 port ethernet cards and
printer was directly to it - after moving it and everything else to
switch all problem and issues have went away. AT the moment I'm running
server for 2 weeks without any corruptions, any random kernel btrfs
crashes etc.

Wow, I've never heard that one before.  You might want to try a
different 4 port card and/or report it to the driver maintainer.  That
shouldn't happen ;)

ql-fstest looks neat, I'll check it out (thanks Bernd).
  
-chris


I've forgot to mention that server sits on UPS, and printer is directly 
connected to mains - when thinking of it, it creates an ground shift 
effect since nothing on cheap PSU got real ground. But anyway this is 
not a fault of this 4 port card, I've tried moving it to cheap ne2000 
and to motherboard integrated one and effect was the same. Also 
diagnostics was veeery problematic because beside of having a corruption 
on hdd memtest was returning corruptions in ram, but on a very rare 
occation, also a cpu test was returning corruption on 1 / day basis. 
I've replaced nearly everything on this server - including psu (to 1400W 
from my dev rig) to make NO difference. I should mention as well that 
this printer is a colour laser printer which got 4 drums to clean, so I 
would assume that it produces enough static electricity to power a small 
cattle.


ps. it shouldn't be an driver issue since errors in ram were 1 - 4 bit 
big located in same 32 bit word - hence i think a single transfer had to 
be corrupt rather than whole eth packet showed into random memory.

--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: btrfs for files 10GB = random spontaneous CRC failure.

2013-02-05 Thread Tomasz Kusmierz

On 05/02/13 13:46, Roman Mamedov wrote:

On Tue, 05 Feb 2013 10:16:34 +
Tomasz Kusmierz tom.kusmi...@gmail.com wrote:


that I was using one of those fantastic pci 4 port ethernet cards and
printer was directly to it - after moving it and everything else to
switch all problem and issues have went away. AT the moment I'm running
server for 2 weeks without any corruptions, any random kernel btrfs
crashes etc.

If moving the printer over to a switch helped, perhaps it is indeed an
electrical interference problem, but if your card is an old one from Sun, keep
in mind that they also have some problems with DMA on machines with large
amounts of RAM:

   sunhme experiences corrupt packets if machine has more than 2GB of memory
   https://bugzilla.kernel.org/show_bug.cgi?id=10790

Not hard to envision a horror story scenario where a rogue network card would
shred your filesystem buffer cache with network packets DMAed all over it,
like bullets from a machine gun :) But in reality afaik IOMMU is supposed to
protect against this.

As I said in reply to Chris it was definitely and electrical issue. Back 
in the days when cat5 eth was a novelty I've learnt hard way a simple 
lesson - don't be skimp, always separate with switch. I've learnt it on 
networks where parties were not necessary powered from same circuit or 
even supply phase. Since this setup is limited to my home I've violated 
my own old rule - and it back fired on me.


Anyway thanks for info on sunhme - WOW 
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: btrfs for files 10GB = random spontaneous CRC failure.

2013-01-15 Thread Lars Weber

Hi,

i had a similar scenario like Tomasz:
- Started with single 3TB Disk.
- Filled the 3TB Disk with a lot of files (more than 30 with 10-30GB)
- Added 2x 1,5TB Disks
- btrfs balance start dconvert=raid1 mconvert=raid1 $MOUNT
- # btrfs scrub start $MOUNT
- # btrfs scrub status $MOUNT

scrub status for $ID
scrub started at Tue Jan 15 07:10:15 2013 and finished after 24020 
seconds

total bytes scrubbed: 4.30TB with 0 errors

so at least it is no general bug in btrfs - maybe this helps you...

# uname -a
Linux n40l 3.7.2 #1 SMP Sun Jan 13 11:46:56 CET 2013 x86_64 GNU/Linux
# btrfs version
Btrfs v0.20-rc1-37-g91d9ee

Regards
Lars

Am 14.01.2013 17:34, schrieb Chris Mason:

On Mon, Jan 14, 2013 at 09:32:25AM -0700, Tomasz Kusmierz wrote:

On 14/01/13 15:57, Chris Mason wrote:

On Mon, Jan 14, 2013 at 08:22:36AM -0700, Tomasz Kusmierz wrote:

On 14/01/13 14:59, Chris Mason wrote:

On Mon, Jan 14, 2013 at 04:09:47AM -0700, Tomasz Kusmierz wrote:

Hi,

Since I had some free time over Christmas, I decided to conduct few
tests over btrFS to se how it will cope with real life storage for
normal gray users and I've found that filesystem will always mess up
your files that are larger than 10GB.

Hi Tom,

I'd like to nail down the test case a little better.

1) Create on one drive, fill with data
2) Add a second drive, convert to raid1
3) find corruptions?

What happens if you start with two drives in raid1?  In other words, I'm
trying to see if this is a problem with the conversion code.

-chris

Ok, my description might be a bit enigmatic so to cut long story short
tests are:
1) create a single drive default btrfs volume on single partition -
fill with test data - scrub - admire errors.
2) create a raid1 (-d raid1 -m raid1) volume with two partitions on
separate disk, each same size etc. - fill with test data - scrub -
admire errors.
3) create a raid10 (-d raid10 -m raid1) volume with four partitions on
separate disk, each same size etc. - fill with test data - scrub -
admire errors.

all disks are same age + size + model ... two different batches to avoid
same time failure.

Ok, so we have two possible causes.  #1 btrfs is writing garbage to your
disks.  #2 something in your kernel is corrupting your data.

Since you're able to see this 100% of the time, lets assume that if #2
were true, we'd be able to trigger it on other filesystems.

So, I've attached an old friend, stress.sh.  Use it like this:

stress.sh -n 5 -c your source directory -s your btrfs mount point

It will run in a loop with 5 parallel processes and make 5 copies of
your data set into the destination.  It will run forever until there are
errors.  You can use a higher process count (-n) to force more
concurrency and use more ram.  It may help to pin down all but 2 or 3 GB
of your memory.

What I'd like you to do is find a data set and command line that make
the script find errors on btrfs.  Then, try the same thing on xfs or
ext4 and let it run at least twice as long.  Then report back ;)

-chris


Chris,

Will do, just please be remember that 2TB of test data on customer
grade sata drives will take a while to test :)

Many thanks.  You might want to start with a smaller data set, 20GB or
so total.

-chris

--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


--
ADC-Ingenieurbüro Wiedemann | In der Borngasse 12 | 57520 Friedewald | Tel: 
02743-930233 | Fax: 02743-930235 | www.adc-wiedemann.de
GF: Dipl.-Ing. Hendrik Wiedemann | Umsatzsteuer-ID: DE 147979431

--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: btrfs for files 10GB = random spontaneous CRC failure.

2013-01-15 Thread Tom Kusmierz

On 14/01/13 16:34, Chris Mason wrote:

On Mon, Jan 14, 2013 at 09:32:25AM -0700, Tomasz Kusmierz wrote:

On 14/01/13 15:57, Chris Mason wrote:

On Mon, Jan 14, 2013 at 08:22:36AM -0700, Tomasz Kusmierz wrote:

On 14/01/13 14:59, Chris Mason wrote:

On Mon, Jan 14, 2013 at 04:09:47AM -0700, Tomasz Kusmierz wrote:

Hi,

Since I had some free time over Christmas, I decided to conduct few
tests over btrFS to se how it will cope with real life storage for
normal gray users and I've found that filesystem will always mess up
your files that are larger than 10GB.

Hi Tom,

I'd like to nail down the test case a little better.

1) Create on one drive, fill with data
2) Add a second drive, convert to raid1
3) find corruptions?

What happens if you start with two drives in raid1?  In other words, I'm
trying to see if this is a problem with the conversion code.

-chris

Ok, my description might be a bit enigmatic so to cut long story short
tests are:
1) create a single drive default btrfs volume on single partition -
fill with test data - scrub - admire errors.
2) create a raid1 (-d raid1 -m raid1) volume with two partitions on
separate disk, each same size etc. - fill with test data - scrub -
admire errors.
3) create a raid10 (-d raid10 -m raid1) volume with four partitions on
separate disk, each same size etc. - fill with test data - scrub -
admire errors.

all disks are same age + size + model ... two different batches to avoid
same time failure.

Ok, so we have two possible causes.  #1 btrfs is writing garbage to your
disks.  #2 something in your kernel is corrupting your data.

Since you're able to see this 100% of the time, lets assume that if #2
were true, we'd be able to trigger it on other filesystems.

So, I've attached an old friend, stress.sh.  Use it like this:

stress.sh -n 5 -c your source directory -s your btrfs mount point

It will run in a loop with 5 parallel processes and make 5 copies of
your data set into the destination.  It will run forever until there are
errors.  You can use a higher process count (-n) to force more
concurrency and use more ram.  It may help to pin down all but 2 or 3 GB
of your memory.

What I'd like you to do is find a data set and command line that make
the script find errors on btrfs.  Then, try the same thing on xfs or
ext4 and let it run at least twice as long.  Then report back ;)

-chris


Chris,

Will do, just please be remember that 2TB of test data on customer
grade sata drives will take a while to test :)

Many thanks.  You might want to start with a smaller data set, 20GB or
so total.

-chris


Chris  all,

Sorry for not replying for that long but Chris old friend stress.sh 
have proven that all my storage is affected with this bug and first 
thing was to bring everything down before corruptions will spread any 
further. Anyway for subject sake btrfs stress have failed after 2h, ext4 
stress have failed after 8h (according to time ./stress.sh blablabla ) 
- so it might be related to that ext4 always seamed slower on my machine 
than btrfs.



Anyway I wanted to use this opportunity to thank Chris and everybody 
related to btrfs development - your file system found a hidden bug in my 
set up that would be there until it would pretty much corrupt 
everything. I don't even want to think how much my main storage got 
corrupted over time (etx4 over lvm over md raid 5).


p.s. bizzare that when I fill ext4 partition with test data everything 
check's up OK (crc over all files), but with Chris tool it gets 
corrupted - for both Adaptec crappy pcie controller and for mother board 
built in one. Also since courses of history proven that my testing 
facilities are crap - any suggestion's on how can I test ram, cpu  
controller would be appreciated.



--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: btrfs for files 10GB = random spontaneous CRC failure.

2013-01-15 Thread Chris Mason
On Tue, Jan 15, 2013 at 04:32:10PM -0700, Tom Kusmierz wrote:
 Chris  all,
 
 Sorry for not replying for that long but Chris old friend stress.sh 
 have proven that all my storage is affected with this bug and first 
 thing was to bring everything down before corruptions will spread any 
 further. Anyway for subject sake btrfs stress have failed after 2h, ext4 
 stress have failed after 8h (according to time ./stress.sh blablabla ) 
 - so it might be related to that ext4 always seamed slower on my machine 
 than btrfs.

Ok, great.  These problems are really hard to debug, and I'm glad we've
nailed it down to the lower layers.

 
 
 Anyway I wanted to use this opportunity to thank Chris and everybody 
 related to btrfs development - your file system found a hidden bug in my 
 set up that would be there until it would pretty much corrupt 
 everything. I don't even want to think how much my main storage got 
 corrupted over time (etx4 over lvm over md raid 5).
 
 p.s. bizzare that when I fill ext4 partition with test data everything 
 check's up OK (crc over all files), but with Chris tool it gets 
 corrupted - for both Adaptec crappy pcie controller and for mother board 
 built in one.

One really hard part of tracking down corruptions is that our boxes have
so much ram right now that they are often hidden by the page cache.  My
first advice is to boot with much less ram (1G/2G) or pin down all your
ram for testing.  A problem that triggers in 10 minutes is a billion
times easier to figure out than one that triggers in 8 hours.

 Also since courses of history proven that my testing 
 facilities are crap - any suggestion's on how can I test ram, cpu  
 controller would be appreciated.

Step one is to figure out if you've got a CPU/memory problem or an IO problem.
memtest is often able to find CPU and memory problems, but if you pass
memtest I like to use gcc for extra hard testing.

If you have the ram, make a copy of the linux kernel tree in /dev/shm or
any ramdisk/tmpfs mount.  Then run make -j ; make clean in a loop until
your box either crashes, gcc reports an internal compiler error, or 16
hours go by.  Your loop will need to check for failed makes and stop
once you get the first failure.

Hopefully that will catch it.  Otherwise we need to look at the IO
stack.

-chris
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


btrfs for files 10GB = random spontaneous CRC failure.

2013-01-14 Thread Tomasz Kusmierz

Hi,

Since I had some free time over Christmas, I decided to conduct few 
tests over btrFS to se how it will cope with real life storage for 
normal gray users and I've found that filesystem will always mess up 
your files that are larger than 10GB.


Long story:
I've used my set of data that I've got nicelly backed up on personal 
raid 5 to populate btrfs volumes: music, slr pics and video (an just a 
few document). Disks used in test are all green 2TB disks from WD.


1. First I started with creating btrfs (4k blocks) on one disk, filling 
it up and then adding second disk - convert to raid1 through balance - 
convert to raid10 trough balance. Unfortunately converting to raid1 
failed - because of CRC error in 49 files that vere bigger  10GB. At 
this point I was a bit spooked up that my controllers are failing or 
that drives got some bad sectors. Tested everything (took few days) and 
it turns out that there is no apparent issue with hardware (bad 
sectors or io down to disks).
2. At this point I thought cool this will be a perfect test case for 
scrub to show it's magical power!. Created raid1 over two volumes - 
try scrubbing - FAIL ... It turns out that magically I've got corrupted 
CRC in two exactly same logical locations on two different disks (~34 
files  10GB affected) hence scrub can't do anything with it. It only 
reports it as uncorrectable errors
3. Performed same test on raid10 setup (still 4k block). Same results 
(just different file count).


Ok, time to dig more into this because it starts get intriguing. I'm 
running ubuntu server 12.10 (64bit) with stock kernel, so my next step 
was to get 3.7.1 kernel + new btrfs tool straight from git repo.
Unfortunatelly 1  2  3 still provides same results, corrupt CRC only 
in files  10GB.
At this point I thought fine maybe when I'll expand allocation block - 
it will make less block needed for big file to fit in resulting in 
propperly storing those - time for 16K leafs :) (-n 16K -l 16K) 
sectors are still 4K for known reasons :P. Well, it does exactly the 
same thing - 1  2  3 same results, big files get automagically corrupt.



Something about test data:
music - not more than 200MB files (tipical mix of mp3  aac) 10 K files 
give or take.
pics - not more than 20MB (typical point  shot + dslr) 6K files give or 
take.
video1 - collection of little ones with size more than 300MB, less than 
1.5GB ~ 400 files

video2 - collection of 5GB - 18GB files ~400 files

I guess that stating that files 10GB are only affected is a long 
shot, but so far I've not seen file less than 10GB affected (I was not 
really thorough about checking size, but all files that size I've 
checked were more than 10GB)


ps. As a footnote I'll add that I've tried shuffling test 1, 2  3 
without video2 and it all work just fine.


If you've got any ideas for work around ( other than zfs :D ) I'm happy 
to try it out.


Tom.
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


btrfs for files 10GB = random spontaneous CRC failure.

2013-01-14 Thread Tomasz Kusmierz

Hi,

Since I had some free time over Christmas, I decided to conduct few 
tests over btrFS to se how it will cope with real life storage for 
normal gray users and I've found that filesystem will always mess up 
your files that are larger than 10GB.


Long story:
I've used my set of data that I've got nicelly backed up on personal 
raid 5 to populate btrfs volumes: music, slr pics and video (an just a 
few document). Disks used in test are all green 2TB disks from WD.


1. First I started with creating btrfs (4k blocks) on one disk, filling 
it up and then adding second disk - convert to raid1 through balance - 
convert to raid10 trough balance. Unfortunately converting to raid1 
failed - because of CRC error in 49 files that vere bigger  10GB. At 
this point I was a bit spooked up that my controllers are failing or 
that drives got some bad sectors. Tested everything (took few days) and 
it turns out that there is no apparent issue with hardware (bad 
sectors or io down to disks).
2. At this point I thought cool this will be a perfect test case for 
scrub to show it's magical power!. Created raid1 over two volumes - 
try scrubbing - FAIL ... It turns out that magically I've got corrupted 
CRC in two exactly same logical locations (~34 files  10GB affected).
3. Performed same test on raid10 setup (still 4k block). Same results 
(just diffrent file count).


Ok, time to dig more into this because it starts get intriguing. I'm 
running ubuntu server 12.10 with stock kernel, so my next step was to 
get 3.7.1 kernel + new btrfs tool straight from git repo.
Unfortunatelly 1  2  3 still provides same results, corrupt CRC only 
in files  10GB.
At this point I thought fine maybe when I'll expand allocation block - 
it will make less block needed for big file to fit in resulting in 
propperly storing those - time for 16K leafs :) (-n 16K -l 16K) 
sectors are still 4K for known reasons :P. Well, it does exactly the 
same thing - 1  2  3 same results, big files get automagically corrupt.



Something about test data:
music - not more than 200MB files (tipical mix of mp3  aac) 10 K files 
give or take.
pics - not more than 20MB (typical point  shot + dslr) 6K files give or 
take.
video1 - collection of little ones with size more than 300MB, less than 
1.5GB ~ 400 files

video2 - collection of 5GB - 18GB files ~400 files

I guess that stating that files 10GB are only affected is a long 
shot, but so far I've not seen file less than 10GB affected (I was not 
really thorough about checking size, but all files that size I've 
checked were more than 10GB)


ps. As a footnote I'll add that I've tried shuffling test 1, 2  3 
without video2 and it all work just fine.


If you've got any ideas for work around ( other than zfs :D ) I'm happy 
to try it out.


Tom.
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: btrfs for files 10GB = random spontaneous CRC failure.

2013-01-14 Thread Roman Mamedov
Hello,

On Mon, 14 Jan 2013 11:17:17 +
Tomasz Kusmierz tom.kusmi...@gmail.com wrote:

 this point I was a bit spooked up that my controllers are failing or 

Which controller manufacturer/model?

-- 
With respect,
Roman

~~~
Stallman had a printer,
with code he could not see.
So he began to tinker,
and set the software free.


signature.asc
Description: PGP signature


Re: btrfs for files 10GB = random spontaneous CRC failure.

2013-01-14 Thread Tomasz Kusmierz

On 14/01/13 11:25, Roman Mamedov wrote:

Hello,

On Mon, 14 Jan 2013 11:17:17 +
Tomasz Kusmierz tom.kusmi...@gmail.com wrote:


this point I was a bit spooked up that my controllers are failing or

Which controller manufacturer/model?

Well, this is a home server (which I preffer to tinker on). Two 
controllers were used, mother board build in, and crappy Adaptec pcie one.


00:11.0 SATA controller: Advanced Micro Devices [AMD] nee ATI 
SB7x0/SB8x0/SB9x0 SATA Controller [AHCI mode]

02:00.0 RAID bus controller: Adaptec Serial ATA II RAID 1430SA (rev 02)


ps. MoBo is: ASUS M4A79T Deluxe
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: btrfs for files 10GB = random spontaneous CRC failure.

2013-01-14 Thread Chris Mason
On Mon, Jan 14, 2013 at 04:09:47AM -0700, Tomasz Kusmierz wrote:
 Hi,
 
 Since I had some free time over Christmas, I decided to conduct few 
 tests over btrFS to se how it will cope with real life storage for 
 normal gray users and I've found that filesystem will always mess up 
 your files that are larger than 10GB.

Hi Tom,

I'd like to nail down the test case a little better.

1) Create on one drive, fill with data
2) Add a second drive, convert to raid1
3) find corruptions?

What happens if you start with two drives in raid1?  In other words, I'm
trying to see if this is a problem with the conversion code.

-chris
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: btrfs for files 10GB = random spontaneous CRC failure.

2013-01-14 Thread Tomasz Kusmierz

On 14/01/13 14:59, Chris Mason wrote:

On Mon, Jan 14, 2013 at 04:09:47AM -0700, Tomasz Kusmierz wrote:

Hi,

Since I had some free time over Christmas, I decided to conduct few
tests over btrFS to se how it will cope with real life storage for
normal gray users and I've found that filesystem will always mess up
your files that are larger than 10GB.

Hi Tom,

I'd like to nail down the test case a little better.

1) Create on one drive, fill with data
2) Add a second drive, convert to raid1
3) find corruptions?

What happens if you start with two drives in raid1?  In other words, I'm
trying to see if this is a problem with the conversion code.

-chris
Ok, my description might be a bit enigmatic so to cut long story short 
tests are:
1) create a single drive default btrfs volume on single partition - 
fill with test data - scrub - admire errors.
2) create a raid1 (-d raid1 -m raid1) volume with two partitions on 
separate disk, each same size etc. - fill with test data - scrub - 
admire errors.
3) create a raid10 (-d raid10 -m raid1) volume with four partitions on 
separate disk, each same size etc. - fill with test data - scrub - 
admire errors.


all disks are same age + size + model ... two different batches to avoid 
same time failure.

--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: btrfs for files 10GB = random spontaneous CRC failure.

2013-01-14 Thread Chris Mason
On Mon, Jan 14, 2013 at 08:22:36AM -0700, Tomasz Kusmierz wrote:
 On 14/01/13 14:59, Chris Mason wrote:
  On Mon, Jan 14, 2013 at 04:09:47AM -0700, Tomasz Kusmierz wrote:
  Hi,
 
  Since I had some free time over Christmas, I decided to conduct few
  tests over btrFS to se how it will cope with real life storage for
  normal gray users and I've found that filesystem will always mess up
  your files that are larger than 10GB.
  Hi Tom,
 
  I'd like to nail down the test case a little better.
 
  1) Create on one drive, fill with data
  2) Add a second drive, convert to raid1
  3) find corruptions?
 
  What happens if you start with two drives in raid1?  In other words, I'm
  trying to see if this is a problem with the conversion code.
 
  -chris
 Ok, my description might be a bit enigmatic so to cut long story short 
 tests are:
 1) create a single drive default btrfs volume on single partition - 
 fill with test data - scrub - admire errors.
 2) create a raid1 (-d raid1 -m raid1) volume with two partitions on 
 separate disk, each same size etc. - fill with test data - scrub - 
 admire errors.
 3) create a raid10 (-d raid10 -m raid1) volume with four partitions on 
 separate disk, each same size etc. - fill with test data - scrub - 
 admire errors.
 
 all disks are same age + size + model ... two different batches to avoid 
 same time failure.

Ok, so we have two possible causes.  #1 btrfs is writing garbage to your
disks.  #2 something in your kernel is corrupting your data.

Since you're able to see this 100% of the time, lets assume that if #2
were true, we'd be able to trigger it on other filesystems.

So, I've attached an old friend, stress.sh.  Use it like this:

stress.sh -n 5 -c your source directory -s your btrfs mount point

It will run in a loop with 5 parallel processes and make 5 copies of
your data set into the destination.  It will run forever until there are
errors.  You can use a higher process count (-n) to force more
concurrency and use more ram.  It may help to pin down all but 2 or 3 GB
of your memory.

What I'd like you to do is find a data set and command line that make
the script find errors on btrfs.  Then, try the same thing on xfs or
ext4 and let it run at least twice as long.  Then report back ;)

-chris

#!/bin/bash -
# -*- Shell-script -*-
#
# Copyright (C) 1999 Bibliotech Ltd., 631-633 Fulham Rd., London SW6 5UQ.
#
# $Id: stress.sh,v 1.2 1999/02/10 10:58:04 rich Exp $
#
# Change log:
#
# $Log: stress.sh,v $
# Revision 1.2  1999/02/10 10:58:04  rich
# Use cp instead of tar to copy.
#
# Revision 1.1  1999/02/09 15:13:38  rich
# Added first version of stress test program.
#

# Stress-test a file system by doing multiple
# parallel disk operations. This does everything
# in MOUNTPOINT/stress.

nconcurrent=50
content=/usr/doc
stagger=yes

while getopts c:n:s c; do
case $c in
c)
content=$OPTARG
;;
n)
nconcurrent=$OPTARG
;;
s)
stagger=no
;;
*)
echo 'Usage: stress.sh [-options] MOUNTPOINT'
echo 'Options: -c Content directory'
echo ' -n Number of concurrent accesses (default: 4)'
echo ' -s Avoid staggerring start times'
exit 1
;;
esac
done

shift $(($OPTIND-1))
if [ $# -ne 1 ]; then
echo 'For usage: stress.sh -?'
exit 1
fi

mountpoint=$1

echo 'Number of concurrent processes:' $nconcurrent
echo 'Content directory:' $content '(size:' `du -s $content | awk '{print $1}'` 
'KB)'

# Check the mount point is really a mount point.

#if [ `df | awk '{print $6}' | grep ^$mountpoint\$ | wc -l` -lt 1 ]; then
#echo $mountpoint: This doesn\'t seem to be a mountpoint. Try not
#echo to use a trailing / character.
#exit 1
#fi

# Create the directory, if it doesn't exist.

if [ ! -d $mountpoint/stress ]; then
rm -rf $mountpoint/stress
if ! mkdir $mountpoint/stress; then
echo Problem creating $mountpoint/stress directory. Do you have 
sufficient
echo access permissions\?
exit 1
fi
fi

echo Created $mountpoint/stress directory.

# Construct MD5 sums over the content directory.

echo -n Computing MD5 sums over content directory: 
( cd $content  find . -type f -print0 | xargs -0 md5sum | sort -o 
$mountpoint/stress/content.sums )
echo done.

# Start the stressing processes.

echo -n Starting stress test processes: 

pids=

p=1
while [ $p -le $nconcurrent ]; do
echo -n $p 

(

# Wait for all processes to start up.
if [ $stagger = yes ]; then
sleep $((10*$p))
else
sleep 10
fi

while true; do

# Remove old directories.
echo -n D$p 
rm -rf $mountpoint/stress/$p

# Copy content - partition.
echo -n W$p 
mkdir $mountpoint/stress/$p
base=`basename $content`

#( cd $content  tar cf - . ) | ( cd $mountpoint/stress/$p  tar 
xf - )
cp -dRx $content 

Re: btrfs for files 10GB = random spontaneous CRC failure.

2013-01-14 Thread Roman Mamedov
On Mon, 14 Jan 2013 15:22:36 +
Tomasz Kusmierz tom.kusmi...@gmail.com wrote:

 1) create a single drive default btrfs volume on single partition - 
 fill with test data - scrub - admire errors.

Did you try ruling out btrfs as the cause of the problem? Maybe something else
in your system is corrupting data, and btrfs just lets you know about that.

I.e. on the same drive, create an Ext4 filesystem, copy some data to it which
has known checksums (use md5sum or cfv to generate them in advance for data
that is on another drive and is waiting to be copied); copy to that drive,
flush caches, verify checksums of files at the destination.

-- 
With respect,
Roman

~~~
Stallman had a printer,
with code he could not see.
So he began to tinker,
and set the software free.


signature.asc
Description: PGP signature


Re: btrfs for files 10GB = random spontaneous CRC failure.

2013-01-14 Thread Tomasz Kusmierz

On 14/01/13 15:57, Chris Mason wrote:

On Mon, Jan 14, 2013 at 08:22:36AM -0700, Tomasz Kusmierz wrote:

On 14/01/13 14:59, Chris Mason wrote:

On Mon, Jan 14, 2013 at 04:09:47AM -0700, Tomasz Kusmierz wrote:

Hi,

Since I had some free time over Christmas, I decided to conduct few
tests over btrFS to se how it will cope with real life storage for
normal gray users and I've found that filesystem will always mess up
your files that are larger than 10GB.

Hi Tom,

I'd like to nail down the test case a little better.

1) Create on one drive, fill with data
2) Add a second drive, convert to raid1
3) find corruptions?

What happens if you start with two drives in raid1?  In other words, I'm
trying to see if this is a problem with the conversion code.

-chris

Ok, my description might be a bit enigmatic so to cut long story short
tests are:
1) create a single drive default btrfs volume on single partition -
fill with test data - scrub - admire errors.
2) create a raid1 (-d raid1 -m raid1) volume with two partitions on
separate disk, each same size etc. - fill with test data - scrub -
admire errors.
3) create a raid10 (-d raid10 -m raid1) volume with four partitions on
separate disk, each same size etc. - fill with test data - scrub -
admire errors.

all disks are same age + size + model ... two different batches to avoid
same time failure.

Ok, so we have two possible causes.  #1 btrfs is writing garbage to your
disks.  #2 something in your kernel is corrupting your data.

Since you're able to see this 100% of the time, lets assume that if #2
were true, we'd be able to trigger it on other filesystems.

So, I've attached an old friend, stress.sh.  Use it like this:

stress.sh -n 5 -c your source directory -s your btrfs mount point

It will run in a loop with 5 parallel processes and make 5 copies of
your data set into the destination.  It will run forever until there are
errors.  You can use a higher process count (-n) to force more
concurrency and use more ram.  It may help to pin down all but 2 or 3 GB
of your memory.

What I'd like you to do is find a data set and command line that make
the script find errors on btrfs.  Then, try the same thing on xfs or
ext4 and let it run at least twice as long.  Then report back ;)

-chris


Chris,

Will do, just please be remember that 2TB of test data on customer 
grade sata drives will take a while to test :)




--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: btrfs for files 10GB = random spontaneous CRC failure.

2013-01-14 Thread Tomasz Kusmierz

On 14/01/13 16:20, Roman Mamedov wrote:

On Mon, 14 Jan 2013 15:22:36 +
Tomasz Kusmierz tom.kusmi...@gmail.com wrote:


1) create a single drive default btrfs volume on single partition -
fill with test data - scrub - admire errors.

Did you try ruling out btrfs as the cause of the problem? Maybe something else
in your system is corrupting data, and btrfs just lets you know about that.

I.e. on the same drive, create an Ext4 filesystem, copy some data to it which
has known checksums (use md5sum or cfv to generate them in advance for data
that is on another drive and is waiting to be copied); copy to that drive,
flush caches, verify checksums of files at the destination.


Hi Roman,

Chris just provided his good old friend stress.sh that should do that. 
So I'll dive into more testing :)


Tom.
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: btrfs for files 10GB = random spontaneous CRC failure.

2013-01-14 Thread Chris Mason
On Mon, Jan 14, 2013 at 09:32:25AM -0700, Tomasz Kusmierz wrote:
 On 14/01/13 15:57, Chris Mason wrote:
  On Mon, Jan 14, 2013 at 08:22:36AM -0700, Tomasz Kusmierz wrote:
  On 14/01/13 14:59, Chris Mason wrote:
  On Mon, Jan 14, 2013 at 04:09:47AM -0700, Tomasz Kusmierz wrote:
  Hi,
 
  Since I had some free time over Christmas, I decided to conduct few
  tests over btrFS to se how it will cope with real life storage for
  normal gray users and I've found that filesystem will always mess up
  your files that are larger than 10GB.
  Hi Tom,
 
  I'd like to nail down the test case a little better.
 
  1) Create on one drive, fill with data
  2) Add a second drive, convert to raid1
  3) find corruptions?
 
  What happens if you start with two drives in raid1?  In other words, I'm
  trying to see if this is a problem with the conversion code.
 
  -chris
  Ok, my description might be a bit enigmatic so to cut long story short
  tests are:
  1) create a single drive default btrfs volume on single partition -
  fill with test data - scrub - admire errors.
  2) create a raid1 (-d raid1 -m raid1) volume with two partitions on
  separate disk, each same size etc. - fill with test data - scrub -
  admire errors.
  3) create a raid10 (-d raid10 -m raid1) volume with four partitions on
  separate disk, each same size etc. - fill with test data - scrub -
  admire errors.
 
  all disks are same age + size + model ... two different batches to avoid
  same time failure.
  Ok, so we have two possible causes.  #1 btrfs is writing garbage to your
  disks.  #2 something in your kernel is corrupting your data.
 
  Since you're able to see this 100% of the time, lets assume that if #2
  were true, we'd be able to trigger it on other filesystems.
 
  So, I've attached an old friend, stress.sh.  Use it like this:
 
  stress.sh -n 5 -c your source directory -s your btrfs mount point
 
  It will run in a loop with 5 parallel processes and make 5 copies of
  your data set into the destination.  It will run forever until there are
  errors.  You can use a higher process count (-n) to force more
  concurrency and use more ram.  It may help to pin down all but 2 or 3 GB
  of your memory.
 
  What I'd like you to do is find a data set and command line that make
  the script find errors on btrfs.  Then, try the same thing on xfs or
  ext4 and let it run at least twice as long.  Then report back ;)
 
  -chris
 
 Chris,
 
 Will do, just please be remember that 2TB of test data on customer 
 grade sata drives will take a while to test :)

Many thanks.  You might want to start with a smaller data set, 20GB or
so total.

-chris

--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html