Re: Linux Software RAID 5 + XFS Multi-Benchmarks / 10 Raptors Again

2008-01-16 Thread Al Boldi
Justin Piszcz wrote: For these benchmarks I timed how long it takes to extract a standard 4.4 GiB DVD: Settings: Software RAID 5 with the following settings (until I change those too): Base setup: blockdev --setra 65536 /dev/md3 echo 16384 /sys/block/md3/md/stripe_cache_size echo

Re: Linux Software RAID 5 + XFS Multi-Benchmarks / 10 Raptors Again

2008-01-16 Thread Al Boldi
Justin Piszcz wrote: On Wed, 16 Jan 2008, Al Boldi wrote: Also, can you retest using dd with different block-sizes? I can do this, moment.. I know about oflag=direct but I choose to use dd with sync and measure the total time it takes. /usr/bin/time -f %E -o ~/$i=chunk.txt bash -c 'dd

[RFD] Layering: Use-Case Composers (was: DRBD - what is it, anyways? [compare with e.g. NBD + MD raid])

2007-08-12 Thread Al Boldi
Lars Ellenberg wrote: meanwhile, please, anyone interessted, the drbd paper for LinuxConf Eu 2007 is finalized. http://www.drbd.org/fileadmin/drbd/publications/ drbd8.linux-conf.eu.2007.pdf it does not give too much implementation detail (would be inapropriate for conference proceedings,

Re: [RFD] Layering: Use-Case Composers (was: DRBD - what is it, anyways? [compare with e.g. NBD + MD raid])

2007-08-12 Thread Al Boldi
Evgeniy Polyakov wrote: Al Boldi ([EMAIL PROTECTED]) wrote: Look at ZFS; it illegally violates layering by combining md/dm/lvm with the fs, but it does this based on a realistic understanding of the problems involved, which enables it to improve performance, flexibility, and functionality

Re: bonnie++ benchmarks for ext2,ext3,ext4,jfs,reiserfs,xfs,zfs on software raid 5

2007-07-30 Thread Al Boldi
Justin Piszcz wrote: CONFIG: Software RAID 5 (400GB x 6): Default mkfs parameters for all filesystems. Kernel was 2.6.21 or 2.6.22, did these awhile ago. Hardware was SATA with PCI-e only, nothing on the PCI bus. ZFS was userspace+fuse of course. Wow! Userspace and still that efficient.

Re: [RFH] Partition table recovery

2007-07-22 Thread Al Boldi
Theodore Tso wrote: On Sun, Jul 22, 2007 at 07:10:31AM +0300, Al Boldi wrote: Sounds great, but it may be advisable to hook this into the partition modification routines instead of mkfs/fsck. Which would mean that the partition manager could ask the kernel to instruct its fs subsystem

Re: [RFH] Partition table recovery

2007-07-21 Thread Al Boldi
Theodore Tso wrote: On Sat, Jul 21, 2007 at 07:54:14PM +0200, Rene Herman wrote: sfdisk -d already works most of the time. Not as a verbatim tool (I actually semi-frequently use a sfdisk -d /dev/hda | sfdisk invocation as a way to _rewrite_ the CHS fields to other values after changing

Re: [RFH] Partition table recovery

2007-07-20 Thread Al Boldi
Jeffrey V. Merkey wrote: Al Boldi wrote: As always, a good friend of mine managed to scratch my partion table by cat'ing /dev/full into /dev/sda. I was able to push him out of the way, but at least the first 100MB are gone. I can probably live without the first partion, but there are many

Re: [RFH] Partition table recovery

2007-07-20 Thread Al Boldi
Dave Young wrote: On 7/20/07, Al Boldi [EMAIL PROTECTED] wrote: As always, a good friend of mine managed to scratch my partion table by cat'ing /dev/full into /dev/sda. I was able to push him out of the way, but /dev/null ? at least the first 100MB are gone. I can probably live

Re: [RFH] Partition table recovery

2007-07-20 Thread Al Boldi
James Lamanna wrote: On 7/19/07, Al Boldi [EMAIL PROTECTED] wrote: As always, a good friend of mine managed to scratch my partion table by cat'ing /dev/full into /dev/sda. I was able to push him out of the way, but at least the first 100MB are gone. I can probably live without the first

Re: [RFH] Partition table recovery

2007-07-20 Thread Al Boldi
Jan-Benedict Glaw wrote: On Fri, 2007-07-20 14:29:34 +0300, Al Boldi [EMAIL PROTECTED] wrote: But, I want something much more automated. And the partition table backup per partition entry isn't really a bad idea. That's called `gpart'. Oh, gpart is great, but if we had a backup copy

[RFH] Partion table recovery

2007-07-19 Thread Al Boldi
As always, a good friend of mine managed to scratch my partion table by cat'ing /dev/full into /dev/sda. I was able to push him out of the way, but at least the first 100MB are gone. I can probably live without the first partion, but there are many partitions after that, which I hope should

Re: Software RAID5 Horrible Write Speed On 3ware Controller!!

2007-07-18 Thread Al Boldi
Justin Piszcz wrote: UltraDense-AS-3ware-R5-9-disks,16G,50676,89,96019,34,46379,9,60267,99,5010 98,56,248.5,0,16:10:16/64,240,3,21959,84,1109,10,286,4,22923,91,544,6 UltraDense-AS-3ware-R5-9-disks,16G,49983,88,96902,37,47951,10,59002,99,529

[RFC] VFS: data=ordered (was: [Advocacy] Re: 3ware 9650 tips)

2007-07-16 Thread Al Boldi
Matthew Wilcox wrote: On Mon, Jul 16, 2007 at 08:40:00PM +0300, Al Boldi wrote: XFS surely rocks, but it's missing one critical component: data=ordered And that's one component that's just too critical to overlook for an enterprise environment that is built on data-integrity over

Re: [PATCH RFC 3/4] md: writeback caching policy for raid5 [experimental]

2007-04-11 Thread Al Boldi
Dan Williams wrote: In write-through mode bi_end_io is called once writes to the data disk(s) and the parity disk have completed. In write-back mode bi_end_io is called immediately after data has been copied into the stripe cache, which also causes the stripe to be marked dirty. This is not

Re: raid1 does not seem faster

2007-04-03 Thread Al Boldi
Bill Davidsen wrote: Al Boldi wrote: The problem is that raid1 one doesn't do striped reads, but rather uses read-balancing per proc. Try your test with parallel reads; it should be faster. : : It would be nice if reads larger than some size were considered as candidates for multiple

Re: raid1 does not seem faster

2007-04-01 Thread Al Boldi
Jan Engelhardt wrote: normally, I'd think that combining drives into a raid1 array would give me at least a little improvement in read speed. In my setup however, this does not seem to be the case. 14:16 opteron:/var/log # hdparm -t /dev/sda Timing buffered disk reads: 170 MB in 3.01

Re: PATA/SATA Disk Reliability paper

2007-02-26 Thread Al Boldi
Mario 'BitKoenig' Holbe wrote: Al Boldi [EMAIL PROTECTED] wrote: Interesting link. They seem to point out that smart not necessarily warns of pending failure. This is probably worse than not having smart at all, as it gives you the illusion of safety. If SMART gives you the illusion

Re: PATA/SATA Disk Reliability paper

2007-02-25 Thread Al Boldi
Mark Hahn wrote: In contrast, ever since these holes appeared, drive failures became the norm. wow, great conspiracy theory! I think you misunderstand. I just meant plain old-fashioned mis-engineering. maybe the hole is plugged at the factory with a substance which evaporates at

Re: PATA/SATA Disk Reliability paper

2007-02-25 Thread Al Boldi
Mark Hahn wrote: - disks are very complicated, so their failure rates are a combination of conditional failure rates of many components. to take a fully reductionist approach would require knowing how each of ~1k parts responds to age, wear, temp, handling, etc.

Re: PATA/SATA Disk Reliability paper

2007-02-23 Thread Al Boldi
Stephen C Woods wrote: So drives do need to be ventilated, not so much wory about exploding, but rather subtle distortion of the case as the atmospheric preasure changed. I have a '94 Caviar without any apparent holes; and as a bonus, the drive still works. In contrast, ever since these

Re: PATA/SATA Disk Reliability paper

2007-02-20 Thread Al Boldi
, there is a hole with a warning printed on its side: DO NOT COVER HOLE BELOW V V V V o In contrast, older models from the last century, don't have that hole. Al Boldi wrote: If there is one thing

Re: PATA/SATA Disk Reliability paper

2007-02-19 Thread Al Boldi
Richard Scobie wrote: Thought this paper may be of interest. A study done by Google on over 100,000 drives they have/had in service. http://labs.google.com/papers/disk_failures.pdf Interesting link. They seem to point out that smart not necessarily warns of pending failure. This is

Re: Linux Software RAID 5 Performance Optimizations: 2.6.19.1: (211MB/s read 195MB/s write)

2007-01-12 Thread Al Boldi
Justin Piszcz wrote: RAID 5 TWEAKED: 1:06.41 elapsed @ 60% CPU This should be 1:14 not 1:06(was with a similarly sized file but not the same) the 1:14 is the same file as used with the other benchmarks. and to get that I used 256mb read-ahead and 16384 stripe size ++ 128 max_sectors_kb

Re: Linux Software RAID 5 Performance Optimizations: 2.6.19.1: (211MB/s read 195MB/s write)

2007-01-12 Thread Al Boldi
Justin Piszcz wrote: Btw, max sectors did improve my performance a little bit but stripe_cache+read_ahead were the main optimizations that made everything go faster by about ~1.5x. I have individual bonnie++ benchmarks of [only] the max_sector_kb tests as well, it improved the times from

Re: Linux Software RAID 5 Performance Optimizations: 2.6.19.1: (211MB/s read 195MB/s write)

2007-01-12 Thread Al Boldi
Justin Piszcz wrote: On Sat, 13 Jan 2007, Al Boldi wrote: Justin Piszcz wrote: Btw, max sectors did improve my performance a little bit but stripe_cache+read_ahead were the main optimizations that made everything go faster by about ~1.5x. I have individual bonnie++ benchmarks

Re: Propose of enhancement of raid1 driver

2006-10-30 Thread Al Boldi
Mario 'BitKoenig' Holbe wrote: Al Boldi [EMAIL PROTECTED] wrote: But what still isn't clear, why can't raid1 use something like the raid10 offset=2 mode? RAID1 has equal data on all mirrors, so sooner or later you have to seek somewhere - no matter how you layout the data on each mirror

Re: Propose of enhancement of raid1 driver

2006-10-30 Thread Al Boldi
Mario 'BitKoenig' Holbe wrote: Al Boldi [EMAIL PROTECTED] wrote: Don't underestimate the effects mere layout can have on multi-disk array performance, despite it being highly hw dependent. I can't see the difference between equal mirrors and somehow interleaved layout on RAID1. Since you

Re: Propose of enhancement of raid1 driver

2006-10-28 Thread Al Boldi
Mario 'BitKoenig' Holbe wrote: Neil Brown [EMAIL PROTECTED] wrote: Skipping over blocks within a track is no faster than reading blocks in the track, so you would need to make sure that your chunk size is Not even no faster but probably even slower. Surely slower, on conventional hds

Re: Large single raid and XFS or two small ones and EXT3?

2006-06-23 Thread Al Boldi
Chris Allen wrote: Francois Barre wrote: 2006/6/23, PFC [EMAIL PROTECTED]: - XFS is faster and fragments less, but make sure you have a good UPS Why a good UPS ? XFS has a good strong journal, I never had an issue with it yet... And believe me, I did have some dirty things

Re: [PATCH 009 of 11] md: Support stripe/offset mode in raid10

2006-05-02 Thread Al Boldi
Neil Brown wrote: On Tuesday May 2, [EMAIL PROTECTED] wrote: NeilBrown wrote: The industry standard DDF format allows for a stripe/offset layout where data is duplicated on different stripes. e.g. A B C D D A B C E F G H H E F G (columns are

Re: Help needed - RAID5 recovery from Power-fail

2006-04-04 Thread Al Boldi
Neil Brown wrote: 2 devices in a raid5?? Doesn't seem a lot of point it being raid5 rather than raid1. Wouldn't a 2-dev raid5 imply a striped block mirror (i.e faster) rather than a raid1 duplicate block mirror (i.e. slower) ? Thanks! -- Al - To unsubscribe from this list: send the line

Re: [PATCH 2.6.15-git9a] aoe [1/1]: do not stop retransmit timer when device goes down

2006-01-27 Thread Al Boldi
Ed L. Cashin wrote: On Thu, Jan 26, 2006 at 01:04:37AM +0300, Al Boldi wrote: Ed L. Cashin wrote: This patch is a bugfix that follows and depends on the eight aoe driver patches sent January 19th. Will they also fix this? Or is this an md bug? No, this patch fixes a bug that would

Re: io performance...

2006-01-19 Thread Al Boldi
Jeff V. Merkey wrote: Jens Axboe wrote: On Mon, Jan 16 2006, Jeff V. Merkey wrote: Max Waterman wrote: I've noticed that I consistently get better (read) numbers from kernel 2.6.8 than from later kernels. To open the bottlenecks, the following works well. Jens will shoot me -#define

Re: RAID0 performance question

2005-12-19 Thread Al Boldi
JaniD++ wrote: For me, the performance bottleneck is cleanly about RAID0 layer used exactly as concentrator to join the 4x2TB to 1x8TB. Did you try running RAID0 over nbd directly and found it to be faster? IIRC, stacking raid modules does need a considerable amount of tuning, and even then

Re: Where is the performance bottleneck?

2005-09-02 Thread Al Boldi
Holger Kiehl wrote: top - 08:39:11 up 2:03, 2 users, load average: 23.01, 21.48, 15.64 Tasks: 102 total, 2 running, 100 sleeping, 0 stopped, 0 zombie Cpu(s): 0.0% us, 17.7% sy, 0.0% ni, 0.0% id, 78.9% wa, 0.2% hi, 3.1% si Mem: 8124184k total, 8093068k used,31116k free,

Re: [PATCH md 006 of 6] Add write-behind support for md/raid1

2005-08-12 Thread Al Boldi
Paul Clements wrote: Al Boldi wrote: NeilBrown wrote: If a device is flagged 'WriteMostly' and the array has a bitmap, and the bitmap superblock indicates that write_behind is allowed, then write_behind is enabled for WriteMostly devices. Nice, but why is it dependent on WriteMostly

RE: Multiplexed RAID-1 mode

2005-08-01 Thread Al Boldi
Neil Brown wrote: { On Sunday July 31, [EMAIL PROTECTED] wrote: Multiplexing read/write requests would certainly improve performance ala RAID-0 (-offset overhead). During reads the same RAID-0 code (+mirroring offset) could be used. During writes though, this would imply delayed mirroring.

Multiplexed RAID-1 mode

2005-07-31 Thread Al Boldi
Gordon Henderson wrote: { On Sat, 30 Jul 2005, Jeff Breidenbach wrote: I just ran a Linux software RAID-1 benchmark with some 500GB SATA drives in NCQ mode, along with a non-RAID control. Details are here for those interested. http://www.jab.org/raid-bench/ The results you get are