Re: Linux RAID Partition Offset 63 cylinders / 30% performance hit?

2007-12-20 Thread Gabor Gombas
On Wed, Dec 19, 2007 at 10:31:12AM -0500, Justin Piszcz wrote: > Some nice graphs found here: > http://sqlblog.com/blogs/linchi_shea/archive/2007/02/01/performance-impact-of-disk-misalignment.aspx Again, this is a HW RAID, and the partitioning is done _on top of_ the RAID. Gabor -- --

Re: Linux RAID Partition Offset 63 cylinders / 30% performance hit?

2007-12-20 Thread Gabor Gombas
On Wed, Dec 19, 2007 at 04:01:43PM +0100, Mattias Wadenstein wrote: > From that setup it seems simple, scrap the partition table and use the disk > device for raid. This is what we do for all data storage disks (hw raid) > and sw raid members. And _exactly_ that's when you run into the alignmen

Re: Linux RAID Partition Offset 63 cylinders / 30% performance hit?

2007-12-20 Thread Gabor Gombas
On Wed, Dec 19, 2007 at 12:55:16PM -0500, Justin Piszcz wrote: > unligned, just fdisk /dev/sdc, mkpartition, fd raid. > aligned, fdisk, expert, start at 512 as the off-set No, that won't show any difference. You need to partition _the RAID device_. If the partitioning is below the RAID level, th

Re: optimal IO scheduler choice?

2007-12-13 Thread Gabor Gombas
On Thu, Dec 13, 2007 at 06:24:15AM -0500, Justin Piszcz wrote: > Sequential: > Output of CFQ: (horrid): 311,683 KiB/s > Output of AS: 443,103 KiB/s OTOH AS had worse latencies than CFQ AFAIR (it was quite some time ago I last experimented). So it depends on what do you want to optimize for. Gab

Re: raid6 check/repair

2007-12-07 Thread Gabor Gombas
On Wed, Dec 05, 2007 at 03:31:14PM -0500, Bill Davidsen wrote: > BTW: if this can be done in a user program, mdadm, rather than by code in > the kernel, that might well make everyone happy. Okay, realistically "less > unhappy." I start to like the idea. Of course you can't repair a running arra

Re: Implementing low level timeouts within MD

2007-10-30 Thread Gabor Gombas
On Tue, Oct 30, 2007 at 12:08:07AM -0500, Alberto Alonso wrote: > > > * Internal serverworks PATA controller on a netengine server. The > > > server if off waiting to get picked up, so I can't get the important > > > details. > > > > 1 PATA failure. > > I was surprised on this one, I did hav

Re: Raid-10 mount at startup always has problem

2007-10-29 Thread Gabor Gombas
On Mon, Oct 29, 2007 at 08:41:39AM +0100, Luca Berra wrote: > consider a storage with 64 spt, an io size of 4k and partition starting > at sector 63. > first io request will require two ios from the storage (1 for sector 63, > and one for sectors 64 to 70) > the next 7 io (71-78,79-86,97-94,95-102

Re: Raid-10 mount at startup always has problem

2007-10-27 Thread Gabor Gombas
On Sat, Oct 27, 2007 at 09:50:55AM +0200, Luca Berra wrote: >> Because you didn't stripe align the partition, your bad. > :) > by default fdisk misalignes partition tables > and aligning them is more complex than just doing without. Why use fdisk then? Use parted instead. It's not the kernel's fa

Re: Time to deprecate old RAID formats?

2007-10-26 Thread Gabor Gombas
On Fri, Oct 26, 2007 at 02:52:59PM -0400, Doug Ledford wrote: > In fact, no you can't. I know, because I've created a device that had > both but wasn't a raid device. And it's matching partner still existed > too. What you are talking about would have misrecognized this > situation, guaranteed.

Re: Time to deprecate old RAID formats?

2007-10-26 Thread Gabor Gombas
On Fri, Oct 26, 2007 at 02:41:56PM -0400, Doug Ledford wrote: > * When using lilo to boot from a raid device, it automatically installs > itself to the mbr, not to the partition. This can not be changed. Only > 0.90 and 1.0 superblock types are supported because lilo doesn't > understand the off

Re: Time to deprecate old RAID formats?

2007-10-26 Thread Gabor Gombas
On Fri, Oct 26, 2007 at 06:22:27PM +0200, Gabor Gombas wrote: > You got the ordering wrong. You should get userspace support ready and > accepted _first_, and then you can start the > flamew^H^H^H^H^H^Hdiscussion to make the in-kernel partitioning code > configurable. Oh wait that

Re: Raid-10 mount at startup always has problem

2007-10-26 Thread Gabor Gombas
On Fri, Oct 26, 2007 at 11:15:13AM +0200, Luca Berra wrote: > on a pc maybe, but that is 20 years old design. > partition table design is limited because it is still based on C/H/S, > which do not exist anymore. The MS-DOS format is not the only possible partition table layout. Other formats such

Re: Time to deprecate old RAID formats?

2007-10-26 Thread Gabor Gombas
On Fri, Oct 26, 2007 at 11:54:18AM +0200, Luca Berra wrote: > but the fix is easy. > remove the partition detection code from the kernel and start working on > a smart userspace replacement for device detection. we already have > vol_id from udev and blkid from ext3 which support detection of many

Re: Software RAID5 Horrible Write Speed On 3ware Controller!!

2007-07-18 Thread Gabor Gombas
On Wed, Jul 18, 2007 at 01:51:16PM +0100, Robin Hill wrote: > Just to pick up on this one (as I'm about to reformat my array as XFS) - > does this actually work with a hardware controller? Is there any > assurance that the XFS stripes align with the hardware RAID stripes? Or > could you just end

Re: Software RAID5 Horrible Write Speed On 3ware Controller!!

2007-07-18 Thread Gabor Gombas
On Wed, Jul 18, 2007 at 06:23:25AM -0400, Justin Piszcz wrote: > I recently got a chance to test SW RAID5 using 750GB disks (10) in a RAID5 > on a 3ware card, model no: 9550SXU-12 > > The bottom line is the controller is doing some weird caching with writes > on SW RAID5 which makes it not worth

Re: Customize the error emails of `mdadm --monitor`

2007-06-06 Thread Gabor Gombas
On Wed, Jun 06, 2007 at 04:24:31PM +0200, Peter Rabbitson wrote: > So I was asking if the component _number_, which is unique to a specific > device regardless of the assembly mechanism, can be reported in case of a > failure. So you need to write an event-handling script and pass it to mdadm (

Re: Customize the error emails of `mdadm --monitor`

2007-06-06 Thread Gabor Gombas
On Wed, Jun 06, 2007 at 02:23:31PM +0200, Peter Rabbitson wrote: > This would not work as arrays are assembled by the kernel at boot time, at > which point there is no udev or anything else for that matter other than > /dev/sdX. And I am pretty sure my OS (debian) does not support udev in > ini

Re: Fwd: Identify SATA Disks

2007-05-24 Thread Gabor Gombas
On Thu, May 24, 2007 at 09:29:04AM +1000, lewis shobbrook wrote: > I've noted that device allocation can change with the generation of > new initrd's and installation of new kernels; i.e. /dev/sdc becomes > /dev/sda depending upon what order the modules load etc. > I'm wondering if one could send

Re: RAID1, hot-swap and boot integrity

2007-03-06 Thread Gabor Gombas
On Mon, Mar 05, 2007 at 06:32:32PM -0500, Mike Accetta wrote: > Yes, we actually have a separate (smallish) boot partition at the front of > the array. This does reduce the at-risk window substantially. I'll have to > ponder whether it reduces it close enough to negligible to then ignore, but >

Re: RAID1, hot-swap and boot integrity

2007-03-02 Thread Gabor Gombas
On Fri, Mar 02, 2007 at 10:40:32AM -0500, Justin Piszcz wrote: > AFAIK mdadm/kernel raid can handle this, I had a number of occaisons when > my UPS shut my machine down when I was rebuilding a RAID5 array, when the > box came back up, the rebuild picked up where it left off. _If_ the resync got

Re: RAID1, hot-swap and boot integrity

2007-03-02 Thread Gabor Gombas
On Fri, Mar 02, 2007 at 09:04:40AM -0500, Mike Accetta wrote: > Thoughts or other suggestions anyone? This is a case where a very small /boot partition is still a very good idea... 50-100MB is a good choice (some initramfs generators require quite a bit of space under /boot while generating the i

Re: Odd (slow) RAID performance

2006-12-08 Thread Gabor Gombas
On Thu, Dec 07, 2006 at 10:51:25AM -0500, Bill Davidsen wrote: > I also suspect that write are not being combined, since writing the 2GB > test runs at one-drive speed writing 1MB blocks, but floppy speed > writing 2k blocks. And no, I'm not running out of CPU to do the > overhead, it jumps fro

Re: Swap initialised as an md?

2006-11-12 Thread Gabor Gombas
On Fri, Nov 10, 2006 at 12:55:57PM +0100, Mogens Kjaer wrote: > If one of your disks fails, and you have pages in the swapfile > on the failing disk, your machine will crash when the pages are > needed again. IMHO the machine will not crash just the application which the page belongs to will be k

Re: Too much ECC?

2006-11-09 Thread Gabor Gombas
On Thu, Nov 09, 2006 at 03:30:55PM +0100, Dexter Filmore wrote: > 195 Hardware_ECC_Recovered 3344107 For some models that's perfectly normal. > Looking at a 5 year old 40GB Maxtor that's not been cooled too well I see "3" > as the raw value. Different technology, different vendor, different m

Re: New features?

2006-11-03 Thread Gabor Gombas
On Fri, Nov 03, 2006 at 02:39:31PM +1100, Neil Brown wrote: > mdadm could probably be changed to be able to remove the device > anyway. The only difficulty is: how do you tell it which device to > remove", given that there is no name in /dev to use. > Suggestions? Major:minor? If /sys/block stil

Re: libata hotplug and md raid?

2006-10-17 Thread Gabor Gombas
On Tue, Oct 17, 2006 at 10:07:07AM +0200, Gabor Gombas wrote: > Vanilla 2.6.18 kernel. In fact, all the /sys/block/*/holders directories > are empty here. Never mind, I just found the per-partition holders directories. Argh.

Re: libata hotplug and md raid?

2006-10-17 Thread Gabor Gombas
On Tue, Oct 17, 2006 at 11:58:03AM +1000, Neil Brown wrote: > udev can find out what needs to be done by looking at > /sys/block/whatever/holders. Are you sure? $ cat /proc/mdstat [...] md0 : active raid1 sdd1[1] sdc1[0] sdb1[2] sda1[3] 393472 blocks [4/4] [] [...] $ ls -l /sys/block/

Re: avoiding the initial resync on --create

2006-10-10 Thread Gabor Gombas
On Tue, Oct 10, 2006 at 01:47:56PM -0400, Doug Ledford wrote: > Not at all true. Every filesystem, no matter where it stores its > metadata blocks, still writes to every single metadata block it > allocates to initialize that metadata block. The same is true for > directory blocks...they are cre

Re: avoiding the initial resync on --create

2006-10-10 Thread Gabor Gombas
On Mon, Oct 09, 2006 at 12:32:00PM -0400, Doug Ledford wrote: > You don't really need to. After a clean install, the operating system > has no business reading any block it didn't write to during the install > unless you are just reading disk blocks for the fun of it. What happens if you have a

Re: Can you IMAGE Mirrored OS Drives?

2006-08-22 Thread Gabor Gombas
On Sat, Aug 19, 2006 at 09:05:39AM +0200, Luca Berra wrote: > please, can we try not to resurrect again the kernel-level autodetection > flamewar on this list. There is no need for a flame war. In some situations one is better, in other situations the other is better. Gabor -- ---

Re: remark and RFC

2006-08-18 Thread Gabor Gombas
On Thu, Aug 17, 2006 at 08:28:07AM +0200, Peter T. Breuer wrote: > 1) if the network disk device has decided to shut down wholesale >(temporarily) because of lack of contact over the net, then >retries and writes are _bound_ to fail for a while, so there >is no point in sending them no

Re: Can you IMAGE Mirrored OS Drives?

2006-08-18 Thread Gabor Gombas
On Wed, Aug 16, 2006 at 06:06:24AM -0400, andy liebman wrote: > There is absolutely NO PROBLEM making images of single disks and > restoring them to new disks (thus, creating clones). And it is very > fast. For an OS drive with about 4 GBs of data, it only takes about 5 > minutes to make the im

Re: Can you IMAGE Mirrored OS Drives?

2006-08-18 Thread Gabor Gombas
On Wed, Aug 16, 2006 at 09:38:54AM +0200, Luca Berra wrote: > The only risk is if you ever move one disk from one machine to another. > To work around this you can change the uuid by recreating the array with > mdadm, No need to re-create, --update=uuid should be enough according to the man page.

Re: second controller: what will my discs be called, and does it matter?

2006-07-07 Thread Gabor Gombas
On Thu, Jul 06, 2006 at 08:12:14PM +0200, Dexter Filmore wrote: > How can I tell if the discs on the new controller will become sd[e-h] or if > they'll be the new a-d and push the existing ones back? If they are the same type (or more precisely, if they use the same driver), then their order on

Re: IBM xSeries stop responding during RAID1 reconstruction

2006-06-20 Thread Gabor Gombas
On Tue, Jun 20, 2006 at 08:00:13AM -0700, Mr. James W. Laferriere wrote: > At least one can do a ls of the /sys/block area & then do an > automated > echo cfq down the tree . Does anyone know of a method to set a > default > scheduler ? RTFM: Documentation/kernel-

Re: IBM xSeries stop responding during RAID1 reconstruction

2006-06-20 Thread Gabor Gombas
On Tue, Jun 20, 2006 at 03:08:59PM +0200, Niccolo Rigacci wrote: > Do you know if it is possible to switch the scheduler at runtime? echo cfq > /sys/block//queue/scheduler Gabor -- - MTA SZTAKI Computer and Automation Research I

Re: IBM xSeries stop responding during RAID1 reconstruction

2006-06-19 Thread Gabor Gombas
On Wed, Jun 14, 2006 at 10:46:09AM -0500, Bill Cizek wrote: > I was able to work around this by lowering > /proc/sys/dev/raid/speed_limit_max to a value > below my disk thruput value (~ 50 MB/s) as follows: IMHO a much better fix is to use the cfq I/O scheduler during the rebuild. The default an

Re: replace disk in raid5 without linux noticing?

2006-04-20 Thread Gabor Gombas
On Wed, Apr 19, 2006 at 02:16:10PM -0400, Ming Zhang wrote: > is this possible? > * stop RAID5 > * set a mirror between current disk X and a new added disk Y, and X as > primary one (which means copy X to Y to full sync, and before this ends, > only read from X); also this mirror will not have an

Re: help wanted - 6-disk raid5 borked: _ _ U U U U

2006-04-20 Thread Gabor Gombas
On Mon, Apr 17, 2006 at 09:30:32AM +1000, Neil Brown wrote: > It is arguable that for a read error on a degraded raid5, that may not > be the best thing to do, but I'm not completely convinced. My opinion would be that in the degraded case md should behave as if it was a single physical drive, an