Re: raid5: I lost a XFS file system due to a minor IDE cable problem

2007-05-28 Thread Pallai Roland
On Monday 28 May 2007 14:53:55 Pallai Roland wrote: > On Friday 25 May 2007 02:05:47 David Chinner wrote: > > "-o ro,norecovery" will allow you to mount the filesystem and get any > > uncorrupted data off it. > > > > You still may get shutdowns if you trip

Re: raid5: I lost a XFS file system due to a minor IDE cable problem

2007-05-28 Thread Pallai Roland
On Friday 25 May 2007 02:05:47 David Chinner wrote: > "-o ro,norecovery" will allow you to mount the filesystem and get any > uncorrupted data off it. > > You still may get shutdowns if you trip across corrupted metadata in > the filesystem, though. This filesystem is completely dead. hq:~# mount

Re: raid5: I lost a XFS file system due to a minor IDE cable problem

2007-05-28 Thread Pallai Roland
On Monday 28 May 2007 04:17:18 David Chinner wrote: > On Mon, May 28, 2007 at 03:50:17AM +0200, Pallai Roland wrote: > > On Monday 28 May 2007 02:30:11 David Chinner wrote: > > > On Fri, May 25, 2007 at 04:35:36PM +0200, Pallai Roland wrote: > > > > .and I've sp

Re: raid5: I lost a XFS file system due to a minor IDE cable problem

2007-05-27 Thread Pallai Roland
On Monday 28 May 2007 02:30:11 David Chinner wrote: > On Fri, May 25, 2007 at 04:35:36PM +0200, Pallai Roland wrote: > > On Friday 25 May 2007 06:55:00 David Chinner wrote: > > > Oh, did you look at your logs and find that XFS had spammed them > > > about writes tha

Re: raid5: I lost a XFS file system due to a minor IDE cable problem

2007-05-25 Thread Pallai Roland
On Friday 25 May 2007 06:55:00 David Chinner wrote: > Oh, did you look at your logs and find that XFS had spammed them > about writes that were failing? The first message after the incident: May 24 01:53:50 hq kernel: Filesystem "loop1": XFS internal error xfs_btree_check_sblock at line 336 of

Re: raid5: I lost a XFS file system due to a minor IDE cable problem

2007-05-25 Thread Pallai Roland
On Friday 25 May 2007 03:35:48 Pallai Roland wrote: > On Fri, 2007-05-25 at 10:05 +1000, David Chinner wrote: > > On Thu, May 24, 2007 at 07:20:35AM -0400, Justin Piszcz wrote: > > > On Thu, 24 May 2007, Pallai Roland wrote: > > > >It's a good question

Re: raid5: I lost a XFS file system due to a minor IDE cable problem

2007-05-24 Thread Pallai Roland
On Fri, 2007-05-25 at 10:05 +1000, David Chinner wrote: > On Thu, May 24, 2007 at 07:20:35AM -0400, Justin Piszcz wrote: > > On Thu, 24 May 2007, Pallai Roland wrote: > > >I wondering why the md raid5 does accept writes after 2 disks failed. I've > > >a

raid5: I lost a XFS file system due to a minor IDE cable problem

2007-05-24 Thread Pallai Roland
Hi, I wondering why the md raid5 does accept writes after 2 disks failed. I've an array built from 7 drives, filesystem is XFS. Yesterday, an IDE cable failed (my friend kicked it off from the box on the floor:) and 2 disks have been kicked but my download (yafc) not stopped, it tried and cou

Re: major performance drop on raid5 due to context switches caused by small max_hw_sectors [partially resolved]

2007-04-22 Thread Pallai Roland
On Sunday 22 April 2007 16:48:11 Justin Piszcz wrote: > Have you also optimized your stripe cache for writes? Not yet. Is it worth it? -- d - To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger

Re: major performance drop on raid5 due to context switches caused by small max_hw_sectors [partially resolved]

2007-04-22 Thread Pallai Roland
On Sunday 22 April 2007 13:42:43 Justin Piszcz wrote: > http://www.rhic.bnl.gov/hepix/talks/041019pm/schoen.pdf > Check page 13 of 20. Thanks, interesting presentation. I'm working in the same area now, big media files and many clients. I spent some days to build a low-cost, high performance se

Re: major performance drop on raid5 due to context switches caused by small max_hw_sectors [partially resolved]

2007-04-22 Thread Pallai Roland
On Sunday 22 April 2007 12:23:12 Justin Piszcz wrote: > On Sun, 22 Apr 2007, Pallai Roland wrote: > > On Sunday 22 April 2007 10:47:59 Justin Piszcz wrote: > >> On Sun, 22 Apr 2007, Pallai Roland wrote: > >>> On Sunday 22 April 2007 02:18:09 Justin Piszcz wrote: &g

Re: major performance drop on raid5 due to context switches caused by small max_hw_sectors [partially resolved]

2007-04-22 Thread Pallai Roland
On Sunday 22 April 2007 10:47:59 Justin Piszcz wrote: > On Sun, 22 Apr 2007, Pallai Roland wrote: > > On Sunday 22 April 2007 02:18:09 Justin Piszcz wrote: > >> > >> How did you run your read test? > >> > > > > I did run 100 parallel reader process

Re: major performance drop on raid5 due to context switches caused by small max_hw_sectors [partially resolved]

2007-04-21 Thread Pallai Roland
On Sunday 22 April 2007 02:18:09 Justin Piszcz wrote: > On Sat, 21 Apr 2007, Pallai Roland wrote: > > > > RAID5, chunk size 128k: > > > > # mdadm -C -n8 -l5 -c128 -z 1200 /dev/md/0 /dev/sd[ijklmnop] > > (waiting for sync, then mount, mkfs, etc) >

Re: major performance drop on raid5 due to context switches caused by small max_hw_sectors [partially resolved]

2007-04-21 Thread Pallai Roland
On Saturday 21 April 2007 07:47:49 you wrote: > On 4/21/07, Pallai Roland <[EMAIL PROTECTED]> wrote: > > I made a software RAID5 array from 8 disks top on a HPT2320 card driven > > by hpt's driver. max_hw_sectors is 64Kb in this proprietary driver. I > > began to

major performance drop on raid5 due to context switches caused by small max_hw_sectors

2007-04-20 Thread Pallai Roland
Hi! I made a software RAID5 array from 8 disks top on a HPT2320 card driven by hpt's driver. max_hw_sectors is 64Kb in this proprietary driver. I began to test it with a simple sequental read by 100 threads with adjusted readahead size (2048Kb; total ram is 1Gb, I use posix_fadvise DONTNEED a

iostat messed up with md on 2.6.16.x

2006-05-23 Thread Pallai Roland
Hi, I upgraded my kernel from 2.6.15.6 to 2.6.16.16 and now the 'iostat -x 1' permanently shows 100% utilisation on each disk that member of an md array. I asked my friend who using 3 boxes with 2.6.16.2 2.6.16.9 2.6.16.11 and raid1, he's reported the same too. it works for anyone? I don't think

[PATCH] proactive raid5 disk replacement, 2.6.14

2005-11-09 Thread Pallai Roland
*big* thanks for all developers of bitmap-based raid5 resyncing and bad block rewriting, both of them is really great feature! :) I ported my "proactive thing" to the new kernel, it can live nicely with the new features now.. some bugs hunted in the last month, it's quite stable for me i

Re: [PATCH] proactive raid5 disk replacement for 2.6.11

2005-08-22 Thread Pallai Roland
On Mon, 2005-08-22 at 12:47 +0200, Molle Bestefich wrote: > Claas Hilbrecht wrote: > > Pallai Roland schrieb: > > > this is a feature patch that implements 'proactive raid5 disk > > > replacement' (http://www.arctic.org/~dean/raid-wishlist.html), > >

Re: Oops in raid1?

2005-08-20 Thread Pallai Roland
On Sat, 2005-08-20 at 18:26 +0200, [EMAIL PROTECTED] wrote: > I get this message, when high upload. (disk write) > But the GNBD generates this, in that situation, and thats why I think, this > is something else... yes, seems like it's an another bug in the GNBD, but the backtrace is clear in the

Re: Oops in raid1?

2005-08-20 Thread Pallai Roland
Hi, On Sat, 2005-08-20 at 11:55 +0200, [EMAIL PROTECTED] wrote: > I found this, bud don't know what is this exactly... > It is not look like the *NBD's deadlock. :-/ it's exactly a GNBD bug, imho > [...] > Aug 20 01:07:24 192.168.2.50 kernel: [42992885.04] Process md3_raid1 > (pid: 2769, th

Re: [PATCH] proactive raid5 disk replacement for 2.6.11, updated

2005-08-20 Thread Pallai Roland
external error handler is done, error handling is a quite complex now compared to the old method, but there's a built-in handler if you don't write your own. if the exit value of the handler isn't understood then md_error() will be also called. all disk IO is suspended during the running of the e

Re: [PATCH] proactive raid5 disk replacement for 2.6.11, updated

2005-08-19 Thread Pallai Roland
On Thu, 2005-08-18 at 15:46 +0200, Pallai Roland wrote: > On Thu, 2005-08-18 at 15:28 +1000, Neil Brown wrote: > > If we want to mirror a single drive in a raid5 array, I would really > > like to do that using the raid1 personality. > > [...] > the current hack allows

Re: [PATCH] proactive raid5 disk replacement for 2.6.11, updated

2005-08-18 Thread Pallai Roland
On Thu, 2005-08-18 at 12:24 +0200, Lars Marowsky-Bree wrote: > On 2005-08-18T15:28:41, Neil Brown <[EMAIL PROTECTED]> wrote: > > To handle read failures, I would like the first step to be to re-write > > the failed block. I believe most (all?) drives will relocate the > > block if a write cannot

Re: [PATCH] proactive raid5 disk replacement for 2.6.11, updated

2005-08-18 Thread Pallai Roland
On Thu, 2005-08-18 at 15:28 +1000, Neil Brown wrote: > However I think I would like to do it a little bit differently. thanks for your reply, interesting ideas! > If we want to mirror a single drive in a raid5 array, I would really > like to do that using the raid1 personality. > e.g. >suspe

[PATCH] proactive raid5 disk replacement for 2.6.11, updated

2005-08-17 Thread Pallai Roland
per-device bad block cache has been implemented to speed up arrays with partially failed drives (replies are often slow from those). also helps to determine badly damaged drives based on number of bad blocks, and can take an action if steps over an user defined threshold (see /proc/sys/dev/raid/b

Re: [PATCH] proactive raid5 disk replacement for 2.6.11

2005-08-15 Thread Pallai Roland
On Mon, 2005-08-15 at 13:29 +0200, Mario 'BitKoenig' Holbe wrote: > Pallai Roland <[EMAIL PROTECTED]> wrote: > > this is a feature patch that implements 'proactive raid5 disk > > replacement' (http://www.arctic.org/~dean/raid-wishlist.html), > > th

Re: [PATCH] proactive raid5 disk replacement for 2.6.11 [fixed patch]

2005-08-14 Thread Pallai Roland
On Sun, 2005-08-14 at 22:10 +0200, Pallai Roland wrote: > this is a feature patch that implements 'proactive raid5 disk > replacement' (http://www.arctic.org/~dean/raid-wishlist.html), > [...] sorry, the previous patch doesn't work against vanilla kernel, this one do

[PATCH] proactive raid5 disk replacement for 2.6.11

2005-08-14 Thread Pallai Roland
Hi, this is a feature patch that implements 'proactive raid5 disk replacement' (http://www.arctic.org/~dean/raid-wishlist.html), that could help a lot on large raid5 arrays built from cheap sata drivers when the IO traffic such large that daily media scan on the disks isn't possible. linux soft