Re: block level vs. file level
This also raises another point, which is relevant for both cases - same exact models of hard disks have different number of cylinders, so if a RAID partition is created on a larger drive it cannot be mirrored to a smaller drive. I have a RAID5 with 5 250G drives, but some are 251 GiB (maxtors), some are 250.059 GiB (seagate)... say, if I started with 5 Seagates, I could later replace one of them with a Maxtor, but not the other way around, as the Seagate are just a tiny bit smaller. cfdisk says : sdb1 250994,42 sdc1 250056,74 I suggest, when using software raid, to create partitions that are, say, 100 megabytes or even a gigabyte smaller than the size of the drive. You lose a bit of space, but if you ever need to change one, you won't feel stupid with a brand new drive that you can't use because it's a few sectors too short. - To unsubscribe from this list: send the line unsubscribe linux-raid in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: NVRAM support
On Mon, Feb 13, 2006 at 10:22:04AM +0100, Erik Mouw wrote: On Fri, Feb 10, 2006 at 05:02:02PM -0800, dean gaudet wrote: it doesn't seem to make any sense at all to use a non-volatile external memory for swap... swap has no purpose past a power outage. No, but it is a very fast swap device. Much faster than a hard drive. Wouldn't the same amount of money be better spent on RAM then? -- http://strugglers.net/wiki/Xen_hosting -- A Xen VPS hosting hobby Encrypted mail welcome - keyid 0x604DE5DB signature.asc Description: Digital signature
Re: block level vs. file level
On Mon, Feb 13, 2006 at 09:48:49AM +0100, PFC wrote: I suggest, when using software raid, to create partitions that are, say, 100 megabytes or even a gigabyte smaller than the size of the drive. You lose a bit of space, but if you ever need to change one, you won't feel stupid with a brand new drive that you can't use because it's a few sectors too short. After my previous experience what I tend to do now is set aside about 2GB on each disk to use as components of a RAID-0 that I use for scratch space (/tmp or whatever, anything that I don't care about losing) while the machine is running. That way if I end up by bad luck getting a slightly smaller replacement drive then I can just do away with or shrink its RAID-0 component while keeping the other partitions the same, yet the space is not *totally* wasted. -- http://strugglers.net/wiki/Xen_hosting -- A Xen VPS hosting hobby Encrypted mail welcome - keyid 0x604DE5DB signature.asc Description: Digital signature
RE: NVRAM support
Not the same amount! Match the size of the NV RAM disk with RAM at a fraction of the cost. With the money saved, buy a computer for the kids. :) } -Original Message- } From: [EMAIL PROTECTED] [mailto:linux-raid- } [EMAIL PROTECTED] On Behalf Of Andy Smith } Sent: Monday, February 13, 2006 6:55 AM } To: linux-raid@vger.kernel.org } Subject: Re: NVRAM support } } On Mon, Feb 13, 2006 at 10:22:04AM +0100, Erik Mouw wrote: } On Fri, Feb 10, 2006 at 05:02:02PM -0800, dean gaudet wrote: } it doesn't seem to make any sense at all to use a non-volatile } external } memory for swap... swap has no purpose past a power outage. } } No, but it is a very fast swap device. Much faster than a hard drive. } } Wouldn't the same amount of money be better spent on RAM then? } } -- } http://strugglers.net/wiki/Xen_hosting -- A Xen VPS hosting hobby } Encrypted mail welcome - keyid 0x604DE5DB - To unsubscribe from this list: send the line unsubscribe linux-raid in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
RAID 5 inaccessible - continued
All right, this weekend I was able to use dd to create an imagefile out of the disk. I did the folowing: dd conv=noerror if=dev/hdd1 of=/mnt/hdb1/Faulty-RAIDDisk.img losetup /dev/loop0 /mnt/hdb1/Faulty-RAIDDisk.img I edited the mdadm.conf, by replacing /dev/hdd1 for /dev/loop0. But it did not work out (yet). madm -E /dev/loop0 mdadm: No super block found on /dev/loop0 (Expected magic a92b4efc, got ) How can I continue best? - mdadm -A --force /dev/md0 or - can I restore the superblock from the hdd1 disk (which is still alive) or - can I configure mdadm.conf other than this: (/dev/hdc1 is spare, probably out of date) DEVICE /dev/hdb1 /dev/hdc1 /dev/loop0 ARRAY /dev/md0 devices=/dev/hdb1,/dev/hdc1,/dev/loop0 or - some other solution? Krekna 2006/2/8, Krekna Mektek [EMAIL PROTECTED]: Hi, I found out that my storage drive was gone and I went to my server to check out what wrong. I've got 3 400GB disks wich form the array. I found out I had one spare and one faulty drive, and the RAID 5 array was not able to recover. After a reboot because of some stuff with Xen my main rootdisk (hda) was also failing, and the whole machine was not able to boot anymore. And there I was... After I tried to commit suicide and did not succeed, I went back to my server to try something out. I booted with Knoppix 4.02 and edited the mdadm.conf as follows: DEVICE /dev/hd[bcd]1 ARRAY /dev/md0 devices=/dev/hdb1,/dev/hdc1,/dev/hdd1 I executed mdrun and the following messages appeared: Forcing event count in /dev/hdd1(2) from 81190986 upto 88231796 clearing FAULTY flag for device 2 in /dev/md0 for /dev/hdd1 /dev/md0 has been started with 2 drives (out of 3) and 1 spare. So I thought I was lucky enough, to get back my data, maybe a bit lost concerning the event count which is missing some. Am I right? But, when I tried to mount it the next day, this was also not happening. I ended up with one faulty, one spare and one active. After stopping and starting the array sometimes the array was rebuilding again. I found out that the disk that it needs to rebuilt the array (hdd1 that is) is getting errors and falls back to faulty again. Number Major Minor RaidDevice State 0 3 650 active sync 1 00- removed 2 22 652 active sync 3 2211 spare rebuilding and then this: Rebuild Status : 1% complete Number Major Minor RaidDevice State 0 3 650 active sync 1 00- removed 2 00- removed 3 2211 spare rebuilding 4 22 652 faulty And my dmesg is full of these errors coming from the faulty hdd: end_request: I/O error, dev hdd, sector 13614775 hdd: dma_intr: status=0x51 { DriveReady SeekComplete Error } hdd: dma_intr: error=0x40 { UncorrectableError }, LBAsect=13615063, high=0, low=13615063, sector=13614783 ide: failed opcode was: unknown end_request: I/O error, dev hdd, sector 13614783 I guess this will never succeed... Is there away to get this data back from the individual disks perhaps? FYI: [EMAIL PROTECTED] cat /proc/mdstat Personalities : [raid5] md0 : active raid5 hdb1[0] hdc1[3] hdd1[4](F) 781417472 blocks level 5, 64k chunk, algorithm 2 [3/1] [U__] [] recovery = 1.7% (6807460/390708736) finish=3626.9min speed=1764K/sec unused devices: none Krekna - To unsubscribe from this list: send the line unsubscribe linux-raid in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Question: array locking, possible?
Rick On HP-UX disk mirroring is done in LVM. I'm using md driver for mirroring and LVM on top of it. Controlling access to my disks in LVM is just too late. I would have to assemble the array before I can activate VGs. If the array in question is being used on the other host nobody can guarantee that bad thing wont happen. And what I would like to prevent is: two hosts accessing (writing) an array. Thanks anyway for the hint. Regards, Chris On Thu, 9 Feb 2006 10:28:58 -0800 Stern, Rick (Serviceguard Linux) [EMAIL PROTECTED] wrote: There is more interest, just not vocal. May want to look at LVM2 and its ability to use tagging to control enablement of VGs. This way it is not HW dependent. -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Chris Osicki Sent: Thursday, February 09, 2006 2:26 AM To: linux-raid@vger.kernel.org Subject: Re: Question: array locking, possible? It looks like we are the only two md users interested in such a feature. Not enough to get Neil's attention ;-) Regards, Chris On Wed, 8 Feb 2006 21:45:33 +0100 Jure Peèar [EMAIL PROTECTED] wrote: On Wed, 8 Feb 2006 11:55:49 +0100 Chris Osicki [EMAIL PROTECTED] wrote: I was thinking about it, I have no idea how to do it on Linux if ever possible. I connect over fibre channel SAN, using QLogic QLA2312 HBAS, if it matters. Anyone any hints? I too am running a jbod with md raid between two machines. So far md never caused any kind of problems, altough I did have situations where both machines were syncing mirrors at once. If there's a little tool to reserve a disk via scsi, I'd like to know about it too. Even a piece of code would be enough. -- Jure Peèar http://jure.pecar.org/ - To unsubscribe from this list: send the line unsubscribe linux-raid in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html - To unsubscribe from this list: send the line unsubscribe linux-raid in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html - To unsubscribe from this list: send the line unsubscribe linux-raid in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html - To unsubscribe from this list: send the line unsubscribe linux-raid in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
RE: Question: array locking, possible?
I understand about HP-UX mirroring/LVM. I was a little too obtuse. LVM2 has a feature (not well advertised) that allows an VG to be tagged so it will not be activated by system b if it is already tagged as being in use by system a. I was suggesting that a similar feature could be added to MD. This way a MD array could be marked as owned and, if so, mdadm would not activate it from another system. This way all of the MD control is still within mdadm. If Neil is interested, I'll try to dig up more info. Regards, Rick -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Chris Osicki Sent: Monday, February 13, 2006 9:13 AM To: linux-raid@vger.kernel.org Subject: Re: Question: array locking, possible? Rick On HP-UX disk mirroring is done in LVM. I'm using md driver for mirroring and LVM on top of it. Controlling access to my disks in LVM is just too late. I would have to assemble the array before I can activate VGs. If the array in question is being used on the other host nobody can guarantee that bad thing wont happen. And what I would like to prevent is: two hosts accessing (writing) an array. Thanks anyway for the hint. Regards, Chris On Thu, 9 Feb 2006 10:28:58 -0800 Stern, Rick (Serviceguard Linux) [EMAIL PROTECTED] wrote: There is more interest, just not vocal. May want to look at LVM2 and its ability to use tagging to control enablement of VGs. This way it is not HW dependent. -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Chris Osicki Sent: Thursday, February 09, 2006 2:26 AM To: linux-raid@vger.kernel.org Subject: Re: Question: array locking, possible? It looks like we are the only two md users interested in such a feature. Not enough to get Neil's attention ;-) Regards, Chris On Wed, 8 Feb 2006 21:45:33 +0100 Jure Peèar [EMAIL PROTECTED] wrote: On Wed, 8 Feb 2006 11:55:49 +0100 Chris Osicki [EMAIL PROTECTED] wrote: I was thinking about it, I have no idea how to do it on Linux if ever possible. I connect over fibre channel SAN, using QLogic QLA2312 HBAS, if it matters. Anyone any hints? I too am running a jbod with md raid between two machines. So far md never caused any kind of problems, altough I did have situations where both machines were syncing mirrors at once. If there's a little tool to reserve a disk via scsi, I'd like to know about it too. Even a piece of code would be enough. -- Jure Peèar http://jure.pecar.org/ - To unsubscribe from this list: send the line unsubscribe linux-raid in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html - To unsubscribe from this list: send the line unsubscribe linux-raid in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html - To unsubscribe from this list: send the line unsubscribe linux-raid in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html - To unsubscribe from this list: send the line unsubscribe linux-raid in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html - To unsubscribe from this list: send the line unsubscribe linux-raid in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Question: array locking, possible?
Luca On Thu, 9 Feb 2006 21:48:48 +0100 Luca Berra [EMAIL PROTECTED] wrote: On Thu, Feb 09, 2006 at 10:28:58AM -0800, Stern, Rick (Serviceguard Linux) wrote: There is more interest, just not vocal. May want to look at LVM2 and its ability to use tagging to control enablement of VGs. This way it is not HW dependent. I believe there is space in md1 superblock for a cluster/exclusive flag, if not the name field could be used Great if there is space for it there is a hope. Unfortunately I don't think my programming skills are up to such a task as making proof-of-concept patches. what is missing is an interface between mdadm and cmcld so mdadm can ask cmcld permission to activate an array with the cluster/exclusive flag set. For the time being we could live without it. I'm convinced HP would make use of it once it's there. And I wouldn't say mdadm should get permission from cmcld (for those who don't know Service Guard cluster software from HP: cmcld is the Cluster daemon). IMHO cmcld should clear the flag on the array when initiating a fail-over in case the host which used it crashed. Once again, what I would like it for is for preventing two hosts writing the array at the same time because I accidentally activated it. Without cmcld's awareness of the cluster/exclusive flag I would always run mdadm with the '--force' option to enable the array during package startup, because if I trust the cluster software I know the fail-over is happening because the other node crashed or it is a manual (clean) fail-over. We can discuss details of SG integration after Neil implemented this flag. I can hope, you already found space for it ... ;-) Regards, Chris L. -- Luca Berra -- [EMAIL PROTECTED] Communication Media Services S.r.l. /\ \ / ASCII RIBBON CAMPAIGN XAGAINST HTML MAIL / \ - To unsubscribe from this list: send the line unsubscribe linux-raid in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html - To unsubscribe from this list: send the line unsubscribe linux-raid in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Question: array locking, possible?
Rick You must have missed my first posting, or maybe I was not clear enough. We _are_ talking about the same thing. Now we are already three or four thinking of it as a useful feature, the pression on Neil is dramatically increasing ... ;-) Regards, Chris On Mon, 13 Feb 2006 09:21:06 -0800 Stern, Rick (Serviceguard Linux) [EMAIL PROTECTED] wrote: I understand about HP-UX mirroring/LVM. I was a little too obtuse. LVM2 has a feature (not well advertised) that allows an VG to be tagged so it will not be activated by system b if it is already tagged as being in use by system a. I was suggesting that a similar feature could be added to MD. This way a MD array could be marked as owned and, if so, mdadm would not activate it from another system. This way all of the MD control is still within mdadm. If Neil is interested, I'll try to dig up more info. Regards, Rick -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Chris Osicki Sent: Monday, February 13, 2006 9:13 AM To: linux-raid@vger.kernel.org Subject: Re: Question: array locking, possible? Rick On HP-UX disk mirroring is done in LVM. I'm using md driver for mirroring and LVM on top of it. Controlling access to my disks in LVM is just too late. I would have to assemble the array before I can activate VGs. If the array in question is being used on the other host nobody can guarantee that bad thing wont happen. And what I would like to prevent is: two hosts accessing (writing) an array. Thanks anyway for the hint. Regards, Chris On Thu, 9 Feb 2006 10:28:58 -0800 Stern, Rick (Serviceguard Linux) [EMAIL PROTECTED] wrote: There is more interest, just not vocal. May want to look at LVM2 and its ability to use tagging to control enablement of VGs. This way it is not HW dependent. -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Chris Osicki Sent: Thursday, February 09, 2006 2:26 AM To: linux-raid@vger.kernel.org Subject: Re: Question: array locking, possible? It looks like we are the only two md users interested in such a feature. Not enough to get Neil's attention ;-) Regards, Chris On Wed, 8 Feb 2006 21:45:33 +0100 Jure Peèar [EMAIL PROTECTED] wrote: On Wed, 8 Feb 2006 11:55:49 +0100 Chris Osicki [EMAIL PROTECTED] wrote: I was thinking about it, I have no idea how to do it on Linux if ever possible. I connect over fibre channel SAN, using QLogic QLA2312 HBAS, if it matters. Anyone any hints? I too am running a jbod with md raid between two machines. So far md never caused any kind of problems, altough I did have situations where both machines were syncing mirrors at once. If there's a little tool to reserve a disk via scsi, I'd like to know about it too. Even a piece of code would be enough. -- Jure Peèar http://jure.pecar.org/ - To unsubscribe from this list: send the line unsubscribe linux-raid in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html - To unsubscribe from this list: send the line unsubscribe linux-raid in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html - To unsubscribe from this list: send the line unsubscribe linux-raid in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html - To unsubscribe from this list: send the line unsubscribe linux-raid in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html - To unsubscribe from this list: send the line unsubscribe linux-raid in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html - To unsubscribe from this list: send the line unsubscribe linux-raid in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Question: array locking, possible?
On Mon, Feb 13, 2006 at 06:52:47PM +0100, Chris Osicki wrote: Luca On Thu, 9 Feb 2006 21:48:48 +0100 Luca Berra [EMAIL PROTECTED] wrote: On Thu, Feb 09, 2006 at 10:28:58AM -0800, Stern, Rick (Serviceguard Linux) wrote: There is more interest, just not vocal. May want to look at LVM2 and its ability to use tagging to control enablement of VGs. This way it is not HW dependent. I believe there is space in md1 superblock for a cluster/exclusive flag, if not the name field could be used Great if there is space for it there is a hope. Unfortunately I don't think my programming skills are up to such a task as making proof-of-concept patches. i was thinking of adding a bit in the feature_map flags to enable this kind of behaviour, the downside of it is that kernel space code has to be updated to account for this flags, as it is for anything in the superblock except for name. Neil, what would you think of reserving some more space in the superblock for other data which can be used from user-space? i believe playing with name is a kludge. what is missing is an interface between mdadm and cmcld so mdadm can ask cmcld permission to activate an array with the cluster/exclusive flag set. For the time being we could live without it. I'm convinced HP would make use of it once it's there. i was thinking something like a socket based interface between mdadm and a generic cluster daemon, non necessarily cmcld. And I wouldn't say mdadm should get permission from cmcld (for those who don't know Service Guard cluster software from HP: cmcld is the Cluster daemon). IMHO cmcld should clear the flag on the array when initiating a fail-over in case the host which used it crashed. no, i don't like the flag to be cleared, there is too much space for a race. The flag should be permanent (unless it is forcibly removed with mdadm --grow). Once again, what I would like it for is for preventing two hosts writing the array at the same time because I accidentally activated it. Without cmcld's awareness of the cluster/exclusive flag I would always run mdadm with the '--force' option to enable the array during package startup, because if I trust the cluster software I know the fail-over is happening because the other node crashed or it is a manual (clean) fail-over. if you only want this, it could be entirely implemented into mdadm, just adding a exclusive flag to the ARRAY line in mdadm.conf this is not foolproof, as it will only prevent mdadm -As from assembling a device, providing the identification information on the command line or running something like mdadm -Asc partitions, to fool it. -- Luca Berra -- [EMAIL PROTECTED] Communication Media Services S.r.l. /\ \ / ASCII RIBBON CAMPAIGN XAGAINST HTML MAIL / \ diff -urN mdadm-2.3.1/Assemble.c mdadm-2.3.1.exclusive/Assemble.c --- mdadm-2.3.1/Assemble.c 2006-01-25 08:01:10.0 +0100 +++ mdadm-2.3.1.exclusive/Assemble.c2006-02-13 22:48:04.0 +0100 @@ -34,7 +34,7 @@ mddev_dev_t devlist, int readonly, int runstop, char *update, -int verbose, int force) +int verbose, int force, int exclusive) { /* * The task of Assemble is to find a collection of @@ -255,6 +255,15 @@ continue; } + if (ident-exclusive != UnSet + !exclusive ) { + if ((inargv verbose = 0) || verbose 0) + fprintf(stderr, Name : %s can be activated in exclusive mode only.\n, + devname); + continue; + } + + /* If we are this far, then we are commited to this device. * If the super_block doesn't exist, or doesn't match others, * then we cannot continue diff -urN mdadm-2.3.1/ReadMe.c mdadm-2.3.1.exclusive/ReadMe.c --- mdadm-2.3.1/ReadMe.c2006-02-06 05:09:35.0 +0100 +++ mdadm-2.3.1.exclusive/ReadMe.c 2006-02-13 22:27:26.0 +0100 @@ -147,6 +147,7 @@ {scan, 0, 0, 's'}, {force,0, 0, 'f'}, {update, 1, 0, 'U'}, +{exclusive, 0, 0, 'x'}, /* Management */ {add, 0, 0, 'a'}, diff -urN mdadm-2.3.1/config.c mdadm-2.3.1.exclusive/config.c --- mdadm-2.3.1/config.c2005-12-09 06:00:47.0 +0100 +++ mdadm-2.3.1.exclusive/config.c 2006-02-13 22:23:02.0 +0100 @@ -286,6 +286,7 @@ mis.st = NULL; mis.bitmap_fd = -1; mis.name[0] = 0; + mis.exclusive = 0; for (w=dl_next(line); w!=line; w=dl_next(w)) { if (w[0] == '/') { @@ -386,6 +387,8 @@ fprintf(stderr, Name : auto type of \%s\ ignored for %s\n, w+5, mis.devname?mis.devname:unlabeled-array); } + } else if (strncasecmp(w,
Re: Question: array locking, possible?
On Mon, Feb 13, 2006 at 10:53:43PM +0100, Luca Berra wrote: diff -urN mdadm-2.3.1/Assemble.c mdadm-2.3.1.exclusive/Assemble.c please note that the patch was written while i was composing the email as a proof-of-concept, it should not be considered working (or even compiling code) L. -- Luca Berra -- [EMAIL PROTECTED] Communication Media Services S.r.l. /\ \ / ASCII RIBBON CAMPAIGN XAGAINST HTML MAIL / \ - To unsubscribe from this list: send the line unsubscribe linux-raid in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC][PATCH 000 of 3] MD Acceleration and the ADMA interface: Introduction
On Monday February 6, [EMAIL PROTECTED] wrote: On 2/5/06, Neil Brown [EMAIL PROTECTED] wrote: I've looked through the patches - not exhaustively, but hopefully enough to get a general idea of what is happening. There are some things I'm not clear on and some things that I could suggest alternates too... I have a few questions to check that I understand your suggestions. (sorry for the delay). - Each ADMA client (e.g. a raid5 array) gets a dedicated adma thread to handle all its requests. And it handles them all in series. I wonder if this is really optimal. If there are multiple adma engines, then a single client could only make use of one of them reliably. It would seem to make more sense to have just one thread - or maybe one per processor or one per adma engine - and have any ordering between requests made explicit in the interface. Actually as each processor could be seen as an ADMA engine, maybe you want one thread per processor AND one per engine. If there are no engines, the per-processor threads run with high priority, else with low. ...so the engine thread would handle explicit client requested ordering constraints and then hand the operations off to per processor worker threads in the pio case or queue directly to hardware in the presence of such an engine. In md_thread you talk about priority inversion deadlocks, do those same concerns apply here? That comment in md.c about priority inversion deadlocks predates my involvement - making it s last millennium... I don't think it is relevant any more, and possibly never was. I don't see any room for priority inversion here. I probably wouldn't even have an 'engine thread'. If I were to write 'md' today, it probably wouldn't have a dedicate thread but would use 'schedule_work' to arrange for code to be run in process-context. The ADMA engine could do the same. Note: I'm not saying this is the right way to go. But I do think it is worth exploring. I'm not sure about threads for the 'pio' case. It would probably be easiest that way, but I would explore the 'schedule_work' family of services first. But yes, the ADMA engine would handle explicit client requested ordering and arrange for work to be done somehow. - I have thought that the way md/raid5 currently does the 'copy-to-buffer' and 'xor' in two separate operations may not be the best use of the memory bus. If you could have a 3-address operation that read from A, stored into B, and xorred into C, then A would have to be read half as often. Would such an interface make sense with ADMA? I don't have sufficient knowledge of assemble to do it myself for the current 'xor' code. At the very least I can add a copy+xor command to ADMA, that way developers implementing engines can optimize for this case, if the hardware supports it, and the hand coded assembly guys can do their thing. - Your handling of highmem doesn't seem right. You shouldn't kmap it until you have decided that you have to do the operation 'by hand' (i.e. in the cpu, not in the DMA engine). If the dma engine can be used at all, kmap isn't needed at all. I made the assumption that if CONFIG_HIGHMEM is not set then the kmap call resolves to a simple page_address() call. I think its ok, but it does look fishy so I will revise this code. I also was looking to handle the case where the underlying hardware DMA engine does not support high memory addresses. I think the only way to handle the ADMA engine not supporting high memory is to do the operation 'polled' - i.e. in the CPU. The alternative is to copy it to somewhere that the DMA engine can reach, and if you are going to do that, you have done most of the work already. Possibly you could still gain by using the engine for RAID6 calculations, but not for copy, compare, or xor operations. And if you are using the DMA engine, then you don't want the page_address. You want to use pci_map_page (or similar?) to get a dma_handle. For example, one it has been decided to initiate a write (there is enough data to correctly update the parity block). You need to perform a sequence of copies and xor operations, and then submit write requests. This is currently done by the copy/xor happening inline under the sh-lock spinlock, and then R5_WantWrite is set. Then, out side the spinlock, if WantWrite is set generic_make_request is calls as appropriate. I would change this so that a sequence of descriptors was assembled which described that copies and xors. Appropriate call-backs would be set so that the generic_make_request is called at the right time (after the copy, or after that last xor for the parity block). Then outside the sh-lock spinlock this sequence is passed to the ADMA manager. If there is no ADMA engine present, everything is performed
Lilo append= , A suggestion .
Hello Neil All , I'll bet I am going to get harassed over this , but ... The present form (iirc) of the lilo append statement is append=md=d0,/dev/sda,/dev/sdb I am wondering how difficult the below would be to code ? This allows a (relatively) short strings to be append'd instead of the sometimes large listing of devices . append=md=d0,UUID=e9e0f605:9ed694c2:3e2002c9:0415c080 Ok , I got my asbestos brithes on . Have at it ;-) . Tia , JimL -- +--+ | James W. Laferriere | SystemTechniques | Give me VMS | | NetworkEngineer | 3542 Broken Yoke Dr. | Give me Linux | | [EMAIL PROTECTED] | Billings , MT. 59105 | only on AXP | | http://www.asteriskhelpdesk.com/cgi-bin/astlance/r.cgi?babydr | +--+ - To unsubscribe from this list: send the line unsubscribe linux-raid in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Lilo append= , A suggestion .
On Monday February 13, [EMAIL PROTECTED] wrote: Hello Neil All , I'll bet I am going to get harassed over this , but ... The present form (iirc) of the lilo append statement is append=md=d0,/dev/sda,/dev/sdb I am wondering how difficult the below would be to code ? This allows a (relatively) short strings to be append'd instead of the sometimes large listing of devices . append=md=d0,UUID=e9e0f605:9ed694c2:3e2002c9:0415c080 Ok , I got my asbestos brithes on . Have at it ;-) . This is just the job for an initramfs. They are *really*easy* to make, and very flexible. mdadm-2.2 and later come with a little script which (tested on Debian) makes a simple initramfs which will recognise a kernel parameter (as passed by lilo's 'append') like rootuuid=97e58306:2c85fd85:2346b91e:aaca5fee and will assemble the appropriate array a /dev/md_d0 and will then mount a filesystem of there as root. If it doesn't do exactly what you want, it is fairly easy to modify. NeilBrown - To unsubscribe from this list: send the line unsubscribe linux-raid in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Lilo append= , A suggestion .
On Mon, Feb 13, 2006 at 09:12:42PM -0700, Mr. James W. Laferriere wrote: Hello Neil All , I'll bet I am going to get harassed over this , but ... The present form (iirc) of the lilo append statement is append=md=d0,/dev/sda,/dev/sdb I am wondering how difficult the below would be to code ? This allows a (relatively) short strings to be append'd instead of the sometimes large listing of devices . append=md=d0,UUID=e9e0f605:9ed694c2:3e2002c9:0415c080 Ok , I got my asbestos brithes on . Have at it ;-) . Tia , JimL what about all the past threads about in-kernel autodetection? L. -- Luca Berra -- [EMAIL PROTECTED] Communication Media Services S.r.l. /\ \ / ASCII RIBBON CAMPAIGN XAGAINST HTML MAIL / \ - To unsubscribe from this list: send the line unsubscribe linux-raid in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html