I have been putting raid recovery (raid 1 and 5) through it's paces by marking various partitions failed (using raidsetfaulty) Then doing raidhotremove and raidhotadd to either restart the same partition, or add a new partition. I came across a condition which caused a kernel oops 3 out of 4 tries with a call to address 0. Create a raid 1 fail one partition remove it add a partition on an IDE hard disk. I could not get a failure when adding a partition on a SCSI device, not have I seen a failure on a raid 5. There are no glaring differences between HOT_ADD on raid 1 and 5. After examining the oops, I determined that it was always happening in md_do_sync when a call was being made to run_task_queue(&tq_disk). I struggled for some time to understand the reason for calling run_task_queue at this point and decided it must be for speed. I commented out both run_task_queue calls, recompiled. The problem seems to have gone away and no measurable performance decrease. I see that run_task_queue has re-entrance protection otherwise I would suspect an interrupt serviceing the run_task_queue as well. Either way, I am running without the calls to run_task_queue and will continue testing for other failures in the recovery path. One Final note. After the oops, I would cycle power on the computer. When starting up, raid 1 considered the raid array to be in sync, even though the sync never occurred. I saw the area where the SuperBlock was being updated, and I will check to see if the SB is updated correctly. Clay
