Re: ata (raid) patches
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 I wrote: | For those game enough to try the results of my handiwork ;-), enclosed | is a patch against the files in /usr/src/sys/dev/ata for RELENG_6 (and | possibly others) with the following objectives: | | 1) the ata-raid driver currently leaks ata_composite and ata_request | structures into neverland in a mirrored configuration. This can be | observed using sysctl -a | grep ^ata_ and noting the increasing | in-use count as time goes on. Eventually, this causes the kernel to | run out of memory. This is fixed by tracking the request counts on each | composite request. ~ [ .. ] | As usual, this patch comes with no warranty ... it works for me. If it | breaks your system, you own all the pieces. | | I recommend you back up your system before testing, This is just the composite/request leak patch on it's own, Michael -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.0 (MingW32) iD8DBQFDi9J3iJykeV6HPMURAvENAJ4v9p/HjHLQ+iJ+EH23+z9ZiTbXyACeOhNA ux0UiWyGNNKu4rrPl12GP+4= =NC/r -END PGP SIGNATURE- *** /usr/src/sys/dev/ata/ata-all.h.orig Sun Nov 27 14:17:57 2005 --- /usr/src/sys/dev/ata/ata-all.h Sun Nov 27 14:22:05 2005 *** *** 331,336 --- 331,337 u_int32_t wr_depend; /* write depends on subdisks */ u_int32_t wr_done;/* done write subdisks */ struct ata_request *request[32]; /* size must match maps above */ + long count; /* count required of this composite */ caddr_t data_1; caddr_t data_2; }; *** ata-raid.c.orig Mon Nov 28 18:02:01 2005 --- ata-raid.c Mon Nov 28 22:52:53 2005 *** *** 410,415 --- 410,416 mtx_init(composite-lock, ATA PseudoRAID rebuild lock, NULL, MTX_DEF); + composite-count = request-bytecount; composite-rd_needed |= (1 drv); composite-wr_depend |= (1 drv); composite-wr_needed |= (1 this); *** *** 468,473 --- 469,475 mtx_init(composite-lock, ATA PseudoRAID mirror lock, NULL, MTX_DEF); + composite-count = request-bytecount; composite-wr_needed |= (1 drv); composite-wr_needed |= (1 this); composite-request[drv] = request; *** *** 607,613 /* good data, update how far we've gotten */ else { bp-bio_resid -= request-donecount; ! if (bp-bio_resid == 0) { if (composite-wr_done (1 mirror)) finished = 1; } --- 609,616 /* good data, update how far we've gotten */ else { bp-bio_resid -= request-donecount; ! composite-count -= request-donecount; ! if (composite-count == 0) { if (composite-wr_done (1 mirror)) finished = 1; } *** *** 621,627 printf(DOH! rebuild failed\n); /* XXX SOS */ rdp-rebuild_lba = blk; } ! if (bp-bio_resid == 0) finished = 1; } } --- 624,630 printf(DOH! rebuild failed\n); /* XXX SOS */ rdp-rebuild_lba = blk; } ! if (composite-count == 0) finished = 1; } } *** *** 658,667 } bp-bio_resid -= composite-request[mirror]-donecount; } ! else bp-bio_resid -= request-donecount; ! if (bp-bio_resid == 0) finished = 1; } mtx_unlock(composite-lock); --- 661,674 } bp-bio_resid -= composite-request[mirror]-donecount; + composite-count -= +
ata (raid) patches
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 For those game enough to try the results of my handiwork ;-), enclosed is a patch against the files in /usr/src/sys/dev/ata for RELENG_6 (and possibly others) with the following objectives: 1) the ata-raid driver currently leaks ata_composite and ata_request structures into neverland in a mirrored configuration. This can be observed using sysctl -a | grep ^ata_ and noting the increasing in-use count as time goes on. Eventually, this causes the kernel to run out of memory. This is fixed by tracking the request counts on each composite request. 2) another part of this patch is to ata-queue where a channel lock is asserted in a (hopefully rare) rebuild process even if the dependencies flag is set (we're waiting for a read). Moving the test for a dependency outside of the lock saves waiting on it when nothing can be done. A small nit with near negligible impact but, when you're waiting for a rebuild ... 3) another part of this patch is to ata-raid where the choice of drive from which to read favours one side of a mirror even when both drives are near the block(s) we want. Because the mirror is on another channel on the Highpoint controller, it performs (marginally :-() better when we toggle between them. As usual, this patch comes with no warranty ... it works for me. If it breaks your system, you own all the pieces. I recommend you back up your system before testing, Michael -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.0 (MingW32) iD8DBQFDihFsiJykeV6HPMURAlHQAJ9UO/vD8rp/V3+zh89qBBOPGQ+cugCg5qGp 9N8FaffInZMuOIMSGICV70c= =NZTT -END PGP SIGNATURE- *** /usr/src/sys/dev/ata/ata-all.h.orig Sun Nov 27 14:17:57 2005 --- /usr/src/sys/dev/ata/ata-all.h Sun Nov 27 14:22:05 2005 *** *** 331,336 --- 331,337 u_int32_t wr_depend; /* write depends on subdisks */ u_int32_t wr_done;/* done write subdisks */ struct ata_request *request[32]; /* size must match maps above */ + long count; /* count required of this composite */ caddr_t data_1; caddr_t data_2; }; *** /usr/src/sys/dev/ata/ata-queue.c.orig Sun Nov 27 14:17:57 2005 --- /usr/src/sys/dev/ata/ata-queue.cSun Nov 27 14:40:43 2005 *** *** 182,209 } /* check we are in the right state and has no dependencies */ ! mtx_lock(ch-state_mtx); ! if (ch-state == ATA_IDLE !dependencies) { ! ATA_DEBUG_RQ(request, starting); ! TAILQ_REMOVE(ch-ata_queue, request, chain); ! ch-running = request; ! ch-state = ATA_ACTIVE; ! ! /* if we are the freezing point release it */ ! if (ch-freezepoint == request) ! ch-freezepoint = NULL; ! ! if (ch-hw.begin_transaction(request) == ATA_OP_FINISHED) { ! ch-running = NULL; ! ch-state = ATA_IDLE; ! mtx_unlock(ch-state_mtx); ! mtx_unlock(ch-queue_mtx); ! ATA_LOCKING(dev, ATA_LF_UNLOCK); ! ata_finish(request); ! return; } } - mtx_unlock(ch-state_mtx); } } mtx_unlock(ch-queue_mtx); --- 182,211 } /* check we are in the right state and has no dependencies */ ! if (!dependencies) { ! mtx_lock(ch-state_mtx); ! if (ch-state == ATA_IDLE) { ! ATA_DEBUG_RQ(request, starting); ! TAILQ_REMOVE(ch-ata_queue, request, chain); ! ch-running = request; ! ch-state = ATA_ACTIVE; ! ! /* if we are the freezing point release it */ ! if (ch-freezepoint == request) ! ch-freezepoint = NULL; ! ! if (ch-hw.begin_transaction(request) == ATA_OP_FINISHED) { ! ch-running = NULL; ! ch-state = ATA_IDLE; ! mtx_unlock(ch-state_mtx); ! mtx_unlock(ch-queue_mtx); ! ATA_LOCKING(dev, ATA_LF_UNLOCK); ! ata_finish(request); ! return; ! } } + mtx_unlock(ch-state_mtx); } } } mtx_unlock(ch-queue_mtx); *** *** 426,432 composite-wr_done |= (1 request-this); if (composite-wr_depend ! (composite-rd_done composite-wr_depend)==composite-wr_depend (composite-wr_needed (~composite-wr_done))) { index = ((composite-wr_needed (~composite-wr_done))) - 1; } --- 428,434 composite-wr_done |= (1 request-this); if
Re: ata (raid) patches
Michael Butler wrote: 1) the ata-raid driver currently leaks ata_composite and ata_request structures into neverland in a mirrored configuration. This can be observed using sysctl -a | grep ^ata_ and noting the increasing in-use count as time goes on. Eventually, this causes the kernel to run out of memory. This is fixed by tracking the request counts on each composite request. Looks pretty much on the spot, I'll look this one over and get it committed once I'm satisfied with it fixing the bug, thanks a bunch for hunting this one down ! 2) another part of this patch is to ata-queue where a channel lock is asserted in a (hopefully rare) rebuild process even if the dependencies flag is set (we're waiting for a read). Moving the test for a dependency outside of the lock saves waiting on it when nothing can be done. A small nit with near negligible impact but, when you're waiting for a rebuild ... I dont think you can measure this actually, but it doesn't hurt at least to move the lock outside. 3) another part of this patch is to ata-raid where the choice of drive from which to read favours one side of a mirror even when both drives are near the block(s) we want. Because the mirror is on another channel on the Highpoint controller, it performs (marginally :-() better when we toggle between them. Hmm, well, depends on what sort of behavior one wants to optimise for. I have a few other patches for optimisations but havn't decided what to eventually use yet. Guess its time for me to run some tests on this.. I'll look into get this integrated, again thanks for digging in and doing the hard work of finding the problem(s) !! -Søren ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: ata (raid) patches
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Søren Schmidt wrote: | 3) another part of this patch is to ata-raid where the choice of drive | from which to read favours one side of a mirror even when both drives | are near the block(s) we want. Because the mirror is on another channel | on the Highpoint controller, it performs (marginally :-() better when we | toggle between them. | | | Hmm, well, depends on what sort of behavior one wants to optimise for. I | have a few other patches for optimisations but havn't decided what to | eventually use yet. Guess its time for me to run some tests on this.. On further investigation, I'd failed to notice (doh!) that rdp-toggle isn't preserved between requests and that one unobvious side-effect of the original code is that it does slightly better than my patched version .. sigh. I'll rework this and see if I can make it do better, Michael -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.0 (MingW32) iD8DBQFDijcSiJykeV6HPMURAiRkAJ4omvpWlXhsJn+B17jZiyCwCl58DgCdFXJJ EndsDJBOOn/m3Dl5fQkMziA= =6dn0 -END PGP SIGNATURE- ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]