Re: [dm-devel] [PATCH v7 9/9] block: Avoid deadlocks with bio allocation by stacking drivers
> "Kent" == Kent Overstreet writes: Kent> On Wed, Aug 29, 2012 at 03:39:14PM +0100, Alasdair G Kergon wrote: >> It's also instructive to remember why the code is the way it is: it used >> to process bios for underlying devices immediately, but this sometimes >> meant too much recursive stack growth. If a per-device rescuer thread >> is to be made available (as well as the mempool), the option of >> reinstating recursion is there too - only punting to workqueue when the >> stack actually becomes "too big". (Also bear in mind that some dm >> targets may have dependencies on their own mempools - submission can >> block there too.) I find it helpful only to consider splitting into two >> pieces - it must always be possible to process the first piece (i.e. >> process it at the next layer down in the stack) and complete it >> independently of what happens to the second piece (which might require >> further splitting and block until the first piece has completed). Kent> I'm sure it could be made to work (and it may well simpler), but it Kent> seems problematic from a performance pov. Kent> With stacked devices you'd then have to switch stacks on _every_ bio. Kent> That could be made fast enough I'm sure, but it wouldn't be free and I Kent> don't know of any existing code in the kernel that implements what we'd Kent> need (though if you know how you'd go about doing that, I'd love to Kent> know! Would be useful for other things). Kent> The real problem is that because we'd need these extra stacks for Kent> handling all bios we couldn't get by with just one per bio_set. We'd Kent> only need one to make forward progress so the rest could be allocated Kent> on demand (i.e. what the workqueue code does) but that sounds like it's Kent> starting to get expensive. Maybe we need to limit the size of BIOs to that of the bottom-most underlying device instead? Or maybe limit BIOs to some small multiple? As you stack up DM targets one on top of each other, they should respect the limits of the underlying devices and pass those limits up the chain. Or maybe I'm speaking giberish... John -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [dm-devel] [PATCH v7 9/9] block: Avoid deadlocks with bio allocation by stacking drivers
On Wed, Aug 29, 2012 at 05:01:09PM -0400, John Stoffel wrote: > > "Kent" == Kent Overstreet writes: > > Kent> On Wed, Aug 29, 2012 at 03:39:14PM +0100, Alasdair G Kergon wrote: > >> It's also instructive to remember why the code is the way it is: it used > >> to process bios for underlying devices immediately, but this sometimes > >> meant too much recursive stack growth. If a per-device rescuer thread > >> is to be made available (as well as the mempool), the option of > >> reinstating recursion is there too - only punting to workqueue when the > >> stack actually becomes "too big". (Also bear in mind that some dm > >> targets may have dependencies on their own mempools - submission can > >> block there too.) I find it helpful only to consider splitting into two > >> pieces - it must always be possible to process the first piece (i.e. > >> process it at the next layer down in the stack) and complete it > >> independently of what happens to the second piece (which might require > >> further splitting and block until the first piece has completed). > > Kent> I'm sure it could be made to work (and it may well simpler), but it > Kent> seems problematic from a performance pov. > > Kent> With stacked devices you'd then have to switch stacks on _every_ bio. > Kent> That could be made fast enough I'm sure, but it wouldn't be free and I > Kent> don't know of any existing code in the kernel that implements what we'd > Kent> need (though if you know how you'd go about doing that, I'd love to > Kent> know! Would be useful for other things). > > Kent> The real problem is that because we'd need these extra stacks for > Kent> handling all bios we couldn't get by with just one per bio_set. We'd > Kent> only need one to make forward progress so the rest could be allocated > Kent> on demand (i.e. what the workqueue code does) but that sounds like it's > Kent> starting to get expensive. > > Maybe we need to limit the size of BIOs to that of the bottom-most > underlying device instead? Or maybe limit BIOs to some small > multiple? As you stack up DM targets one on top of each other, they > should respect the limits of the underlying devices and pass those > limits up the chain. That's the approach the block layer currently tries to take. It's brittle, tricky and inefficient, and it completely breaks down when the limits are dynamic - so things like dm and bcache are always going to have to split anyways. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [dm-devel] [PATCH v7 9/9] block: Avoid deadlocks with bio allocation by stacking drivers
On Wed, Aug 29, 2012 at 06:23:36PM +0100, Alasdair G Kergon wrote: > On Wed, Aug 29, 2012 at 10:13:45AM -0700, Kent Overstreet wrote: > > Only on allocation failure. > > Which as you already said assumes wrapping together the other mempools in the > submission path first... Yes? I'm not sure what you're getting at. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [dm-devel] [PATCH v7 9/9] block: Avoid deadlocks with bio allocation by stacking drivers
On Wed, Aug 29, 2012 at 10:13:45AM -0700, Kent Overstreet wrote: > Only on allocation failure. Which as you already said assumes wrapping together the other mempools in the submission path first... Alasdair -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [dm-devel] [PATCH v7 9/9] block: Avoid deadlocks with bio allocation by stacking drivers
On Wed, Aug 29, 2012 at 09:50:06AM -0700, Kent Overstreet wrote: > The other thing we could do is make the rescuer thread per block device > (which arguably makes more sense than per bio_set anyways), then > associate bio_sets with specific block devices/rescuers. For dm, it's equivalent - already one bioset required per live device: dm_alloc_md_mempools(). Alasdair -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [dm-devel] [PATCH v7 9/9] block: Avoid deadlocks with bio allocation by stacking drivers
On Wed, Aug 29, 2012 at 03:39:14PM +0100, Alasdair G Kergon wrote: > It's also instructive to remember why the code is the way it is: it used > to process bios for underlying devices immediately, but this sometimes > meant too much recursive stack growth. If a per-device rescuer thread > is to be made available (as well as the mempool), the option of > reinstating recursion is there too - only punting to workqueue when the > stack actually becomes "too big". (Also bear in mind that some dm > targets may have dependencies on their own mempools - submission can > block there too.) I find it helpful only to consider splitting into two > pieces - it must always be possible to process the first piece (i.e. > process it at the next layer down in the stack) and complete it > independently of what happens to the second piece (which might require > further splitting and block until the first piece has completed). I'm sure it could be made to work (and it may well simpler), but it seems problematic from a performance pov. With stacked devices you'd then have to switch stacks on _every_ bio. That could be made fast enough I'm sure, but it wouldn't be free and I don't know of any existing code in the kernel that implements what we'd need (though if you know how you'd go about doing that, I'd love to know! Would be useful for other things). The real problem is that because we'd need these extra stacks for handling all bios we couldn't get by with just one per bio_set. We'd only need one to make forward progress so the rest could be allocated on demand (i.e. what the workqueue code does) but that sounds like it's starting to get expensive. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [dm-devel] [PATCH v7 9/9] block: Avoid deadlocks with bio allocation by stacking drivers
On Wed, Aug 29, 2012 at 08:57:59AM -0400, Vivek Goyal wrote: > I would say keep all the bio splitting patches and any fixes w.r.t > deadlocks in a seprate series. As this is little complicated and a lot > of is just theoritical corner cases. If you limit this series to just > bio_set related cleanups, it becomes more acceptable for inclusion. Yes, please keep the splitting patches separate. The current code gets away with what it does through statistics making the deadlocks very unlikely. It's also instructive to remember why the code is the way it is: it used to process bios for underlying devices immediately, but this sometimes meant too much recursive stack growth. If a per-device rescuer thread is to be made available (as well as the mempool), the option of reinstating recursion is there too - only punting to workqueue when the stack actually becomes "too big". (Also bear in mind that some dm targets may have dependencies on their own mempools - submission can block there too.) I find it helpful only to consider splitting into two pieces - it must always be possible to process the first piece (i.e. process it at the next layer down in the stack) and complete it independently of what happens to the second piece (which might require further splitting and block until the first piece has completed). Alasdair -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [dm-devel] [PATCH v7 9/9] block: Avoid deadlocks with bio allocation by stacking drivers
On Wed, Aug 29, 2012 at 08:57:59AM -0400, Vivek Goyal wrote: I would say keep all the bio splitting patches and any fixes w.r.t deadlocks in a seprate series. As this is little complicated and a lot of is just theoritical corner cases. If you limit this series to just bio_set related cleanups, it becomes more acceptable for inclusion. Yes, please keep the splitting patches separate. The current code gets away with what it does through statistics making the deadlocks very unlikely. It's also instructive to remember why the code is the way it is: it used to process bios for underlying devices immediately, but this sometimes meant too much recursive stack growth. If a per-device rescuer thread is to be made available (as well as the mempool), the option of reinstating recursion is there too - only punting to workqueue when the stack actually becomes too big. (Also bear in mind that some dm targets may have dependencies on their own mempools - submission can block there too.) I find it helpful only to consider splitting into two pieces - it must always be possible to process the first piece (i.e. process it at the next layer down in the stack) and complete it independently of what happens to the second piece (which might require further splitting and block until the first piece has completed). Alasdair -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [dm-devel] [PATCH v7 9/9] block: Avoid deadlocks with bio allocation by stacking drivers
On Wed, Aug 29, 2012 at 03:39:14PM +0100, Alasdair G Kergon wrote: It's also instructive to remember why the code is the way it is: it used to process bios for underlying devices immediately, but this sometimes meant too much recursive stack growth. If a per-device rescuer thread is to be made available (as well as the mempool), the option of reinstating recursion is there too - only punting to workqueue when the stack actually becomes too big. (Also bear in mind that some dm targets may have dependencies on their own mempools - submission can block there too.) I find it helpful only to consider splitting into two pieces - it must always be possible to process the first piece (i.e. process it at the next layer down in the stack) and complete it independently of what happens to the second piece (which might require further splitting and block until the first piece has completed). I'm sure it could be made to work (and it may well simpler), but it seems problematic from a performance pov. With stacked devices you'd then have to switch stacks on _every_ bio. That could be made fast enough I'm sure, but it wouldn't be free and I don't know of any existing code in the kernel that implements what we'd need (though if you know how you'd go about doing that, I'd love to know! Would be useful for other things). The real problem is that because we'd need these extra stacks for handling all bios we couldn't get by with just one per bio_set. We'd only need one to make forward progress so the rest could be allocated on demand (i.e. what the workqueue code does) but that sounds like it's starting to get expensive. -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [dm-devel] [PATCH v7 9/9] block: Avoid deadlocks with bio allocation by stacking drivers
On Wed, Aug 29, 2012 at 09:50:06AM -0700, Kent Overstreet wrote: The other thing we could do is make the rescuer thread per block device (which arguably makes more sense than per bio_set anyways), then associate bio_sets with specific block devices/rescuers. For dm, it's equivalent - already one bioset required per live device: dm_alloc_md_mempools(). Alasdair -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [dm-devel] [PATCH v7 9/9] block: Avoid deadlocks with bio allocation by stacking drivers
On Wed, Aug 29, 2012 at 10:13:45AM -0700, Kent Overstreet wrote: Only on allocation failure. Which as you already said assumes wrapping together the other mempools in the submission path first... Alasdair -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [dm-devel] [PATCH v7 9/9] block: Avoid deadlocks with bio allocation by stacking drivers
On Wed, Aug 29, 2012 at 06:23:36PM +0100, Alasdair G Kergon wrote: On Wed, Aug 29, 2012 at 10:13:45AM -0700, Kent Overstreet wrote: Only on allocation failure. Which as you already said assumes wrapping together the other mempools in the submission path first... Yes? I'm not sure what you're getting at. -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [dm-devel] [PATCH v7 9/9] block: Avoid deadlocks with bio allocation by stacking drivers
On Wed, Aug 29, 2012 at 05:01:09PM -0400, John Stoffel wrote: Kent == Kent Overstreet koverstr...@google.com writes: Kent On Wed, Aug 29, 2012 at 03:39:14PM +0100, Alasdair G Kergon wrote: It's also instructive to remember why the code is the way it is: it used to process bios for underlying devices immediately, but this sometimes meant too much recursive stack growth. If a per-device rescuer thread is to be made available (as well as the mempool), the option of reinstating recursion is there too - only punting to workqueue when the stack actually becomes too big. (Also bear in mind that some dm targets may have dependencies on their own mempools - submission can block there too.) I find it helpful only to consider splitting into two pieces - it must always be possible to process the first piece (i.e. process it at the next layer down in the stack) and complete it independently of what happens to the second piece (which might require further splitting and block until the first piece has completed). Kent I'm sure it could be made to work (and it may well simpler), but it Kent seems problematic from a performance pov. Kent With stacked devices you'd then have to switch stacks on _every_ bio. Kent That could be made fast enough I'm sure, but it wouldn't be free and I Kent don't know of any existing code in the kernel that implements what we'd Kent need (though if you know how you'd go about doing that, I'd love to Kent know! Would be useful for other things). Kent The real problem is that because we'd need these extra stacks for Kent handling all bios we couldn't get by with just one per bio_set. We'd Kent only need one to make forward progress so the rest could be allocated Kent on demand (i.e. what the workqueue code does) but that sounds like it's Kent starting to get expensive. Maybe we need to limit the size of BIOs to that of the bottom-most underlying device instead? Or maybe limit BIOs to some small multiple? As you stack up DM targets one on top of each other, they should respect the limits of the underlying devices and pass those limits up the chain. That's the approach the block layer currently tries to take. It's brittle, tricky and inefficient, and it completely breaks down when the limits are dynamic - so things like dm and bcache are always going to have to split anyways. -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [dm-devel] [PATCH v7 9/9] block: Avoid deadlocks with bio allocation by stacking drivers
Kent == Kent Overstreet koverstr...@google.com writes: Kent On Wed, Aug 29, 2012 at 03:39:14PM +0100, Alasdair G Kergon wrote: It's also instructive to remember why the code is the way it is: it used to process bios for underlying devices immediately, but this sometimes meant too much recursive stack growth. If a per-device rescuer thread is to be made available (as well as the mempool), the option of reinstating recursion is there too - only punting to workqueue when the stack actually becomes too big. (Also bear in mind that some dm targets may have dependencies on their own mempools - submission can block there too.) I find it helpful only to consider splitting into two pieces - it must always be possible to process the first piece (i.e. process it at the next layer down in the stack) and complete it independently of what happens to the second piece (which might require further splitting and block until the first piece has completed). Kent I'm sure it could be made to work (and it may well simpler), but it Kent seems problematic from a performance pov. Kent With stacked devices you'd then have to switch stacks on _every_ bio. Kent That could be made fast enough I'm sure, but it wouldn't be free and I Kent don't know of any existing code in the kernel that implements what we'd Kent need (though if you know how you'd go about doing that, I'd love to Kent know! Would be useful for other things). Kent The real problem is that because we'd need these extra stacks for Kent handling all bios we couldn't get by with just one per bio_set. We'd Kent only need one to make forward progress so the rest could be allocated Kent on demand (i.e. what the workqueue code does) but that sounds like it's Kent starting to get expensive. Maybe we need to limit the size of BIOs to that of the bottom-most underlying device instead? Or maybe limit BIOs to some small multiple? As you stack up DM targets one on top of each other, they should respect the limits of the underlying devices and pass those limits up the chain. Or maybe I'm speaking giberish... John -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/