Re: kernel BUG at drivers/scsi/scsi_lib.c:1096!
On Fri, 04 Dec 2015 18:02:58 +0100, Jens Axboe wrote: > > On 12/04/2015 09:59 AM, Takashi Iwai wrote: > > On Wed, 25 Nov 2015 20:01:47 +0100, > > Hannes Reinecke wrote: > >> > >> On 11/25/2015 07:01 PM, Mike Snitzer wrote: > >>> On Wed, Nov 25 2015 at 4:04am -0500, > >>> Hannes Reinecke wrote: > >>> > On 11/20/2015 04:28 PM, Ewan Milne wrote: > > On Fri, 2015-11-20 at 15:55 +0100, Hannes Reinecke wrote: > >> Can't we have a joint effort here? > >> I've been spending a _LOT_ of time trying to debug things here, but > >> none of the ideas I've come up with have been able to fix anything. > > > > Yes. I'm not the one primarily looking at it, and we don't have a > > reproducer in-house. We just have the one dump right now. > > > >> > >> I'm almost tempted to increase the count from scsi_alloc_sgtable() > >> by one and be done with ... > >> > > > > That might not fix it if it is a problem with the merge code, though. > > > And indeed, it doesn't. > >>> > >>> How did you arrive at that? Do you have a reproducer now? > >>> > >> Not a reproducer, but several dumps for analysis. > >> > Seems I finally found the culprit. > > What happens is this: > We have two paths, with these seg_boundary_masks: > > path-1:seg_boundary_mask = 65535, > path-2:seg_boundary_mask = 4294967295, > > consequently the DM request queue has this: > > md-1:seg_boundary_mask = 65535, > > What happens now is that a request is being formatted, and sent > to path 2. During submission req->nr_phys_segments is formatted > with the limits of path 2, arriving at a count of 3. > Now the request gets retried on path 1, but as the NOMERGE request > flag is set req->nr_phys_segments is never updated. > But blk_rq_map_sg() ignores all counters, and just uses the > bi_vec directly, resulting in a count of 4 -> boom. > > So the culprit here is the NOMERGE flag, > >>> > >>> NOMERGE is always set in __blk_rq_prep_clone() for cloned requests. > >>> > >> Yes. > >> > which is evaluated via > ->dm_dispatch_request() > ->blk_insert_cloned_request() > ->blk_rq_check_limits() > >>> > >>> blk_insert_cloned_request() is the only caller of blk_rq_check_limits(); > >>> anyway after reading your mail I'm still left wondering if your proposed > >>> patch is correct. > >>> > If the above assessment is correct, the following patch should > fix it: > > diff --git a/block/blk-core.c b/block/blk-core.c > index 801ced7..12cccd6 100644 > --- a/block/blk-core.c > +++ b/block/blk-core.c > @@ -1928,7 +1928,7 @@ EXPORT_SYMBOL(submit_bio); > */ > int blk_rq_check_limits(struct request_queue *q, struct request *rq) > { > - if (!rq_mergeable(rq)) > + if (rq->cmd_type != REQ_TYPE_FS) > return 0; > > if (blk_rq_sectors(rq) > blk_queue_get_max_sectors(q, > rq->cmd_flags)) { > > > Mike? Jens? > Can you comment on it? > >>> > >>> You're not explaining the actual change in the patch very well; I think > >>> you're correct but you're leaving the justification as an exercise to > >>> the reviewer: > >>> > >>> blk_rq_check_limits() will call blk_recalc_rq_segments() after the > >>> !rq_mergeable() check but you're saying for this case in question we > >>> never get there -- due to the cloned request having NOMERGE set. > >>> > >>> So in blk_rq_check_limits() you've unrolled rq_mergeable() and > >>> open-coded the lone remaining check (rq->cmd_type != REQ_TYPE_FS) > >>> > >>> I agree that the (rq->cmd_flags & REQ_NOMERGE_FLAGS) check in > >>> the blk_insert_cloned_request() call-chain (via rq_mergeable()) makes no > >>> sense for cloned requests that always have NOMERGE set. > >>> > >>> So you're saying that by having blk_rq_check_limits() go on to call > >>> blk_recalc_rq_segments() this bug will be fixed? > >>> > >> That is the idea. > >> > >> I've already established that in all instances I have seen so far > >> req->nr_phys_segments is _less_ than req->bio->bi_phys_segments. > >> > >> As it turns out, req->nr_phys_segemnts _would_ have been updated in > >> blk_rq_check_limits(), but isn't due to the NOMERGE flag being set > >> for the cloned request. > >> So each cloned request inherits the values from the original request, > >> despite the fact that req->nr_phys_segments _has_ to be evaluated in > >> the final request_queue context, as the queue limits _might_ be > >> different from the original (merged) queue limits of the multipath > >> request queue. > >> > >>> BTW, I think blk_rq_check_limits()'s export should be removed and the > >>> function made static and renamed to blk_clone_rq_check_limits(), again: > >>> blk_insert_cloned_request() is the only caller of blk_rq_check_limits() > >>> > >> Actually, seeing Jens'
Re: kernel BUG at drivers/scsi/scsi_lib.c:1096!
On 12/04/2015 09:59 AM, Takashi Iwai wrote: On Wed, 25 Nov 2015 20:01:47 +0100, Hannes Reinecke wrote: On 11/25/2015 07:01 PM, Mike Snitzer wrote: On Wed, Nov 25 2015 at 4:04am -0500, Hannes Reinecke wrote: On 11/20/2015 04:28 PM, Ewan Milne wrote: On Fri, 2015-11-20 at 15:55 +0100, Hannes Reinecke wrote: Can't we have a joint effort here? I've been spending a _LOT_ of time trying to debug things here, but none of the ideas I've come up with have been able to fix anything. Yes. I'm not the one primarily looking at it, and we don't have a reproducer in-house. We just have the one dump right now. I'm almost tempted to increase the count from scsi_alloc_sgtable() by one and be done with ... That might not fix it if it is a problem with the merge code, though. And indeed, it doesn't. How did you arrive at that? Do you have a reproducer now? Not a reproducer, but several dumps for analysis. Seems I finally found the culprit. What happens is this: We have two paths, with these seg_boundary_masks: path-1:seg_boundary_mask = 65535, path-2:seg_boundary_mask = 4294967295, consequently the DM request queue has this: md-1:seg_boundary_mask = 65535, What happens now is that a request is being formatted, and sent to path 2. During submission req->nr_phys_segments is formatted with the limits of path 2, arriving at a count of 3. Now the request gets retried on path 1, but as the NOMERGE request flag is set req->nr_phys_segments is never updated. But blk_rq_map_sg() ignores all counters, and just uses the bi_vec directly, resulting in a count of 4 -> boom. So the culprit here is the NOMERGE flag, NOMERGE is always set in __blk_rq_prep_clone() for cloned requests. Yes. which is evaluated via ->dm_dispatch_request() ->blk_insert_cloned_request() ->blk_rq_check_limits() blk_insert_cloned_request() is the only caller of blk_rq_check_limits(); anyway after reading your mail I'm still left wondering if your proposed patch is correct. If the above assessment is correct, the following patch should fix it: diff --git a/block/blk-core.c b/block/blk-core.c index 801ced7..12cccd6 100644 --- a/block/blk-core.c +++ b/block/blk-core.c @@ -1928,7 +1928,7 @@ EXPORT_SYMBOL(submit_bio); */ int blk_rq_check_limits(struct request_queue *q, struct request *rq) { - if (!rq_mergeable(rq)) + if (rq->cmd_type != REQ_TYPE_FS) return 0; if (blk_rq_sectors(rq) > blk_queue_get_max_sectors(q, rq->cmd_flags)) { Mike? Jens? Can you comment on it? You're not explaining the actual change in the patch very well; I think you're correct but you're leaving the justification as an exercise to the reviewer: blk_rq_check_limits() will call blk_recalc_rq_segments() after the !rq_mergeable() check but you're saying for this case in question we never get there -- due to the cloned request having NOMERGE set. So in blk_rq_check_limits() you've unrolled rq_mergeable() and open-coded the lone remaining check (rq->cmd_type != REQ_TYPE_FS) I agree that the (rq->cmd_flags & REQ_NOMERGE_FLAGS) check in the blk_insert_cloned_request() call-chain (via rq_mergeable()) makes no sense for cloned requests that always have NOMERGE set. So you're saying that by having blk_rq_check_limits() go on to call blk_recalc_rq_segments() this bug will be fixed? That is the idea. I've already established that in all instances I have seen so far req->nr_phys_segments is _less_ than req->bio->bi_phys_segments. As it turns out, req->nr_phys_segemnts _would_ have been updated in blk_rq_check_limits(), but isn't due to the NOMERGE flag being set for the cloned request. So each cloned request inherits the values from the original request, despite the fact that req->nr_phys_segments _has_ to be evaluated in the final request_queue context, as the queue limits _might_ be different from the original (merged) queue limits of the multipath request queue. BTW, I think blk_rq_check_limits()'s export should be removed and the function made static and renamed to blk_clone_rq_check_limits(), again: blk_insert_cloned_request() is the only caller of blk_rq_check_limits() Actually, seeing Jens' last comment the check for REQ_TYPE_FS is pointless, too, so we might as well remove the entire if-clause. Seems prudent to make that change now to be clear that this code is only used by cloned requests. Yeah, that would make sense. I'll be preparing a patch. With a more detailed description :-) Do we have already a fix? Right now I got (likely) this kernel BUG() on the almost latest Linus tree (commit 25364a9e54fb8296). It happened while I started a KVM right after a fresh boot. The machine paniced even before that, so I hit this twice today. Update to the tree as-of yesterday (or today) and it should work. 25364a9e54fb8296 doesn't include the latest block fixes that were sent in yesterday, that should fix it. You need commit a88d32af18b8 or newer. -- Jens Axboe -- To
Re: kernel BUG at drivers/scsi/scsi_lib.c:1096!
On Wed, 25 Nov 2015 20:01:47 +0100, Hannes Reinecke wrote: > > On 11/25/2015 07:01 PM, Mike Snitzer wrote: > > On Wed, Nov 25 2015 at 4:04am -0500, > > Hannes Reinecke wrote: > > > >> On 11/20/2015 04:28 PM, Ewan Milne wrote: > >>> On Fri, 2015-11-20 at 15:55 +0100, Hannes Reinecke wrote: > Can't we have a joint effort here? > I've been spending a _LOT_ of time trying to debug things here, but > none of the ideas I've come up with have been able to fix anything. > >>> > >>> Yes. I'm not the one primarily looking at it, and we don't have a > >>> reproducer in-house. We just have the one dump right now. > >>> > > I'm almost tempted to increase the count from scsi_alloc_sgtable() > by one and be done with ... > > >>> > >>> That might not fix it if it is a problem with the merge code, though. > >>> > >> And indeed, it doesn't. > > > > How did you arrive at that? Do you have a reproducer now? > > > Not a reproducer, but several dumps for analysis. > > >> Seems I finally found the culprit. > >> > >> What happens is this: > >> We have two paths, with these seg_boundary_masks: > >> > >> path-1:seg_boundary_mask = 65535, > >> path-2:seg_boundary_mask = 4294967295, > >> > >> consequently the DM request queue has this: > >> > >> md-1:seg_boundary_mask = 65535, > >> > >> What happens now is that a request is being formatted, and sent > >> to path 2. During submission req->nr_phys_segments is formatted > >> with the limits of path 2, arriving at a count of 3. > >> Now the request gets retried on path 1, but as the NOMERGE request > >> flag is set req->nr_phys_segments is never updated. > >> But blk_rq_map_sg() ignores all counters, and just uses the > >> bi_vec directly, resulting in a count of 4 -> boom. > >> > >> So the culprit here is the NOMERGE flag, > > > > NOMERGE is always set in __blk_rq_prep_clone() for cloned requests. > > > Yes. > > >> which is evaluated via > >> ->dm_dispatch_request() > >>->blk_insert_cloned_request() > >> ->blk_rq_check_limits() > > > > blk_insert_cloned_request() is the only caller of blk_rq_check_limits(); > > anyway after reading your mail I'm still left wondering if your proposed > > patch is correct. > > > >> If the above assessment is correct, the following patch should > >> fix it: > >> > >> diff --git a/block/blk-core.c b/block/blk-core.c > >> index 801ced7..12cccd6 100644 > >> --- a/block/blk-core.c > >> +++ b/block/blk-core.c > >> @@ -1928,7 +1928,7 @@ EXPORT_SYMBOL(submit_bio); > >>*/ > >> int blk_rq_check_limits(struct request_queue *q, struct request *rq) > >> { > >> - if (!rq_mergeable(rq)) > >> + if (rq->cmd_type != REQ_TYPE_FS) > >> return 0; > >> > >> if (blk_rq_sectors(rq) > blk_queue_get_max_sectors(q, > >> rq->cmd_flags)) { > >> > >> > >> Mike? Jens? > >> Can you comment on it? > > > > You're not explaining the actual change in the patch very well; I think > > you're correct but you're leaving the justification as an exercise to > > the reviewer: > > > > blk_rq_check_limits() will call blk_recalc_rq_segments() after the > > !rq_mergeable() check but you're saying for this case in question we > > never get there -- due to the cloned request having NOMERGE set. > > > > So in blk_rq_check_limits() you've unrolled rq_mergeable() and > > open-coded the lone remaining check (rq->cmd_type != REQ_TYPE_FS) > > > > I agree that the (rq->cmd_flags & REQ_NOMERGE_FLAGS) check in > > the blk_insert_cloned_request() call-chain (via rq_mergeable()) makes no > > sense for cloned requests that always have NOMERGE set. > > > > So you're saying that by having blk_rq_check_limits() go on to call > > blk_recalc_rq_segments() this bug will be fixed? > > > That is the idea. > > I've already established that in all instances I have seen so far > req->nr_phys_segments is _less_ than req->bio->bi_phys_segments. > > As it turns out, req->nr_phys_segemnts _would_ have been updated in > blk_rq_check_limits(), but isn't due to the NOMERGE flag being set > for the cloned request. > So each cloned request inherits the values from the original request, > despite the fact that req->nr_phys_segments _has_ to be evaluated in > the final request_queue context, as the queue limits _might_ be > different from the original (merged) queue limits of the multipath > request queue. > > > BTW, I think blk_rq_check_limits()'s export should be removed and the > > function made static and renamed to blk_clone_rq_check_limits(), again: > > blk_insert_cloned_request() is the only caller of blk_rq_check_limits() > > > Actually, seeing Jens' last comment the check for REQ_TYPE_FS is > pointless, too, so we might as well remove the entire if-clause. > > > Seems prudent to make that change now to be clear that this code is only > > used by cloned requests. > > > Yeah, that would make sense. I'll be preparing a patch. > With a more detailed description :-) Do we have already a fix? Right now I got (likely)
Re: kernel BUG at drivers/scsi/scsi_lib.c:1096!
On Wed, Nov 25 2015 at 3:23pm -0500, Mike Snitzer wrote: > On Wed, Nov 25 2015 at 2:24pm -0500, > Jens Axboe wrote: > > > On 11/25/2015 12:10 PM, Hannes Reinecke wrote: > > >The problem is that NOMERGE does too much, as it inhibits _any_ merging. > > > > Right, that is the point of the flag from the block layer view, > > where it was originally added for the case mentioned. > > And we really don't want _any_ merging. The merging, if any, will have > already happened in upper DM-multipath's elevator. So there should be > no need to have the underlying SCSI paths do any merging. > > > >Unfortunately, the req->nr_phys_segments value is evaluated in the final > > >_driver_ context _after_ the merging happend; cf > > >scsi_lib.c:scsi_init_sgtable(). > > >As nr_phys_segments is inherited from the original request (and never > > >recalculated with the new request queue limits) the following > > >blk_rq_map_sg() call might end up at a different calculation, especially > > >after retrying a request on another path. > > > > That all sounds pretty horrible. Why is blk_rq_check_limits() > > checking for mergeable at all? If merging is disabled on the > > request, I'm assuming that's an attempt at an optimization since we > > know it won't change. But that should be tracked separately, like > > how it's done on the bio. > > Not clear to me why it was checking for merging... Ewan pointed out that blk_rq_check_limits()'s call to rq_mergable() was introduced as part of Martin's DISCARD cleanup that prepared for WRITE_SAME, see: commit e2a60da74 ("block: Clean up special command handling logic") And prior to that, blk_rq_check_limits()'s (rq->cmd_flags & REQ_DISCARD) check was introduced by some guy crazy doppelganger name "Ike Snitzer", see commit: 3383977f ("block: update request stacking methods to support discards") So basically we probably never needed the extra check in blk_rq_check_limits() to begin with.. Ike was probably paranoid. ;) -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: kernel BUG at drivers/scsi/scsi_lib.c:1096!
On Wed, Nov 25 2015 at 2:24pm -0500, Jens Axboe wrote: > On 11/25/2015 12:10 PM, Hannes Reinecke wrote: > >On 11/25/2015 06:56 PM, Jens Axboe wrote: > >>On 11/25/2015 02:04 AM, Hannes Reinecke wrote: > >>>On 11/20/2015 04:28 PM, Ewan Milne wrote: > On Fri, 2015-11-20 at 15:55 +0100, Hannes Reinecke wrote: > >Can't we have a joint effort here? > >I've been spending a _LOT_ of time trying to debug things here, but > >none of the ideas I've come up with have been able to fix anything. > > Yes. I'm not the one primarily looking at it, and we don't have a > reproducer in-house. We just have the one dump right now. > > > > >I'm almost tempted to increase the count from scsi_alloc_sgtable() > >by one and be done with ... > > > > That might not fix it if it is a problem with the merge code, though. > > >>>And indeed, it doesn't. > >>>Seems I finally found the culprit. > >>> > >>>What happens is this: > >>>We have two paths, with these seg_boundary_masks: > >>> > >>>path-1:seg_boundary_mask = 65535, > >>>path-2:seg_boundary_mask = 4294967295, > >>> > >>>consequently the DM request queue has this: > >>> > >>>md-1:seg_boundary_mask = 65535, > >>> > >>>What happens now is that a request is being formatted, and sent > >>>to path 2. During submission req->nr_phys_segments is formatted > >>>with the limits of path 2, arriving at a count of 3. > >>>Now the request gets retried on path 1, but as the NOMERGE request > >>>flag is set req->nr_phys_segments is never updated. > >>>But blk_rq_map_sg() ignores all counters, and just uses the > >>>bi_vec directly, resulting in a count of 4 -> boom. > >>> > >>>So the culprit here is the NOMERGE flag, which is evaluated > >>>via > >>>->dm_dispatch_request() > >>> ->blk_insert_cloned_request() > >>> ->blk_rq_check_limits() > >>> > >>>If the above assessment is correct, the following patch should > >>>fix it: > >>> > >>>diff --git a/block/blk-core.c b/block/blk-core.c > >>>index 801ced7..12cccd6 100644 > >>>--- a/block/blk-core.c > >>>+++ b/block/blk-core.c > >>>@@ -1928,7 +1928,7 @@ EXPORT_SYMBOL(submit_bio); > >>> */ > >>> int blk_rq_check_limits(struct request_queue *q, struct request *rq) > >>> { > >>>- if (!rq_mergeable(rq)) > >>>+ if (rq->cmd_type != REQ_TYPE_FS) > >>> return 0; > >>> > >>> if (blk_rq_sectors(rq) > blk_queue_get_max_sectors(q, > >>>rq->cmd_flags)) { > >>> > >>> > >>>Mike? Jens? > >>>Can you comment on it? > >> > >>We only support merging on REQ_TYPE_FS already, so how is the above > >>making it any different? In general, NOMERGE being set or not should not > >>make a difference. It's only a hint that we need not check further if we > >>should be merging on this request, since we already tried it once, found > >>we'd exceed various limits, then set NOMERGE to reflect that. > >> > >The problem is that NOMERGE does too much, as it inhibits _any_ merging. > > Right, that is the point of the flag from the block layer view, > where it was originally added for the case mentioned. And we really don't want _any_ merging. The merging, if any, will have already happened in upper DM-multipath's elevator. So there should be no need to have the underlying SCSI paths do any merging. > >Unfortunately, the req->nr_phys_segments value is evaluated in the final > >_driver_ context _after_ the merging happend; cf > >scsi_lib.c:scsi_init_sgtable(). > >As nr_phys_segments is inherited from the original request (and never > >recalculated with the new request queue limits) the following > >blk_rq_map_sg() call might end up at a different calculation, especially > >after retrying a request on another path. > > That all sounds pretty horrible. Why is blk_rq_check_limits() > checking for mergeable at all? If merging is disabled on the > request, I'm assuming that's an attempt at an optimization since we > know it won't change. But that should be tracked separately, like > how it's done on the bio. Not clear to me why it was checking for merging... -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: kernel BUG at drivers/scsi/scsi_lib.c:1096!
On 11/25/2015 12:10 PM, Hannes Reinecke wrote: On 11/25/2015 06:56 PM, Jens Axboe wrote: On 11/25/2015 02:04 AM, Hannes Reinecke wrote: On 11/20/2015 04:28 PM, Ewan Milne wrote: On Fri, 2015-11-20 at 15:55 +0100, Hannes Reinecke wrote: Can't we have a joint effort here? I've been spending a _LOT_ of time trying to debug things here, but none of the ideas I've come up with have been able to fix anything. Yes. I'm not the one primarily looking at it, and we don't have a reproducer in-house. We just have the one dump right now. I'm almost tempted to increase the count from scsi_alloc_sgtable() by one and be done with ... That might not fix it if it is a problem with the merge code, though. And indeed, it doesn't. Seems I finally found the culprit. What happens is this: We have two paths, with these seg_boundary_masks: path-1:seg_boundary_mask = 65535, path-2:seg_boundary_mask = 4294967295, consequently the DM request queue has this: md-1:seg_boundary_mask = 65535, What happens now is that a request is being formatted, and sent to path 2. During submission req->nr_phys_segments is formatted with the limits of path 2, arriving at a count of 3. Now the request gets retried on path 1, but as the NOMERGE request flag is set req->nr_phys_segments is never updated. But blk_rq_map_sg() ignores all counters, and just uses the bi_vec directly, resulting in a count of 4 -> boom. So the culprit here is the NOMERGE flag, which is evaluated via ->dm_dispatch_request() ->blk_insert_cloned_request() ->blk_rq_check_limits() If the above assessment is correct, the following patch should fix it: diff --git a/block/blk-core.c b/block/blk-core.c index 801ced7..12cccd6 100644 --- a/block/blk-core.c +++ b/block/blk-core.c @@ -1928,7 +1928,7 @@ EXPORT_SYMBOL(submit_bio); */ int blk_rq_check_limits(struct request_queue *q, struct request *rq) { - if (!rq_mergeable(rq)) + if (rq->cmd_type != REQ_TYPE_FS) return 0; if (blk_rq_sectors(rq) > blk_queue_get_max_sectors(q, rq->cmd_flags)) { Mike? Jens? Can you comment on it? We only support merging on REQ_TYPE_FS already, so how is the above making it any different? In general, NOMERGE being set or not should not make a difference. It's only a hint that we need not check further if we should be merging on this request, since we already tried it once, found we'd exceed various limits, then set NOMERGE to reflect that. The problem is that NOMERGE does too much, as it inhibits _any_ merging. Right, that is the point of the flag from the block layer view, where it was originally added for the case mentioned. Unfortunately, the req->nr_phys_segments value is evaluated in the final _driver_ context _after_ the merging happend; cf scsi_lib.c:scsi_init_sgtable(). As nr_phys_segments is inherited from the original request (and never recalculated with the new request queue limits) the following blk_rq_map_sg() call might end up at a different calculation, especially after retrying a request on another path. That all sounds pretty horrible. Why is blk_rq_check_limits() checking for mergeable at all? If merging is disabled on the request, I'm assuming that's an attempt at an optimization since we know it won't change. But that should be tracked separately, like how it's done on the bio. -- Jens Axboe -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: kernel BUG at drivers/scsi/scsi_lib.c:1096!
On 11/25/2015 06:56 PM, Jens Axboe wrote: On 11/25/2015 02:04 AM, Hannes Reinecke wrote: On 11/20/2015 04:28 PM, Ewan Milne wrote: On Fri, 2015-11-20 at 15:55 +0100, Hannes Reinecke wrote: Can't we have a joint effort here? I've been spending a _LOT_ of time trying to debug things here, but none of the ideas I've come up with have been able to fix anything. Yes. I'm not the one primarily looking at it, and we don't have a reproducer in-house. We just have the one dump right now. I'm almost tempted to increase the count from scsi_alloc_sgtable() by one and be done with ... That might not fix it if it is a problem with the merge code, though. And indeed, it doesn't. Seems I finally found the culprit. What happens is this: We have two paths, with these seg_boundary_masks: path-1:seg_boundary_mask = 65535, path-2:seg_boundary_mask = 4294967295, consequently the DM request queue has this: md-1:seg_boundary_mask = 65535, What happens now is that a request is being formatted, and sent to path 2. During submission req->nr_phys_segments is formatted with the limits of path 2, arriving at a count of 3. Now the request gets retried on path 1, but as the NOMERGE request flag is set req->nr_phys_segments is never updated. But blk_rq_map_sg() ignores all counters, and just uses the bi_vec directly, resulting in a count of 4 -> boom. So the culprit here is the NOMERGE flag, which is evaluated via ->dm_dispatch_request() ->blk_insert_cloned_request() ->blk_rq_check_limits() If the above assessment is correct, the following patch should fix it: diff --git a/block/blk-core.c b/block/blk-core.c index 801ced7..12cccd6 100644 --- a/block/blk-core.c +++ b/block/blk-core.c @@ -1928,7 +1928,7 @@ EXPORT_SYMBOL(submit_bio); */ int blk_rq_check_limits(struct request_queue *q, struct request *rq) { - if (!rq_mergeable(rq)) + if (rq->cmd_type != REQ_TYPE_FS) return 0; if (blk_rq_sectors(rq) > blk_queue_get_max_sectors(q, rq->cmd_flags)) { Mike? Jens? Can you comment on it? We only support merging on REQ_TYPE_FS already, so how is the above making it any different? In general, NOMERGE being set or not should not make a difference. It's only a hint that we need not check further if we should be merging on this request, since we already tried it once, found we'd exceed various limits, then set NOMERGE to reflect that. The problem is that NOMERGE does too much, as it inhibits _any_ merging. Unfortunately, the req->nr_phys_segments value is evaluated in the final _driver_ context _after_ the merging happend; cf scsi_lib.c:scsi_init_sgtable(). As nr_phys_segments is inherited from the original request (and never recalculated with the new request queue limits) the following blk_rq_map_sg() call might end up at a different calculation, especially after retrying a request on another path. Cheers, Hannes -- Dr. Hannes ReineckezSeries & Storage h...@suse.de +49 911 74053 688 SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg GF: F. Imendörffer, J. Smithard, J. Guild, D. Upmanyu, G. Norton HRB 21284 (AG Nürnberg) -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: kernel BUG at drivers/scsi/scsi_lib.c:1096!
On 11/25/2015 07:01 PM, Mike Snitzer wrote: On Wed, Nov 25 2015 at 4:04am -0500, Hannes Reinecke wrote: On 11/20/2015 04:28 PM, Ewan Milne wrote: On Fri, 2015-11-20 at 15:55 +0100, Hannes Reinecke wrote: Can't we have a joint effort here? I've been spending a _LOT_ of time trying to debug things here, but none of the ideas I've come up with have been able to fix anything. Yes. I'm not the one primarily looking at it, and we don't have a reproducer in-house. We just have the one dump right now. I'm almost tempted to increase the count from scsi_alloc_sgtable() by one and be done with ... That might not fix it if it is a problem with the merge code, though. And indeed, it doesn't. How did you arrive at that? Do you have a reproducer now? Not a reproducer, but several dumps for analysis. Seems I finally found the culprit. What happens is this: We have two paths, with these seg_boundary_masks: path-1:seg_boundary_mask = 65535, path-2:seg_boundary_mask = 4294967295, consequently the DM request queue has this: md-1:seg_boundary_mask = 65535, What happens now is that a request is being formatted, and sent to path 2. During submission req->nr_phys_segments is formatted with the limits of path 2, arriving at a count of 3. Now the request gets retried on path 1, but as the NOMERGE request flag is set req->nr_phys_segments is never updated. But blk_rq_map_sg() ignores all counters, and just uses the bi_vec directly, resulting in a count of 4 -> boom. So the culprit here is the NOMERGE flag, NOMERGE is always set in __blk_rq_prep_clone() for cloned requests. Yes. which is evaluated via ->dm_dispatch_request() ->blk_insert_cloned_request() ->blk_rq_check_limits() blk_insert_cloned_request() is the only caller of blk_rq_check_limits(); anyway after reading your mail I'm still left wondering if your proposed patch is correct. If the above assessment is correct, the following patch should fix it: diff --git a/block/blk-core.c b/block/blk-core.c index 801ced7..12cccd6 100644 --- a/block/blk-core.c +++ b/block/blk-core.c @@ -1928,7 +1928,7 @@ EXPORT_SYMBOL(submit_bio); */ int blk_rq_check_limits(struct request_queue *q, struct request *rq) { - if (!rq_mergeable(rq)) + if (rq->cmd_type != REQ_TYPE_FS) return 0; if (blk_rq_sectors(rq) > blk_queue_get_max_sectors(q, rq->cmd_flags)) { Mike? Jens? Can you comment on it? You're not explaining the actual change in the patch very well; I think you're correct but you're leaving the justification as an exercise to the reviewer: blk_rq_check_limits() will call blk_recalc_rq_segments() after the !rq_mergeable() check but you're saying for this case in question we never get there -- due to the cloned request having NOMERGE set. So in blk_rq_check_limits() you've unrolled rq_mergeable() and open-coded the lone remaining check (rq->cmd_type != REQ_TYPE_FS) I agree that the (rq->cmd_flags & REQ_NOMERGE_FLAGS) check in the blk_insert_cloned_request() call-chain (via rq_mergeable()) makes no sense for cloned requests that always have NOMERGE set. So you're saying that by having blk_rq_check_limits() go on to call blk_recalc_rq_segments() this bug will be fixed? That is the idea. I've already established that in all instances I have seen so far req->nr_phys_segments is _less_ than req->bio->bi_phys_segments. As it turns out, req->nr_phys_segemnts _would_ have been updated in blk_rq_check_limits(), but isn't due to the NOMERGE flag being set for the cloned request. So each cloned request inherits the values from the original request, despite the fact that req->nr_phys_segments _has_ to be evaluated in the final request_queue context, as the queue limits _might_ be different from the original (merged) queue limits of the multipath request queue. BTW, I think blk_rq_check_limits()'s export should be removed and the function made static and renamed to blk_clone_rq_check_limits(), again: blk_insert_cloned_request() is the only caller of blk_rq_check_limits() Actually, seeing Jens' last comment the check for REQ_TYPE_FS is pointless, too, so we might as well remove the entire if-clause. Seems prudent to make that change now to be clear that this code is only used by cloned requests. Yeah, that would make sense. I'll be preparing a patch. With a more detailed description :-) Cheers, Hannes -- Dr. Hannes ReineckezSeries & Storage h...@suse.de +49 911 74053 688 SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg GF: F. Imendörffer, J. Smithard, J. Guild, D. Upmanyu, G. Norton HRB 21284 (AG Nürnberg) -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: kernel BUG at drivers/scsi/scsi_lib.c:1096!
On Wed, Nov 25 2015 at 4:04am -0500, Hannes Reinecke wrote: > On 11/20/2015 04:28 PM, Ewan Milne wrote: > > On Fri, 2015-11-20 at 15:55 +0100, Hannes Reinecke wrote: > >> Can't we have a joint effort here? > >> I've been spending a _LOT_ of time trying to debug things here, but > >> none of the ideas I've come up with have been able to fix anything. > > > > Yes. I'm not the one primarily looking at it, and we don't have a > > reproducer in-house. We just have the one dump right now. > > > >> > >> I'm almost tempted to increase the count from scsi_alloc_sgtable() > >> by one and be done with ... > >> > > > > That might not fix it if it is a problem with the merge code, though. > > > And indeed, it doesn't. How did you arrive at that? Do you have a reproducer now? > Seems I finally found the culprit. > > What happens is this: > We have two paths, with these seg_boundary_masks: > > path-1:seg_boundary_mask = 65535, > path-2:seg_boundary_mask = 4294967295, > > consequently the DM request queue has this: > > md-1:seg_boundary_mask = 65535, > > What happens now is that a request is being formatted, and sent > to path 2. During submission req->nr_phys_segments is formatted > with the limits of path 2, arriving at a count of 3. > Now the request gets retried on path 1, but as the NOMERGE request > flag is set req->nr_phys_segments is never updated. > But blk_rq_map_sg() ignores all counters, and just uses the > bi_vec directly, resulting in a count of 4 -> boom. > > So the culprit here is the NOMERGE flag, NOMERGE is always set in __blk_rq_prep_clone() for cloned requests. > which is evaluated via > ->dm_dispatch_request() > ->blk_insert_cloned_request() > ->blk_rq_check_limits() blk_insert_cloned_request() is the only caller of blk_rq_check_limits(); anyway after reading your mail I'm still left wondering if your proposed patch is correct. > If the above assessment is correct, the following patch should > fix it: > > diff --git a/block/blk-core.c b/block/blk-core.c > index 801ced7..12cccd6 100644 > --- a/block/blk-core.c > +++ b/block/blk-core.c > @@ -1928,7 +1928,7 @@ EXPORT_SYMBOL(submit_bio); > */ > int blk_rq_check_limits(struct request_queue *q, struct request *rq) > { > - if (!rq_mergeable(rq)) > + if (rq->cmd_type != REQ_TYPE_FS) > return 0; > > if (blk_rq_sectors(rq) > blk_queue_get_max_sectors(q, > rq->cmd_flags)) { > > > Mike? Jens? > Can you comment on it? You're not explaining the actual change in the patch very well; I think you're correct but you're leaving the justification as an exercise to the reviewer: blk_rq_check_limits() will call blk_recalc_rq_segments() after the !rq_mergeable() check but you're saying for this case in question we never get there -- due to the cloned request having NOMERGE set. So in blk_rq_check_limits() you've unrolled rq_mergeable() and open-coded the lone remaining check (rq->cmd_type != REQ_TYPE_FS) I agree that the (rq->cmd_flags & REQ_NOMERGE_FLAGS) check in the blk_insert_cloned_request() call-chain (via rq_mergeable()) makes no sense for cloned requests that always have NOMERGE set. So you're saying that by having blk_rq_check_limits() go on to call blk_recalc_rq_segments() this bug will be fixed? BTW, I think blk_rq_check_limits()'s export should be removed and the function made static and renamed to blk_clone_rq_check_limits(), again: blk_insert_cloned_request() is the only caller of blk_rq_check_limits() Seems prudent to make that change now to be clear that this code is only used by cloned requests. Thanks, Mike -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: kernel BUG at drivers/scsi/scsi_lib.c:1096!
On 11/25/2015 02:04 AM, Hannes Reinecke wrote: On 11/20/2015 04:28 PM, Ewan Milne wrote: On Fri, 2015-11-20 at 15:55 +0100, Hannes Reinecke wrote: Can't we have a joint effort here? I've been spending a _LOT_ of time trying to debug things here, but none of the ideas I've come up with have been able to fix anything. Yes. I'm not the one primarily looking at it, and we don't have a reproducer in-house. We just have the one dump right now. I'm almost tempted to increase the count from scsi_alloc_sgtable() by one and be done with ... That might not fix it if it is a problem with the merge code, though. And indeed, it doesn't. Seems I finally found the culprit. What happens is this: We have two paths, with these seg_boundary_masks: path-1:seg_boundary_mask = 65535, path-2:seg_boundary_mask = 4294967295, consequently the DM request queue has this: md-1:seg_boundary_mask = 65535, What happens now is that a request is being formatted, and sent to path 2. During submission req->nr_phys_segments is formatted with the limits of path 2, arriving at a count of 3. Now the request gets retried on path 1, but as the NOMERGE request flag is set req->nr_phys_segments is never updated. But blk_rq_map_sg() ignores all counters, and just uses the bi_vec directly, resulting in a count of 4 -> boom. So the culprit here is the NOMERGE flag, which is evaluated via ->dm_dispatch_request() ->blk_insert_cloned_request() ->blk_rq_check_limits() If the above assessment is correct, the following patch should fix it: diff --git a/block/blk-core.c b/block/blk-core.c index 801ced7..12cccd6 100644 --- a/block/blk-core.c +++ b/block/blk-core.c @@ -1928,7 +1928,7 @@ EXPORT_SYMBOL(submit_bio); */ int blk_rq_check_limits(struct request_queue *q, struct request *rq) { - if (!rq_mergeable(rq)) + if (rq->cmd_type != REQ_TYPE_FS) return 0; if (blk_rq_sectors(rq) > blk_queue_get_max_sectors(q, rq->cmd_flags)) { Mike? Jens? Can you comment on it? We only support merging on REQ_TYPE_FS already, so how is the above making it any different? In general, NOMERGE being set or not should not make a difference. It's only a hint that we need not check further if we should be merging on this request, since we already tried it once, found we'd exceed various limits, then set NOMERGE to reflect that. -- Jens Axboe -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: kernel BUG at drivers/scsi/scsi_lib.c:1096!
On 11/20/2015 04:28 PM, Ewan Milne wrote: > On Fri, 2015-11-20 at 15:55 +0100, Hannes Reinecke wrote: >> Can't we have a joint effort here? >> I've been spending a _LOT_ of time trying to debug things here, but >> none of the ideas I've come up with have been able to fix anything. > > Yes. I'm not the one primarily looking at it, and we don't have a > reproducer in-house. We just have the one dump right now. > >> >> I'm almost tempted to increase the count from scsi_alloc_sgtable() >> by one and be done with ... >> > > That might not fix it if it is a problem with the merge code, though. > And indeed, it doesn't. Seems I finally found the culprit. What happens is this: We have two paths, with these seg_boundary_masks: path-1:seg_boundary_mask = 65535, path-2:seg_boundary_mask = 4294967295, consequently the DM request queue has this: md-1:seg_boundary_mask = 65535, What happens now is that a request is being formatted, and sent to path 2. During submission req->nr_phys_segments is formatted with the limits of path 2, arriving at a count of 3. Now the request gets retried on path 1, but as the NOMERGE request flag is set req->nr_phys_segments is never updated. But blk_rq_map_sg() ignores all counters, and just uses the bi_vec directly, resulting in a count of 4 -> boom. So the culprit here is the NOMERGE flag, which is evaluated via ->dm_dispatch_request() ->blk_insert_cloned_request() ->blk_rq_check_limits() If the above assessment is correct, the following patch should fix it: diff --git a/block/blk-core.c b/block/blk-core.c index 801ced7..12cccd6 100644 --- a/block/blk-core.c +++ b/block/blk-core.c @@ -1928,7 +1928,7 @@ EXPORT_SYMBOL(submit_bio); */ int blk_rq_check_limits(struct request_queue *q, struct request *rq) { - if (!rq_mergeable(rq)) + if (rq->cmd_type != REQ_TYPE_FS) return 0; if (blk_rq_sectors(rq) > blk_queue_get_max_sectors(q, rq->cmd_flags)) { Mike? Jens? Can you comment on it? Cheers, Hannes -- Dr. Hannes ReineckezSeries & Storage h...@suse.de +49 911 74053 688 SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg GF: F. Imendörffer, J. Smithard, J. Guild, D. Upmanyu, G. Norton HRB 21284 (AG Nürnberg) -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: kernel BUG at drivers/scsi/scsi_lib.c:1096!
On 11/23/2015 10:21 AM, Ming Lei wrote: On Mon, 23 Nov 2015 10:46:20 +0800 Ming Lei wrote: Hi Mark, On Mon, Nov 23, 2015 at 9:50 AM, Mark Salter wrote: On Mon, 2015-11-23 at 08:36 +0800, Ming Lei wrote: On Mon, Nov 23, 2015 at 7:20 AM, Mark Salter wrote: On Sun, 2015-11-22 at 00:56 +0800, Ming Lei wrote: On Sat, 21 Nov 2015 12:30:14 +0100 Laurent Dufour wrote: On 20/11/2015 13:10, Michael Ellerman wrote: On Thu, 2015-11-19 at 00:23 -0800, Christoph Hellwig wrote: It's pretty much guaranteed a block layer bug, most likely in the merge bios to request infrastucture where we don't obey the merging limits properly. Does either of you have a known good and first known bad kernel? Not me, I've only hit it one or two times. All I can say is I have hit it in 4.4-rc1. Laurent, can you narrow it down at all? It seems that the panic is triggered by the commit bdced438acd8 ("block: setup bi_phys_segments after splitting") which has been pulled by the merge d9734e0d1ccf ("Merge branch 'for-4.4/core' of git://git.kernel.dk/linux-block"). My system is panicing promptly when running a kernel built at d9734e0d1ccf, while reverting the commit bdced438acd8, it can run hours without panicing. This being said, I can't explain what's going wrong. May Ming shed some light here ? Laurent, looks there is one bug in blk_bio_segment_split(), would you mind testing the following patch to see if it fixes your issue? --- From 6fc701231dcc000bc8bc4b9105583380d9aa31f4 Mon Sep 17 00:00:00 2001 From: Ming Lei Date: Sun, 22 Nov 2015 00:47:13 +0800 Subject: [PATCH] block: fix segment split Inside blk_bio_segment_split(), previous bvec pointer('bvprvp') always points to the iterator local variable, which is obviously wrong, so fix it by pointing to the local variable of 'bvprv'. Signed-off-by: Ming Lei --- block/blk-merge.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/block/blk-merge.c b/block/blk-merge.c index de5716d8..f2efe8a 100644 --- a/block/blk-merge.c +++ b/block/blk-merge.c @@ -98,7 +98,7 @@ static struct bio *blk_bio_segment_split(struct request_queue *q, seg_size += bv.bv_len; bvprv = bv; - bvprvp = &bv; + bvprvp = &bvprv; sectors += bv.bv_len >> 9; continue; } @@ -108,7 +108,7 @@ new_segment: nsegs++; bvprv = bv; - bvprvp = &bv; + bvprvp = &bvprv; seg_size = bv.bv_len; sectors += bv.bv_len >> 9; } I'm still hitting the BUG even with this patch applied on top of 4.4-rc1. OK, looks there are still other bugs, care to share us how to reproduce it on arm64? thanks, Ming Unfortunately, the best reproducer I have is to boot the platform. I have seen the BUG a few times post-boot, but I don't have a consistant reproducer. I am using upstream 4.4-rc1 with this config: http://people.redhat.com/msalter/fh_defconfig With 4.4-rc1 on an APM Mustang platform, I see the BUG about once every 6-7 boots. On an AMD Seattle platform, about every 9 boots. Thanks for the input, and I will try to reproduce the issue on mustang with your kernel config. I can reproduce the issue on mustang, and looks I may understand the story now. When 64K page size is used on arm64, and the default segment size of block is 65536, then one segment should only include one page at most. Commit bdced438acd83a(block: setup bi_phys_segments after splitting) does not compute bio->bi_seg_front_size and bio->bi_seg_back_size, then one less segment may be obtained because blk_phys_contig_segment() thought the last bvec in 1st bio and the 1st bvec in the 2nd bio is in one physical segment, so cause the regression. Looks the following patch can fix the issue by figuring bio->bi_seg_front_size and bio->bi_seg_back_size in blk_bio_segment_split(). Mark, thanks again for providing the reproduction steps, and could you run your test to see if it can fix your issue? --- From 86b5f33d48715c1150fdcfd9a76e495e7aa913aa Mon Sep 17 00:00:00 2001 From: Ming Lei Date: Mon, 23 Nov 2015 20:27:23 +0800 Subject: [PATCH 2/2] blk-merge: fix blk_bio_segment_split Commit bdced438acd83a(block: setup bi_phys_segments after splitting) introduces function of computing bio->bi_phys_segments during bio splitting. Unfortunately both bio->bi_seg_front_size and bio->bi_seg_back_size arn't computed, so too many physical segments may be obtained for one request since both the two are used to check if one segment across two bios can be possible. This patch fixes the issue by computing the two variables in blk_bio_segment_split(). Reported-by: Michael Ellerman Reported-by: Mark Salter Fixes: bdced438acd83a(block: setup bi_phys_segments after splitting) Signed-off-by: Ming Lei --- block/blk-merge.c | 22 +++--- 1 file changed, 19 insertions(+), 3 deletions(-) diff
Re: kernel BUG at drivers/scsi/scsi_lib.c:1096!
On Mon, 2015-11-23 at 23:27 +0800, Ming Lei wrote: > On Mon, Nov 23, 2015 at 11:20 PM, Laurent Dufour > wrote: > > > > > > Reverting above commit on top if 4.4-rc1 seems to fix the problem for me. > > > > That's what I mentioned earlier ;) > > > > Now Ming send an additional patch with seems to fix the bug introduced > > through the commit bdced438acd8. When testing with this new patch I > > can't get the panic anymore, but Mark reported he is still hitting it. > > Laurent, thanks for your test on the 1st patch, and looks there are at > least two problems, and my 2nd patch sent just now should address > Mark's issue which is caused by bdced438acd83a. > > Once the 2nd one is tested OK, I will send out the two together. > > Thanks, > Ming Thanks Ming. With both patches applied, I have been unable to reproduce the problem. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: kernel BUG at drivers/scsi/scsi_lib.c:1096!
On 23/11/2015 16:27, Ming Lei wrote: > On Mon, Nov 23, 2015 at 11:20 PM, Laurent Dufour > wrote: >>> >>> Reverting above commit on top if 4.4-rc1 seems to fix the problem for me. >> >> That's what I mentioned earlier ;) >> >> Now Ming send an additional patch with seems to fix the bug introduced >> through the commit bdced438acd8. When testing with this new patch I >> can't get the panic anymore, but Mark reported he is still hitting it. > > Laurent, thanks for your test on the 1st patch, and looks there are at > least two problems, and my 2nd patch sent just now should address > Mark's issue which is caused by bdced438acd83a. > > Once the 2nd one is tested OK, I will send out the two together. FWIW, I applied the 2nd patch, and my system is still running like a charm. Cheers, Laurent. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: kernel BUG at drivers/scsi/scsi_lib.c:1096!
On Mon, Nov 23, 2015 at 11:20 PM, Laurent Dufour wrote: >> >> Reverting above commit on top if 4.4-rc1 seems to fix the problem for me. > > That's what I mentioned earlier ;) > > Now Ming send an additional patch with seems to fix the bug introduced > through the commit bdced438acd8. When testing with this new patch I > can't get the panic anymore, but Mark reported he is still hitting it. Laurent, thanks for your test on the 1st patch, and looks there are at least two problems, and my 2nd patch sent just now should address Mark's issue which is caused by bdced438acd83a. Once the 2nd one is tested OK, I will send out the two together. Thanks, Ming -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: kernel BUG at drivers/scsi/scsi_lib.c:1096!
On Mon, 23 Nov 2015 10:46:20 +0800 Ming Lei wrote: > Hi Mark, > > On Mon, Nov 23, 2015 at 9:50 AM, Mark Salter wrote: > > On Mon, 2015-11-23 at 08:36 +0800, Ming Lei wrote: > >> On Mon, Nov 23, 2015 at 7:20 AM, Mark Salter wrote: > >> > On Sun, 2015-11-22 at 00:56 +0800, Ming Lei wrote: > >> > > On Sat, 21 Nov 2015 12:30:14 +0100 > >> > > Laurent Dufour wrote: > >> > > > >> > > > On 20/11/2015 13:10, Michael Ellerman wrote: > >> > > > > On Thu, 2015-11-19 at 00:23 -0800, Christoph Hellwig wrote: > >> > > > > > >> > > > > > It's pretty much guaranteed a block layer bug, most likely in the > >> > > > > > merge bios to request infrastucture where we don't obey the > >> > > > > > merging > >> > > > > > limits properly. > >> > > > > > > >> > > > > > Does either of you have a known good and first known bad kernel? > >> > > > > > >> > > > > Not me, I've only hit it one or two times. All I can say is I have > >> > > > > hit it in > >> > > > > 4.4-rc1. > >> > > > > > >> > > > > Laurent, can you narrow it down at all? > >> > > > > >> > > > It seems that the panic is triggered by the commit bdced438acd8 > >> > > > ("block: > >> > > > setup bi_phys_segments after splitting") which has been pulled by the > >> > > > merge d9734e0d1ccf ("Merge branch 'for-4.4/core' of > >> > > > git://git.kernel.dk/linux-block"). > >> > > > > >> > > > My system is panicing promptly when running a kernel built at > >> > > > d9734e0d1ccf, while reverting the commit bdced438acd8, it can run > >> > > > hours > >> > > > without panicing. > >> > > > > >> > > > This being said, I can't explain what's going wrong. > >> > > > > >> > > > May Ming shed some light here ? > >> > > > >> > > Laurent, looks there is one bug in blk_bio_segment_split(), would you > >> > > mind testing the following patch to see if it fixes your issue? > >> > > > >> > > --- > >> > > From 6fc701231dcc000bc8bc4b9105583380d9aa31f4 Mon Sep 17 00:00:00 2001 > >> > > From: Ming Lei > >> > > Date: Sun, 22 Nov 2015 00:47:13 +0800 > >> > > Subject: [PATCH] block: fix segment split > >> > > > >> > > Inside blk_bio_segment_split(), previous bvec pointer('bvprvp') > >> > > always points to the iterator local variable, which is obviously > >> > > wrong, so fix it by pointing to the local variable of 'bvprv'. > >> > > > >> > > Signed-off-by: Ming Lei > >> > > --- > >> > > block/blk-merge.c | 4 ++-- > >> > > 1 file changed, 2 insertions(+), 2 deletions(-) > >> > > > >> > > diff --git a/block/blk-merge.c b/block/blk-merge.c > >> > > index de5716d8..f2efe8a 100644 > >> > > --- a/block/blk-merge.c > >> > > +++ b/block/blk-merge.c > >> > > @@ -98,7 +98,7 @@ static struct bio *blk_bio_segment_split(struct > >> > > request_queue *q, > >> > > > >> > > seg_size += bv.bv_len; > >> > > bvprv = bv; > >> > > - bvprvp = &bv; > >> > > + bvprvp = &bvprv; > >> > > sectors += bv.bv_len >> 9; > >> > > continue; > >> > > } > >> > > @@ -108,7 +108,7 @@ new_segment: > >> > > > >> > > nsegs++; > >> > > bvprv = bv; > >> > > - bvprvp = &bv; > >> > > + bvprvp = &bvprv; > >> > > seg_size = bv.bv_len; > >> > > sectors += bv.bv_len >> 9; > >> > > } > >> > > >> > I'm still hitting the BUG even with this patch applied on top of 4.4-rc1. > >> > >> OK, looks there are still other bugs, care to share us how to reproduce > >> it on arm64? > >> > >> thanks, > >> Ming > > > > Unfortunately, the best reproducer I have is to boot the platform. I have > > seen the > > BUG a few times post-boot, but I don't have a consistant reproducer. I am > > using > > upstream 4.4-rc1 with this config: > > > > http://people.redhat.com/msalter/fh_defconfig > > > > With 4.4-rc1 on an APM Mustang platform, I see the BUG about once every 6-7 > > boots. > > On an AMD Seattle platform, about every 9 boots. > > Thanks for the input, and I will try to reproduce the issue on mustang with > your kernel config. I can reproduce the issue on mustang, and looks I may understand the story now. When 64K page size is used on arm64, and the default segment size of block is 65536, then one segment should only include one page at most. Commit bdced438acd83a(block: setup bi_phys_segments after splitting) does not compute bio->bi_seg_front_size and bio->bi_seg_back_size, then one less segment may be obtained because blk_phys_contig_segment() thought the last bvec in 1st bio and the 1st bvec in the 2nd bio is in one physical segment, so cause the regression. Looks the following patch can fix the issue by figuring bio->bi_seg_front_size and bio->bi_seg_back_size in blk_bio_segment_split(). Mark, thanks again for providing the reproduction steps, and could you run your test to see if it can fix your issue? --- >From 86b5f33d48715c1150fdcfd9a76e495e7aa913aa Mon Sep 17 00:00:00 2001 From: Ming Lei Date: Mon, 23 Nov
Re: kernel BUG at drivers/scsi/scsi_lib.c:1096!
On 23/11/2015 16:13, Pratyush Anand wrote: > On 23/11/2015:02:57:19 PM, Laurent Dufour wrote: >> On 23/11/2015 00:20, Mark Salter wrote: >>> On Sun, 2015-11-22 at 00:56 +0800, Ming Lei wrote: On Sat, 21 Nov 2015 12:30:14 +0100 Laurent Dufour wrote: > On 20/11/2015 13:10, Michael Ellerman wrote: >> On Thu, 2015-11-19 at 00:23 -0800, Christoph Hellwig wrote: >> >>> It's pretty much guaranteed a block layer bug, most likely in the >>> merge bios to request infrastucture where we don't obey the merging >>> limits properly. >>> >>> Does either of you have a known good and first known bad kernel? >> >> Not me, I've only hit it one or two times. All I can say is I have hit >> it in >> 4.4-rc1. >> >> Laurent, can you narrow it down at all? > > It seems that the panic is triggered by the commit bdced438acd8 ("block: > setup bi_phys_segments after splitting") which has been pulled by the > merge d9734e0d1ccf ("Merge branch 'for-4.4/core' of > git://git.kernel.dk/linux-block"). > > My system is panicing promptly when running a kernel built at > d9734e0d1ccf, while reverting the commit bdced438acd8, it can run hours > without panicing. > > This being said, I can't explain what's going wrong. > > May Ming shed some light here ? Laurent, looks there is one bug in blk_bio_segment_split(), would you mind testing the following patch to see if it fixes your issue? --- From 6fc701231dcc000bc8bc4b9105583380d9aa31f4 Mon Sep 17 00:00:00 2001 From: Ming Lei Date: Sun, 22 Nov 2015 00:47:13 +0800 Subject: [PATCH] block: fix segment split Inside blk_bio_segment_split(), previous bvec pointer('bvprvp') always points to the iterator local variable, which is obviously wrong, so fix it by pointing to the local variable of 'b > ~Pratyush > vprv'. Signed-off-by: Ming Lei --- block/blk-merge.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/block/blk-merge.c b/block/blk-merge.c index de5716d8..f2efe8a 100644 --- a/block/blk-merge.c +++ b/block/blk-merge.c @@ -98,7 +98,7 @@ static struct bio *blk_bio_segment_split(struct request_queue *q, seg_size += bv.bv_len;bdced438acd8 bvprv = bv; - bvprvp = &bv; + bvprvp = &bvprv; sectors += bv.bv_len >> 9; continue; } @@ -108,7 +108,7 @@ new_segment: nsegs++; bvprv = bv; - bvprvp = &bv; + bvprvp = &bvprv; seg_size = bv.bv_len; sectors += bv.bv_len >> 9; } >>> >>> I'm still hitting the BUG even with this patch applied on top of 4.4-rc1. >> >> On my side, with the patch applied on top of 4.4-rc1, I can't get the >> panic anymore. > > git bisect shows: > > bdced438acd83ad83a6c6fc7f50099b820245ddb is the first bad commit > commit bdced438acd83ad83a6c6fc7f50099b820245ddb > Author: Ming Lei > Date: Tue Oct 20 23:13:52 2015 +0800 > > block: setup bi_phys_segments after splitting > > Reverting above commit on top if 4.4-rc1 seems to fix the problem for me. That's what I mentioned earlier ;) Now Ming send an additional patch with seems to fix the bug introduced through the commit bdced438acd8. When testing with this new patch I can't get the panic anymore, but Mark reported he is still hitting it. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: kernel BUG at drivers/scsi/scsi_lib.c:1096!
On 23/11/2015:02:57:19 PM, Laurent Dufour wrote: > On 23/11/2015 00:20, Mark Salter wrote: > > On Sun, 2015-11-22 at 00:56 +0800, Ming Lei wrote: > >> On Sat, 21 Nov 2015 12:30:14 +0100 > >> Laurent Dufour wrote: > >> > >>> On 20/11/2015 13:10, Michael Ellerman wrote: > On Thu, 2015-11-19 at 00:23 -0800, Christoph Hellwig wrote: > > > It's pretty much guaranteed a block layer bug, most likely in the > > merge bios to request infrastucture where we don't obey the merging > > limits properly. > > > > Does either of you have a known good and first known bad kernel? > > Not me, I've only hit it one or two times. All I can say is I have hit > it in > 4.4-rc1. > > Laurent, can you narrow it down at all? > >>> > >>> It seems that the panic is triggered by the commit bdced438acd8 ("block: > >>> setup bi_phys_segments after splitting") which has been pulled by the > >>> merge d9734e0d1ccf ("Merge branch 'for-4.4/core' of > >>> git://git.kernel.dk/linux-block"). > >>> > >>> My system is panicing promptly when running a kernel built at > >>> d9734e0d1ccf, while reverting the commit bdced438acd8, it can run hours > >>> without panicing. > >>> > >>> This being said, I can't explain what's going wrong. > >>> > >>> May Ming shed some light here ? > >> > >> Laurent, looks there is one bug in blk_bio_segment_split(), would you > >> mind testing the following patch to see if it fixes your issue? > >> > >> --- > >> From 6fc701231dcc000bc8bc4b9105583380d9aa31f4 Mon Sep 17 00:00:00 2001 > >> From: Ming Lei > >> Date: Sun, 22 Nov 2015 00:47:13 +0800 > >> Subject: [PATCH] block: fix segment split > >> > >> Inside blk_bio_segment_split(), previous bvec pointer('bvprvp') > >> always points to the iterator local variable, which is obviously > >> wrong, so fix it by pointing to the local variable of 'bvprv'. > >> > >> Signed-off-by: Ming Lei > >> --- > >> block/blk-merge.c | 4 ++-- > >> 1 file changed, 2 insertions(+), 2 deletions(-) > >> > >> diff --git a/block/blk-merge.c b/block/blk-merge.c > >> index de5716d8..f2efe8a 100644 > >> --- a/block/blk-merge.c > >> +++ b/block/blk-merge.c > >> @@ -98,7 +98,7 @@ static struct bio *blk_bio_segment_split(struct > >> request_queue *q, > >> > >>seg_size += bv.bv_len; > >>bvprv = bv; > >> - bvprvp = &bv; > >> + bvprvp = &bvprv; > >>sectors += bv.bv_len >> 9; > >>continue; > >>} > >> @@ -108,7 +108,7 @@ new_segment: > >> > >>nsegs++; > >>bvprv = bv; > >> - bvprvp = &bv; > >> + bvprvp = &bvprv; > >>seg_size = bv.bv_len; > >>sectors += bv.bv_len >> 9; > >>} > > > > I'm still hitting the BUG even with this patch applied on top of 4.4-rc1. > > On my side, with the patch applied on top of 4.4-rc1, I can't get the > panic anymore. git bisect shows: bdced438acd83ad83a6c6fc7f50099b820245ddb is the first bad commit commit bdced438acd83ad83a6c6fc7f50099b820245ddb Author: Ming Lei Date: Tue Oct 20 23:13:52 2015 +0800 block: setup bi_phys_segments after splitting Reverting above commit on top if 4.4-rc1 seems to fix the problem for me. ~Pratyush -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: kernel BUG at drivers/scsi/scsi_lib.c:1096!
On 23/11/2015 00:20, Mark Salter wrote: > On Sun, 2015-11-22 at 00:56 +0800, Ming Lei wrote: >> On Sat, 21 Nov 2015 12:30:14 +0100 >> Laurent Dufour wrote: >> >>> On 20/11/2015 13:10, Michael Ellerman wrote: On Thu, 2015-11-19 at 00:23 -0800, Christoph Hellwig wrote: > It's pretty much guaranteed a block layer bug, most likely in the > merge bios to request infrastucture where we don't obey the merging > limits properly. > > Does either of you have a known good and first known bad kernel? Not me, I've only hit it one or two times. All I can say is I have hit it in 4.4-rc1. Laurent, can you narrow it down at all? >>> >>> It seems that the panic is triggered by the commit bdced438acd8 ("block: >>> setup bi_phys_segments after splitting") which has been pulled by the >>> merge d9734e0d1ccf ("Merge branch 'for-4.4/core' of >>> git://git.kernel.dk/linux-block"). >>> >>> My system is panicing promptly when running a kernel built at >>> d9734e0d1ccf, while reverting the commit bdced438acd8, it can run hours >>> without panicing. >>> >>> This being said, I can't explain what's going wrong. >>> >>> May Ming shed some light here ? >> >> Laurent, looks there is one bug in blk_bio_segment_split(), would you >> mind testing the following patch to see if it fixes your issue? >> >> --- >> From 6fc701231dcc000bc8bc4b9105583380d9aa31f4 Mon Sep 17 00:00:00 2001 >> From: Ming Lei >> Date: Sun, 22 Nov 2015 00:47:13 +0800 >> Subject: [PATCH] block: fix segment split >> >> Inside blk_bio_segment_split(), previous bvec pointer('bvprvp') >> always points to the iterator local variable, which is obviously >> wrong, so fix it by pointing to the local variable of 'bvprv'. >> >> Signed-off-by: Ming Lei >> --- >> block/blk-merge.c | 4 ++-- >> 1 file changed, 2 insertions(+), 2 deletions(-) >> >> diff --git a/block/blk-merge.c b/block/blk-merge.c >> index de5716d8..f2efe8a 100644 >> --- a/block/blk-merge.c >> +++ b/block/blk-merge.c >> @@ -98,7 +98,7 @@ static struct bio *blk_bio_segment_split(struct >> request_queue *q, >> >> seg_size += bv.bv_len; >> bvprv = bv; >> -bvprvp = &bv; >> +bvprvp = &bvprv; >> sectors += bv.bv_len >> 9; >> continue; >> } >> @@ -108,7 +108,7 @@ new_segment: >> >> nsegs++; >> bvprv = bv; >> -bvprvp = &bv; >> +bvprvp = &bvprv; >> seg_size = bv.bv_len; >> sectors += bv.bv_len >> 9; >> } > > I'm still hitting the BUG even with this patch applied on top of 4.4-rc1. On my side, with the patch applied on top of 4.4-rc1, I can't get the panic anymore. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: kernel BUG at drivers/scsi/scsi_lib.c:1096!
On 11/20/2015 04:28 PM, Ewan Milne wrote: On Fri, 2015-11-20 at 15:55 +0100, Hannes Reinecke wrote: Can't we have a joint effort here? I've been spending a _LOT_ of time trying to debug things here, but none of the ideas I've come up with have been able to fix anything. Yes. I'm not the one primarily looking at it, and we don't have a reproducer in-house. We just have the one dump right now. Oh, I got plenty of them :-( I'm almost tempted to increase the count from scsi_alloc_sgtable() by one and be done with ... That might not fix it if it is a problem with the merge code, though. Of course not. But it'll be a band-aid to keep the customer happy. Cheers, Hannes -- Dr. Hannes ReineckezSeries & Storage h...@suse.de +49 911 74053 688 SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg GF: F. Imendörffer, J. Smithard, J. Guild, D. Upmanyu, G. Norton HRB 21284 (AG Nürnberg) -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: kernel BUG at drivers/scsi/scsi_lib.c:1096!
Hi Mark, On Mon, Nov 23, 2015 at 9:50 AM, Mark Salter wrote: > On Mon, 2015-11-23 at 08:36 +0800, Ming Lei wrote: >> On Mon, Nov 23, 2015 at 7:20 AM, Mark Salter wrote: >> > On Sun, 2015-11-22 at 00:56 +0800, Ming Lei wrote: >> > > On Sat, 21 Nov 2015 12:30:14 +0100 >> > > Laurent Dufour wrote: >> > > >> > > > On 20/11/2015 13:10, Michael Ellerman wrote: >> > > > > On Thu, 2015-11-19 at 00:23 -0800, Christoph Hellwig wrote: >> > > > > >> > > > > > It's pretty much guaranteed a block layer bug, most likely in the >> > > > > > merge bios to request infrastucture where we don't obey the merging >> > > > > > limits properly. >> > > > > > >> > > > > > Does either of you have a known good and first known bad kernel? >> > > > > >> > > > > Not me, I've only hit it one or two times. All I can say is I have >> > > > > hit it in >> > > > > 4.4-rc1. >> > > > > >> > > > > Laurent, can you narrow it down at all? >> > > > >> > > > It seems that the panic is triggered by the commit bdced438acd8 >> > > > ("block: >> > > > setup bi_phys_segments after splitting") which has been pulled by the >> > > > merge d9734e0d1ccf ("Merge branch 'for-4.4/core' of >> > > > git://git.kernel.dk/linux-block"). >> > > > >> > > > My system is panicing promptly when running a kernel built at >> > > > d9734e0d1ccf, while reverting the commit bdced438acd8, it can run hours >> > > > without panicing. >> > > > >> > > > This being said, I can't explain what's going wrong. >> > > > >> > > > May Ming shed some light here ? >> > > >> > > Laurent, looks there is one bug in blk_bio_segment_split(), would you >> > > mind testing the following patch to see if it fixes your issue? >> > > >> > > --- >> > > From 6fc701231dcc000bc8bc4b9105583380d9aa31f4 Mon Sep 17 00:00:00 2001 >> > > From: Ming Lei >> > > Date: Sun, 22 Nov 2015 00:47:13 +0800 >> > > Subject: [PATCH] block: fix segment split >> > > >> > > Inside blk_bio_segment_split(), previous bvec pointer('bvprvp') >> > > always points to the iterator local variable, which is obviously >> > > wrong, so fix it by pointing to the local variable of 'bvprv'. >> > > >> > > Signed-off-by: Ming Lei >> > > --- >> > > block/blk-merge.c | 4 ++-- >> > > 1 file changed, 2 insertions(+), 2 deletions(-) >> > > >> > > diff --git a/block/blk-merge.c b/block/blk-merge.c >> > > index de5716d8..f2efe8a 100644 >> > > --- a/block/blk-merge.c >> > > +++ b/block/blk-merge.c >> > > @@ -98,7 +98,7 @@ static struct bio *blk_bio_segment_split(struct >> > > request_queue *q, >> > > >> > > seg_size += bv.bv_len; >> > > bvprv = bv; >> > > - bvprvp = &bv; >> > > + bvprvp = &bvprv; >> > > sectors += bv.bv_len >> 9; >> > > continue; >> > > } >> > > @@ -108,7 +108,7 @@ new_segment: >> > > >> > > nsegs++; >> > > bvprv = bv; >> > > - bvprvp = &bv; >> > > + bvprvp = &bvprv; >> > > seg_size = bv.bv_len; >> > > sectors += bv.bv_len >> 9; >> > > } >> > >> > I'm still hitting the BUG even with this patch applied on top of 4.4-rc1. >> >> OK, looks there are still other bugs, care to share us how to reproduce >> it on arm64? >> >> thanks, >> Ming > > Unfortunately, the best reproducer I have is to boot the platform. I have > seen the > BUG a few times post-boot, but I don't have a consistant reproducer. I am > using > upstream 4.4-rc1 with this config: > > http://people.redhat.com/msalter/fh_defconfig > > With 4.4-rc1 on an APM Mustang platform, I see the BUG about once every 6-7 > boots. > On an AMD Seattle platform, about every 9 boots. Thanks for the input, and I will try to reproduce the issue on mustang with your kernel config. > > I have a script that loops through an ssh command to reboot the platform > under test. > I manually install test kernels and then run the script and wait for failure. > While > debugging, I have tried more minimal configs with which I have been unable to > reproduce the problem even after several hours of reboots. With the above > mentioned > fh_defconfig, I have been able to get a failure within 20 or so boots with > most > kernel builds but at certain kernel commits, the failure has taken a longer > time to > reproduce. > > From my POV, I can't say which commit causes the problem. So far, I have not > been > able to reproduce at all before commit d9734e0d1ccf but I am currently trying > to > reproduce with commit 0d51ce9ca1116 (one merge earlier than d9734e0d1ccf). The patch for fixing 'bvprvp' is better to be included for test, because that issue may have a big effect on computing physical seg count. Also I appreciate if you, Laurent or anyone may provide debug log which can be captured with the attached debug patch when this issue is trigered . Thanks, diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c index dd8ad2a..a0db1fe 100644 --- a/drivers/s
Re: kernel BUG at drivers/scsi/scsi_lib.c:1096!
On Mon, 2015-11-23 at 08:36 +0800, Ming Lei wrote: > On Mon, Nov 23, 2015 at 7:20 AM, Mark Salter wrote: > > On Sun, 2015-11-22 at 00:56 +0800, Ming Lei wrote: > > > On Sat, 21 Nov 2015 12:30:14 +0100 > > > Laurent Dufour wrote: > > > > > > > On 20/11/2015 13:10, Michael Ellerman wrote: > > > > > On Thu, 2015-11-19 at 00:23 -0800, Christoph Hellwig wrote: > > > > > > > > > > > It's pretty much guaranteed a block layer bug, most likely in the > > > > > > merge bios to request infrastucture where we don't obey the merging > > > > > > limits properly. > > > > > > > > > > > > Does either of you have a known good and first known bad kernel? > > > > > > > > > > Not me, I've only hit it one or two times. All I can say is I have > > > > > hit it in > > > > > 4.4-rc1. > > > > > > > > > > Laurent, can you narrow it down at all? > > > > > > > > It seems that the panic is triggered by the commit bdced438acd8 ("block: > > > > setup bi_phys_segments after splitting") which has been pulled by the > > > > merge d9734e0d1ccf ("Merge branch 'for-4.4/core' of > > > > git://git.kernel.dk/linux-block"). > > > > > > > > My system is panicing promptly when running a kernel built at > > > > d9734e0d1ccf, while reverting the commit bdced438acd8, it can run hours > > > > without panicing. > > > > > > > > This being said, I can't explain what's going wrong. > > > > > > > > May Ming shed some light here ? > > > > > > Laurent, looks there is one bug in blk_bio_segment_split(), would you > > > mind testing the following patch to see if it fixes your issue? > > > > > > --- > > > From 6fc701231dcc000bc8bc4b9105583380d9aa31f4 Mon Sep 17 00:00:00 2001 > > > From: Ming Lei > > > Date: Sun, 22 Nov 2015 00:47:13 +0800 > > > Subject: [PATCH] block: fix segment split > > > > > > Inside blk_bio_segment_split(), previous bvec pointer('bvprvp') > > > always points to the iterator local variable, which is obviously > > > wrong, so fix it by pointing to the local variable of 'bvprv'. > > > > > > Signed-off-by: Ming Lei > > > --- > > > block/blk-merge.c | 4 ++-- > > > 1 file changed, 2 insertions(+), 2 deletions(-) > > > > > > diff --git a/block/blk-merge.c b/block/blk-merge.c > > > index de5716d8..f2efe8a 100644 > > > --- a/block/blk-merge.c > > > +++ b/block/blk-merge.c > > > @@ -98,7 +98,7 @@ static struct bio *blk_bio_segment_split(struct > > > request_queue *q, > > > > > > seg_size += bv.bv_len; > > > bvprv = bv; > > > - bvprvp = &bv; > > > + bvprvp = &bvprv; > > > sectors += bv.bv_len >> 9; > > > continue; > > > } > > > @@ -108,7 +108,7 @@ new_segment: > > > > > > nsegs++; > > > bvprv = bv; > > > - bvprvp = &bv; > > > + bvprvp = &bvprv; > > > seg_size = bv.bv_len; > > > sectors += bv.bv_len >> 9; > > > } > > > > I'm still hitting the BUG even with this patch applied on top of 4.4-rc1. > > OK, looks there are still other bugs, care to share us how to reproduce > it on arm64? > > thanks, > Ming Unfortunately, the best reproducer I have is to boot the platform. I have seen the BUG a few times post-boot, but I don't have a consistant reproducer. I am using upstream 4.4-rc1 with this config: http://people.redhat.com/msalter/fh_defconfig With 4.4-rc1 on an APM Mustang platform, I see the BUG about once every 6-7 boots. On an AMD Seattle platform, about every 9 boots. I have a script that loops through an ssh command to reboot the platform under test. I manually install test kernels and then run the script and wait for failure. While debugging, I have tried more minimal configs with which I have been unable to reproduce the problem even after several hours of reboots. With the above mentioned fh_defconfig, I have been able to get a failure within 20 or so boots with most kernel builds but at certain kernel commits, the failure has taken a longer time to reproduce. >From my POV, I can't say which commit causes the problem. So far, I have not >been able to reproduce at all before commit d9734e0d1ccf but I am currently trying to reproduce with commit 0d51ce9ca1116 (one merge earlier than d9734e0d1ccf). -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: kernel BUG at drivers/scsi/scsi_lib.c:1096!
On Mon, Nov 23, 2015 at 7:20 AM, Mark Salter wrote: > On Sun, 2015-11-22 at 00:56 +0800, Ming Lei wrote: >> On Sat, 21 Nov 2015 12:30:14 +0100 >> Laurent Dufour wrote: >> >> > On 20/11/2015 13:10, Michael Ellerman wrote: >> > > On Thu, 2015-11-19 at 00:23 -0800, Christoph Hellwig wrote: >> > > >> > > > It's pretty much guaranteed a block layer bug, most likely in the >> > > > merge bios to request infrastucture where we don't obey the merging >> > > > limits properly. >> > > > >> > > > Does either of you have a known good and first known bad kernel? >> > > >> > > Not me, I've only hit it one or two times. All I can say is I have hit >> > > it in >> > > 4.4-rc1. >> > > >> > > Laurent, can you narrow it down at all? >> > >> > It seems that the panic is triggered by the commit bdced438acd8 ("block: >> > setup bi_phys_segments after splitting") which has been pulled by the >> > merge d9734e0d1ccf ("Merge branch 'for-4.4/core' of >> > git://git.kernel.dk/linux-block"). >> > >> > My system is panicing promptly when running a kernel built at >> > d9734e0d1ccf, while reverting the commit bdced438acd8, it can run hours >> > without panicing. >> > >> > This being said, I can't explain what's going wrong. >> > >> > May Ming shed some light here ? >> >> Laurent, looks there is one bug in blk_bio_segment_split(), would you >> mind testing the following patch to see if it fixes your issue? >> >> --- >> From 6fc701231dcc000bc8bc4b9105583380d9aa31f4 Mon Sep 17 00:00:00 2001 >> From: Ming Lei >> Date: Sun, 22 Nov 2015 00:47:13 +0800 >> Subject: [PATCH] block: fix segment split >> >> Inside blk_bio_segment_split(), previous bvec pointer('bvprvp') >> always points to the iterator local variable, which is obviously >> wrong, so fix it by pointing to the local variable of 'bvprv'. >> >> Signed-off-by: Ming Lei >> --- >> block/blk-merge.c | 4 ++-- >> 1 file changed, 2 insertions(+), 2 deletions(-) >> >> diff --git a/block/blk-merge.c b/block/blk-merge.c >> index de5716d8..f2efe8a 100644 >> --- a/block/blk-merge.c >> +++ b/block/blk-merge.c >> @@ -98,7 +98,7 @@ static struct bio *blk_bio_segment_split(struct >> request_queue *q, >> >> seg_size += bv.bv_len; >> bvprv = bv; >> - bvprvp = &bv; >> + bvprvp = &bvprv; >> sectors += bv.bv_len >> 9; >> continue; >> } >> @@ -108,7 +108,7 @@ new_segment: >> >> nsegs++; >> bvprv = bv; >> - bvprvp = &bv; >> + bvprvp = &bvprv; >> seg_size = bv.bv_len; >> sectors += bv.bv_len >> 9; >> } > > I'm still hitting the BUG even with this patch applied on top of 4.4-rc1. OK, looks there are still other bugs, care to share us how to reproduce it on arm64? thanks, Ming -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: kernel BUG at drivers/scsi/scsi_lib.c:1096!
On Sun, 2015-11-22 at 00:56 +0800, Ming Lei wrote: > On Sat, 21 Nov 2015 12:30:14 +0100 > Laurent Dufour wrote: > > > On 20/11/2015 13:10, Michael Ellerman wrote: > > > On Thu, 2015-11-19 at 00:23 -0800, Christoph Hellwig wrote: > > > > > > > It's pretty much guaranteed a block layer bug, most likely in the > > > > merge bios to request infrastucture where we don't obey the merging > > > > limits properly. > > > > > > > > Does either of you have a known good and first known bad kernel? > > > > > > Not me, I've only hit it one or two times. All I can say is I have hit it > > > in > > > 4.4-rc1. > > > > > > Laurent, can you narrow it down at all? > > > > It seems that the panic is triggered by the commit bdced438acd8 ("block: > > setup bi_phys_segments after splitting") which has been pulled by the > > merge d9734e0d1ccf ("Merge branch 'for-4.4/core' of > > git://git.kernel.dk/linux-block"). > > > > My system is panicing promptly when running a kernel built at > > d9734e0d1ccf, while reverting the commit bdced438acd8, it can run hours > > without panicing. > > > > This being said, I can't explain what's going wrong. > > > > May Ming shed some light here ? > > Laurent, looks there is one bug in blk_bio_segment_split(), would you > mind testing the following patch to see if it fixes your issue? > > --- > From 6fc701231dcc000bc8bc4b9105583380d9aa31f4 Mon Sep 17 00:00:00 2001 > From: Ming Lei > Date: Sun, 22 Nov 2015 00:47:13 +0800 > Subject: [PATCH] block: fix segment split > > Inside blk_bio_segment_split(), previous bvec pointer('bvprvp') > always points to the iterator local variable, which is obviously > wrong, so fix it by pointing to the local variable of 'bvprv'. > > Signed-off-by: Ming Lei > --- > block/blk-merge.c | 4 ++-- > 1 file changed, 2 insertions(+), 2 deletions(-) > > diff --git a/block/blk-merge.c b/block/blk-merge.c > index de5716d8..f2efe8a 100644 > --- a/block/blk-merge.c > +++ b/block/blk-merge.c > @@ -98,7 +98,7 @@ static struct bio *blk_bio_segment_split(struct > request_queue *q, > > seg_size += bv.bv_len; > bvprv = bv; > - bvprvp = &bv; > + bvprvp = &bvprv; > sectors += bv.bv_len >> 9; > continue; > } > @@ -108,7 +108,7 @@ new_segment: > > nsegs++; > bvprv = bv; > - bvprvp = &bv; > + bvprvp = &bvprv; > seg_size = bv.bv_len; > sectors += bv.bv_len >> 9; > } I'm still hitting the BUG even with this patch applied on top of 4.4-rc1. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: kernel BUG at drivers/scsi/scsi_lib.c:1096!
On Sat, 21 Nov 2015 12:30:14 +0100 Laurent Dufour wrote: > On 20/11/2015 13:10, Michael Ellerman wrote: > > On Thu, 2015-11-19 at 00:23 -0800, Christoph Hellwig wrote: > > > >> It's pretty much guaranteed a block layer bug, most likely in the > >> merge bios to request infrastucture where we don't obey the merging > >> limits properly. > >> > >> Does either of you have a known good and first known bad kernel? > > > > Not me, I've only hit it one or two times. All I can say is I have hit it in > > 4.4-rc1. > > > > Laurent, can you narrow it down at all? > > It seems that the panic is triggered by the commit bdced438acd8 ("block: > setup bi_phys_segments after splitting") which has been pulled by the > merge d9734e0d1ccf ("Merge branch 'for-4.4/core' of > git://git.kernel.dk/linux-block"). > > My system is panicing promptly when running a kernel built at > d9734e0d1ccf, while reverting the commit bdced438acd8, it can run hours > without panicing. > > This being said, I can't explain what's going wrong. > > May Ming shed some light here ? Laurent, looks there is one bug in blk_bio_segment_split(), would you mind testing the following patch to see if it fixes your issue? --- >From 6fc701231dcc000bc8bc4b9105583380d9aa31f4 Mon Sep 17 00:00:00 2001 From: Ming Lei Date: Sun, 22 Nov 2015 00:47:13 +0800 Subject: [PATCH] block: fix segment split Inside blk_bio_segment_split(), previous bvec pointer('bvprvp') always points to the iterator local variable, which is obviously wrong, so fix it by pointing to the local variable of 'bvprv'. Signed-off-by: Ming Lei --- block/blk-merge.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/block/blk-merge.c b/block/blk-merge.c index de5716d8..f2efe8a 100644 --- a/block/blk-merge.c +++ b/block/blk-merge.c @@ -98,7 +98,7 @@ static struct bio *blk_bio_segment_split(struct request_queue *q, seg_size += bv.bv_len; bvprv = bv; - bvprvp = &bv; + bvprvp = &bvprv; sectors += bv.bv_len >> 9; continue; } @@ -108,7 +108,7 @@ new_segment: nsegs++; bvprv = bv; - bvprvp = &bv; + bvprvp = &bvprv; seg_size = bv.bv_len; sectors += bv.bv_len >> 9; } -- 1.9.1 Thanks, Ming -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: kernel BUG at drivers/scsi/scsi_lib.c:1096!
On 20/11/2015 13:10, Michael Ellerman wrote: > On Thu, 2015-11-19 at 00:23 -0800, Christoph Hellwig wrote: > >> It's pretty much guaranteed a block layer bug, most likely in the >> merge bios to request infrastucture where we don't obey the merging >> limits properly. >> >> Does either of you have a known good and first known bad kernel? > > Not me, I've only hit it one or two times. All I can say is I have hit it in > 4.4-rc1. > > Laurent, can you narrow it down at all? It seems that the panic is triggered by the commit bdced438acd8 ("block: setup bi_phys_segments after splitting") which has been pulled by the merge d9734e0d1ccf ("Merge branch 'for-4.4/core' of git://git.kernel.dk/linux-block"). My system is panicing promptly when running a kernel built at d9734e0d1ccf, while reverting the commit bdced438acd8, it can run hours without panicing. This being said, I can't explain what's going wrong. May Ming shed some light here ? Cheers -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: kernel BUG at drivers/scsi/scsi_lib.c:1096!
On Fri, 2015-11-20 at 15:55 +0100, Hannes Reinecke wrote: > Can't we have a joint effort here? > I've been spending a _LOT_ of time trying to debug things here, but > none of the ideas I've come up with have been able to fix anything. Yes. I'm not the one primarily looking at it, and we don't have a reproducer in-house. We just have the one dump right now. > > I'm almost tempted to increase the count from scsi_alloc_sgtable() > by one and be done with ... > That might not fix it if it is a problem with the merge code, though. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: kernel BUG at drivers/scsi/scsi_lib.c:1096!
On 11/20/2015 03:38 PM, Ewan Milne wrote: > On Thu, 2015-11-19 at 16:35 +0100, Hannes Reinecke wrote: >> On 11/19/2015 09:23 AM, Christoph Hellwig wrote: >>> It's pretty much guaranteed a block layer bug, most likely in the >>> merge bios to request infrastucture where we don't obey the merging >>> limits properly. >>> >>> Does either of you have a known good and first known bad kernel? >> >> Well, I have been fighting a similar issue for several months now, >> albeit with multipath enabled. Haven't had much progress with this, >> sadly. >> Seeing that this is our distro kernel it might or might not be >> related; however, as the symptoms are identical there still is a >> chance that this is actually a generic block-layer problem. >> >> Cheers, >> >> Hannes > > We have seen this also. (e.g. req->nr_phys_segments was 3, but > blk_rq_map_sg() returned 4.) I was suspicious of the patch: > > bio: modify __bio_add_page() to accept pages that don't start a new segment > > But we put some debugging code in and didn't hit it. We haven't > found the problem yet, either, though. We're still looking. > Can't we have a joint effort here? I've been spending a _LOT_ of time trying to debug things here, but none of the ideas I've come up with have been able to fix anything. I'm almost tempted to increase the count from scsi_alloc_sgtable() by one and be done with ... > As Christoph said, it would seem to be a problem with the block layer > merging. > > The API for this seems defective, in that blk_rq_map_sg() should > never be returning a value indicating that it overwrote past the > end of the supplied SG array and depend on the caller to check it. > (We could get data corruption on another I/O if it used adjacent > memory for a different SG list, for example.) > Yeah, the API is bloody useless. By the time you hit the BUG_ON you've already caused a memory corruption, so no way you can recover there. At the very least we should be passing in the sg list count into blk_map_rq_sg(), but that's a core blocklayer API and changes here would require changes by quite a set of drivers. Plus it wouldn't help me for a distribution kernel ... Cheers, Hannes -- Dr. Hannes ReineckezSeries & Storage h...@suse.de +49 911 74053 688 SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg GF: F. Imendörffer, J. Smithard, J. Guild, D. Upmanyu, G. Norton HRB 21284 (AG Nürnberg) -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: kernel BUG at drivers/scsi/scsi_lib.c:1096!
On Thu, 2015-11-19 at 16:35 +0100, Hannes Reinecke wrote: > On 11/19/2015 09:23 AM, Christoph Hellwig wrote: > > It's pretty much guaranteed a block layer bug, most likely in the > > merge bios to request infrastucture where we don't obey the merging > > limits properly. > > > > Does either of you have a known good and first known bad kernel? > > Well, I have been fighting a similar issue for several months now, > albeit with multipath enabled. Haven't had much progress with this, > sadly. > Seeing that this is our distro kernel it might or might not be > related; however, as the symptoms are identical there still is a > chance that this is actually a generic block-layer problem. > > Cheers, > > Hannes We have seen this also. (e.g. req->nr_phys_segments was 3, but blk_rq_map_sg() returned 4.) I was suspicious of the patch: bio: modify __bio_add_page() to accept pages that don't start a new segment But we put some debugging code in and didn't hit it. We haven't found the problem yet, either, though. We're still looking. As Christoph said, it would seem to be a problem with the block layer merging. The API for this seems defective, in that blk_rq_map_sg() should never be returning a value indicating that it overwrote past the end of the supplied SG array and depend on the caller to check it. (We could get data corruption on another I/O if it used adjacent memory for a different SG list, for example.) -Ewan -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: kernel BUG at drivers/scsi/scsi_lib.c:1096!
On Fri, 2015-11-20 at 13:56 +0100, Laurent Dufour wrote: > On 20/11/2015 13:10, Michael Ellerman wrote: > > On Thu, 2015-11-19 at 00:23 -0800, Christoph Hellwig wrote: > > > > > It's pretty much guaranteed a block layer bug, most likely in the > > > merge bios to request infrastucture where we don't obey the merging > > > limits properly. > > > > > > Does either of you have a known good and first known bad kernel? > > > > Not me, I've only hit it one or two times. All I can say is I have hit it in > > 4.4-rc1. > > > > Laurent, can you narrow it down at all? > > I'm trying bisect it but it's very time consuming, and I think I missed > some bad kernel because I didn't let enough time to the system to panic. > > But I'm still on... > I usually hit it at boot time. Maybe one out of five times. I'll take a stab at bisecting today... -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: kernel BUG at drivers/scsi/scsi_lib.c:1096!
On 20/11/2015 13:10, Michael Ellerman wrote: > On Thu, 2015-11-19 at 00:23 -0800, Christoph Hellwig wrote: > >> It's pretty much guaranteed a block layer bug, most likely in the >> merge bios to request infrastucture where we don't obey the merging >> limits properly. >> >> Does either of you have a known good and first known bad kernel? > > Not me, I've only hit it one or two times. All I can say is I have hit it in > 4.4-rc1. > > Laurent, can you narrow it down at all? I'm trying bisect it but it's very time consuming, and I think I missed some bad kernel because I didn't let enough time to the system to panic. But I'm still on... Cheers -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: kernel BUG at drivers/scsi/scsi_lib.c:1096!
On Thu, 2015-11-19 at 00:23 -0800, Christoph Hellwig wrote: > It's pretty much guaranteed a block layer bug, most likely in the > merge bios to request infrastucture where we don't obey the merging > limits properly. > > Does either of you have a known good and first known bad kernel? Not me, I've only hit it one or two times. All I can say is I have hit it in 4.4-rc1. Laurent, can you narrow it down at all? cheers -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: kernel BUG at drivers/scsi/scsi_lib.c:1096!
On 11/19/2015 09:23 AM, Christoph Hellwig wrote: > It's pretty much guaranteed a block layer bug, most likely in the > merge bios to request infrastucture where we don't obey the merging > limits properly. > > Does either of you have a known good and first known bad kernel? Well, I have been fighting a similar issue for several months now, albeit with multipath enabled. Haven't had much progress with this, sadly. Seeing that this is our distro kernel it might or might not be related; however, as the symptoms are identical there still is a chance that this is actually a generic block-layer problem. Cheers, Hannes -- Dr. Hannes ReineckezSeries & Storage h...@suse.de +49 911 74053 688 SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg GF: F. Imendörffer, J. Smithard, J. Guild, D. Upmanyu, G. Norton HRB 21284 (AG Nürnberg) -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: kernel BUG at drivers/scsi/scsi_lib.c:1096!
It's pretty much guaranteed a block layer bug, most likely in the merge bios to request infrastucture where we don't obey the merging limits properly. Does either of you have a known good and first known bad kernel? -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: kernel BUG at drivers/scsi/scsi_lib.c:1096!
On Wed, 2015-11-18 at 09:03 -0500, Mark Salter wrote: > On Wed, 2015-11-18 at 20:18 +1100, Michael Ellerman wrote: > > Hi folks, > > > > I'm intermittently seeing the following oops on at least one powerpc box. > > > > The BUG_ON() is from: > > > > static int scsi_init_sgtable(struct request *req, struct scsi_data_buffer > > *sdb) > > { > > ... > > count = blk_rq_map_sg(req->q, req, sdb->table.sgl); > > BUG_ON(count > sdb->table.nents); > > > > Looking at the dump it looks like count was 2, I can't work out what nents > > was. > > > > The machine's just a fairly boring bare metal setup, with a single IPR > > adapter: > > > > 0001:08:00.0 RAID bus controller: IBM PCI-E IPR SAS Adapter (ASIC) (rev 02) > > Subsystem: IBM PCIe3 x8 SAS RAID Internal Adapter 6Gb (57D7) > > Flags: bus master, fast devsel, latency 0 > > Kernel driver in use: ipr > > > > > > Anyone seen it before or have any ideas? > > I'm also seeing it on arm64 in 4.4-rc1 Ah thanks, that's a good data point. I was assuming it was a driver bug, but I assume you're not using IPR :) > [6.859003] Call trace: > > [6.861439] [] scsi_init_sgtable+0x84/0x88 > > [6.867072] [] scsi_init_io+0x4c/0x1ac > > [6.872358] [] sd_setup_read_write_cmnd+0x44/0x844 > > [6.878682] [] sd_init_command+0x38/0xb0 > > [6.884141] [] scsi_setup_cmnd+0xd8/0x13c > > [6.889686] [] scsi_prep_fn+0xc0/0x140 > > [6.894973] [] blk_peek_request+0x148/0x24c > > [6.900692] [] scsi_request_fn+0x58/0x648 > > [6.906237] [] __blk_run_queue+0x40/0x58 > > [6.911696] [] blk_run_queue+0x30/0x48 > > [6.916983] [] scsi_run_queue+0x204/0x294 > > [6.922528] [] scsi_end_request+0x13c/0x1a0 > > [6.928247] [] scsi_io_completion+0xf0/0x564 > > [6.934052] [] scsi_finish_command+0xe4/0x144 > > [6.939943] [] scsi_softirq_done+0x148/0x178 > > [6.945748] [] blk_done_softirq+0x7c/0x94 > > [6.951295] [] __do_softirq+0x114/0x2a0 > > [6.956667] [] irq_exit+0x8c/0xe4 > > [6.961522] [] handle_IPI+0x170/0x228 > > [6.966721] [] gic_handle_irq+0xa0/0xb8 > > [6.972093] Exception stack(0xfe03dc143de0 to 0xfe03dc143f00) > cheers -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: kernel BUG at drivers/scsi/scsi_lib.c:1096!
On Wed, 2015-11-18 at 20:18 +1100, Michael Ellerman wrote: > Hi folks, > > I'm intermittently seeing the following oops on at least one powerpc box. > > The BUG_ON() is from: > > static int scsi_init_sgtable(struct request *req, struct scsi_data_buffer > *sdb) > { > ... > count = blk_rq_map_sg(req->q, req, sdb->table.sgl); > BUG_ON(count > sdb->table.nents); > > Looking at the dump it looks like count was 2, I can't work out what nents > was. > > The machine's just a fairly boring bare metal setup, with a single IPR > adapter: > > 0001:08:00.0 RAID bus controller: IBM PCI-E IPR SAS Adapter (ASIC) (rev 02) > Subsystem: IBM PCIe3 x8 SAS RAID Internal Adapter 6Gb (57D7) > Flags: bus master, fast devsel, latency 0 > Kernel driver in use: ipr > > > Anyone seen it before or have any ideas? I'm also seeing it on arm64 in 4.4-rc1 [6.859003] Call trace: [6.861439] [] scsi_init_sgtable+0x84/0x88 [6.867072] [] scsi_init_io+0x4c/0x1ac [6.872358] [] sd_setup_read_write_cmnd+0x44/0x844 [6.878682] [] sd_init_command+0x38/0xb0 [6.884141] [] scsi_setup_cmnd+0xd8/0x13c [6.889686] [] scsi_prep_fn+0xc0/0x140 [6.894973] [] blk_peek_request+0x148/0x24c [6.900692] [] scsi_request_fn+0x58/0x648 [6.906237] [] __blk_run_queue+0x40/0x58 [6.911696] [] blk_run_queue+0x30/0x48 [6.916983] [] scsi_run_queue+0x204/0x294 [6.922528] [] scsi_end_request+0x13c/0x1a0 [6.928247] [] scsi_io_completion+0xf0/0x564 [6.934052] [] scsi_finish_command+0xe4/0x144 [6.939943] [] scsi_softirq_done+0x148/0x178 [6.945748] [] blk_done_softirq+0x7c/0x94 [6.951295] [] __do_softirq+0x114/0x2a0 [6.956667] [] irq_exit+0x8c/0xe4 [6.961522] [] handle_IPI+0x170/0x228 [6.966721] [] gic_handle_irq+0xa0/0xb8 [6.972093] Exception stack(0xfe03dc143de0 to 0xfe03dc143f00) > > cheers > > > systemd[1]: Starting Uncomplicated firewall... > Starting Uncomplicated firewall... > [ cut here ] > kernel BUG at drivers/scsi/scsi_lib.c:1096! > Oops: Exception in kernel mode, sig: 5 [#1] > SMP NR_CPUS=2048 NUMA PowerNV > Modules linked in: > CPU: 132 PID: 2699 Comm: kworker/132:1H Not tainted > 4.4.0-rc1-54939-ge22a248-dirty #77 > Workqueue: kblockd cfq_kick_queue > task: c00fef147400 ti: c00feb384000 task.ti: c00feb384000 > NIP: c05ab4a8 LR: c05ab490 CTR: > REGS: c00feb387620 TRAP: 0700 Not tainted > (4.4.0-rc1-54939-ge22a248-dirty) > MSR: 900100029033 CR: 24002228 XER: > CFAR: c0464950 SOFTE: 0 > GPR00: c05ab490 c00feb3878a0 c0d77d00 0002 > GPR04: c00ff2030158 c00ff47a0c00 1000 > GPR08: 0001 fff7 > GPR12: 2200 cfde5200 c00c8098 c0b39858 > GPR16: c0ae12c8 c0b39948 > GPR20: c05ab9c0 c017f5800144 c01e551a6850 > GPR24: c017f5800140 c007efb9c800 c01e551a6800 > GPR28: c00ff2030158 c00feb420240 > NIP [c05ab4a8] scsi_init_sgtable+0xa8/0x180 > LR [c05ab490] scsi_init_sgtable+0x90/0x180 > Call Trace: > [c00feb3878a0] [c05ab490] scsi_init_sgtable+0x90/0x180 > (unreliable) > [c00feb3878e0] [c05ab5d4] scsi_init_io+0x54/0x160 > [c00feb387930] [c05fb43c] sd_init_command+0x6c/0xb00 > [c00feb3879f0] [c05ac2f8] scsi_setup_cmnd+0x108/0x1a0 > [c00feb387a30] [c05ac594] scsi_prep_fn+0x104/0x1c0 > [c00feb387a70] [c045d85c] blk_peek_request+0x20c/0x390 > [c00feb387af0] [c05ad6a8] scsi_request_fn+0xb8/0x7f0 > [c00feb387bf0] [c04583b4] __blk_run_queue+0x54/0x80 > [c00feb387c20] [c047e178] cfq_kick_queue+0x38/0xc0 > [c00feb387c50] [c00c00cc] process_one_work+0x2ac/0x560 > [c00feb387ce0] [c00c0510] worker_thread+0x190/0x660 > [c00feb387d80] [c00c8198] kthread+0x108/0x130 > [c00feb387e30] [c