Re: [jet.c...@intel.com: [bio] kernel BUG at drivers/block/virtio_blk.c:166!]
Hi, On Wed, May 28, 2014 at 11:16:17PM +0800, Jet Chen wrote: > > Sorry for late respond. Dongsu has sent a patch for this issue. > message-id > <1401289778-9840-1-git-send-email-dongsu.p...@profitbricks.com> > Do you still need me to test the following patch ? No, don't test my patch, it is not needed anymore, please test Dongsu's patch instead so we are sure the regression is fixed. Thanks, Maurizio Lombardi -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [jet.c...@intel.com: [bio] kernel BUG at drivers/block/virtio_blk.c:166!]
On 05/27/2014 07:24 PM, Maurizio Lombardi wrote: > On Tue, May 27, 2014 at 10:43:59AM +0200, Maurizio Lombardi wrote: >> >> But now I'm suspicious of this part of commit 3979ef4dcf: >> >> failed: >> bvec->bv_page = NULL; >> bvec->bv_len = 0; >> bvec->bv_offset = 0; >> bio->bi_vcnt--; < >> blk_recount_segments(q, bio); >> return 0; >> >> Is decreasing bi_vcnt sufficient to guarantee that blk_recount_segments() >> recalculates the correct number of physical segments? >> Looking at the __blk_recalc_rq_segments() it appears it may not be the case. >> >> The question is how can we restore the correct number of physical segments >> in case >> of failure without breaking anything... >> > > If my hypothesis is correct, the following patch should trigger a kernel > panic, > Jet Chen, can you try it and let me know whether the BUG_ON is hit or not? Sorry for late respond. Dongsu has sent a patch for this issue. message-id <1401289778-9840-1-git-send-email-dongsu.p...@profitbricks.com> Do you still need me to test the following patch ? > > diff --git a/block/bio.c b/block/bio.c > index 0443694..763868f 100644 > --- a/block/bio.c > +++ b/block/bio.c > @@ -701,6 +701,7 @@ static int __bio_add_page(struct request_queue *q, struct > bio *bio, struct page > unsigned int max_sectors) > { > int retried_segments = 0; > + unsigned int phys_segments_orig; > struct bio_vec *bvec; > > /* > @@ -751,6 +752,9 @@ static int __bio_add_page(struct request_queue *q, struct > bio *bio, struct page > if (bio->bi_vcnt >= bio->bi_max_vecs) > return 0; > > + blk_recount_segments(q, bio); > + phys_segments_orig = bio->bi_phys_segments; > + > /* >* setup the new entry, we might clear it again later if we >* cannot add the page > @@ -811,6 +815,7 @@ static int __bio_add_page(struct request_queue *q, struct > bio *bio, struct page > bvec->bv_offset = 0; > bio->bi_vcnt--; > blk_recount_segments(q, bio); > + BUG_ON(phys_segments_orig != bio->bi_phys_segments); > return 0; > } > > > Regards, > Maurizio Lombardi > -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [jet.c...@intel.com: [bio] kernel BUG at drivers/block/virtio_blk.c:166!]
Hi, On 27.05.2014 13:24, Maurizio Lombardi wrote: > On Tue, May 27, 2014 at 10:43:59AM +0200, Maurizio Lombardi wrote: > > > > But now I'm suspicious of this part of commit 3979ef4dcf: > > > > failed: > > bvec->bv_page = NULL; > > bvec->bv_len = 0; > > bvec->bv_offset = 0; > > bio->bi_vcnt--; < > > blk_recount_segments(q, bio); > > return 0; > > > > Is decreasing bi_vcnt sufficient to guarantee that blk_recount_segments() > > recalculates the correct number of physical segments? > > Looking at the __blk_recalc_rq_segments() it appears it may not be the case. > > > > The question is how can we restore the correct number of physical segments > > in case > > of failure without breaking anything... > > If my hypothesis is correct, the following patch should trigger a kernel > panic, > Jet Chen, can you try it and let me know whether the BUG_ON is hit or not? I was also able to reproduce this bug just as reported by Jet Chen, and now I think I've found out a solution. I'll send out a patch. Regards, Dongsu > diff --git a/block/bio.c b/block/bio.c > index 0443694..763868f 100644 > --- a/block/bio.c > +++ b/block/bio.c > @@ -701,6 +701,7 @@ static int __bio_add_page(struct request_queue *q, struct > bio *bio, struct page > unsigned int max_sectors) > { > int retried_segments = 0; > + unsigned int phys_segments_orig; > struct bio_vec *bvec; > > /* > @@ -751,6 +752,9 @@ static int __bio_add_page(struct request_queue *q, struct > bio *bio, struct page > if (bio->bi_vcnt >= bio->bi_max_vecs) > return 0; > > + blk_recount_segments(q, bio); > + phys_segments_orig = bio->bi_phys_segments; > + > /* >* setup the new entry, we might clear it again later if we >* cannot add the page > @@ -811,6 +815,7 @@ static int __bio_add_page(struct request_queue *q, struct > bio *bio, struct page > bvec->bv_offset = 0; > bio->bi_vcnt--; > blk_recount_segments(q, bio); > + BUG_ON(phys_segments_orig != bio->bi_phys_segments); > return 0; > } > > > Regards, > Maurizio Lombardi > -- > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in > the body of a message to majord...@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > Please read the FAQ at http://www.tux.org/lkml/ -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [jet.c...@intel.com: [bio] kernel BUG at drivers/block/virtio_blk.c:166!]
Hi, On 27.05.2014 13:24, Maurizio Lombardi wrote: On Tue, May 27, 2014 at 10:43:59AM +0200, Maurizio Lombardi wrote: But now I'm suspicious of this part of commit 3979ef4dcf: failed: bvec-bv_page = NULL; bvec-bv_len = 0; bvec-bv_offset = 0; bio-bi_vcnt--; blk_recount_segments(q, bio); return 0; Is decreasing bi_vcnt sufficient to guarantee that blk_recount_segments() recalculates the correct number of physical segments? Looking at the __blk_recalc_rq_segments() it appears it may not be the case. The question is how can we restore the correct number of physical segments in case of failure without breaking anything... If my hypothesis is correct, the following patch should trigger a kernel panic, Jet Chen, can you try it and let me know whether the BUG_ON is hit or not? I was also able to reproduce this bug just as reported by Jet Chen, and now I think I've found out a solution. I'll send out a patch. Regards, Dongsu diff --git a/block/bio.c b/block/bio.c index 0443694..763868f 100644 --- a/block/bio.c +++ b/block/bio.c @@ -701,6 +701,7 @@ static int __bio_add_page(struct request_queue *q, struct bio *bio, struct page unsigned int max_sectors) { int retried_segments = 0; + unsigned int phys_segments_orig; struct bio_vec *bvec; /* @@ -751,6 +752,9 @@ static int __bio_add_page(struct request_queue *q, struct bio *bio, struct page if (bio-bi_vcnt = bio-bi_max_vecs) return 0; + blk_recount_segments(q, bio); + phys_segments_orig = bio-bi_phys_segments; + /* * setup the new entry, we might clear it again later if we * cannot add the page @@ -811,6 +815,7 @@ static int __bio_add_page(struct request_queue *q, struct bio *bio, struct page bvec-bv_offset = 0; bio-bi_vcnt--; blk_recount_segments(q, bio); + BUG_ON(phys_segments_orig != bio-bi_phys_segments); return 0; } Regards, Maurizio Lombardi -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/ -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [jet.c...@intel.com: [bio] kernel BUG at drivers/block/virtio_blk.c:166!]
On 05/27/2014 07:24 PM, Maurizio Lombardi wrote: On Tue, May 27, 2014 at 10:43:59AM +0200, Maurizio Lombardi wrote: But now I'm suspicious of this part of commit 3979ef4dcf: failed: bvec-bv_page = NULL; bvec-bv_len = 0; bvec-bv_offset = 0; bio-bi_vcnt--; blk_recount_segments(q, bio); return 0; Is decreasing bi_vcnt sufficient to guarantee that blk_recount_segments() recalculates the correct number of physical segments? Looking at the __blk_recalc_rq_segments() it appears it may not be the case. The question is how can we restore the correct number of physical segments in case of failure without breaking anything... If my hypothesis is correct, the following patch should trigger a kernel panic, Jet Chen, can you try it and let me know whether the BUG_ON is hit or not? Sorry for late respond. Dongsu has sent a patch for this issue. message-id 1401289778-9840-1-git-send-email-dongsu.p...@profitbricks.com Do you still need me to test the following patch ? diff --git a/block/bio.c b/block/bio.c index 0443694..763868f 100644 --- a/block/bio.c +++ b/block/bio.c @@ -701,6 +701,7 @@ static int __bio_add_page(struct request_queue *q, struct bio *bio, struct page unsigned int max_sectors) { int retried_segments = 0; + unsigned int phys_segments_orig; struct bio_vec *bvec; /* @@ -751,6 +752,9 @@ static int __bio_add_page(struct request_queue *q, struct bio *bio, struct page if (bio-bi_vcnt = bio-bi_max_vecs) return 0; + blk_recount_segments(q, bio); + phys_segments_orig = bio-bi_phys_segments; + /* * setup the new entry, we might clear it again later if we * cannot add the page @@ -811,6 +815,7 @@ static int __bio_add_page(struct request_queue *q, struct bio *bio, struct page bvec-bv_offset = 0; bio-bi_vcnt--; blk_recount_segments(q, bio); + BUG_ON(phys_segments_orig != bio-bi_phys_segments); return 0; } Regards, Maurizio Lombardi -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [jet.c...@intel.com: [bio] kernel BUG at drivers/block/virtio_blk.c:166!]
Hi, On Wed, May 28, 2014 at 11:16:17PM +0800, Jet Chen wrote: Sorry for late respond. Dongsu has sent a patch for this issue. message-id 1401289778-9840-1-git-send-email-dongsu.p...@profitbricks.com Do you still need me to test the following patch ? No, don't test my patch, it is not needed anymore, please test Dongsu's patch instead so we are sure the regression is fixed. Thanks, Maurizio Lombardi -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [jet.c...@intel.com: [bio] kernel BUG at drivers/block/virtio_blk.c:166!]
On Tue, May 27, 2014 at 10:43:59AM +0200, Maurizio Lombardi wrote: > > But now I'm suspicious of this part of commit 3979ef4dcf: > > failed: > bvec->bv_page = NULL; > bvec->bv_len = 0; > bvec->bv_offset = 0; > bio->bi_vcnt--; < > blk_recount_segments(q, bio); > return 0; > > Is decreasing bi_vcnt sufficient to guarantee that blk_recount_segments() > recalculates the correct number of physical segments? > Looking at the __blk_recalc_rq_segments() it appears it may not be the case. > > The question is how can we restore the correct number of physical segments in > case > of failure without breaking anything... > If my hypothesis is correct, the following patch should trigger a kernel panic, Jet Chen, can you try it and let me know whether the BUG_ON is hit or not? diff --git a/block/bio.c b/block/bio.c index 0443694..763868f 100644 --- a/block/bio.c +++ b/block/bio.c @@ -701,6 +701,7 @@ static int __bio_add_page(struct request_queue *q, struct bio *bio, struct page unsigned int max_sectors) { int retried_segments = 0; + unsigned int phys_segments_orig; struct bio_vec *bvec; /* @@ -751,6 +752,9 @@ static int __bio_add_page(struct request_queue *q, struct bio *bio, struct page if (bio->bi_vcnt >= bio->bi_max_vecs) return 0; + blk_recount_segments(q, bio); + phys_segments_orig = bio->bi_phys_segments; + /* * setup the new entry, we might clear it again later if we * cannot add the page @@ -811,6 +815,7 @@ static int __bio_add_page(struct request_queue *q, struct bio *bio, struct page bvec->bv_offset = 0; bio->bi_vcnt--; blk_recount_segments(q, bio); + BUG_ON(phys_segments_orig != bio->bi_phys_segments); return 0; } Regards, Maurizio Lombardi -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [jet.c...@intel.com: [bio] kernel BUG at drivers/block/virtio_blk.c:166!]
On Tue, May 27, 2014 at 12:03:29PM +0800, Ming Lei wrote: > On Tue, May 27, 2014 at 3:43 AM, Maurizio Lombardi > wrote: > > Hi Jens, > > > > looks like that commit 3979ef4dcf3d1de55a560a3a4016c30a835df44d > > ("bio-modify-__bio_add_page-to-accept-pages-that-dont-start-a-new-segment-v3") > > introduces a regression, as reported by Jet Chan. > > > > Do you have any idea about the possible problem with this patch? > > > > it is the one that performs a recount of the segments in case of failure in > > __bio_add_page() > > > > http://www.spinics.net/lists/mm-commits/msg103684.html > > > > I would not be surprised if the bug was introduced by fceb38f36f, because it > > contained a mystake that commit 3979ef4dcf supposedly fixed. > > But learning that commit 3979ef4dcf is introducing a regression leaves > > me quite puzzled. > > From code of __blk_recalc_rq_segments(), looks it > won't check if recounted physical segment number is > bigger than queue_max_segments(), so wondering if > blk_recount_segments() can always decrease > physical segment number. > This is what __bio_add_page() did before both fceb38f36f and 3979ef4dcf at line 757 while (bio->bi_phys_segments >= queue_max_segments(q)) { if (retried_segments) return 0; retried_segments = 1; blk_recount_segments(q, bio); } so it is possible, in case of error, to return from the function even if the recounted physical segments are bigger than queue_max_segments(q). - But now I'm suspicious of this part of commit 3979ef4dcf: failed: bvec->bv_page = NULL; bvec->bv_len = 0; bvec->bv_offset = 0; bio->bi_vcnt--; < blk_recount_segments(q, bio); return 0; Is decreasing bi_vcnt sufficient to guarantee that blk_recount_segments() recalculates the correct number of physical segments? Looking at the __blk_recalc_rq_segments() it appears it may not be the case. The question is how can we restore the correct number of physical segments in case of failure without breaking anything... Regards, Maurizio Lombardi -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [jet.c...@intel.com: [bio] kernel BUG at drivers/block/virtio_blk.c:166!]
On Tue, May 27, 2014 at 12:03:29PM +0800, Ming Lei wrote: On Tue, May 27, 2014 at 3:43 AM, Maurizio Lombardi mlomb...@redhat.com wrote: Hi Jens, looks like that commit 3979ef4dcf3d1de55a560a3a4016c30a835df44d (bio-modify-__bio_add_page-to-accept-pages-that-dont-start-a-new-segment-v3) introduces a regression, as reported by Jet Chan. Do you have any idea about the possible problem with this patch? it is the one that performs a recount of the segments in case of failure in __bio_add_page() http://www.spinics.net/lists/mm-commits/msg103684.html I would not be surprised if the bug was introduced by fceb38f36f, because it contained a mystake that commit 3979ef4dcf supposedly fixed. But learning that commit 3979ef4dcf is introducing a regression leaves me quite puzzled. From code of __blk_recalc_rq_segments(), looks it won't check if recounted physical segment number is bigger than queue_max_segments(), so wondering if blk_recount_segments() can always decrease physical segment number. This is what __bio_add_page() did before both fceb38f36f and 3979ef4dcf at line 757 while (bio-bi_phys_segments = queue_max_segments(q)) { if (retried_segments) return 0; retried_segments = 1; blk_recount_segments(q, bio); } so it is possible, in case of error, to return from the function even if the recounted physical segments are bigger than queue_max_segments(q). - But now I'm suspicious of this part of commit 3979ef4dcf: failed: bvec-bv_page = NULL; bvec-bv_len = 0; bvec-bv_offset = 0; bio-bi_vcnt--; blk_recount_segments(q, bio); return 0; Is decreasing bi_vcnt sufficient to guarantee that blk_recount_segments() recalculates the correct number of physical segments? Looking at the __blk_recalc_rq_segments() it appears it may not be the case. The question is how can we restore the correct number of physical segments in case of failure without breaking anything... Regards, Maurizio Lombardi -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [jet.c...@intel.com: [bio] kernel BUG at drivers/block/virtio_blk.c:166!]
On Tue, May 27, 2014 at 10:43:59AM +0200, Maurizio Lombardi wrote: But now I'm suspicious of this part of commit 3979ef4dcf: failed: bvec-bv_page = NULL; bvec-bv_len = 0; bvec-bv_offset = 0; bio-bi_vcnt--; blk_recount_segments(q, bio); return 0; Is decreasing bi_vcnt sufficient to guarantee that blk_recount_segments() recalculates the correct number of physical segments? Looking at the __blk_recalc_rq_segments() it appears it may not be the case. The question is how can we restore the correct number of physical segments in case of failure without breaking anything... If my hypothesis is correct, the following patch should trigger a kernel panic, Jet Chen, can you try it and let me know whether the BUG_ON is hit or not? diff --git a/block/bio.c b/block/bio.c index 0443694..763868f 100644 --- a/block/bio.c +++ b/block/bio.c @@ -701,6 +701,7 @@ static int __bio_add_page(struct request_queue *q, struct bio *bio, struct page unsigned int max_sectors) { int retried_segments = 0; + unsigned int phys_segments_orig; struct bio_vec *bvec; /* @@ -751,6 +752,9 @@ static int __bio_add_page(struct request_queue *q, struct bio *bio, struct page if (bio-bi_vcnt = bio-bi_max_vecs) return 0; + blk_recount_segments(q, bio); + phys_segments_orig = bio-bi_phys_segments; + /* * setup the new entry, we might clear it again later if we * cannot add the page @@ -811,6 +815,7 @@ static int __bio_add_page(struct request_queue *q, struct bio *bio, struct page bvec-bv_offset = 0; bio-bi_vcnt--; blk_recount_segments(q, bio); + BUG_ON(phys_segments_orig != bio-bi_phys_segments); return 0; } Regards, Maurizio Lombardi -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [jet.c...@intel.com: [bio] kernel BUG at drivers/block/virtio_blk.c:166!]
On Tue, May 27, 2014 at 3:43 AM, Maurizio Lombardi wrote: > Hi Jens, > > looks like that commit 3979ef4dcf3d1de55a560a3a4016c30a835df44d > ("bio-modify-__bio_add_page-to-accept-pages-that-dont-start-a-new-segment-v3") > introduces a regression, as reported by Jet Chan. > > Do you have any idea about the possible problem with this patch? > > it is the one that performs a recount of the segments in case of failure in > __bio_add_page() > > http://www.spinics.net/lists/mm-commits/msg103684.html > > I would not be surprised if the bug was introduced by fceb38f36f, because it > contained a mystake that commit 3979ef4dcf supposedly fixed. > But learning that commit 3979ef4dcf is introducing a regression leaves > me quite puzzled. >From code of __blk_recalc_rq_segments(), looks it won't check if recounted physical segment number is bigger than queue_max_segments(), so wondering if blk_recount_segments() can always decrease physical segment number. Thanks, -- Ming Lei -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [jet.c...@intel.com: [bio] kernel BUG at drivers/block/virtio_blk.c:166!]
On Tue, May 27, 2014 at 3:43 AM, Maurizio Lombardi mlomb...@redhat.com wrote: Hi Jens, looks like that commit 3979ef4dcf3d1de55a560a3a4016c30a835df44d (bio-modify-__bio_add_page-to-accept-pages-that-dont-start-a-new-segment-v3) introduces a regression, as reported by Jet Chan. Do you have any idea about the possible problem with this patch? it is the one that performs a recount of the segments in case of failure in __bio_add_page() http://www.spinics.net/lists/mm-commits/msg103684.html I would not be surprised if the bug was introduced by fceb38f36f, because it contained a mystake that commit 3979ef4dcf supposedly fixed. But learning that commit 3979ef4dcf is introducing a regression leaves me quite puzzled. From code of __blk_recalc_rq_segments(), looks it won't check if recounted physical segment number is bigger than queue_max_segments(), so wondering if blk_recount_segments() can always decrease physical segment number. Thanks, -- Ming Lei -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/