On Tue, Oct 23 2007, Jens Axboe wrote: > On Mon, Oct 22 2007, David Miller wrote: > > > > I'm debugging a blk_rq_map_sg() crash that i'm getting on sparc64 as > > root is mounted over IDE. I think I know what is happening now. > > > > The IDE sg table is allocated and initialized like this in > > drivers/ide/ide-probe.c: > > > > x = kmalloc(sizeof(struct scatterlist) * nents, GFP_XXX); > > sg_init_table(x, nents); > > > > So far, so good. > > > > Now, ide_map_sg() passes requests down to blk_rq_map_sg() like this in > > drivers/block/ide-io.c: > > > > hwif->sg_nents = blk_rq_map_sg(drive->queue, rq, sg); > > > > Ok, so what does blk_rq_map_sg() do? > > > > sg = NULL; > > rq_for_each_segment(bvec, rq, iter) { > > ... > > if (bvprv && cluster) { > > ... > > } else { > > new_segment: > > if (!sg) > > sg = sglist; > > else > > sg = sg_next(sg); > > ... > > } > > bvprv = bvec; > > } /* segments in rq */ > > > > if (sg) > > __sg_mark_end(sg); > > > > So let's say the first request comes in and needs 2 segs. > > This will mark sg[1].page_link with 0x2 > > > > If the next request from IDE needs 4 segs, we'll OOPS because > > sg_next() on &sg[1] will see page_link bit 0x2 is set and > > therefore return NULL. > > > > A quick look shows that if you're testing on SCSI (or something > > layered on top of it like SATA or PATA) you won't see this seemingly > > guarenteed crash because the SCSI mid-layer allocates a fresh sglist > > via mempool_alloc() and runs sg_init_table() on it for every I/O > > request. > > We should never see the end pointer in blk_rq_map_sg(), or that's a bug > in the driver. So it should be OK to just clear the end pointer always > in there, even if it's not the prettiest solution... > > This just needs to be wrapped up in some scatterlist.h macro/function. > > diff --git a/block/ll_rw_blk.c b/block/ll_rw_blk.c > index 61c2e39..a3bda2f 100644 > --- a/block/ll_rw_blk.c > +++ b/block/ll_rw_blk.c > @@ -1354,6 +1354,12 @@ new_segment: > else > sg = sg_next(sg); > > + /* > + * Clear end-of-table pointer, we'll mark a new one > + * at the end > + */ > + sg->page_link &= ~0x2; > + > sg_dma_len(sg) = 0; > sg_dma_address(sg) = 0; > sg_set_page(sg, bvec->bv_page);
Eh this wont work, it's the wrong entry... Here's a temporary work-around. diff --git a/drivers/ide/ide-io.c b/drivers/ide/ide-io.c index c89f0d3..108202b 100644 --- a/drivers/ide/ide-io.c +++ b/drivers/ide/ide-io.c @@ -822,6 +822,7 @@ void ide_map_sg(ide_drive_t *drive, struct request *rq) return; if (rq->cmd_type != REQ_TYPE_ATA_TASKFILE) { + sg_init_table(hwif->sg_table, hwif->sg_max_nents); hwif->sg_nents = blk_rq_map_sg(drive->queue, rq, sg); } else { sg_init_one(sg, rq->buffer, rq->nr_sectors * SECTOR_SIZE); diff --git a/drivers/ide/ide-probe.c b/drivers/ide/ide-probe.c index ec55a17..cbfb113 100644 --- a/drivers/ide/ide-probe.c +++ b/drivers/ide/ide-probe.c @@ -1324,8 +1324,6 @@ static int hwif_init(ide_hwif_t *hwif) goto out; } - sg_init_table(hwif->sg_table, hwif->sg_max_nents); - if (init_irq(hwif) == 0) goto done; -- Jens Axboe - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/