On Tue, Dec 18, 2012 at 3:42 AM, Jens Axboe <ax...@kernel.dk> wrote: > > Bah. Does the below fix it up for you?
Grr. This is still bullshit. Doing this: alignment = sector << 9; is fundamentally crap, because 'sector_t' may well be 32-bit (non-large-block device case). And we're supposed (surprise surprise) to be able to handle devices larger than 4GB in size. So doing *any* of these calculations in bytes is pure and utter crap. You need to do them in sectors. That's what "sector_t" means, and that's damn well how everything should work. Anything that works in bytes is simply pure crap. And don't talk to me about 64-bit math and doing it in "u64" or "loff_t", that's just utterly moronic too. Besides, "sector_div()" is only sensible when you're looking for the remainder of a sector number. That's true in the first case (sector really is a sector number - it's the starting sector of the partition), but the source of alignment and granularity are actually just "unsigned int" (and that's in bytes, not sectors), so using sector_t afterwards is crazy too. You should have used just '%'. Looking around, there are other places where this idiocy happens too (blkdev_issue_discard() seems to think the granularity/alignments are sector_t's too, for example). Anyway, here's a patch to fix the crazy types and the bogus second "sector_div()". It's whitespace-damaged, because not only have I not tested it, I also think somebody needs to look at things in general. The whole "discard_alignment" handling is extremely odd. I don't think it should be called "alignment" at all - because it isn't. It's an alignment *offset*. Look at the normal (non-discard) case, where it's called "alignment_offset" like it should be. So the math is confused, the types are confused, and the naming is confused. Please, somebody check this out, because now *I* am confused. And btw, that whole commit happened too f*cking late too. When I get a pull request, it should damn well have been tested already, and it should have been developed *before* the merge window started. Not the day before the pull request. I'm grumpy, because all of this code is UTTER SH*T, and it was sent to me. Why? Linus --- diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h index acb4f7bbbd32..c23cae25a0c0 100644 --- a/include/linux/blkdev.h +++ b/include/linux/blkdev.h @@ -1188,14 +1188,25 @@ static inline int queue_discard_alignment(struct request_queue *q) static inline int queue_limit_discard_alignment(struct queue_limits *lim, sector_t sector) { - sector_t alignment = sector << 9; - alignment = sector_div(alignment, lim->discard_granularity); + /* Why are these in bytes, not sectors? */ + unsigned int alignment, granularity, offset; if (!lim->max_discard_sectors) return 0; - alignment = lim->discard_granularity + lim->discard_alignment - alignment; - return sector_div(alignment, lim->discard_granularity); + alignment = lim->discard_alignment >> 9; + granularity = lim->discard_granularity >> 9; + if (!alignment || !granularity) + return 0; + + /* Offset of the partition start in 'granularity' sectors */ + offset = sector_div(sector, granularity); + + /* And why do we do this modulus *again* in blkdev_issue_discard()? */ + offset = (granularity + alignment - offset) % granularity; + + /* Turn it back into bytes, gaah */ + return offset << 9; } static inline int bdev_discard_alignment(struct block_device *bdev) -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/