Garrett D'Amore wrote: > ????? ???????????? wrote: >> Why is m_blksize unsigned 32bit? >> Do you want to repeat the story of '640KB are enough'? >> Please make m_blksize an uint64 for the sake of future users who want >> to address more than 4GB as block. >> > The block size is the minimum addressable unit of data. It is *not* > the maximum address. In fact, with the way this is specified, you > could in theory describe a device capable of storing 2^92 bytes of > data. Although to achieve that, you'd have to use 4GB blocks.
Sorry, that's in error... it should be 2^96. (Apparently I had trouble adding 64 and 32.) For further details, assuming only a 512 byte addressable block, today you can 2^73 bytes worth of data. I think the ZFS folks have done some analysis about the amount of data that really relates to. Its way, way bigger than we'll see in my lifetime, I think. - Garrett > > The largest block size in any common use today is 64K, which is far > far short of 4GB. If you had a 4GB block size, then you would *have* > to transfer full 4GB just to do a minimum read. I don't think any > such devices are likely within the next decade or three. (Maybe not > ever.) > > - Garrett > >> On Wed, Dec 2, 2009 at 5:34 PM, Garrett D'Amore <gdamore at sun.com> wrote: >> >>> Okay, I'm going to propose solving this properly. The solution is >>> that the >>> following structure: >>> >>> struct bd_media { >>> uint64_t m_nblks; >>> boolean_t m_readonly; >>> }; >>> >>> Will grow a new member expressing the block size: >>> >>> struct bd_media { >>> uint64_t m_nblks; >>> uint32_t m_blksize; >>> boolean_t m_readonly; >>> }; >>> >>> *However*, in order to keep things simple and efficient in the code, >>> the >>> rule is that the blksize *must* be a power of two. If the block >>> size is not >>> a power of two, then the driver will have to do the RMW thing to >>> fake one >>> up. (With the associated performance penalties.) >>> >>> This should allow non-spinning media to work fine with blkdev, even >>> with >>> larger block sizes. >>> >>> Note that I can only test 512 byte blocks, so architecturally the >>> solution >>> will be complete, but the resulting code *may* have bugs that I >>> won't know >>> about until we have media with larger block size requirements. >>> >>> - Garrett >>> >>> Darren J Moffat wrote: >>> >>>> Garrett D'Amore wrote: >>>> >>>>> Darren J Moffat wrote: >>>>> >>>>>> With the updated name I have only one possible small issue. >>>>>> >>>>>> You said the driver is hardcoded to 512 byte blocks, yet the >>>>>> industry is >>>>>> moving towards 4k blocks. Is that likely to be an issue for >>>>>> devices driven >>>>>> by 'blkdev' ? >>>>>> >>>>> I don't think so. If the application (or other code) issues smaller >>>>> I/Os, the underlying driver will have to implement as RMW. If the >>>>> filesystem or app code uses 4K aligned and sized buffers, then >>>>> there won't >>>>> be a need to do RMW. >>>>> >>>> I understood how that side would work. >>>> >>>> >>>>> We might need to change the code slightly to express the 4K size >>>>> in the >>>>> various ioctls, but that can happen latter when I run into such a >>>>> device. >>>>> (The current devices supported by this device driver all use 512 >>>>> byte >>>>> blocks.) >>>>> >>>> My concern is when there are devices that use 4K size blocks >>>> instead of >>>> 512 byte blocks. Is there anything in the architecture of bd that >>>> means >>>> they can't easily be supported by this bd driver ? >>>> >>>> >>> _______________________________________________ >>> opensolaris-arc mailing list >>> opensolaris-arc at opensolaris.org >>> >>> >> >> >> >> >