Re: CVS commit: [tls-maxphys] src/sys/dev
On Wed, Oct 10, 2012 at 11:34:48AM +0200, Manuel Bouyer wrote: I'm not sure what the best way to handle this would be. If we assume that maxphys is a power of 2, we could use a maxphys-derived mask here. Otherwise, maybe we should compute and cache the largest power-of-2 value below maxphys in v_mount, as is done in vfs_vnops.c:vn_ra_allocctx() (actually, this would remove ra_iochunk as we could use the mount point's value) I originally did that. There were some negative performance consequences. With two of the SCSI (RAID, actually) controllers I have on hand, the DMA descriptor format effectively imposes a restriction of 192K on transfer size. But these are exactly the sort of devices that really like to see large writes. The next lower power of two is 128k... I think a lot of the clever math with shifts and masks should perhaps go away. Any efficiency improvements it yields are, I think, swamped by the 5,000 function calls we make to do a single I/O. Thor
Re: CVS commit: [tls-maxphys] src/sys/dev
On Wed, Oct 10, 2012 at 08:17:05AM -0400, Thor Lancelot Simon wrote: On Wed, Oct 10, 2012 at 11:34:48AM +0200, Manuel Bouyer wrote: I'm not sure what the best way to handle this would be. If we assume that maxphys is a power of 2, we could use a maxphys-derived mask here. Otherwise, maybe we should compute and cache the largest power-of-2 value below maxphys in v_mount, as is done in vfs_vnops.c:vn_ra_allocctx() (actually, this would remove ra_iochunk as we could use the mount point's value) I originally did that. There were some negative performance consequences. With two of the SCSI (RAID, actually) controllers I have on hand, the DMA descriptor format effectively imposes a restriction of 192K on transfer size. But these are exactly the sort of devices that really like to see large writes. The next lower power of two is 128k... I think a lot of the clever math with shifts and masks should perhaps go away. Any efficiency improvements it yields are, I think, swamped by the 5,000 function calls we make to do a single I/O. I suspect UVM wants to do batch of read or writes in power of 2 size, aligned on power of 2 boundaries. Or this is my feeling after playing with the readahead code, and I wouldn't be surprised if the write code had the same requirement. -- Manuel Bouyer bou...@antioche.eu.org NetBSD: 26 ans d'experience feront toujours la difference --
Re: CVS commit: [tls-maxphys] src/sys/dev
On Wed, Oct 10, 2012 at 11:34:48AM +0200, Manuel Bouyer wrote: On Tue, Oct 09, 2012 at 05:59:06PM -0700, Chuck Silvers wrote: [...] with a 'cat big_file /dev/null' writes are still limited to 64k ... I would hope that cat'ing a file to /dev/null wouldn't result in any writes. :-) I assume you meant 'cat big_file other_file' ? I use: dd if=/dev/zero of=bigfile bs=1g count=7 if so, then the reason for the 64k writes would be this block of code in ffs_write(): if (!async oldoff 16 != uio-uio_offset 16) { mutex_enter(vp-v_interlock); error = VOP_PUTPAGES(vp, (oldoff 16) 16, (uio-uio_offset 16) 16, PGO_CLEANIT | PGO_JOURNALLOCKED | PGO_LAZY); if (error) break; } that's it. I did s/16/32/g in the code above and now I get 128k writes. 32? Not 17? Thor
Re: CVS commit: [tls-maxphys] src/sys/dev
On Tue, Oct 09, 2012 at 05:59:06PM -0700, Chuck Silvers wrote: if so, then the reason for the 64k writes would be this block of code in ffs_write(): if (!async oldoff 16 != uio-uio_offset 16) { mutex_enter(vp-v_interlock); error = VOP_PUTPAGES(vp, (oldoff 16) 16, (uio-uio_offset 16) 16, PGO_CLEANIT | PGO_JOURNALLOCKED | PGO_LAZY); if (error) break; } Upon reflection, I do not understand how this works. Consider a write() of 131072 starting at file offset 0. oldoff 16 will be 0; uio-uio_offset will be 131072, unless this is the source of my misunderstanding, after the call to ubc_uiomove(). Otherwise, I don't see how uio-uio_offset gets updated at all. Now, we VOP_PUTPAGES(vp, 0, (131072 16) 16)) which is of course VOP_PUTPAGES(vp, 0, 131072). So why does this only push out 64K? Thor
Re: CVS commit: [tls-maxphys] src/sys/dev
On Wed, Oct 10, 2012 at 11:48:37AM -0400, Thor Lancelot Simon wrote: On Tue, Oct 09, 2012 at 05:59:06PM -0700, Chuck Silvers wrote: if so, then the reason for the 64k writes would be this block of code in ffs_write(): if (!async oldoff 16 != uio-uio_offset 16) { mutex_enter(vp-v_interlock); error = VOP_PUTPAGES(vp, (oldoff 16) 16, (uio-uio_offset 16) 16, PGO_CLEANIT | PGO_JOURNALLOCKED | PGO_LAZY); if (error) break; } Upon reflection, I do not understand how this works. Consider a write() of 131072 starting at file offset 0. oldoff 16 will be 0; uio-uio_offset will be 131072, unless this is the source of my misunderstanding, after the call to ubc_uiomove(). Otherwise, I don't see how uio-uio_offset gets updated at all. Now, we VOP_PUTPAGES(vp, 0, (131072 16) 16)) which is of course VOP_PUTPAGES(vp, 0, 131072). So why does this only push out 64K? I don't think you can have oldoff at 0 and uio-uio_offset at 131072, because the write is split in smaller chunks by the while (uio-uio_resid 0) { } loop. The chunk size will be: bytelen = MIN(fs-fs_bsize - blkoffset, uio-uio_resid); so we get data from userland in at most fs-fs_bsize chunks at a time, and when we cross a 64k boundary we start a write. -- Manuel Bouyer bou...@antioche.eu.org NetBSD: 26 ans d'experience feront toujours la difference --