Re: CVS commit: [tls-maxphys] src/sys/dev

2012-10-10 Thread Thor Lancelot Simon
On Wed, Oct 10, 2012 at 11:34:48AM +0200, Manuel Bouyer wrote:
 
 I'm not sure what the best way to handle this would be.
 If we assume that maxphys is a power of 2, we could use a maxphys-derived
 mask here. Otherwise, maybe we should compute and cache the largest power-of-2
 value below maxphys in v_mount, as is done in vfs_vnops.c:vn_ra_allocctx()
 (actually, this would remove ra_iochunk as we could use the mount point's
 value)

I originally did that.  There were some negative performance consequences.

With two of the SCSI (RAID, actually) controllers I have on hand, the DMA
descriptor format effectively imposes a restriction of 192K on transfer
size.  But these are exactly the sort of devices that really like to see
large writes.  The next lower power of two is 128k...

I think a lot of the clever math with shifts and masks should perhaps go
away.  Any efficiency improvements it yields are, I think, swamped by the
5,000 function calls we make to do a single I/O.

Thor


Re: CVS commit: [tls-maxphys] src/sys/dev

2012-10-10 Thread Manuel Bouyer
On Wed, Oct 10, 2012 at 08:17:05AM -0400, Thor Lancelot Simon wrote:
 On Wed, Oct 10, 2012 at 11:34:48AM +0200, Manuel Bouyer wrote:
  
  I'm not sure what the best way to handle this would be.
  If we assume that maxphys is a power of 2, we could use a maxphys-derived
  mask here. Otherwise, maybe we should compute and cache the largest 
  power-of-2
  value below maxphys in v_mount, as is done in vfs_vnops.c:vn_ra_allocctx()
  (actually, this would remove ra_iochunk as we could use the mount point's
  value)
 
 I originally did that.  There were some negative performance consequences.
 
 With two of the SCSI (RAID, actually) controllers I have on hand, the DMA
 descriptor format effectively imposes a restriction of 192K on transfer
 size.  But these are exactly the sort of devices that really like to see
 large writes.  The next lower power of two is 128k...
 
 I think a lot of the clever math with shifts and masks should perhaps go
 away.  Any efficiency improvements it yields are, I think, swamped by the
 5,000 function calls we make to do a single I/O.

I suspect UVM wants to do batch of read or writes in power of 2 size,
aligned on power of 2 boundaries. Or this is my feeling after playing
with the readahead code, and I wouldn't be surprised if the write
code had the same requirement.

-- 
Manuel Bouyer bou...@antioche.eu.org
 NetBSD: 26 ans d'experience feront toujours la difference
--


Re: CVS commit: [tls-maxphys] src/sys/dev

2012-10-10 Thread Thor Lancelot Simon
On Wed, Oct 10, 2012 at 11:34:48AM +0200, Manuel Bouyer wrote:
 On Tue, Oct 09, 2012 at 05:59:06PM -0700, Chuck Silvers wrote:
   [...]
   with a 'cat big_file  /dev/null'
   writes are still limited to 64k ...
  
  I would hope that cat'ing a file to /dev/null wouldn't result in any 
  writes.  :-)
  I assume you meant 'cat big_file  other_file' ?
 
 I use: dd if=/dev/zero of=bigfile bs=1g count=7
 
  
  if so, then the reason for the 64k writes would be this block of code in 
  ffs_write():
  
  if (!async  oldoff  16 != uio-uio_offset  16) {
  mutex_enter(vp-v_interlock);
  error = VOP_PUTPAGES(vp, (oldoff  16)  16,
  (uio-uio_offset  16)  16,
  PGO_CLEANIT | PGO_JOURNALLOCKED | PGO_LAZY);
  if (error)
  break;
  }
  
 
 that's it. I did s/16/32/g in the code above and now I get 128k writes.

32?  Not 17?

Thor


Re: CVS commit: [tls-maxphys] src/sys/dev

2012-10-10 Thread Thor Lancelot Simon
On Tue, Oct 09, 2012 at 05:59:06PM -0700, Chuck Silvers wrote:
 
 if so, then the reason for the 64k writes would be this block of code in 
 ffs_write():
 
   if (!async  oldoff  16 != uio-uio_offset  16) {
   mutex_enter(vp-v_interlock);
   error = VOP_PUTPAGES(vp, (oldoff  16)  16,
   (uio-uio_offset  16)  16,
   PGO_CLEANIT | PGO_JOURNALLOCKED | PGO_LAZY);
   if (error)
   break;
   }

Upon reflection, I do not understand how this works.

Consider a write() of 131072 starting at file offset 0.

oldoff  16 will be 0; uio-uio_offset will be 131072, unless
this is the source of my misunderstanding, after the call to ubc_uiomove().
Otherwise, I don't see how uio-uio_offset gets updated at all.

Now, we VOP_PUTPAGES(vp, 0, (131072  16)  16)) which is of course
VOP_PUTPAGES(vp, 0, 131072).  So why does this only push out 64K?

Thor


Re: CVS commit: [tls-maxphys] src/sys/dev

2012-10-10 Thread Manuel Bouyer
On Wed, Oct 10, 2012 at 11:48:37AM -0400, Thor Lancelot Simon wrote:
 On Tue, Oct 09, 2012 at 05:59:06PM -0700, Chuck Silvers wrote:
  
  if so, then the reason for the 64k writes would be this block of code in 
  ffs_write():
  
  if (!async  oldoff  16 != uio-uio_offset  16) {
  mutex_enter(vp-v_interlock);
  error = VOP_PUTPAGES(vp, (oldoff  16)  16,
  (uio-uio_offset  16)  16,
  PGO_CLEANIT | PGO_JOURNALLOCKED | PGO_LAZY);
  if (error)
  break;
  }
 
 Upon reflection, I do not understand how this works.
 
 Consider a write() of 131072 starting at file offset 0.
 
 oldoff  16 will be 0; uio-uio_offset will be 131072, unless
 this is the source of my misunderstanding, after the call to ubc_uiomove().
 Otherwise, I don't see how uio-uio_offset gets updated at all.
 
 Now, we VOP_PUTPAGES(vp, 0, (131072  16)  16)) which is of course
 VOP_PUTPAGES(vp, 0, 131072).  So why does this only push out 64K?

I don't think you can have oldoff at 0 and uio-uio_offset at 131072,
because the write is split in smaller chunks by the
while (uio-uio_resid  0) { } loop. The chunk size will be:
bytelen = MIN(fs-fs_bsize - blkoffset, uio-uio_resid);

so we get data from userland in at most fs-fs_bsize chunks at a time,
and when we cross a 64k boundary we start a write.

-- 
Manuel Bouyer bou...@antioche.eu.org
 NetBSD: 26 ans d'experience feront toujours la difference
--