On 6/26/06, Neil Perrin <[EMAIL PROTECTED]> wrote:


Robert Milkowski wrote On 06/25/06 04:12,:
> Hello Neil,
>
> Saturday, June 24, 2006, 3:46:34 PM, you wrote:
>
> NP> Chris,
>
> NP> The data will be written twice on ZFS using NFS. This is because NFS
> NP> on closing the file internally uses fsync to cause the writes to be
> NP> committed. This causes the ZIL to immediately write the data to the 
intent log.
> NP> Later the data is also written committed as part of the pools transaction 
group
> NP> commit, at which point the intent block blocks are freed.
>
> NP> It does seem inefficient to doubly write the data. In fact for blocks
> NP> larger than zfs_immediate_write_sz (was 64K but now 32K after 6440499 
fixed)
> NP> we write the data block and also an intent log record with the block 
pointer.
> NP> During txg commit we link this block into the pool tree. By 
experimentation
> NP> we found 32K to be the (current) cutoff point. As the nfsd at most write 
32K
> NP> they do not benefit from this.
>
> Is 32KB easily tuned (mdb?)?

I'm not sure. NFS folk?

I think he is referring to the zfs_immediate_write_sz variable, but
NFS will support
larger block sizes as well.  Unfortunately, since the maximum IP
datagram size is
64k, after headers are taken into account, the largest useful value is
60k.  If this is
to be laid out as an indirect write, will it be written as
32k+16k+8k+4k blocks?  If so,
this seems like it would be quite inefficient for RAID-Z, and writes
would best be
left at 32k.

Chris
_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Reply via email to