Robert Milkowski wrote On 06/25/06 04:12,:
Hello Neil,

Saturday, June 24, 2006, 3:46:34 PM, you wrote:

NP> Chris,

NP> The data will be written twice on ZFS using NFS. This is because NFS
NP> on closing the file internally uses fsync to cause the writes to be
NP> committed. This causes the ZIL to immediately write the data to the intent 
log.
NP> Later the data is also written committed as part of the pools transaction 
group
NP> commit, at which point the intent block blocks are freed.

NP> It does seem inefficient to doubly write the data. In fact for blocks
NP> larger than zfs_immediate_write_sz (was 64K but now 32K after 6440499 fixed)
NP> we write the data block and also an intent log record with the block 
pointer.
NP> During txg commit we link this block into the pool tree. By experimentation
NP> we found 32K to be the (current) cutoff point. As the nfsd at most write 32K
NP> they do not benefit from this.

Is 32KB easily tuned (mdb?)?

I'm not sure. NFS folk?

I guess not but perhaps.

And why only for blocks larger than zfs_immediate_write_sz?

When data is large enough (currently >32K) it's more efficient to directly
write the block, and additionally save the block pointer in a ZIL record.
Otherwise it's more efficient to copy the data into a large log block
potentially along with other writes.

--

Neil
_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Reply via email to