On 16/01/2010 00:09, Jeffry Molanus wrote:
-----Original Message-----
From: neil.per...@sun.com [mailto:neil.per...@sun.com]

I think you misunderstand the function of the ZIL. It's not a journal,
and doesn't get transferred to the pool as of a txg. It's only ever
written except
after a crash it's read to do replay. See:

http://blogs.sun.com/perrin/entry/the_lumberjack
I also read another blog[1]; the part of interest here is this:

The zil behaves differently for different size of writes that happens. For 
small writes, the data is stored as a part of the log record. For writes 
greater than zfs_immediate_write_sz (64KB), the ZIL does not store a copy of 
the write, but rather syncs the write to disk and only a pointer to the sync-ed 
data is stored in the log record.

If I understand this right, writes<64KB get stored on the SSD devices.


if an application requests a synchronous write then it is commited to ZIL immediately, once it is done the IO is acknowledged to application. But data written to ZIL is still in memory as part of an currently open txg and will be committed to a pool with no need to read anything from ZIL. Then there is an optimization you wrote above so data block not necesarilly need to be writen just pointers which point to them.

Now it is slightly more complicated as you need to take into account logbias property and a possibility that a dedicated zil device could be present.

As Neil wrote zfs will read from ZIL only if while importing a pool it will be detected that there is some data in ZIL which hasn't been commited to a pool yet which could happen due to system reset, power loss or devices suddenly disappearing.

--
Robert Milkowski
http://milek.blogspot.com


_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Reply via email to