Re: [zfs-discuss] Is write(2) made durable atomically?

2009-12-01 Thread Robert Milkowski

Neil Perrin wrote:




Under the hood in ZFS, writes are committed using either shadow 
paging or

logging, as I understand it. So I believe that I mean to ask whether a
write(2), pushed to ZPL, and pushed on down the stack, can be split 
into
multiple transactions? Or, instead, is it guaranteed to be committed 
in a

single transaction, and so committed atomically?


A write made through the ZPL (zfs_write()) will be broken into 
transactions
that contain at most 128KB user data. So a large write could well be 
split

across transaction groups, and thus committed separately.


So what happens if application is doing a synchronous write of lets say 
512KB?
The write will be splitted in at least 4 separate transactions and the 
write will be confirmed to the application only after all 512KB has been 
written. But is there a possibility that if after a first transaction 
was commited system crashed and although write was not confirmed to the 
application 128KB of it has been commited to the disk? Or will it be 
rolled back? Basically for synchronous writes of more than 128KB - is it 
guaranteed that all data under a given write is committed or nothing at all?


--
Robert Milkowski
http://milek.blogspot.com


___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Is write(2) made durable atomically?

2009-11-30 Thread Chris Frost
On Mon, Nov 30, 2009 at 10:23:06PM -0800, Chris Frost wrote:
> On Mon, Nov 30, 2009 at 11:03:07PM -0700, Neil Perrin wrote:
> > A write made through the ZPL (zfs_write()) will be broken into transactions
> > that contain at most 128KB user data. So a large write could well be split
> > across transaction groups, and thus committed separately. 
> 
> That answers my exact question; thanks!

For my PhD thesis I am working on file systems that build on shadow paging
and am interested in the design choices behind ZFS. Off the top of your
head, how fundamental would you say it is for the system to split each
zfs_write() into transactions <=128KB in size? That is, could the system
support far larger transactions easily and efficiently? Could the system
be made to support transactions that are bounded in size only by free
pool space?

thanks again,
-- 
Chris Frost
http://www.frostnet.net/chris/
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Is write(2) made durable atomically?

2009-11-30 Thread Chris Frost
On Mon, Nov 30, 2009 at 11:03:07PM -0700, Neil Perrin wrote:
> A write made through the ZPL (zfs_write()) will be broken into transactions
> that contain at most 128KB user data. So a large write could well be split
> across transaction groups, and thus committed separately. 

That answers my exact question; thanks!

And Richard, thanks, too. Sorry that my question wasn't stated clearly
enough to avoid causing confusion about whether I asked about the timing
of durability vs. the atomicity of writes with respect to failures.

-- 
Chris Frost
http://www.frostnet.net/chris/
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Is write(2) made durable atomically?

2009-11-30 Thread Neil Perrin





Under the hood in ZFS, writes are committed using either shadow paging or
logging, as I understand it. So I believe that I mean to ask whether a
write(2), pushed to ZPL, and pushed on down the stack, can be split into
multiple transactions? Or, instead, is it guaranteed to be committed in a
single transaction, and so committed atomically?


A write made through the ZPL (zfs_write()) will be broken into transactions
that contain at most 128KB user data. So a large write could well be split
across transaction groups, and thus committed separately. 


Neil.
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Is write(2) made durable atomically?

2009-11-30 Thread Richard Elling

On Nov 30, 2009, at 8:30 PM, Chris Frost wrote:


Will a write(2) to a ZFS file be made durable atomically?


Yes or no, as specified by the options set at open(2).

Note: it is worthwhile to know if your hardware honors cache flush
requests, otherwise all bets are off.

Under the hood in ZFS, writes are committed using either shadow  
paging or

logging, as I understand it. So I believe that I mean to ask whether a
write(2), pushed to ZPL, and pushed on down the stack, can be split  
into
multiple transactions? Or, instead, is it guaranteed to be committed  
in a

single transaction, and so committed atomically?


ZPL is the ZFS POSIX Layer.  I believe you meant to say the ZFS intent
log (ZIL) instead.  There is a lot of material online about the ZIL  
and how

it works. Neil's blog is often cited:
http://blogs.sun.com/perrin/entry/the_lumberjack

 -- richard

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss