Re: [zfs-discuss] zfs set readonly=on does not entirely go into read-only mode

2010-08-27 Thread Ian Collins

On 08/28/10 11:13 AM, Robert Milkowski wrote:

Hi,

When I set readonly=on on a dataset then no new files are allowed to 
be created.

However writes to already opened files are allowed.

This is rather counter intuitive - if I set a filesystem as read-only 
I would expect it not to allow any modifications to it.


I think it shouldn't behave this way and it should be considered as a 
bug.


What do you think?


No.

Think of this from the perspective of an application. How would write 
failure be reported?  open(2) returns EACCES if the file can not be 
written but there isn't a corresponding return from write(2).  Any open 
file descriptors would have to be updated to reflect the change of 
access and the application would end up with an unexpected error return 
(EBADF?).


If the application has been given permission to open a file for writing 
and this permission is unexpectedly revoked, strange things my happen.  
The file being written would be in an inconsistent state.


I think it is better to let write operation complete and leave the file 
in a consistent state.


--

Ian.

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] zfs set readonly=on does not entirely go into read-only mode

2010-08-27 Thread Ian Collins

On 08/28/10 12:05 PM, Ian Collins wrote:

On 08/28/10 11:13 AM, Robert Milkowski wrote:

Hi,

When I set readonly=on on a dataset then no new files are allowed to 
be created.

However writes to already opened files are allowed.

This is rather counter intuitive - if I set a filesystem as read-only 
I would expect it not to allow any modifications to it.


I think it shouldn't behave this way and it should be considered as a 
bug.


What do you think?


No.

Think of this from the perspective of an application. How would write 
failure be reported?  open(2) returns EACCES if the file can not be 
written but there isn't a corresponding return from write(2).  Any 
open file descriptors would have to be updated to reflect the change 
of access and the application would end up with an unexpected error 
return (EBADF?).


If the application has been given permission to open a file for 
writing and this permission is unexpectedly revoked, strange things my 
happen.  The file being written would be in an inconsistent state.


I think it is better to let write operation complete and leave the 
file in a consistent state.


Following on from my own reply, I think that if there is a bug, it it 
letting the change occur when there are open files.  Setting the 
filesystem read-only is effectively remounting it (the equivalent on 
other filesystems) so it should behave in the same way as an unmount in 
the presence of open files.


--
Ian.

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] zfs set readonly=on does not entirely go into read-only mode

2010-08-27 Thread Nicolas Williams
On Sat, Aug 28, 2010 at 12:05:53PM +1200, Ian Collins wrote:
 Think of this from the perspective of an application. How would
 write failure be reported?  open(2) returns EACCES if the file can
 not be written but there isn't a corresponding return from write(2).
 Any open file descriptors would have to be updated to reflect the
 change of access and the application would end up with an unexpected
 error return (EBADF?).

EROFS.  But write(2) isn't supposed to return EROFS.  NFSv3's and v4's
write ops are allowed to return the NFS equivalent of EROFS, and so
typically NFS clients do cause write(2) to return EROFS in such cases
(but then, NFS isn't fully POSIX).

write(2) can return EIO though, and, IIRC, the BSD revoke(2) syscall
arranges for just that to be returned by write(2) calls on revoked
fildes.

IMO EROFS and EIO would both be OK.  It might be a good idea to require
a force option to make a change that would cause non-POSIX behavior.

I'd think that there's many possible ways to handle this:

a) disallow setting readonly=on on mounted datasets that are
   readonly=false;

b) disallow ... but only if there are any fildes open for write (doesn't
   matter if shared with NFS as NFS writes are allowed to return EROFS);

c) allow the change but make it take effect on next mount;

d) force umount the dataset, make the change, mount again;

e) have write(2), to fildes open for write before the change to
   readonly=on, return EROFS after the change;

f) same as (d) but only if you force the prop change;

g) have write(2), to fildes open for write before the change to
   readonly=on, return EIO after the change;

h) allow write(2)s to fildes open for write before the change to
   readonly=on;

(h) is current behavior.  (a) and (b) would be reasonable, but if EBUSY,
the user may not be able to change the property without drastic steps
(such as rebooting, if there's lots of datasets below).  (c) would be
confusing, and not that useful.  (d) would be unreasonable (plus what if
there's datasets below this one?!).  (e)...  may be reasonable if you
think that we're well outside POSIX the moment you change the readonly
prop to on.  (f) is reasonable (by forcing the change you'd be saying
that you're happy to leave POSIX land).  (h) is reasonable.

 If the application has been given permission to open a file for
 writing and this permission is unexpectedly revoked, strange things
 my happen.  The file being written would be in an inconsistent
 state.

Well, there's always the BSD revoke(2) system call.  Use it and 

 I think it is better to let write operation complete and leave the
 file in a consistent state.

There is that too.  But you could, too, just power off...  The
application should use fsync(2) (or fdatasync()) carefully to ensure
that failed write(2)s and power failures don't leave the application in
an unrecoverable state.

Nico
-- 
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] zfs set readonly=on does not entirely go into read-only mode

2010-08-27 Thread Edward Ned Harvey
 From: zfs-discuss-boun...@opensolaris.org [mailto:zfs-discuss-
 boun...@opensolaris.org] On Behalf Of Ian Collins
 
  However writes to already opened files are allowed.
 
 Think of this from the perspective of an application. How would write
 failure be reported?  

Both very good points.  But I agree with Robert.  

write() has a known failure mode when disk is full.  I agree bad things can
happen to applications that attempt write() when disk is full ... however
... Only a user with root privs is able to set readonly property.  I expect
the root user is doing this for a reason.  Willing, able, and aware to take
responsibility for the consequences.

The intuitive (generally expected) thing, when you're root and you make a
filesystem readonly, is that it becomes readonly.

If that is not the behavior ... Well, I can think of at least one really
specific, important example problem.

Suppose an application writes to a file infinitely.  Fills up the
filesystem.  This is a known bad thing for ZFS, sometimes causing
unrecoverable infinite IO and forcing power-cycle (I don't have a bug # but
see here: http://opensolaris.org/jive/thread.jspa?threadID=132383tstart=0 )
...

If you find yourself in the infinite IO, would-be-forced to power cycle
situation, the workaround is to reduce some reservation to free up space.
Then you should be able to rm, destroy, and stop scrub.  But if the
application is still infinitely writing to the open file handle that it
already owns ... then any space you can free up will just get consumed again
immediately by the bad application.

Another specific example ...

Suppose you zfs send from a primary server to a backup server.  You want
the filesystems to be readonly on the backup fileserver, in order to receive
incrementals.  If you make a mistake, and start writing to the backup server
filesystem, you want to be able to correct your mistake.  Make it readonly,
stop anything from writing to it, rollback to the unmodified snapshot, so
you're able to receive incrementals again.

If setting readonly doesn't stop open filehandles from writing ... What can
you do?  You either have to flex your brain muscle to figure out some
technique to find which application is performing writes (not always easy to
do) or you basically have to unmount  remount the filesystem to force
writes to stop, which might not be easy to do, because filehandles are in
use.  You might feel the need to simply reboot, instead of figuring out a
way to do all this.  You just complain to your colleagues and say yeah, the
stupid thing made me reboot in order to make the filesystem readonly.

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] zfs set readonly=on does not entirely go into read-only mode

2010-08-27 Thread Ian Collins

On 08/28/10 12:45 PM, Edward Ned Harvey wrote:

Another specific example ...

Suppose you zfs send from a primary server to a backup server.  You want
the filesystems to be readonly on the backup fileserver, in order to receive
incrementals.  If you make a mistake, and start writing to the backup server
filesystem, you want to be able to correct your mistake.  Make it readonly,
stop anything from writing to it, rollback to the unmodified snapshot, so
you're able to receive incrementals again.
   


I think you have lost a not in there somewhere!

I always set all the backup filesystems on our staging sever read-only 
(and atime=off, it that makes any difference to a read-only 
filesystem).  You can still receive to a read-only filesystem and 
there's -F to force roll-backs.  The exception is when adding a new 
nested filesystem; the mount will fail unless the parent is read/write.


--
Ian.

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] zfs set readonly=on does not entirely go into read-only mode

2010-08-27 Thread Edward Ned Harvey
 From: zfs-discuss-boun...@opensolaris.org [mailto:zfs-discuss-
 boun...@opensolaris.org] On Behalf Of Ian Collins
 
 so it should behave in the same way as an unmount in
 the presence of open files.

+1

You can unmount lazy, or force, or by default, the unmount fails in the
presence of open files.  (I think.)  So to keep everybody happy, let people
do whatever they want.  ;-)

Setting readonly property should fail in the presence of open files, or you
can force it, which would truly sweep the rug out from under the writing
processes.  And if the developer(s) are feeling ambitious, implement lazy
too.  ;-)

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] zfs set readonly=on does not entirely go into read-only mode

2010-08-27 Thread Edward Ned Harvey
 From: Ian Collins [mailto:i...@ianshome.com]
 
 On 08/28/10 12:45 PM, Edward Ned Harvey wrote:
  Another specific example ...
 
  Suppose you zfs send from a primary server to a backup server.  You
 want
  the filesystems to be readonly on the backup fileserver, in order to
 receive
  incrementals.  If you make a mistake, and start writing to the backup
 server
  filesystem, you want to be able to correct your mistake.  Make it
 readonly,
  stop anything from writing to it, rollback to the unmodified
 snapshot, so
  you're able to receive incrementals again.
 
 
 I think you have lost a not in there somewhere!

Didn't miss any not, but it may not have been written clearly.

If you *intended* to set the destination filesystem readonly before, and you
only discovered it's not readonly later, evident by the fact that something
wrote to it and now you can't receive incremental zfs snapshots...  Then you
want to correct your mistake.  Whatever was writing to the backup
fileserver, it shouldn't have been.  So set the filesystem readonly,
rollback to the latest snapshot that corresponds to the primary server, so
you can again start receiving incrementals.

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss