After reviewing the many comments on this case and after further discussion
with the zfs engineers, we are making the following changes to the PSARC
case:
1. The sync property will be inherited.
2. sync=default is replaced by sync=standard
Note, sync=disabled will remain.
3. Additional text has been added to warn about setting sync=disabled
on root or /var file systems.
Here's the updated proposal:
------------------------------------
Template Version: @(#)sac_nextcase 1.70 03/30/10 SMI
This information is Copyright (c) 2010, Oracle and/or its affiliates.
All rights
reserved.
1. Introduction
1.1. Project/Component Working Name:
zil synchronicity
1.2. Name of Document Author/Supplier:
Author: Neil Perrin
1.3 Date of This Document:
06 April, 2010
4. Technical Description
I am sponsoring the following fast-track for Robert Milkowski and Neil
Perrin. It introduces a new dataset property for controlling
synchronous behavior. The case requests micro/patch binding.
1. Summary
Provide administrators of zfs with the ability to control
the behavior of synchronous requests (e.g. fsync, O_DSYNC).
In particular, new capabilities are proposed to 1) delay executing the
synchronous request and 2) force all requests to be synchronous.
The current POSIX behavior of ensuring all synchronous requests are
written to stable storage would remain the default.
2. Background
Currently ZFS have no official control over synchronous behaviour.
However, unfortunately, a zfs module switch (zil_disable) has been
fairly well publicized to disable the ZIL, the code which enforces
synchronous requests. This is a global kernel variable and so affects
all file systems within all pools.
It should be noted that ZFS pools are always consistent. That is, the
intent log is not required for pool integrity. This is due to the
transactional behavior of ZFS and in particular its transaction
group commit (txg) model. What the ZIL does is simply to ensure that
synchronous write requests are committed to stable storage prior to
returning from the system call.
There are reasonable cases where the administrator understands the
consequences of disabling synchronous behavior. For example, if the
system crashes they start again from scratch. This is quicker
than enabling synchronous behavior.
There are also reasons an administrator may want to enable synchronous
behavior for all writes. This might help debug an issue where
synchronous writes are needed.
3. Proposal
The options and semantics proposed for a new zfs dataset property:
sync=standard
This is the default option. Synchronous file system transactions
(fsync, O_DSYNC, O_SYNC, etc) are written out (to the intent log)
and then secondly all devices written are flushed to ensure
the data is stable (not cached by device controllers).
sync=always
For the ultra-cautious, every file system transaction would
be written and flushed to stable storage by system call return.
This obviously has a big performance penalty.
sync=disabled
Synchronous requests are disabled. File system transactions
only commit to stable storage on the next DMU transaction group
commit which can be many seconds. This option will give the
highest performance. However, it is very dangerous as ZFS
would be ignoring the synchronous transaction demands of
applications such as databases or NFS.
Setting sync=disabled on the currently active root or /var
filesystem may result in out-of-spec behavior, application data
loss and increased vulnerability to replay attacks
Administrators should only use this when these risks are understood.
The property can be set when the dataset is created, or dynamically,
and will take effect immediately. To change the property, an
administrator can use the standard 'zfs' command. For example:
# zfs create -o sync=disabled whirlpool/milek
# zfs set sync=always whirlpool/perrin
The current value of 'sync' can be retrieved in the usual manner
with 'zfs get sync' or 'zfs list -o sync'. The sync property is
inherited from parent datasets.
6. Resources and Schedule
6.4. Steering Committee requested information
6.4.1. Consolidation C-team Name:
ON
6.5. ARC review type: FastTrack
6.6. ARC Exposure: open
_______________________________________________
opensolaris-arc mailing list
[email protected]