Peter Cudhea wrote:
> Your point is well taken that ZFS should not duplicate functionality 
> that is already or should be available at the device driver level.    In 
> this case, I think it misses the point of what ZFS should be doing that 
> it is not.
> 
> ZFS does its own periodic commits to the disk, and it knows if those 
> commit points have reached the disk or not, or whether they are getting 
> errors.    In this particular case, those commits to disk are presumably 
> failing, because one of the disks they depend on has been removed from 
> the system.   (If the writes are not being marked as failures, that 
> would definitely be an error in the device driver, as you say.)  In this 
> case, however, the ZIL log has stopped being updated, but ZFS does 
> nothing to announce that this has happened, or to indicate that a remedy 
> is required.

I think you have some misconceptions about how the ZIL works.
It doesn't provide journalling like UFS. The following might help:

http://blogs.sun.com/perrin/entry/the_lumberjack

The ZIL isn't used at all unless there's fsync/O_DSYNC activity.

> 
> At the very least, it would be extremely helpful if  ZFS had a status to 
> report that indicates that the ZIL log is out of date, or that there are 
> troubles writing to the ZIL log, or something like that.

If the ZIL cannot be written then we force a transaction group (txg)
commit. That is the only recourse to force data to stable storage before
returning to the application. 

> 
> An additional feature would be to have user-selectable behavior when the 
> ZIL log is significantly out of date.    For example, if the ZIL log is 
> more than X seconds out of date, then new writes to the system should 
> pause, or give errors or continue to silently succeed.

Again this doesn't make sense given how the ZIL works.

> 
> In an earlier phase of my career when I worked for a database company, I 
> was responsible for a similar bug.   It caused a major customer to lose 
> a major amount of data when a system rebooted when not all good data had 
> been successfully committed to disk.    The resulting stink caused us to 
> add a feature to detect the cases when the writing-to-disk process had 
> fallen too far behind, and to pause new writes to the database until the 
> situation was resolved.
> 
> Peter
> 
> Bob Friesenhahn wrote:
>> While I do believe that device drivers. or the fault system, should 
>> notify ZFS when a device fails (and ZFS should appropriately react), I 
>> don't think that ZFS should be responsible for fault monitoring.  ZFS 
>> is in a rather poor position for device fault monitoring, and if it 
>> attempts to do so then it will be slow and may misbehave in other 
>> ways.  The software which communicates with the device (i.e. the 
>> device driver) is in the best position to monitor the device.
>>
>> The primary goal of ZFS is to be able to correctly read data which was 
>> successfully committed to disk.  There are programming interfaces 
>> (e.g. fsync(), msync()) which may be used to ensure that data is 
>> committed to disk, and which should return an error if there is a 
>> problem.  If you were performing your tests over an NFS mount then the 
>> results should be considerably different since NFS requests that its 
>> data be committed to disk.
>>
>> Bob
>> ======================================
>> Bob Friesenhahn
>> [EMAIL PROTECTED], http://www.simplesystems.org/users/bfriesen/
>> GraphicsMagick Maintainer,    http://www.GraphicsMagick.org/
>>
>> _______________________________________________
>> zfs-discuss mailing list
>> zfs-discuss@opensolaris.org
>> http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
>>   
> _______________________________________________
> zfs-discuss mailing list
> zfs-discuss@opensolaris.org
> http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Reply via email to