Re: [OmniOS-discuss] 4kn or 512e with ashift=12

2016-03-24 Thread Richard Elling

> On Mar 23, 2016, at 6:37 PM, Bob Friesenhahn  
> wrote:
> 
> On Wed, 23 Mar 2016, Richard Elling wrote:
> 
>> 
>>> On Mar 23, 2016, at 7:49 AM, Richard Jahnel  wrote:
>>> 
>>> It should be noted that using a 512e disk as a 512n disk subjects you to a 
>>> significant risk of silent corruption in the event of power loss. Because 
>>> 512e disks does a read>modify>write operation to modify 512byte chunk of a 
>>> 4k sector, zfs won't know about the other 7 corrupted 512e sectors in the 
>>> event of a power loss during a write operation. So when discards the 
>>> incomplete txg on reboot, it won't do anything about the other 7 512e 
>>> sectors it doesn't know were affected.
>> 
>> Disagree. The risk is no greater than HDDs today with their volatile write 
>> caches.
> 
> If the data unrelated to the current transaction group is read and then 
> partially modifed (possibly with data corruption due to loss of power during 
> write), this would seem to be worse than loss due to a volatile write cache 
> (assuming drives which observe cache sync requests) since data unrelated to 
> the current transaction group may have been modified.  The end result would 
> be checksum errors during a scrub.

The old data is not modified. This is not read-destroy-modify-write.
 -- richard

___
OmniOS-discuss mailing list
OmniOS-discuss@lists.omniti.com
http://lists.omniti.com/mailman/listinfo/omnios-discuss


Re: [OmniOS-discuss] 4kn or 512e with ashift=12

2016-03-24 Thread Fred Liu


> -Original Message-
> From: Chris Siebenmann [mailto:c...@cs.toronto.edu]
> Sent: 星期三, 三月 23, 2016 23:33
> To: Richard Jahnel
> Cc: Chris Siebenmann; Fred Liu; omnios-discuss@lists.omniti.com
> Subject: Re: [OmniOS-discuss] 4kn or 512e with ashift=12
> 
> > It should be noted that using a 512e disk as a 512n disk subjects you
> > to a significant risk of silent corruption in the event of power loss.
> > Because 512e disks does a read>modify>write operation to modify
> > 512byte chunk of a 4k sector, zfs won't know about the other
> > 7 corrupted 512e sectors in the event of a power loss during a write
> > operation. So when discards the incomplete txg on reboot, it won't do
> > anything about the other 7 512e sectors it doesn't know were affected.
> 
>  This is true; under normal circumstances you do not want to use a 512e drive
> in an ashift=9 vdev. However, if you have a dead 512n drive and you have no
> remaining 512n spares, your choices are to run without redundancy, to wedge
> in a 512e drive and accept the potential problems on power failure (problems
> that can likely be fixed by scrubbing the pool afterwards), or obtain enough
> additional drives (and perhaps
> server(s)) to entirely rebuild the pool on 512e drives with ashift=12.
> 
>  In this situation, running with a 512e drive and accepting the performance
> issues and potential exposure to power failures is basically the lesser evil. 
> (I
> wish ZFS was willing to accept this, but it isn't.)
> 
[Fred Liu]: I have a similar test here:

[root@00-25-90-74-f5-04 ~]# zpool status
  pool: tank
 state: ONLINE
  scan: resilvered 187G in 21h9m with 0 errors on Thu Jan 15 08:05:16 2015
config:

NAME STATE READ WRITE CKSUM
tank ONLINE   0 0 0
  raidz2-0   ONLINE   0 0 0
c2t45d0  ONLINE   0 0 0
c2t46d0  ONLINE   0 0 0
c2t47d0  ONLINE   0 0 0
c2t48d0  ONLINE   0 0 0
c2t49d0  ONLINE   0 0 0
c2t52d0  ONLINE   0 0 0
c2t53d0  ONLINE   0 0 0
c2t44d0  ONLINE   0 0 0
spares
  c0t5000CCA6A0C791CBd0  AVAIL

errors: No known data errors

  pool: zones
 state: ONLINE
  scan: scrub repaired 0 in 2h45m with 0 errors on Tue Aug 12 20:24:30 2014
config:

NAME   STATE READ WRITE CKSUM
zones  ONLINE   0 0 0
  raidz2-0 ONLINE   0 0 0
c0t5000C500584AC07Bd0  ONLINE   0 0 0
c0t5000C500584AC557d0  ONLINE   0 0 0
c0t5000C500584ACB1Fd0  ONLINE   0 0 0
c0t5000C500584AD7B3d0  ONLINE   0 0 0
c0t5000C500584C30DBd0  ONLINE   0 0 0
c0t5000C500586E54A3d0  ONLINE   0 0 0
c0t5000C500586EF0CBd0  ONLINE   0 0 0
c0t5000C50058426A0Fd0  ONLINE   0 0 0
logs
  c4t0d0   ONLINE   0 0 0
  c4t1d0   ONLINE   0 0 0
cache
  c0t55CD2E404BE9CB7Ed0ONLINE   0 0 0

errors: No known data errors

[root@00-25-90-74-f5-04 ~]# format
Searching for disks...done


AVAILABLE DISK SELECTIONS:
   0. c0t55CD2E404BE9CB7Ed0 
  /scsi_vhci/disk@g55cd2e404be9cb7e
   1. c0t5000C500584AC07Bd0 
  /scsi_vhci/disk@g5000c500584ac07b
   2. c0t5000C500584AC557d0 
  /scsi_vhci/disk@g5000c500584ac557
   3. c0t5000C500584ACB1Fd0 
  /scsi_vhci/disk@g5000c500584acb1f
   4. c0t5000C500584AD7B3d0 
  /scsi_vhci/disk@g5000c500584ad7b3
   5. c0t5000C500584C30DBd0 
  /scsi_vhci/disk@g5000c500584c30db
   6. c0t5000C500586E54A3d0 
  /scsi_vhci/disk@g5000c500586e54a3
   7. c0t5000C500586EF0CBd0 
  /scsi_vhci/disk@g5000c500586ef0cb
   8. c0t5000C50058426A0Fd0 
  /scsi_vhci/disk@g5000c50058426a0f
   9. c0t5000CCA6A0C791CBd0 
  /scsi_vhci/disk@g5000cca6a0c791cb
  10. c0t5F0056425331d0 
  /scsi_vhci/disk@g5f0056425331
  11. c2t44d0 
  /pci@0,0/pci8086,1c10@1c/pci1000,3140@0/sd@2c,0
  12. c2t45d0 
  /pci@0,0/pci8086,1c10@1c/pci1000,3140@0/sd@2d,0
  13. c2t46d0 
  /pci@0,0/pci8086,1c10@1c/pci1000,3140@0/sd@2e,0
  14. c2t47d0 
  /pci@0,0/pci8086,1c10@1c/pci1000,3140@0/sd@2f,0
  15. c2t48d0 
  /pci@0,0/pci8086,1c10@1c/pci1000,3140@0/sd@30,0
  16. c2t49d0 
  /pci@0,0/pci8086,1c10@1c/pci1000,3140@0/sd@31,0
  17. c2t52d0 
  /pci@0,0/pci8086,1c10@1c/pci1000,3140@0/sd@34,0
  18. c2t53d0