Re: [zfs-macos] abort zpool replace

2014-10-29 Thread ilovezfs
zpool attach makes a non-mirror into a mirror. zpool detach makes a mirror 
into a non-mirror.

I believe you are looking for zpool remove.

On Wednesday, October 29, 2014 12:54:48 AM UTC-7, Busty wrote:

 Wow, thanks Bjoern for that, now I really know what was going on. I 
 really appreciate the time you took to explain all that. 

 The problem I'm facing is, that I can't detach the drive. A zpool 
 detach pool diskx gives me the error: 
 cannot detach diskx: only applicable to mirror and replacing vdevs. 

 I managed to format the disk as hfs+, zero the drive completely and 
 then format as zfs, but zfs still considers this disk as one of the pool. 

 What can I do to get the drive out of the pool? 

 On 26.10.14 14:43, Bjoern Kahl wrote: 
  
  (this is going to be a bit longer, but since it is a reappearing 
  topic I'd like to provide some background information on what 
  happens behind the scene) 
  
  
  Am 26.10.14 um 12:09 schrieb 'Busty' via zfs-macos: 
  This generated a follow up question: 
  
  I did the zpool replace with an unformatted disk as described in 
  the oracle documentation. After that, zpool status showed the 
  disk as part of the pool, but as disk2, not as disk2s2. 
  Accordingly, OSX wanted to initialize the disk every time upon 
  booting. 
  
  So I formatted the disk as described in the getting started 
  guide on MacZFS, which resolves the problem of OSX wanting to 
  initialize the disk, but still it shows as disk2 (without the 
  s2) with zpool status. I was prepared to resilver the disk again 
  after that, but it was still part of the pool. 
  
  I started a scrub, had 6 checksum errors on that disk right at 
  the beginning, but otherwise the scrub seems to consider the data 
  as good. It is at 7 percent right now. 
  
  Should I be worried that the data is not integer? 
  
  Yes, you should. 
  
  You basically did the following: 
  
  1) 
  
  Gave a whole disk to ZFS, telling it, it is OK to use the whole 
  space from first to last block of the disk. 
  
  ZFS did so and started writing data: 
  
  a) it's vdev label 0,1 from block 0 to 1023 (assuming 512 byte 
  blocks) 
  
  b) it's vdev label 2,3 from block N-1024 to N-1 (assuming N block 
  on disk) 
  
  c) your pool data in between, following it's somewhat complex 
  allocation scheme 
  
  
  2) 
  
  Told OS X to write a disk label (aka GPT) on the disk. 
  
  OS X did so and started writing data: 
  
  a) A protective MBR in block 0 - no damage, ZFS anticipates that, 
  leaving block 0 to 32 (16k) of its label alone. 
  
  b) The primary GPT structures, starting from block 1 (byte 
  position 512) to end of block 33 (byte position 17408). This 
  trashed part of the configuration dictionary in vdev label 0 
  
  c) The secondary GPT structures, in the last 17408 bytes of the 
  disk, overwriting part of the uberblock array in vdev label 3. 
  
  d) The Mac OS X EFI area, usually around block 40 to 409600 (byte 
  positions up to 200 MB). This is /dev/diskXs1. 
  
  e) The man partition /dev/diskXs2, roughly starting at block 
  409640 and extending until some blocks before the secondary GPT 
  structures. This is just created but nor written in noformat has 
  been used. 
  
  
  
  What does this mean?  
  
  
  It depends on how ZFS sees the disk.  Most likely it will continue 
  to use diskX (no slice).  In that case: 
  
  The pool keeps functioning, since vdev labels 1 and 2 are undamaged 
  (0 and 3 are overwritten, see above) 
  
  ZFS will almost instantly fix it's labels, completely overwriting 
  the secondary GPT.  Mac OS X doesn't care, it writes the secondary 
  GPT and never looks there again. 
  
  The situation on the is start is more complex. 
  
  ZFS will also almost instantly fix its label 0. However, this 
  writes only from block 32 on (byte position 16384 onwards), since 
  it completely ignores the first 16 blocks (supposed to hold disk 
  identifier) and doesn't touch the next 16 in normal operation, 
  since they are supposed to hold ZFS boot code and are unused in 
  current implementations. 
  
  So the rewritten vdev label 0 trashes the last 512 bytes of the 
  primary GPT.  This does concern Mac OS X and you should see a 
  waring about an invalid GPT CRC in the system log after boot. 
  
  
  So much for the administrative data structures.  What about your 
  data? 
  
  ZFS' data area starts after the vdev label 1, i.e. at block 1024 
  (byte position 512 kB).  This is somewhere inside the EFI area, 
  overwriting whatever Mac OS X placed there (depends on version, 
  older Mac OS X version didn't placed anything there, don't know for 
  newer versions).  In any case, Mac OS X does not access the EFI 
  area in normal operation, and so won't note the damage. 
  
  On the other hand, Mac OS X is initializing the EFI area when 
  initializing a disk, placing an empty FAT file system there. 
  
  This FAT overwrites part of the ZFS pool data and caused the 

Re: [zfs-macos] abort zpool replace

2014-10-29 Thread 'Busty' via zfs-macos
thanks for the input but:

only inactive hot spares can be removed, whereas I need to
remove/detach/whatever one disk of a raidz1 pool, no mirrors, no duplicates.

I get the impression there is no way to do that, so I might have to
build the pool from scratch again, am I right?



On 29.10.14 09:49, ilove...@icloud.com wrote:
 zpool attach makes a non-mirror into a mirror. zpool detach makes a mirror 
 into a non-mirror.
 
 I believe you are looking for zpool remove.
 
 On Wednesday, October 29, 2014 12:54:48 AM UTC-7, Busty wrote:

 Wow, thanks Bjoern for that, now I really know what was going on. I 
 really appreciate the time you took to explain all that. 

 The problem I'm facing is, that I can't detach the drive. A zpool 
 detach pool diskx gives me the error: 
 cannot detach diskx: only applicable to mirror and replacing vdevs. 

 I managed to format the disk as hfs+, zero the drive completely and 
 then format as zfs, but zfs still considers this disk as one of the pool. 

 What can I do to get the drive out of the pool? 

 On 26.10.14 14:43, Bjoern Kahl wrote: 

 (this is going to be a bit longer, but since it is a reappearing 
 topic I'd like to provide some background information on what 
 happens behind the scene) 


 Am 26.10.14 um 12:09 schrieb 'Busty' via zfs-macos: 
 This generated a follow up question: 

 I did the zpool replace with an unformatted disk as described in 
 the oracle documentation. After that, zpool status showed the 
 disk as part of the pool, but as disk2, not as disk2s2. 
 Accordingly, OSX wanted to initialize the disk every time upon 
 booting. 

 So I formatted the disk as described in the getting started 
 guide on MacZFS, which resolves the problem of OSX wanting to 
 initialize the disk, but still it shows as disk2 (without the 
 s2) with zpool status. I was prepared to resilver the disk again 
 after that, but it was still part of the pool. 

 I started a scrub, had 6 checksum errors on that disk right at 
 the beginning, but otherwise the scrub seems to consider the data 
 as good. It is at 7 percent right now. 

 Should I be worried that the data is not integer? 

 Yes, you should. 

 You basically did the following: 

 1) 

 Gave a whole disk to ZFS, telling it, it is OK to use the whole 
 space from first to last block of the disk. 

 ZFS did so and started writing data: 

 a) it's vdev label 0,1 from block 0 to 1023 (assuming 512 byte 
 blocks) 

 b) it's vdev label 2,3 from block N-1024 to N-1 (assuming N block 
 on disk) 

 c) your pool data in between, following it's somewhat complex 
 allocation scheme 


 2) 

 Told OS X to write a disk label (aka GPT) on the disk. 

 OS X did so and started writing data: 

 a) A protective MBR in block 0 - no damage, ZFS anticipates that, 
 leaving block 0 to 32 (16k) of its label alone. 

 b) The primary GPT structures, starting from block 1 (byte 
 position 512) to end of block 33 (byte position 17408). This 
 trashed part of the configuration dictionary in vdev label 0 

 c) The secondary GPT structures, in the last 17408 bytes of the 
 disk, overwriting part of the uberblock array in vdev label 3. 

 d) The Mac OS X EFI area, usually around block 40 to 409600 (byte 
 positions up to 200 MB). This is /dev/diskXs1. 

 e) The man partition /dev/diskXs2, roughly starting at block 
 409640 and extending until some blocks before the secondary GPT 
 structures. This is just created but nor written in noformat has 
 been used. 



 What does this mean?  


 It depends on how ZFS sees the disk.  Most likely it will continue 
 to use diskX (no slice).  In that case: 

 The pool keeps functioning, since vdev labels 1 and 2 are undamaged 
 (0 and 3 are overwritten, see above) 

 ZFS will almost instantly fix it's labels, completely overwriting 
 the secondary GPT.  Mac OS X doesn't care, it writes the secondary 
 GPT and never looks there again. 

 The situation on the is start is more complex. 

 ZFS will also almost instantly fix its label 0. However, this 
 writes only from block 32 on (byte position 16384 onwards), since 
 it completely ignores the first 16 blocks (supposed to hold disk 
 identifier) and doesn't touch the next 16 in normal operation, 
 since they are supposed to hold ZFS boot code and are unused in 
 current implementations. 

 So the rewritten vdev label 0 trashes the last 512 bytes of the 
 primary GPT.  This does concern Mac OS X and you should see a 
 waring about an invalid GPT CRC in the system log after boot. 


 So much for the administrative data structures.  What about your 
 data? 

 ZFS' data area starts after the vdev label 1, i.e. at block 1024 
 (byte position 512 kB).  This is somewhere inside the EFI area, 
 overwriting whatever Mac OS X placed there (depends on version, 
 older Mac OS X version didn't placed anything there, don't know for 
 newer versions).  In any case, Mac OS X does not access the EFI 
 area in normal operation, and so won't note the damage. 

 On the 

Re: [zfs-macos] abort zpool replace

2014-10-29 Thread Jason Belec
If I understand what I'm reading here, you have a disk that is in your pool and 
the pool is raidz, so you must always have the same number of devices attached 
to the pool, this is a raidz law. You can replace a new one with a damaged one, 
but you cannot remove the damaged one until the replace/resilver is complete. 
You cannot stop a resilver once it has begun, your going to have to be patient. 
Once done, you can proceed with rectifying the issue. The issues you are 
running into are due to not reading up and testing before committing, and it 
seems to happen a lot. ZFS seems frustrating to you right now because it is 
doing everything possible to protect data your messing with. ;)


--
Jason Belec
Sent from my iPad

 On Oct 29, 2014, at 6:50 AM, 'Busty' via zfs-macos 
 zfs-macos@googlegroups.com wrote:
 
 thanks for the input but:
 
 only inactive hot spares can be removed, whereas I need to
 remove/detach/whatever one disk of a raidz1 pool, no mirrors, no duplicates.
 
 I get the impression there is no way to do that, so I might have to
 build the pool from scratch again, am I right?
 
 
 On 29.10.14 09:49, ilove...@icloud.com wrote:
 zpool attach makes a non-mirror into a mirror. zpool detach makes a mirror 
 into a non-mirror.
 
 I believe you are looking for zpool remove.
 
 On Wednesday, October 29, 2014 12:54:48 AM UTC-7, Busty wrote:
 
 Wow, thanks Bjoern for that, now I really know what was going on. I 
 really appreciate the time you took to explain all that. 
 
 The problem I'm facing is, that I can't detach the drive. A zpool 
 detach pool diskx gives me the error: 
 cannot detach diskx: only applicable to mirror and replacing vdevs. 
 
 I managed to format the disk as hfs+, zero the drive completely and 
 then format as zfs, but zfs still considers this disk as one of the pool. 
 
 What can I do to get the drive out of the pool? 
 
 On 26.10.14 14:43, Bjoern Kahl wrote: 
 
 (this is going to be a bit longer, but since it is a reappearing 
 topic I'd like to provide some background information on what 
 happens behind the scene) 
 
 
 Am 26.10.14 um 12:09 schrieb 'Busty' via zfs-macos: 
 This generated a follow up question:
 
 I did the zpool replace with an unformatted disk as described in 
 the oracle documentation. After that, zpool status showed the 
 disk as part of the pool, but as disk2, not as disk2s2. 
 Accordingly, OSX wanted to initialize the disk every time upon 
 booting.
 
 So I formatted the disk as described in the getting started 
 guide on MacZFS, which resolves the problem of OSX wanting to 
 initialize the disk, but still it shows as disk2 (without the 
 s2) with zpool status. I was prepared to resilver the disk again 
 after that, but it was still part of the pool.
 
 I started a scrub, had 6 checksum errors on that disk right at 
 the beginning, but otherwise the scrub seems to consider the data 
 as good. It is at 7 percent right now.
 
 Should I be worried that the data is not integer?
 
 Yes, you should. 
 
 You basically did the following: 
 
 1) 
 
 Gave a whole disk to ZFS, telling it, it is OK to use the whole 
 space from first to last block of the disk. 
 
 ZFS did so and started writing data: 
 
 a) it's vdev label 0,1 from block 0 to 1023 (assuming 512 byte 
 blocks) 
 
 b) it's vdev label 2,3 from block N-1024 to N-1 (assuming N block 
 on disk) 
 
 c) your pool data in between, following it's somewhat complex 
 allocation scheme 
 
 
 2) 
 
 Told OS X to write a disk label (aka GPT) on the disk. 
 
 OS X did so and started writing data: 
 
 a) A protective MBR in block 0 - no damage, ZFS anticipates that, 
 leaving block 0 to 32 (16k) of its label alone. 
 
 b) The primary GPT structures, starting from block 1 (byte 
 position 512) to end of block 33 (byte position 17408). This 
 trashed part of the configuration dictionary in vdev label 0 
 
 c) The secondary GPT structures, in the last 17408 bytes of the 
 disk, overwriting part of the uberblock array in vdev label 3. 
 
 d) The Mac OS X EFI area, usually around block 40 to 409600 (byte 
 positions up to 200 MB). This is /dev/diskXs1. 
 
 e) The man partition /dev/diskXs2, roughly starting at block 
 409640 and extending until some blocks before the secondary GPT 
 structures. This is just created but nor written in noformat has 
 been used. 
 
 
 
 What does this mean?  
 
 
 It depends on how ZFS sees the disk.  Most likely it will continue 
 to use diskX (no slice).  In that case: 
 
 The pool keeps functioning, since vdev labels 1 and 2 are undamaged 
 (0 and 3 are overwritten, see above) 
 
 ZFS will almost instantly fix it's labels, completely overwriting 
 the secondary GPT.  Mac OS X doesn't care, it writes the secondary 
 GPT and never looks there again. 
 
 The situation on the is start is more complex. 
 
 ZFS will also almost instantly fix its label 0. However, this 
 writes only from block 32 on (byte position 16384 onwards), since 
 it completely ignores the first 16 

Re: [zfs-macos] abort zpool replace

2014-10-29 Thread ilovezfs
Yeah, zpool remove won't work on a device in a raidz vdev, nor will zpool 
detach.

What does your current zpool status look like?

On Wednesday, October 29, 2014 4:19:31 AM UTC-7, jasonbelec wrote:

 If I understand what I'm reading here, you have a disk that is in your 
 pool and the pool is raidz, so you must always have the same number of 
 devices attached to the pool, this is a raidz law. You can replace a new 
 one with a damaged one, but you cannot remove the damaged one until the 
 replace/resilver is complete. You cannot stop a resilver once it has begun, 
 your going to have to be patient. Once done, you can proceed with 
 rectifying the issue. The issues you are running into are due to not 
 reading up and testing before committing, and it seems to happen a lot. ZFS 
 seems frustrating to you right now because it is doing everything possible 
 to protect data your messing with. ;) 


 -- 
 Jason Belec 
 Sent from my iPad 

  On Oct 29, 2014, at 6:50 AM, 'Busty' via zfs-macos 
 zfs-...@googlegroups.com javascript: wrote: 
  
  thanks for the input but: 
  
  only inactive hot spares can be removed, whereas I need to 
  remove/detach/whatever one disk of a raidz1 pool, no mirrors, no 
 duplicates. 
  
  I get the impression there is no way to do that, so I might have to 
  build the pool from scratch again, am I right? 
  
  
  On 29.10.14 09:49, ilov...@icloud.com javascript: wrote: 
  zpool attach makes a non-mirror into a mirror. zpool detach makes a 
 mirror 
  into a non-mirror. 
  
  I believe you are looking for zpool remove. 
  
  On Wednesday, October 29, 2014 12:54:48 AM UTC-7, Busty wrote: 
  
  Wow, thanks Bjoern for that, now I really know what was going on. I 
  really appreciate the time you took to explain all that. 
  
  The problem I'm facing is, that I can't detach the drive. A zpool 
  detach pool diskx gives me the error: 
  cannot detach diskx: only applicable to mirror and replacing vdevs. 
  
  I managed to format the disk as hfs+, zero the drive completely and 
  then format as zfs, but zfs still considers this disk as one of the 
 pool. 
  
  What can I do to get the drive out of the pool? 
  
  On 26.10.14 14:43, Bjoern Kahl wrote: 
  
  (this is going to be a bit longer, but since it is a reappearing 
  topic I'd like to provide some background information on what 
  happens behind the scene) 
  
  
  Am 26.10.14 um 12:09 schrieb 'Busty' via zfs-macos: 
  This generated a follow up question: 
  
  I did the zpool replace with an unformatted disk as described in 
  the oracle documentation. After that, zpool status showed the 
  disk as part of the pool, but as disk2, not as disk2s2. 
  Accordingly, OSX wanted to initialize the disk every time upon 
  booting. 
  
  So I formatted the disk as described in the getting started 
  guide on MacZFS, which resolves the problem of OSX wanting to 
  initialize the disk, but still it shows as disk2 (without the 
  s2) with zpool status. I was prepared to resilver the disk again 
  after that, but it was still part of the pool. 
  
  I started a scrub, had 6 checksum errors on that disk right at 
  the beginning, but otherwise the scrub seems to consider the data 
  as good. It is at 7 percent right now. 
  
  Should I be worried that the data is not integer? 
  
  Yes, you should. 
  
  You basically did the following: 
  
  1) 
  
  Gave a whole disk to ZFS, telling it, it is OK to use the whole 
  space from first to last block of the disk. 
  
  ZFS did so and started writing data: 
  
  a) it's vdev label 0,1 from block 0 to 1023 (assuming 512 byte 
  blocks) 
  
  b) it's vdev label 2,3 from block N-1024 to N-1 (assuming N block 
  on disk) 
  
  c) your pool data in between, following it's somewhat complex 
  allocation scheme 
  
  
  2) 
  
  Told OS X to write a disk label (aka GPT) on the disk. 
  
  OS X did so and started writing data: 
  
  a) A protective MBR in block 0 - no damage, ZFS anticipates that, 
  leaving block 0 to 32 (16k) of its label alone. 
  
  b) The primary GPT structures, starting from block 1 (byte 
  position 512) to end of block 33 (byte position 17408). This 
  trashed part of the configuration dictionary in vdev label 0 
  
  c) The secondary GPT structures, in the last 17408 bytes of the 
  disk, overwriting part of the uberblock array in vdev label 3. 
  
  d) The Mac OS X EFI area, usually around block 40 to 409600 (byte 
  positions up to 200 MB). This is /dev/diskXs1. 
  
  e) The man partition /dev/diskXs2, roughly starting at block 
  409640 and extending until some blocks before the secondary GPT 
  structures. This is just created but nor written in noformat has 
  been used. 
  
  
  
  What does this mean?  
  
  
  It depends on how ZFS sees the disk.  Most likely it will continue 
  to use diskX (no slice).  In that case: 
  
  The pool keeps functioning, since vdev labels 1 and 2 are undamaged 
  (0 and 3 are overwritten, see above) 
  
  ZFS will almost 

Re: [zfs-macos] abort zpool replace

2014-10-29 Thread 'Busty' via zfs-macos
Hey Jason,

not really that frustrated, as I feel I'm working my way towards the
solution with the help of you maczfs guys.

I clearly didn't think that out when telling zfs that it is ok to use
the whole disk instead of the s2 slice.

The issue seems to be that I can't tell zfs that I want to start from
scratch with that disk, zfs always recognizes the disk as already being
part of the pool. As a whole.

So, the options I see:

- I can either physically replace the disk with a new one, this time
formatting it as zfs before telling zfs to replace it

- I can build the pool from scratch

(I would go for building the pool from scratch, as the disk in question
is working when installed properly. Additionally, I don't have to buy
another disk and wait for it.)

What do you guys think: is there another option?


zpool status gives me:

Server:~ busty$ zpool status
  pool: Collection
 state: ONLINE
status: One or more devices has experienced an unrecoverable error.  An
attempt was made to correct the error.  Applications are unaffected.
action: Determine if the device needs to be replaced, and clear the errors
using 'zpool clear' or replace the device with 'zpool replace'.
   see: http://www.sun.com/msg/ZFS-8000-9P
 scrub: none requested
config:

NAME STATE READ WRITE CKSUM
Collection   ONLINE   0 0 0
  raidz1 ONLINE   0 0 0
disk5s2  ONLINE   0 0 0
disk4s2  ONLINE   0 0 0
disk7s2  ONLINE   0 0 0
disk3s2  ONLINE   0 0 0
disk2ONLINE   0 0 5
disk1s2  ONLINE   0 0 0
disk6s2  ONLINE   0 0 0

errors: No known data errors

But I bet I get a pocketful (big pocket) of errors on the disk2 when
doing a scrub, since I zeroed the disk completely.




On 29.10.14 12:19, Jason Belec wrote:
 If I understand what I'm reading here, you have a disk that is in your pool 
 and the pool is raidz, so you must always have the same number of devices 
 attached to the pool, this is a raidz law. You can replace a new one with a 
 damaged one, but you cannot remove the damaged one until the replace/resilver 
 is complete. You cannot stop a resilver once it has begun, your going to have 
 to be patient. Once done, you can proceed with rectifying the issue. The 
 issues you are running into are due to not reading up and testing before 
 committing, and it seems to happen a lot. ZFS seems frustrating to you right 
 now because it is doing everything possible to protect data your messing 
 with. ;)
 
 
 --
 Jason Belec
 Sent from my iPad
 
 On Oct 29, 2014, at 6:50 AM, 'Busty' via zfs-macos 
 zfs-macos@googlegroups.com wrote:

 thanks for the input but:

 only inactive hot spares can be removed, whereas I need to
 remove/detach/whatever one disk of a raidz1 pool, no mirrors, no duplicates.

 I get the impression there is no way to do that, so I might have to
 build the pool from scratch again, am I right?


 On 29.10.14 09:49, ilove...@icloud.com wrote:
 zpool attach makes a non-mirror into a mirror. zpool detach makes a mirror 
 into a non-mirror.

 I believe you are looking for zpool remove.

 On Wednesday, October 29, 2014 12:54:48 AM UTC-7, Busty wrote:

 Wow, thanks Bjoern for that, now I really know what was going on. I 
 really appreciate the time you took to explain all that. 

 The problem I'm facing is, that I can't detach the drive. A zpool 
 detach pool diskx gives me the error: 
 cannot detach diskx: only applicable to mirror and replacing vdevs. 

 I managed to format the disk as hfs+, zero the drive completely and 
 then format as zfs, but zfs still considers this disk as one of the pool. 

 What can I do to get the drive out of the pool? 

 On 26.10.14 14:43, Bjoern Kahl wrote: 

 (this is going to be a bit longer, but since it is a reappearing 
 topic I'd like to provide some background information on what 
 happens behind the scene) 


 Am 26.10.14 um 12:09 schrieb 'Busty' via zfs-macos: 
 This generated a follow up question:

 I did the zpool replace with an unformatted disk as described in 
 the oracle documentation. After that, zpool status showed the 
 disk as part of the pool, but as disk2, not as disk2s2. 
 Accordingly, OSX wanted to initialize the disk every time upon 
 booting.

 So I formatted the disk as described in the getting started 
 guide on MacZFS, which resolves the problem of OSX wanting to 
 initialize the disk, but still it shows as disk2 (without the 
 s2) with zpool status. I was prepared to resilver the disk again 
 after that, but it was still part of the pool.

 I started a scrub, had 6 checksum errors on that disk right at 
 the beginning, but otherwise the scrub seems to consider the data 
 as good. It is at 7 percent right now.

 Should I be worried that the data is not integer?

 Yes, you should. 

 You basically did the following: 

 1) 

 Gave a 

Re: [zfs-macos] abort zpool replace

2014-10-29 Thread ilovezfs
OpenZFS on OS X has a command called zpool labelclear to handle this 
situation, but it rarely comes up because if you give OpenZFS on OS X a 
whole device, it will automatically partition it for you.

Since MacZFS does not have the zpool labelclear command, you can achieve 
the same effect by zeroing out the disk.

1) zpool offline the device
2) zero it out
3) partition it
4) zpool online the device
5) zpool replace the device with itself

You can use Disk Utility.app's Erase tab to complete step 2. Be sure to 
select writing a single pass zeros in the Security Options.

In reality you only need to zero out the labels, but it is sufficient to 
zero the whole device.

On Wednesday, October 29, 2014 4:49:33 AM UTC-7, Busty wrote:

 Hey Jason, 

 not really that frustrated, as I feel I'm working my way towards the 
 solution with the help of you maczfs guys. 

 I clearly didn't think that out when telling zfs that it is ok to use 
 the whole disk instead of the s2 slice. 

 The issue seems to be that I can't tell zfs that I want to start from 
 scratch with that disk, zfs always recognizes the disk as already being 
 part of the pool. As a whole. 

 So, the options I see: 

 - I can either physically replace the disk with a new one, this time 
 formatting it as zfs before telling zfs to replace it 

 - I can build the pool from scratch 

 (I would go for building the pool from scratch, as the disk in question 
 is working when installed properly. Additionally, I don't have to buy 
 another disk and wait for it.) 

 What do you guys think: is there another option? 


 zpool status gives me: 

 Server:~ busty$ zpool status 
   pool: Collection 
  state: ONLINE 
 status: One or more devices has experienced an unrecoverable error.  An 
 attempt was made to correct the error.  Applications are 
 unaffected. 
 action: Determine if the device needs to be replaced, and clear the errors 
 using 'zpool clear' or replace the device with 'zpool replace'. 
see: http://www.sun.com/msg/ZFS-8000-9P 
  scrub: none requested 
 config: 

 NAME STATE READ WRITE CKSUM 
 Collection   ONLINE   0 0 0 
   raidz1 ONLINE   0 0 0 
 disk5s2  ONLINE   0 0 0 
 disk4s2  ONLINE   0 0 0 
 disk7s2  ONLINE   0 0 0 
 disk3s2  ONLINE   0 0 0 
 disk2ONLINE   0 0 5 
 disk1s2  ONLINE   0 0 0 
 disk6s2  ONLINE   0 0 0 

 errors: No known data errors 

 But I bet I get a pocketful (big pocket) of errors on the disk2 when 
 doing a scrub, since I zeroed the disk completely. 




 On 29.10.14 12:19, Jason Belec wrote: 
  If I understand what I'm reading here, you have a disk that is in your 
 pool and the pool is raidz, so you must always have the same number of 
 devices attached to the pool, this is a raidz law. You can replace a new 
 one with a damaged one, but you cannot remove the damaged one until the 
 replace/resilver is complete. You cannot stop a resilver once it has begun, 
 your going to have to be patient. Once done, you can proceed with 
 rectifying the issue. The issues you are running into are due to not 
 reading up and testing before committing, and it seems to happen a lot. ZFS 
 seems frustrating to you right now because it is doing everything possible 
 to protect data your messing with. ;) 
  
  
  -- 
  Jason Belec 
  Sent from my iPad 
  
  On Oct 29, 2014, at 6:50 AM, 'Busty' via zfs-macos 
 zfs-...@googlegroups.com javascript: wrote: 
  
  thanks for the input but: 
  
  only inactive hot spares can be removed, whereas I need to 
  remove/detach/whatever one disk of a raidz1 pool, no mirrors, no 
 duplicates. 
  
  I get the impression there is no way to do that, so I might have to 
  build the pool from scratch again, am I right? 
  
  
  On 29.10.14 09:49, ilov...@icloud.com javascript: wrote: 
  zpool attach makes a non-mirror into a mirror. zpool detach makes a 
 mirror 
  into a non-mirror. 
  
  I believe you are looking for zpool remove. 
  
  On Wednesday, October 29, 2014 12:54:48 AM UTC-7, Busty wrote: 
  
  Wow, thanks Bjoern for that, now I really know what was going on. I 
  really appreciate the time you took to explain all that. 
  
  The problem I'm facing is, that I can't detach the drive. A zpool 
  detach pool diskx gives me the error: 
  cannot detach diskx: only applicable to mirror and replacing vdevs. 
  
  I managed to format the disk as hfs+, zero the drive completely and 
  then format as zfs, but zfs still considers this disk as one of the 
 pool. 
  
  What can I do to get the drive out of the pool? 
  
  On 26.10.14 14:43, Bjoern Kahl wrote: 
  
  (this is going to be a bit longer, but since it is a reappearing 
  topic I'd like to provide some background information on what 
  happens behind the scene) 
  
  
  Am 26.10.14 um 12:09 schrieb 

Re: [zfs-macos] abort zpool replace

2014-10-26 Thread Bjoern Kahl
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1


 (this is going to be a bit longer, but since it is a reappearing topic
  I'd like to provide some background information on what happens
  behind the scene)


Am 26.10.14 um 12:09 schrieb 'Busty' via zfs-macos:
 This generated a follow up question:
 
 I did the zpool replace with an unformatted disk as described in
 the oracle documentation. After that, zpool status showed the disk
 as part of the pool, but as disk2, not as disk2s2. Accordingly,
 OSX wanted to initialize the disk every time upon booting.
 
 So I formatted the disk as described in the getting started guide
 on MacZFS, which resolves the problem of OSX wanting to initialize
 the disk, but still it shows as disk2 (without the s2) with zpool
 status. I was prepared to resilver the disk again after that, but
 it was still part of the pool.
 
 I started a scrub, had 6 checksum errors on that disk right at the 
 beginning, but otherwise the scrub seems to consider the data as
 good. It is at 7 percent right now.
 
 Should I be worried that the data is not integer?

 Yes, you should.

 You basically did the following:

 1)

 Gave a whole disk to ZFS, telling it, it is OK to use the whole space
 from first to last block of the disk.

 ZFS did so and started writing data:

 a) it's vdev label 0,1 from block 0 to 1023 (assuming 512 byte blocks)

 b) it's vdev label 2,3 from block N-1024 to N-1 (assuming N block on
disk)

 c) your pool data in between, following it's somewhat complex
allocation scheme


 2)

 Told OS X to write a disk label (aka GPT) on the disk.

 OS X did so and started writing data:

 a) A protective MBR in block 0 - no damage, ZFS anticipates
that, leaving block 0 to 32 (16k) of its label alone.

 b) The primary GPT structures, starting from block 1 (byte position
512) to end of block 33 (byte position 17408).
This trashed part of the configuration dictionary in vdev label 0

 c) The secondary GPT structures, in the last 17408 bytes of the disk,
overwriting part of the uberblock array in vdev label 3.

 d) The Mac OS X EFI area, usually around block 40 to 409600 (byte
positions up to 200 MB). This is /dev/diskXs1.

 e) The man partition /dev/diskXs2, roughly starting at block 409640
and extending until some blocks before the secondary GPT structures.
This is just created but nor written in noformat has been used.



 What does this mean?
 


 It depends on how ZFS sees the disk.  Most likely it will continue to
 use diskX (no slice).  In that case:

 The pool keeps functioning, since vdev labels 1 and 2 are undamaged (0
 and 3 are overwritten, see above)

 ZFS will almost instantly fix it's labels, completely overwriting the
 secondary GPT.  Mac OS X doesn't care, it writes the secondary GPT and
 never looks there again.

 The situation on the is start is more complex.

 ZFS will also almost instantly fix its label 0. However, this writes
 only from block 32 on (byte position 16384 onwards), since it
 completely ignores the first 16 blocks (supposed to hold disk
 identifier) and doesn't touch the next 16 in normal operation, since
 they are supposed to hold ZFS boot code and are unused in current
 implementations.

 So the rewritten vdev label 0 trashes the last 512 bytes of the primary
 GPT.  This does concern Mac OS X and you should see a waring about an
 invalid GPT CRC in the system log after boot.


 So much for the administrative data structures.  What about your data?

 ZFS' data area starts after the vdev label 1, i.e. at block 1024
 (byte position 512 kB).  This is somewhere inside the EFI area,
 overwriting whatever Mac OS X placed there (depends on version, older
 Mac OS X version didn't placed anything there, don't know for newer
 versions).  In any case, Mac OS X does not access the EFI area in
 normal operation, and so won't note the damage.

 On the other hand, Mac OS X is initializing the EFI area when
 initializing a disk, placing an empty FAT file system there.

 This FAT overwrites part of the ZFS pool data and caused the checksum
 errors.


 What to do now?
 ---

 I would detach the disk in question, zap the first and last several MB
 of disk space (i.e. of diskX itself, not of the diskX2s slice) by
 writing zero bytes to disk, for example using dd, reformat with
 diskutil and reattach as /dev/diskX2s.

 Another approach for zapping the disk content is, to format as HFS+
 with diskutil and then select clear/erase free disk space (or
 whatever the English button label says).


 Best regards

Björn

 On 23.10.14 14:01, 'Busty' via zfs-macos wrote:
 
 This was in fact easier than I thought. What did the trick was
 to physically swap the faulty disk with a new one and then zpool
 detach (faulty disk)
 
 After that a zpool replace went like a charm.
 
 Problem solved.
 
 On 15.10.14 20:32, 'Busty' via zfs-macos wrote:
 In my pool, I had a disk that got a smart error (bad block), so
 I pulled it out, 

Re: [zfs-macos] abort zpool replace

2014-10-26 Thread BelecMartin
Well that sure is detailed. And should be in the wiki as it is very useful and 
a great overall explanation. ;)

Jason Belec
Sent from my It's an iPod, a Phone, and an Internet Device...

 On Oct 26, 2014, at 9:43 AM, Bjoern Kahl googlelo...@bjoern-kahl.de wrote:
 
 -BEGIN PGP SIGNED MESSAGE-
 Hash: SHA1
 
 
 (this is going to be a bit longer, but since it is a reappearing topic
  I'd like to provide some background information on what happens
  behind the scene)
 
 
 Am 26.10.14 um 12:09 schrieb 'Busty' via zfs-macos:
 This generated a follow up question:
 
 I did the zpool replace with an unformatted disk as described in
 the oracle documentation. After that, zpool status showed the disk
 as part of the pool, but as disk2, not as disk2s2. Accordingly,
 OSX wanted to initialize the disk every time upon booting.
 
 So I formatted the disk as described in the getting started guide
 on MacZFS, which resolves the problem of OSX wanting to initialize
 the disk, but still it shows as disk2 (without the s2) with zpool
 status. I was prepared to resilver the disk again after that, but
 it was still part of the pool.
 
 I started a scrub, had 6 checksum errors on that disk right at the 
 beginning, but otherwise the scrub seems to consider the data as
 good. It is at 7 percent right now.
 
 Should I be worried that the data is not integer?
 
 Yes, you should.
 
 You basically did the following:
 
 1)
 
 Gave a whole disk to ZFS, telling it, it is OK to use the whole space
 from first to last block of the disk.
 
 ZFS did so and started writing data:
 
 a) it's vdev label 0,1 from block 0 to 1023 (assuming 512 byte blocks)
 
 b) it's vdev label 2,3 from block N-1024 to N-1 (assuming N block on
 disk)
 
 c) your pool data in between, following it's somewhat complex
 allocation scheme
 
 
 2)
 
 Told OS X to write a disk label (aka GPT) on the disk.
 
 OS X did so and started writing data:
 
 a) A protective MBR in block 0 - no damage, ZFS anticipates
that, leaving block 0 to 32 (16k) of its label alone.
 
 b) The primary GPT structures, starting from block 1 (byte position
512) to end of block 33 (byte position 17408).
This trashed part of the configuration dictionary in vdev label 0
 
 c) The secondary GPT structures, in the last 17408 bytes of the disk,
overwriting part of the uberblock array in vdev label 3.
 
 d) The Mac OS X EFI area, usually around block 40 to 409600 (byte
positions up to 200 MB). This is /dev/diskXs1.
 
 e) The man partition /dev/diskXs2, roughly starting at block 409640
and extending until some blocks before the secondary GPT structures.
This is just created but nor written in noformat has been used.
 
 
 
 What does this mean?
 
 
 
 It depends on how ZFS sees the disk.  Most likely it will continue to
 use diskX (no slice).  In that case:
 
 The pool keeps functioning, since vdev labels 1 and 2 are undamaged (0
 and 3 are overwritten, see above)
 
 ZFS will almost instantly fix it's labels, completely overwriting the
 secondary GPT.  Mac OS X doesn't care, it writes the secondary GPT and
 never looks there again.
 
 The situation on the is start is more complex.
 
 ZFS will also almost instantly fix its label 0. However, this writes
 only from block 32 on (byte position 16384 onwards), since it
 completely ignores the first 16 blocks (supposed to hold disk
 identifier) and doesn't touch the next 16 in normal operation, since
 they are supposed to hold ZFS boot code and are unused in current
 implementations.
 
 So the rewritten vdev label 0 trashes the last 512 bytes of the primary
 GPT.  This does concern Mac OS X and you should see a waring about an
 invalid GPT CRC in the system log after boot.
 
 
 So much for the administrative data structures.  What about your data?
 
 ZFS' data area starts after the vdev label 1, i.e. at block 1024
 (byte position 512 kB).  This is somewhere inside the EFI area,
 overwriting whatever Mac OS X placed there (depends on version, older
 Mac OS X version didn't placed anything there, don't know for newer
 versions).  In any case, Mac OS X does not access the EFI area in
 normal operation, and so won't note the damage.
 
 On the other hand, Mac OS X is initializing the EFI area when
 initializing a disk, placing an empty FAT file system there.
 
 This FAT overwrites part of the ZFS pool data and caused the checksum
 errors.
 
 
 What to do now?
 ---
 
 I would detach the disk in question, zap the first and last several MB
 of disk space (i.e. of diskX itself, not of the diskX2s slice) by
 writing zero bytes to disk, for example using dd, reformat with
 diskutil and reattach as /dev/diskX2s.
 
 Another approach for zapping the disk content is, to format as HFS+
 with diskutil and then select clear/erase free disk space (or
 whatever the English button label says).
 
 
 Best regards
 
Björn
 
 On 23.10.14 14:01, 'Busty' via zfs-macos wrote:
 
 This was in fact easier than I thought. 

Re: [zfs-macos] abort zpool replace

2014-10-15 Thread Bjoern Kahl
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1


 Hi 'Busty',

Am 15.10.14 um 20:32 schrieb 'Busty' via zfs-macos:
 In my pool, I had a disk that got a smart error (bad block), so I
 pulled it out, installed a new one and made a zpool replace
 disk5s2 806745480046791602. (That number was shown when typing
 zpool status as the missing device.)
 
 The resilver process started, but it seems that the new disk is
 faulty, because it disappears from the device list infrequently,
 but still at least every 6 hours (I have Temperature Monitor
 running which shows me all disks by serial number).
 
 So I want to change it. But zpool detach poolname dev/disk5s2
 gives the error no such device in pool.
 
 How can I abort the resilvering process? Or is there another way to
 restart the resilvering with a new disk?

 Usually I would do in this situation exactly what you described:
 Detach the disk and attach a new one.

 zpool detach is supposed to detach any disk that can logically be
 detached (i.e. does not remove data that is stored only on that disk).

 To diagnose further, you would need to show us zpool status -v.


 The original disk with the bad block is already on its way to
 Western Digital (it was still in warranty).


 Generally, it is more wise to do the replace with the faulty disk
 still present.  In case of trouble with another disk, it still holds
 most of the data and can provide good block if needed by the resilver
 process.


 Best regards

Björn

- -- 
| Bjoern Kahl   +++   Siegburg   +++Germany |
| googlelogin@-my-domain-   +++   www.bjoern-kahl.de  |
| Languages: German, English, Ancient Latin (a bit :-)) |
-BEGIN PGP SIGNATURE-
Version: GnuPG v1

iQCVAgUBVD7Bn1sDv2ib9OLFAQIilAP9GVkGr/pbpDp3dlZrX7LtmyEG6yNP7ISk
1Xtk0eJ7hmuz8tiZKNTitqkKbhzNkyJtsCaUO0sVctcsO/6WjScCJD5hPv5MZSoK
M9XmbJq+jcXYn4b05vcoQ5SlcpB4dsLZe+oDVq1+2ZgVOXIyqHNO8Jnq2K/xZ986
/a6Ee5c60ug=
=e1Rx
-END PGP SIGNATURE-

-- 

--- 
You received this message because you are subscribed to the Google Groups 
zfs-macos group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to zfs-macos+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: [zfs-macos] abort zpool replace

2014-10-15 Thread 'Busty' via zfs-macos
zpool status -v shows:

Server:~ busty$ zpool status -v
  pool: Collection
 state: DEGRADED
status: One or more devices is currently being resilvered.  The pool
will continue to function, possibly in a degraded state.
action: Wait for the resilver to complete.
 scrub: resilver in progress, 0,98% done, 26h14m to go
config:

NAME  STATE READ WRITE CKSUM
CollectionDEGRADED 0 0 0
  raidz1  DEGRADED 0 0 0
disk3s2   ONLINE   0 0 0
disk5s2   ONLINE   0 0 0
disk7s2   ONLINE   0 0 0
disk1s2   ONLINE   0 0 0
replacing DEGRADED 0 0 0
  806745480046791602  FAULTED  0 0 0  was/dev/disk5s2
  disk4   ONLINE   0 0 0
disk2s2   ONLINE   0 0 0
disk6s2   ONLINE   0 0 0

errors: No known data errors


Good info about letting the disk to be replaced in place until it's
done. My time was running up to send the disk away and it's somewhat
easier to just swap the disks, but I do have a spare SATA-port, so I
could do it the safer way next time.

Meanwhile, what to do with the no such device in pool?

Thanks


On 15.10.14 20:49, Bjoern Kahl wrote:
 
 Hi 'Busty',
 
 Am 15.10.14 um 20:32 schrieb 'Busty' via zfs-macos:
 In my pool, I had a disk that got a smart error (bad block), so
 I pulled it out, installed a new one and made a zpool replace 
 disk5s2 806745480046791602. (That number was shown when typing 
 zpool status as the missing device.)
 
 The resilver process started, but it seems that the new disk is 
 faulty, because it disappears from the device list infrequently, 
 but still at least every 6 hours (I have Temperature Monitor 
 running which shows me all disks by serial number).
 
 So I want to change it. But zpool detach poolname dev/disk5s2 
 gives the error no such device in pool.
 
 How can I abort the resilvering process? Or is there another way
 to restart the resilvering with a new disk?
 
 Usually I would do in this situation exactly what you described: 
 Detach the disk and attach a new one.
 
 zpool detach is supposed to detach any disk that can logically
 be detached (i.e. does not remove data that is stored only on that
 disk).
 
 To diagnose further, you would need to show us zpool status -v.
 
 
 The original disk with the bad block is already on its way to 
 Western Digital (it was still in warranty).
 
 
 Generally, it is more wise to do the replace with the faulty disk 
 still present.  In case of trouble with another disk, it still
 holds most of the data and can provide good block if needed by the
 resilver process.
 
 
 Best regards
 
 Björn
 
 

-- 

--- 
You received this message because you are subscribed to the Google Groups 
zfs-macos group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to zfs-macos+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.