Re: [zfs-discuss] When I stab myself with this knife, it hurts... But - should it kill me?

2007-10-04 Thread Dick Davies
On 04/10/2007, Nathan Kroenert [EMAIL PROTECTED] wrote:

 Client A
   - import pool make couple-o-changes

 Client B
   - import pool -f  (heh)

 Oct  4 15:03:12 fozzie ^Mpanic[cpu0]/thread=ff0002b51c80:
 Oct  4 15:03:12 fozzie genunix: [ID 603766 kern.notice] assertion
 failed: dmu_read(os, smo-smo_object, offset, size, entry_map) == 0 (0x5
 == 0x0)
 , file: ../../common/fs/zfs/space_map.c, line: 339
 Oct  4 15:03:12 fozzie unix: [ID 10 kern.notice]
 Oct  4 15:03:12 fozzie genunix: [ID 655072 kern.notice] ff0002b51160
 genunix:assfail3+b9 ()
 Oct  4 15:03:12 fozzie genunix: [ID 655072 kern.notice] ff0002b51200
 zfs:space_map_load+2ef ()
 Oct  4 15:03:12 fozzie genunix: [ID 655072 kern.notice] ff0002b51240
 zfs:metaslab_activate+66 ()
 Oct  4 15:03:12 fozzie genunix: [ID 655072 kern.notice] ff0002b51300
 zfs:metaslab_group_alloc+24e ()
 Oct  4 15:03:12 fozzie genunix: [ID 655072 kern.notice] ff0002b513d0
 zfs:metaslab_alloc_dva+192 ()
 Oct  4 15:03:12 fozzie genunix: [ID 655072 kern.notice] ff0002b51470
 zfs:metaslab_alloc+82 ()
 Oct  4 15:03:12 fozzie genunix: [ID 655072 kern.notice] ff0002b514c0
 zfs:zio_dva_allocate+68 ()
 Oct  4 15:03:12 fozzie genunix: [ID 655072 kern.notice] ff0002b514e0
 zfs:zio_next_stage+b3 ()
 Oct  4 15:03:12 fozzie genunix: [ID 655072 kern.notice] ff0002b51510
 zfs:zio_checksum_generate+6e ()
 Oct  4 15:03:12 fozzie genunix: [ID 655072 kern.notice] ff0002b51530
 zfs:zio_next_stage+b3 ()
 Oct  4 15:03:12 fozzie genunix: [ID 655072 kern.notice] ff0002b515a0
 zfs:zio_write_compress+239 ()
 Oct  4 15:03:12 fozzie genunix: [ID 655072 kern.notice] ff0002b515c0
 zfs:zio_next_stage+b3 ()
 Oct  4 15:03:12 fozzie genunix: [ID 655072 kern.notice] ff0002b51610
 zfs:zio_wait_for_children+5d ()
 Oct  4 15:03:12 fozzie genunix: [ID 655072 kern.notice] ff0002b51630
 zfs:zio_wait_children_ready+20 ()
 Oct  4 15:03:12 fozzie genunix: [ID 655072 kern.notice] ff0002b51650
 zfs:zio_next_stage_async+bb ()
 Oct  4 15:03:12 fozzie genunix: [ID 655072 kern.notice] ff0002b51670
 zfs:zio_nowait+11 ()
 Oct  4 15:03:12 fozzie genunix: [ID 655072 kern.notice] ff0002b51960
 zfs:dbuf_sync_leaf+1ac ()
 Oct  4 15:03:12 fozzie genunix: [ID 655072 kern.notice] ff0002b519a0
 zfs:dbuf_sync_list+51 ()
 Oct  4 15:03:12 fozzie genunix: [ID 655072 kern.notice] ff0002b51a10
 zfs:dnode_sync+23b ()
 Oct  4 15:03:12 fozzie genunix: [ID 655072 kern.notice] ff0002b51a50
 zfs:dmu_objset_sync_dnodes+55 ()
 Oct  4 15:03:12 fozzie genunix: [ID 655072 kern.notice] ff0002b51ad0
 zfs:dmu_objset_sync+13d ()
 Oct  4 15:03:12 fozzie genunix: [ID 655072 kern.notice] ff0002b51b40
 zfs:dsl_pool_sync+199 ()
 Oct  4 15:03:12 fozzie genunix: [ID 655072 kern.notice] ff0002b51bd0
 zfs:spa_sync+1c5 ()
 Oct  4 15:03:12 fozzie genunix: [ID 655072 kern.notice] ff0002b51c60
 zfs:txg_sync_thread+19a ()
 Oct  4 15:03:12 fozzie genunix: [ID 655072 kern.notice] ff0002b51c70
 unix:thread_start+8 ()
 Oct  4 15:03:12 fozzie unix: [ID 10 kern.notice]

 Is this a known issue, already fixed in a later build, or should I bug it?

It shouldn't panic the machine, no. I'd raise a bug.

 After spending a little time playing with iscsi, I have to say it's
 almost inevitable that someone is going to do this by accident and panic
 a big box for what I see as no good reason. (though I'm happy to be
 educated... ;)

You use ACLs and TPGT groups to ensure 2 hosts can't simultaneously
access the same LUN by accident. You'd have the same problem with
Fibre Channel SANs.
-- 
Rasputin :: Jack of All Trades - Master of Nuns
http://number9.hellooperator.net/
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] When I stab myself with this knife, it hurts... But - should it kill me?

2007-10-04 Thread Ben Rockwood
Dick Davies wrote:
 On 04/10/2007, Nathan Kroenert [EMAIL PROTECTED] wrote:

   
 Client A
   - import pool make couple-o-changes

 Client B
   - import pool -f  (heh)
 

   
 Oct  4 15:03:12 fozzie ^Mpanic[cpu0]/thread=ff0002b51c80:
 Oct  4 15:03:12 fozzie genunix: [ID 603766 kern.notice] assertion
 failed: dmu_read(os, smo-smo_object, offset, size, entry_map) == 0 (0x5
 == 0x0)
 , file: ../../common/fs/zfs/space_map.c, line: 339
 Oct  4 15:03:12 fozzie unix: [ID 10 kern.notice]
 Oct  4 15:03:12 fozzie genunix: [ID 655072 kern.notice] ff0002b51160
 genunix:assfail3+b9 ()
 Oct  4 15:03:12 fozzie genunix: [ID 655072 kern.notice] ff0002b51200
 zfs:space_map_load+2ef ()
 Oct  4 15:03:12 fozzie genunix: [ID 655072 kern.notice] ff0002b51240
 zfs:metaslab_activate+66 ()
 Oct  4 15:03:12 fozzie genunix: [ID 655072 kern.notice] ff0002b51300
 zfs:metaslab_group_alloc+24e ()
 Oct  4 15:03:12 fozzie genunix: [ID 655072 kern.notice] ff0002b513d0
 zfs:metaslab_alloc_dva+192 ()
 Oct  4 15:03:12 fozzie genunix: [ID 655072 kern.notice] ff0002b51470
 zfs:metaslab_alloc+82 ()
 Oct  4 15:03:12 fozzie genunix: [ID 655072 kern.notice] ff0002b514c0
 zfs:zio_dva_allocate+68 ()
 Oct  4 15:03:12 fozzie genunix: [ID 655072 kern.notice] ff0002b514e0
 zfs:zio_next_stage+b3 ()
 Oct  4 15:03:12 fozzie genunix: [ID 655072 kern.notice] ff0002b51510
 zfs:zio_checksum_generate+6e ()
 Oct  4 15:03:12 fozzie genunix: [ID 655072 kern.notice] ff0002b51530
 zfs:zio_next_stage+b3 ()
 Oct  4 15:03:12 fozzie genunix: [ID 655072 kern.notice] ff0002b515a0
 zfs:zio_write_compress+239 ()
 Oct  4 15:03:12 fozzie genunix: [ID 655072 kern.notice] ff0002b515c0
 zfs:zio_next_stage+b3 ()
 Oct  4 15:03:12 fozzie genunix: [ID 655072 kern.notice] ff0002b51610
 zfs:zio_wait_for_children+5d ()
 Oct  4 15:03:12 fozzie genunix: [ID 655072 kern.notice] ff0002b51630
 zfs:zio_wait_children_ready+20 ()
 Oct  4 15:03:12 fozzie genunix: [ID 655072 kern.notice] ff0002b51650
 zfs:zio_next_stage_async+bb ()
 Oct  4 15:03:12 fozzie genunix: [ID 655072 kern.notice] ff0002b51670
 zfs:zio_nowait+11 ()
 Oct  4 15:03:12 fozzie genunix: [ID 655072 kern.notice] ff0002b51960
 zfs:dbuf_sync_leaf+1ac ()
 Oct  4 15:03:12 fozzie genunix: [ID 655072 kern.notice] ff0002b519a0
 zfs:dbuf_sync_list+51 ()
 Oct  4 15:03:12 fozzie genunix: [ID 655072 kern.notice] ff0002b51a10
 zfs:dnode_sync+23b ()
 Oct  4 15:03:12 fozzie genunix: [ID 655072 kern.notice] ff0002b51a50
 zfs:dmu_objset_sync_dnodes+55 ()
 Oct  4 15:03:12 fozzie genunix: [ID 655072 kern.notice] ff0002b51ad0
 zfs:dmu_objset_sync+13d ()
 Oct  4 15:03:12 fozzie genunix: [ID 655072 kern.notice] ff0002b51b40
 zfs:dsl_pool_sync+199 ()
 Oct  4 15:03:12 fozzie genunix: [ID 655072 kern.notice] ff0002b51bd0
 zfs:spa_sync+1c5 ()
 Oct  4 15:03:12 fozzie genunix: [ID 655072 kern.notice] ff0002b51c60
 zfs:txg_sync_thread+19a ()
 Oct  4 15:03:12 fozzie genunix: [ID 655072 kern.notice] ff0002b51c70
 unix:thread_start+8 ()
 Oct  4 15:03:12 fozzie unix: [ID 10 kern.notice]
 

   
 Is this a known issue, already fixed in a later build, or should I bug it?
 

 It shouldn't panic the machine, no. I'd raise a bug.

   
 After spending a little time playing with iscsi, I have to say it's
 almost inevitable that someone is going to do this by accident and panic
 a big box for what I see as no good reason. (though I'm happy to be
 educated... ;)
 

 You use ACLs and TPGT groups to ensure 2 hosts can't simultaneously
 access the same LUN by accident. You'd have the same problem with
 Fibre Channel SANs.
   
I ran into similar problems when replicating via AVS.

benr.
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] When I stab myself with this knife, it hurts... But - should it kill me?

2007-10-04 Thread Victor Engle
Wouldn't this be the known feature where a write error to zfs forces a panic?

Vic



On 10/4/07, Ben Rockwood [EMAIL PROTECTED] wrote:
 Dick Davies wrote:
  On 04/10/2007, Nathan Kroenert [EMAIL PROTECTED] wrote:
 
 
  Client A
- import pool make couple-o-changes
 
  Client B
- import pool -f  (heh)
 
 
 
  Oct  4 15:03:12 fozzie ^Mpanic[cpu0]/thread=ff0002b51c80:
  Oct  4 15:03:12 fozzie genunix: [ID 603766 kern.notice] assertion
  failed: dmu_read(os, smo-smo_object, offset, size, entry_map) == 0 (0x5
  == 0x0)
  , file: ../../common/fs/zfs/space_map.c, line: 339
  Oct  4 15:03:12 fozzie unix: [ID 10 kern.notice]
  Oct  4 15:03:12 fozzie genunix: [ID 655072 kern.notice] ff0002b51160
  genunix:assfail3+b9 ()
  Oct  4 15:03:12 fozzie genunix: [ID 655072 kern.notice] ff0002b51200
  zfs:space_map_load+2ef ()
  Oct  4 15:03:12 fozzie genunix: [ID 655072 kern.notice] ff0002b51240
  zfs:metaslab_activate+66 ()
  Oct  4 15:03:12 fozzie genunix: [ID 655072 kern.notice] ff0002b51300
  zfs:metaslab_group_alloc+24e ()
  Oct  4 15:03:12 fozzie genunix: [ID 655072 kern.notice] ff0002b513d0
  zfs:metaslab_alloc_dva+192 ()
  Oct  4 15:03:12 fozzie genunix: [ID 655072 kern.notice] ff0002b51470
  zfs:metaslab_alloc+82 ()
  Oct  4 15:03:12 fozzie genunix: [ID 655072 kern.notice] ff0002b514c0
  zfs:zio_dva_allocate+68 ()
  Oct  4 15:03:12 fozzie genunix: [ID 655072 kern.notice] ff0002b514e0
  zfs:zio_next_stage+b3 ()
  Oct  4 15:03:12 fozzie genunix: [ID 655072 kern.notice] ff0002b51510
  zfs:zio_checksum_generate+6e ()
  Oct  4 15:03:12 fozzie genunix: [ID 655072 kern.notice] ff0002b51530
  zfs:zio_next_stage+b3 ()
  Oct  4 15:03:12 fozzie genunix: [ID 655072 kern.notice] ff0002b515a0
  zfs:zio_write_compress+239 ()
  Oct  4 15:03:12 fozzie genunix: [ID 655072 kern.notice] ff0002b515c0
  zfs:zio_next_stage+b3 ()
  Oct  4 15:03:12 fozzie genunix: [ID 655072 kern.notice] ff0002b51610
  zfs:zio_wait_for_children+5d ()
  Oct  4 15:03:12 fozzie genunix: [ID 655072 kern.notice] ff0002b51630
  zfs:zio_wait_children_ready+20 ()
  Oct  4 15:03:12 fozzie genunix: [ID 655072 kern.notice] ff0002b51650
  zfs:zio_next_stage_async+bb ()
  Oct  4 15:03:12 fozzie genunix: [ID 655072 kern.notice] ff0002b51670
  zfs:zio_nowait+11 ()
  Oct  4 15:03:12 fozzie genunix: [ID 655072 kern.notice] ff0002b51960
  zfs:dbuf_sync_leaf+1ac ()
  Oct  4 15:03:12 fozzie genunix: [ID 655072 kern.notice] ff0002b519a0
  zfs:dbuf_sync_list+51 ()
  Oct  4 15:03:12 fozzie genunix: [ID 655072 kern.notice] ff0002b51a10
  zfs:dnode_sync+23b ()
  Oct  4 15:03:12 fozzie genunix: [ID 655072 kern.notice] ff0002b51a50
  zfs:dmu_objset_sync_dnodes+55 ()
  Oct  4 15:03:12 fozzie genunix: [ID 655072 kern.notice] ff0002b51ad0
  zfs:dmu_objset_sync+13d ()
  Oct  4 15:03:12 fozzie genunix: [ID 655072 kern.notice] ff0002b51b40
  zfs:dsl_pool_sync+199 ()
  Oct  4 15:03:12 fozzie genunix: [ID 655072 kern.notice] ff0002b51bd0
  zfs:spa_sync+1c5 ()
  Oct  4 15:03:12 fozzie genunix: [ID 655072 kern.notice] ff0002b51c60
  zfs:txg_sync_thread+19a ()
  Oct  4 15:03:12 fozzie genunix: [ID 655072 kern.notice] ff0002b51c70
  unix:thread_start+8 ()
  Oct  4 15:03:12 fozzie unix: [ID 10 kern.notice]
 
 
 
  Is this a known issue, already fixed in a later build, or should I bug it?
 
 
  It shouldn't panic the machine, no. I'd raise a bug.
 
 
  After spending a little time playing with iscsi, I have to say it's
  almost inevitable that someone is going to do this by accident and panic
  a big box for what I see as no good reason. (though I'm happy to be
  educated... ;)
 
 
  You use ACLs and TPGT groups to ensure 2 hosts can't simultaneously
  access the same LUN by accident. You'd have the same problem with
  Fibre Channel SANs.
 
 I ran into similar problems when replicating via AVS.

 benr.
 ___
 zfs-discuss mailing list
 zfs-discuss@opensolaris.org
 http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] When I stab myself with this knife, it hurts... But - should it kill me?

2007-10-04 Thread Nathan Kroenert
I think it's a little more sinister than that...

I'm only just trying to import the pool. Not even yet doing any I/O to it...

Perhaps it's the same cause, I don't know...

But I'm certainly not convinced that I'd be happy with a 25K, for 
example, panicing just because I tried to import a dud pool...

I'm ok(ish) with the panic on a failed write to a non-redundant storage. 
I expect it by now...

Cheers!

Nathan.

Victor Engle wrote:
 Wouldn't this be the known feature where a write error to zfs forces a panic?
 
 Vic
 
 
 
 On 10/4/07, Ben Rockwood [EMAIL PROTECTED] wrote:
 Dick Davies wrote:
 On 04/10/2007, Nathan Kroenert [EMAIL PROTECTED] wrote:


 Client A
   - import pool make couple-o-changes

 Client B
   - import pool -f  (heh)


 Oct  4 15:03:12 fozzie ^Mpanic[cpu0]/thread=ff0002b51c80:
 Oct  4 15:03:12 fozzie genunix: [ID 603766 kern.notice] assertion
 failed: dmu_read(os, smo-smo_object, offset, size, entry_map) == 0 (0x5
 == 0x0)
 , file: ../../common/fs/zfs/space_map.c, line: 339
 Oct  4 15:03:12 fozzie unix: [ID 10 kern.notice]
 Oct  4 15:03:12 fozzie genunix: [ID 655072 kern.notice] ff0002b51160
 genunix:assfail3+b9 ()
 Oct  4 15:03:12 fozzie genunix: [ID 655072 kern.notice] ff0002b51200
 zfs:space_map_load+2ef ()
 Oct  4 15:03:12 fozzie genunix: [ID 655072 kern.notice] ff0002b51240
 zfs:metaslab_activate+66 ()
 Oct  4 15:03:12 fozzie genunix: [ID 655072 kern.notice] ff0002b51300
 zfs:metaslab_group_alloc+24e ()
 Oct  4 15:03:12 fozzie genunix: [ID 655072 kern.notice] ff0002b513d0
 zfs:metaslab_alloc_dva+192 ()
 Oct  4 15:03:12 fozzie genunix: [ID 655072 kern.notice] ff0002b51470
 zfs:metaslab_alloc+82 ()
 Oct  4 15:03:12 fozzie genunix: [ID 655072 kern.notice] ff0002b514c0
 zfs:zio_dva_allocate+68 ()
 Oct  4 15:03:12 fozzie genunix: [ID 655072 kern.notice] ff0002b514e0
 zfs:zio_next_stage+b3 ()
 Oct  4 15:03:12 fozzie genunix: [ID 655072 kern.notice] ff0002b51510
 zfs:zio_checksum_generate+6e ()
 Oct  4 15:03:12 fozzie genunix: [ID 655072 kern.notice] ff0002b51530
 zfs:zio_next_stage+b3 ()
 Oct  4 15:03:12 fozzie genunix: [ID 655072 kern.notice] ff0002b515a0
 zfs:zio_write_compress+239 ()
 Oct  4 15:03:12 fozzie genunix: [ID 655072 kern.notice] ff0002b515c0
 zfs:zio_next_stage+b3 ()
 Oct  4 15:03:12 fozzie genunix: [ID 655072 kern.notice] ff0002b51610
 zfs:zio_wait_for_children+5d ()
 Oct  4 15:03:12 fozzie genunix: [ID 655072 kern.notice] ff0002b51630
 zfs:zio_wait_children_ready+20 ()
 Oct  4 15:03:12 fozzie genunix: [ID 655072 kern.notice] ff0002b51650
 zfs:zio_next_stage_async+bb ()
 Oct  4 15:03:12 fozzie genunix: [ID 655072 kern.notice] ff0002b51670
 zfs:zio_nowait+11 ()
 Oct  4 15:03:12 fozzie genunix: [ID 655072 kern.notice] ff0002b51960
 zfs:dbuf_sync_leaf+1ac ()
 Oct  4 15:03:12 fozzie genunix: [ID 655072 kern.notice] ff0002b519a0
 zfs:dbuf_sync_list+51 ()
 Oct  4 15:03:12 fozzie genunix: [ID 655072 kern.notice] ff0002b51a10
 zfs:dnode_sync+23b ()
 Oct  4 15:03:12 fozzie genunix: [ID 655072 kern.notice] ff0002b51a50
 zfs:dmu_objset_sync_dnodes+55 ()
 Oct  4 15:03:12 fozzie genunix: [ID 655072 kern.notice] ff0002b51ad0
 zfs:dmu_objset_sync+13d ()
 Oct  4 15:03:12 fozzie genunix: [ID 655072 kern.notice] ff0002b51b40
 zfs:dsl_pool_sync+199 ()
 Oct  4 15:03:12 fozzie genunix: [ID 655072 kern.notice] ff0002b51bd0
 zfs:spa_sync+1c5 ()
 Oct  4 15:03:12 fozzie genunix: [ID 655072 kern.notice] ff0002b51c60
 zfs:txg_sync_thread+19a ()
 Oct  4 15:03:12 fozzie genunix: [ID 655072 kern.notice] ff0002b51c70
 unix:thread_start+8 ()
 Oct  4 15:03:12 fozzie unix: [ID 10 kern.notice]


 Is this a known issue, already fixed in a later build, or should I bug it?

 It shouldn't panic the machine, no. I'd raise a bug.


 After spending a little time playing with iscsi, I have to say it's
 almost inevitable that someone is going to do this by accident and panic
 a big box for what I see as no good reason. (though I'm happy to be
 educated... ;)

 You use ACLs and TPGT groups to ensure 2 hosts can't simultaneously
 access the same LUN by accident. You'd have the same problem with
 Fibre Channel SANs.

 I ran into similar problems when replicating via AVS.

 benr.
 ___
 zfs-discuss mailing list
 zfs-discuss@opensolaris.org
 http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

 ___
 zfs-discuss mailing list
 zfs-discuss@opensolaris.org
 http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] When I stab myself with this knife, it hurts... But - should it kill me?

2007-10-04 Thread Victor Engle
 Perhaps it's the same cause, I don't know...

 But I'm certainly not convinced that I'd be happy with a 25K, for
 example, panicing just because I tried to import a dud pool...

 I'm ok(ish) with the panic on a failed write to a non-redundant storage.
 I expect it by now...


I agree, forcing a panic seems to be pretty severe and may cause as
much grief as it prevents. Why not just stop allowing I/O to the pool
so the sys admin can gracefully shutdown the system? Applications
would be disrupted but no more so than they would be disrupted during
a panic.
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] When I stab myself with this knife, it hurts... But - should it kill me?

2007-10-04 Thread eric kustarz

 Client A
   - import pool make couple-o-changes

 Client B
   - import pool -f  (heh)

 Client A + B - With both mounting the same pool, touched a couple of
 files, and removed a couple of files from each client

 Client A + B - zpool export

 Client A - Attempted import and dropped the panic.


ZFS is not a clustered file system.  It cannot handle multiple  
readers (or multiple writers).  By importing the pool on multiple  
machines, you have corrupted the pool.

You purposely did that by adding the '-f' option to 'zpool import'.   
Without the '-f' option ZFS would have told you that its already  
imported on another machine (A).

There is no bug here (besides admin error :)  ).

eric
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] When I stab myself with this knife, it hurts... But - should it kill me?

2007-10-04 Thread A Darren Dunham
On Thu, Oct 04, 2007 at 08:36:10AM -0600, eric kustarz wrote:
  Client A
- import pool make couple-o-changes
 
  Client B
- import pool -f  (heh)
 
  Client A + B - With both mounting the same pool, touched a couple of
  files, and removed a couple of files from each client
 
  Client A + B - zpool export
 
  Client A - Attempted import and dropped the panic.
 
 
 ZFS is not a clustered file system.  It cannot handle multiple  
 readers (or multiple writers).  By importing the pool on multiple  
 machines, you have corrupted the pool.

Yes.

 You purposely did that by adding the '-f' option to 'zpool import'.   
 Without the '-f' option ZFS would have told you that its already  
 imported on another machine (A).
 
 There is no bug here (besides admin error :)  ).

My reading is that the complaint is not about corrupting the pool.  The
complaint is that once a pool has become corrupted, it shouldn't cause a
panic on import.  It seems reasonable to detect this and fail the import
instead.

-- 
Darren Dunham   [EMAIL PROTECTED]
Senior Technical Consultant TAOShttp://www.taos.com/
Got some Dr Pepper?   San Francisco, CA bay area
  This line left intentionally blank to confuse you. 
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] When I stab myself with this knife, it hurts... But - should it kill me?

2007-10-04 Thread Nathan Kroenert
Erik -

Thanks for that, but I know the pool is corrupted - That was kind if the 
point of the exercise.

The bug (at least to me) is ZFS panicing Solaris just trying to import 
the dud pool.

But, maybe I'm missing your point?

Nathan.




eric kustarz wrote:

 Client A
   - import pool make couple-o-changes

 Client B
   - import pool -f  (heh)

 Client A + B - With both mounting the same pool, touched a couple of
 files, and removed a couple of files from each client

 Client A + B - zpool export

 Client A - Attempted import and dropped the panic.

 
 ZFS is not a clustered file system.  It cannot handle multiple readers 
 (or multiple writers).  By importing the pool on multiple machines, you 
 have corrupted the pool.
 
 You purposely did that by adding the '-f' option to 'zpool import'.  
 Without the '-f' option ZFS would have told you that its already 
 imported on another machine (A).
 
 There is no bug here (besides admin error :)  ).
 
 eric
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] When I stab myself with this knife, it hurts... But - should it kill me?

2007-10-04 Thread Eric Schrock
On Fri, Oct 05, 2007 at 08:20:13AM +1000, Nathan Kroenert wrote:
 Erik -
 
 Thanks for that, but I know the pool is corrupted - That was kind if the 
 point of the exercise.
 
 The bug (at least to me) is ZFS panicing Solaris just trying to import 
 the dud pool.
 
 But, maybe I'm missing your point?
 
 Nathan.

This a variation on the read error while writing problem.  It is a
known issue and a generic solution (to handle any kind of non-replicated
writes failing) is in the works (see PSARC 2007/567).

- Eric

--
Eric Schrock, Solaris Kernel Development   http://blogs.sun.com/eschrock
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] When I stab myself with this knife, it hurts... But - should it kill me?

2007-10-04 Thread Nathan Kroenert
Awesome.

Thanks, Eric. :)

This type of feature / fix is quite important to a number of the guys in 
the our local OSUG. In particular, they are adamant that they cannot use 
ZFS in production until it stops panicing the whole box for isolated 
filesystem / zpool failures.

This will be a big step. :)

Cheers.

Nathan.

Eric Schrock wrote:
 On Fri, Oct 05, 2007 at 08:20:13AM +1000, Nathan Kroenert wrote:
 Erik -

 Thanks for that, but I know the pool is corrupted - That was kind if the 
 point of the exercise.

 The bug (at least to me) is ZFS panicing Solaris just trying to import 
 the dud pool.

 But, maybe I'm missing your point?

 Nathan.
 
 This a variation on the read error while writing problem.  It is a
 known issue and a generic solution (to handle any kind of non-replicated
 writes failing) is in the works (see PSARC 2007/567).
 
 - Eric
 
 --
 Eric Schrock, Solaris Kernel Development   http://blogs.sun.com/eschrock
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss