Re: [zfs-discuss] Strange send failure

2012-02-09 Thread Cindy Swearingen

Hi Ian,

This looks like CR 7097870.

To resolve this problem, apply the latest s11 SRU to both systems.

Thanks,

Cindy

On 02/08/12 17:55, Ian Collins wrote:

Hello,

I'm attempting to dry run the send the root data set of a zone from one
Solaris 11 host to another:

sudo zfs send -r rpool/zoneRoot/zone@to_send | sudo ssh remote zfs
receive -ven fileserver/zones

But I'm seeing

cannot receive: stream has unsupported feature, feature flags = 24

The source pool version is 31, the remote pool version is 33. Both the
source filesystem and parent on the remote box are version 5.

I've never seen this before, any clues?


___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] zpool fails with panic in zio_ddt_free()

2012-02-09 Thread Jim Klimov

2012-02-04 18:27, Jim Klimov wrote:

panicstr = BAD TRAP: type=e (#pf Page fault) rp=ff0010a5e920
addr=30 occurred in module zfs due to a NULL pointer dereference
panicstack = unix:die+dd () | unix:trap+1799 () | unix:cmntrap+e6
() | zfs:ddt_phys_decref+c () | zfs:zio_ddt_free+5c () |
zfs:zio_execute+8d () | genunix:taskq_thread+285 () |
unix:thread_start+8 () | crashtime = 1328314064
panic-time = February 4, 2012 04:07:44 AM MSK MSK
(end fault-list[0])


I've finally looked into the code, and the problem seems
weird indeed. With properly wrong on-disk data the code
can take such a path that it tries to dereference a NULL
pointer. This is a bug, methinks:

http://src.illumos.org/source/xref/illumos-gate/usr/src/uts/common/fs/zfs/zio.c#zio_ddt_free

   2100 static int
   2101 zio_ddt_free(zio_t *zio)
   2102 {
   2103 spa_t *spa = zio-io_spa;
   2104 blkptr_t *bp = zio-io_bp;
   2105 ddt_t *ddt = ddt_select(spa, bp);
   2106 ddt_entry_t *dde;
   2107 ddt_phys_t *ddp;
   2108
   2109 ASSERT(BP_GET_DEDUP(bp));
   2110 ASSERT(zio-io_child_type == ZIO_CHILD_LOGICAL);
   2111
   2112 ddt_enter(ddt);
   2113 freedde = dde = ddt_lookup(ddt, bp, B_TRUE);
   2114 ddp = ddt_phys_select(dde, bp);
   2115 ddt_phys_decref(ddp);
   2116 ddt_exit(ddt);
   2117
   2118 return (ZIO_PIPELINE_CONTINUE);
   2119 }

The way I see it, this function:
2109) asserts that the dedup bit is set on the blockpointer,

2113) selects a DDT entry matching the BP with ddt_lookup()
  Strangely for me, it requests creation of a new DDT
  entry if a matching one doesn't exist (B_TRUE) -
  even though this is a DDT-free routine.
  I doubt this autocreated entry gets a DVA assigned,
  but I did not trace the code that far to be certain.

2114) then it tries to find the ddp pointer to the DDT
  entry with ddt_phys_select(), which can validly return
  NULL if no DDT entries, with same DVA[0] and same
  phys_birth as the bp, exist. I speculated above that
  an autocreated DDT entry might have no DVA yet, maybe.
  I also wonder about cases where phys_birth TXGs would
  differ for some reason, even if an otherwise matching
  DDT entry exists (BP rewrite in future, some lag in
  committing-to-disk today)?
http://src.illumos.org/source/xref/illumos-gate/usr/src/uts/common/fs/zfs/ddt.c#ddt_phys_select

2115) ddt_phys_decref() just tries to use ddp as a pointer
  to structure without checking if it is NULL itself:
http://src.illumos.org/source/xref/illumos-gate/usr/src/uts/common/fs/zfs/ddt.c#ddt_phys_decref

309 void
310 ddt_phys_decref(ddt_phys_t *ddp)
311 {
312 ASSERT((int64_t)ddp-ddp_refcnt  0);
313 ddp-ddp_refcnt--;
314 }
315

According to my stacktrace in the kernel panic message,
this should be where the system panics and fails.

Seemingly this should work if on-disk data is consistent.

Since I've had some strangeness regarding corrupt data
blocks pointed to by DDT, with some hickups and reboots
during or soon-after repair attempts, I can't vouch that
my on-disk data is consistent, or that the whole block
tree is readable and accessible to the OS now.

But the code allows for failure is something goes wrong.
I'd add a lot more NULL checks for values (especially
pointers) returned by functions called from zio_ddt_free()
and/or sanity checks for values fed into such functions.

Actually, I'd rewrite this part of zio_ddt_free() like
this:

-   freedde = dde = ddt_lookup(ddt, bp, B_TRUE);
+   freedde = dde = ddt_lookup(ddt, bp, B_FALSE);
+   if (dde) {
ddp = ddt_phys_select(dde, bp);
+   if (ddp) {
ddt_phys_decref(ddp);
+   }
+   }
ddt_exit(ddt);

If ddt_lookup() finds no matching DDE, fine - we're freed.
If ddt_phys_select() finds no DDP, not so fine, but that's
not a good reason to kernel-panic. Perhaps log a message
and maybe leave a leaked block? Or relax the phys_birth check?
Am I mistaken anywhere? :)

Thanks,
//Jim

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss