Re: ZFS panic in -CURRENT

2014-04-15 Thread R. Tyler Croy

(follow up below)

On 04/01/2014 06:57, R. Tyler Croy wrote:

On Tue, 01 Apr 2014 09:41:45 +0300
Andriy Gapon a...@freebsd.org wrote:


on 01/04/2014 02:22 R. Tyler Croy said the following:

Bumping this with more details

On Fri, 28 Mar 2014 09:53:32 -0700
R Tyler Croy ty...@monkeypox.org wrote:


Apologies for the rough format here, I had to take a picture of
this failure because I didn't know what else to do.

http://www.flickr.com/photos/agentdero/13469355463/

I'm building off of the GitHub freebsd.git mirror here, and the
latest commit in the tree is neel@'s Add an ioctl to suspend..

My dmesg/pciconf are here:
https://gist.github.com/rtyler/1faa854dff7c4396d9e8

As linked before, the dmesg and `pciconf -lv` output can be found
here: https://gist.github.com/rtyler/1faa854dff7c4396d9e8

Also in addition to the photo from before of the panic, here's
another reproduction photo:
https://www.flickr.com/photos/agentdero/13472248423/

Are you or have you even been running with any ZFS-related kernel
patches?

Negative, I've never run any specific ZFS patches on this machine (or
any machine for that matter!)

One other unique clue might be that I'm running with an encrypted
zpool, other than that, nothing fancy here.



I've upgraded my machine to r264387 and I still experience the issue, 
here's the latest pretty picture of my panicked laptop :) 
https://secure.flickr.com/photos/agentdero/13880032704/


The issue still seems to stem from a failed assertion in 
zap_leaf_lookup_closest() 
(http://svnweb.freebsd.org/base/head/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zap_leaf.c?revision=249195view=markup#l446) 
but I'm not sure which assertion might be failing.


This is somewhat problematic because I cannot perform *any* FS 
operations with the tainted directory tree, not even a `du -hcs *` to 
find out how much space I can never access again :P



I can reproduce this consistently, if anybody has the time to get onto 
IRC (rtyler on Freenode and EFNet) and debug this, I can certainly act 
as remote hands with kdb to help ascertain more information about the panic.



Cheers


___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: ZFS panic in -CURRENT

2014-04-02 Thread Andriy Gapon
on 01/04/2014 16:57 R. Tyler Croy said the following:
 On Tue, 01 Apr 2014 09:41:45 +0300
 Andriy Gapon a...@freebsd.org wrote:
 
 on 01/04/2014 02:22 R. Tyler Croy said the following:
...
 Also in addition to the photo from before of the panic, here's
 another reproduction photo:
 https://www.flickr.com/photos/agentdero/13472248423/

 Are you or have you even been running with any ZFS-related kernel
 patches?
 
 
 Negative, I've never run any specific ZFS patches on this machine (or
 any machine for that matter!)
 
 One other unique clue might be that I'm running with an encrypted
 zpool, other than that, nothing fancy here.

Your problem looks like a corruption of on-disk data.
I can not say how it came to be or how to fix it now.

-- 
Andriy Gapon
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: ZFS panic in -CURRENT

2014-04-02 Thread R. Tyler Croy
On Wed, 02 Apr 2014 09:58:37 +0300
Andriy Gapon a...@freebsd.org wrote:

 on 01/04/2014 16:57 R. Tyler Croy said the following:
  On Tue, 01 Apr 2014 09:41:45 +0300
  Andriy Gapon a...@freebsd.org wrote:
  
  on 01/04/2014 02:22 R. Tyler Croy said the following:
 ...
  Also in addition to the photo from before of the panic, here's
  another reproduction photo:
  https://www.flickr.com/photos/agentdero/13472248423/
 
  Are you or have you even been running with any ZFS-related kernel
  patches?
  
  
  Negative, I've never run any specific ZFS patches on this machine
  (or any machine for that matter!)
  
  One other unique clue might be that I'm running with an encrypted
  zpool, other than that, nothing fancy here.
 
 Your problem looks like a corruption of on-disk data.
 I can not say how it came to be or how to fix it now.
 


This is concerning to me, I'm using an intel 128GB SSD which is less
than 6 months old. If there is an actual disk-level corruption,
shouldn't that manifest itself as a zpool error?


:/

-- 

- R. Tyler Croy

--
 Code: https://github.com/rtyler
  Chatter: https://twitter.com/agentdero

  % gpg --keyserver keys.gnupg.net --recv-key 3F51E16F
--
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: ZFS panic in -CURRENT

2014-04-02 Thread Andriy Gapon
on 02/04/2014 19:48 R. Tyler Croy said the following:
 On Wed, 02 Apr 2014 09:58:37 +0300
 Andriy Gapon a...@freebsd.org wrote:
 
 on 01/04/2014 16:57 R. Tyler Croy said the following:
 On Tue, 01 Apr 2014 09:41:45 +0300
 Andriy Gapon a...@freebsd.org wrote:

 on 01/04/2014 02:22 R. Tyler Croy said the following:
 ...
 Also in addition to the photo from before of the panic, here's
 another reproduction photo:
 https://www.flickr.com/photos/agentdero/13472248423/

 Are you or have you even been running with any ZFS-related kernel
 patches?


 Negative, I've never run any specific ZFS patches on this machine
 (or any machine for that matter!)

 One other unique clue might be that I'm running with an encrypted
 zpool, other than that, nothing fancy here.

 Your problem looks like a corruption of on-disk data.
 I can not say how it came to be or how to fix it now.

 
 
 This is concerning to me, I'm using an intel 128GB SSD which is less
 than 6 months old. If there is an actual disk-level corruption,
 shouldn't that manifest itself as a zpool error?

I am afraid that this is a different kind of corruption.  Either a bug (possibly
old, already fixes) in ZFS or a corruption that happened in RAM before a buffer
was sent to a disk.


-- 
Andriy Gapon
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: ZFS panic in -CURRENT

2014-04-01 Thread Andriy Gapon
on 01/04/2014 02:22 R. Tyler Croy said the following:
 Bumping this with more details
 
 On Fri, 28 Mar 2014 09:53:32 -0700
 R Tyler Croy ty...@monkeypox.org wrote:
 
 Apologies for the rough format here, I had to take a picture of this
 failure because I didn't know what else to do.

 http://www.flickr.com/photos/agentdero/13469355463/

 I'm building off of the GitHub freebsd.git mirror here, and the
 latest commit in the tree is neel@'s Add an ioctl to suspend..

 My dmesg/pciconf are here:
 https://gist.github.com/rtyler/1faa854dff7c4396d9e8
 
 
 As linked before, the dmesg and `pciconf -lv` output can be found here:
 https://gist.github.com/rtyler/1faa854dff7c4396d9e8
 
 Also in addition to the photo from before of the panic, here's another
 reproduction photo:
 https://www.flickr.com/photos/agentdero/13472248423/

Are you or have you even been running with any ZFS-related kernel patches?

 I'm running -CURRENT as of r263881 right now, with a custom kernel
 which is built on top of the VT kernel
 (https://github.com/rtyler/freebsd/blob/5e324960f1f2b7079de369204fe228db4a2ec99d/sys/amd64/conf/KIWI)
 
 I'm able to get this panic *consistently* whenever a process accesses
 my maildir folder which I sync with the mbsync program (isync package),
 such as `mbsync personal` or when I back up the maildir with duplicity.
 The commonality seems to be listing or accessing portions of this file
 tree. Curiously enough it only seems to be isolated to that single
 portion of the filesystem tree.
 
 The zpool is also clean as far as errors go:
 
 [16:11:03] tyler:freebsd git:(master*) $ zpool status zroot
   pool: zroot
  state: ONLINE
 status: Some supported features are not enabled on the pool. The pool
 can still be used, but some features are unavailable.
 action: Enable all features using 'zpool upgrade'. Once this is done,
 the pool may no longer be accessible by software that does not
 support the features. See zpool-features(7) for details.
   scan: scrub repaired 0 in 0h18m with 0 errors on Fri Mar 28 11:55:03
 2014 config:

 NAME  STATE READ WRITE CKSUM
 zroot ONLINE   0 0 0
   ada0p3.eli  ONLINE   0 0 0

 errors: No known data errors
 [16:19:57] tyler:freebsd git:(master*) $ 
 
 
 I'm not sure what other data would be useful here, I can consistently
 see the panic, but this data is highly personal, so I'm not sure how
 much of a repro case I can give folks. :(
 
 Cheers
 


-- 
Andriy Gapon
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: ZFS panic in -CURRENT

2014-04-01 Thread R. Tyler Croy
On Tue, 01 Apr 2014 09:41:45 +0300
Andriy Gapon a...@freebsd.org wrote:

 on 01/04/2014 02:22 R. Tyler Croy said the following:
  Bumping this with more details
  
  On Fri, 28 Mar 2014 09:53:32 -0700
  R Tyler Croy ty...@monkeypox.org wrote:
  
  Apologies for the rough format here, I had to take a picture of
  this failure because I didn't know what else to do.
 
  http://www.flickr.com/photos/agentdero/13469355463/
 
  I'm building off of the GitHub freebsd.git mirror here, and the
  latest commit in the tree is neel@'s Add an ioctl to suspend..
 
  My dmesg/pciconf are here:
  https://gist.github.com/rtyler/1faa854dff7c4396d9e8
  
  
  As linked before, the dmesg and `pciconf -lv` output can be found
  here: https://gist.github.com/rtyler/1faa854dff7c4396d9e8
  
  Also in addition to the photo from before of the panic, here's
  another reproduction photo:
  https://www.flickr.com/photos/agentdero/13472248423/
 
 Are you or have you even been running with any ZFS-related kernel
 patches?


Negative, I've never run any specific ZFS patches on this machine (or
any machine for that matter!)

One other unique clue might be that I'm running with an encrypted
zpool, other than that, nothing fancy here.



- R. Tyler Croy

--
 Code: https://github.com/rtyler
  Chatter: https://twitter.com/agentdero

  % gpg --keyserver keys.gnupg.net --recv-key 3F51E16F
--
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: ZFS panic in -CURRENT

2014-03-31 Thread R. Tyler Croy
Bumping this with more details

On Fri, 28 Mar 2014 09:53:32 -0700
R Tyler Croy ty...@monkeypox.org wrote:

 Apologies for the rough format here, I had to take a picture of this
 failure because I didn't know what else to do.
 
 http://www.flickr.com/photos/agentdero/13469355463/
 
 I'm building off of the GitHub freebsd.git mirror here, and the
 latest commit in the tree is neel@'s Add an ioctl to suspend..
 
 My dmesg/pciconf are here:
 https://gist.github.com/rtyler/1faa854dff7c4396d9e8


As linked before, the dmesg and `pciconf -lv` output can be found here:
https://gist.github.com/rtyler/1faa854dff7c4396d9e8

Also in addition to the photo from before of the panic, here's another
reproduction photo:
https://www.flickr.com/photos/agentdero/13472248423/

I'm running -CURRENT as of r263881 right now, with a custom kernel
which is built on top of the VT kernel
(https://github.com/rtyler/freebsd/blob/5e324960f1f2b7079de369204fe228db4a2ec99d/sys/amd64/conf/KIWI)

I'm able to get this panic *consistently* whenever a process accesses
my maildir folder which I sync with the mbsync program (isync package),
such as `mbsync personal` or when I back up the maildir with duplicity.
The commonality seems to be listing or accessing portions of this file
tree. Curiously enough it only seems to be isolated to that single
portion of the filesystem tree.

The zpool is also clean as far as errors go:

 [16:11:03] tyler:freebsd git:(master*) $ zpool status zroot
   pool: zroot
  state: ONLINE
 status: Some supported features are not enabled on the pool. The pool
 can still be used, but some features are unavailable.
 action: Enable all features using 'zpool upgrade'. Once this is done,
 the pool may no longer be accessible by software that does not
 support the features. See zpool-features(7) for details.
   scan: scrub repaired 0 in 0h18m with 0 errors on Fri Mar 28 11:55:03
 2014 config:
 
 NAME  STATE READ WRITE CKSUM
 zroot ONLINE   0 0 0
   ada0p3.eli  ONLINE   0 0 0
 
 errors: No known data errors
 [16:19:57] tyler:freebsd git:(master*) $ 


I'm not sure what other data would be useful here, I can consistently
see the panic, but this data is highly personal, so I'm not sure how
much of a repro case I can give folks. :(

Cheers
-- 

- R. Tyler Croy

--
 Code: https://github.com/rtyler
  Chatter: https://twitter.com/agentdero

  % gpg --keyserver keys.gnupg.net --recv-key 3F51E16F
--
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org