Re: Panic in ZFS: solaris assert: dn-dn_datablkshift != 0
FYI, I missed that this fix has already gone into illumos: commit 5253393b09789ec67bec153b866d7285a1cf1645 Author: Matthew Ahrens mahr...@delphix.com Date: Fri Aug 30 02:19:35 2013 4082 zfs receive gets EFBIG from dmu_tx_hold_free() Reviewed by: Eric Schrock eric.schr...@delphix.com Reviewed by: Christopher Siden christopher.si...@delphix.com Reviewed by: George Wilson george.wil...@delphix.com Approved by: Richard Lowe richl...@richlowe.net --matt On Wed, Sep 11, 2013 at 7:43 AM, Jeremie Le Hen j...@freebsd.org wrote: On Mon, Sep 09, 2013 at 09:32:26AM +0200, Jeremie Le Hen wrote: Indeed, probably a bad key combo in vi :). I'm reverting r253821 and r254753 (the second one was supposingly fixing the first one) and recompiling my kernel. I will let you know. So far so good, I've been able to synchronize my datasets beyond the point where it crashed last time. Matthew, do you have any idea of a fix I could try on top of FreeBSD's r253821 and r254753? There has been some debugging on the ZFS mailing-list and I have tested a working fix. See: http://www.listbox.com/member/archive/182191/2013/09/sort/time_rev/page/1/entry/1:38/20130909182626:D79EC5B8-199E-11E3-8BF5-CB08091A731B/ -- Jeremie Le Hen Scientists say the world is made up of Protons, Neutrons and Electrons. They forgot to mention Morons. ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: Panic in ZFS: solaris assert: dn-dn_datablkshift != 0
On Mon, Sep 09, 2013 at 09:32:26AM +0200, Jeremie Le Hen wrote: Indeed, probably a bad key combo in vi :). I'm reverting r253821 and r254753 (the second one was supposingly fixing the first one) and recompiling my kernel. I will let you know. So far so good, I've been able to synchronize my datasets beyond the point where it crashed last time. Matthew, do you have any idea of a fix I could try on top of FreeBSD's r253821 and r254753? There has been some debugging on the ZFS mailing-list and I have tested a working fix. See: http://www.listbox.com/member/archive/182191/2013/09/sort/time_rev/page/1/entry/1:38/20130909182626:D79EC5B8-199E-11E3-8BF5-CB08091A731B/ -- Jeremie Le Hen Scientists say the world is made up of Protons, Neutrons and Electrons. They forgot to mention Morons. ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: Panic in ZFS: solaris assert: dn-dn_datablkshift != 0
On Sun, Sep 08, 2013 at 11:14:44PM +0200, Jeremie Le Hen wrote: On Sun, Sep 08, 2013 at 10:05:55PM +0100, Steven Hartland wrote: On Sun, Sep 08, 2013 at 09:02:48PM +0100, Steven Hartland wrote: On Sun, Sep 08, 2013 at 03:17:09PM +0100, Steven Hartland wrote: I believe this was added by this change set:- http://svnweb.freebsd.org/base?view=revisionrevision=253821 Might want to try back out that change and see if everything works after that? Actually, I already rolled back my kernel to August 1st: # svn info . Path: . Working Copy Root Path: /usr/src URL: http://svn0.us-west.freebsd.org/base/head/sys Repository Root: http://svn0.us-west.freebsd.org/base Repository UUID: ccf9f872-aa2e-dd11-9fc8-001c23d0bc1f Revision: 253847 Node Kind: directory Schedule: normal Last Changed Author: ian Last Changed Rev: 253847 Last Changed Date: 2013-07-31 21:14:00 +0200 (Wed, 31 Jul 2013) And the problem seems to have gone away. I could perform a full zfs send/receive whereas it would trigger a panic 100% of the time with a recent kernel. I still think r253821 is the cause the reason being is prior to r253996 ASSERTS in ZFS where not actually active in HEAD. So if you could roll forward but then backout r253821 and confirm this is indeed the cause that would be a good starting point. If this is indeed the cause be worth engaging Matthew Ahrens cc'ed to find out the reasoning behind the new ASSERT why you may be hitting it? Errm, was there meant to be some content in your reply Jeremie, as it seems to be missing? Indeed, probably a bad key combo in vi :). I'm reverting r253821 and r254753 (the second one was supposingly fixing the first one) and recompiling my kernel. I will let you know. So far so good, I've been able to synchronize my datasets beyond the point where it crashed last time. Matthew, do you have any idea of a fix I could try on top of FreeBSD's r253821 and r254753? -- Jeremie Le Hen Scientists say the world is made up of Protons, Neutrons and Electrons. They forgot to mention Morons. ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: Panic in ZFS: solaris assert: dn-dn_datablkshift != 0
On Sat, Sep 07, 2013 at 02:35:45PM +0200, Jeremie Le Hen wrote: Hi, I have the following panic every time I do a zfs receive on a given dataset. For the background, I synchronize a zfs dataset every couple of minutes using zfs send/receive. I think I recently got a panic (I'm running mav@'s geom direct dispatch patch) which probably happen at a bad time and left the snapshot/data in an inconsistent state. Now, whenever my cron job runs, it triggers the panic. The process that triggers the panic is: zfs receive -F data/jail/caravan Probably relevant, on boot, I have the following message: Solaris: WARNING: can't open objset for data/jail/caravan/%recv I have a core around if needed to debug. I will not try to repair the snapshot/dataset during this weekend, to get a chance to test a patch. Afterward I will have to start this job again. panic: solaris assert: dn-dn_datablkshift != 0, file: /usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/dmu_tx.c, line: 638 cpuid = 1 KDB: stack backtrace: db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame 0xfe00e62401a0 kdb_backtrace() at kdb_backtrace+0x39/frame 0xfe00e6240250 vpanic() at vpanic+0x126/frame 0xfe00e6240290 panic() at panic+0x43/frame 0xfe00e62402f0 assfail() at assfail+0x22/frame 0xfe00e6240300 dmu_tx_hold_free() at dmu_tx_hold_free+0x167/frame 0xfe00e62403e0 dmu_free_long_range() at dmu_free_long_range+0x1f5/frame 0xfe00e6240450 dmu_free_long_object() at dmu_free_long_object+0x1f/frame 0xfe00e6240480 dmu_recv_stream() at dmu_recv_stream+0x86e/frame 0xfe00e62406b0 zfs_ioc_recv() at zfs_ioc_recv+0x96c/frame 0xfe00e6240920 zfsdev_ioctl() at zfsdev_ioctl+0x54a/frame 0xfe00e62409c0 devfs_ioctl_f() at devfs_ioctl_f+0xf0/frame 0xfe00e6240a20 kern_ioctl() at kern_ioctl+0x2ca/frame 0xfe00e6240a90 sys_ioctl() at sys_ioctl+0x11f/frame 0xfe00e6240ae0 amd64_syscall() at amd64_syscall+0x265/frame 0xfe00e6240bf0 Xfast_syscall() at Xfast_syscall+0xfb/frame 0xfe00e6240bf0 --- syscall (54, FreeBSD ELF64, sys_ioctl), rip = 0x8019ddf1a, rsp = 0x7fff5c08, rbp = 0x7fff5c90 --- I rolled back my kernel arbitrarily in the past (2013/08/01). The panic doesn't happen any more. I will try to narrow this down by dichotomy but that will be more efficient if someone has a rough idea wherefrom the problem appeared. -- Jeremie Le Hen Scientists say the world is made up of Protons, Neutrons and Electrons. They forgot to mention Morons. ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: Panic in ZFS: solaris assert: dn-dn_datablkshift != 0
I believe this was added by this change set:- http://svnweb.freebsd.org/base?view=revisionrevision=253821 Might want to try back out that change and see if everything works after that? Regards Steve - Original Message - From: Jeremie Le Hen j...@freebsd.org To: freebsd-current@FreeBSD.org Sent: Sunday, September 08, 2013 9:54 AM Subject: Re: Panic in ZFS: solaris assert: dn-dn_datablkshift != 0 On Sat, Sep 07, 2013 at 02:35:45PM +0200, Jeremie Le Hen wrote: Hi, I have the following panic every time I do a zfs receive on a given dataset. For the background, I synchronize a zfs dataset every couple of minutes using zfs send/receive. I think I recently got a panic (I'm running mav@'s geom direct dispatch patch) which probably happen at a bad time and left the snapshot/data in an inconsistent state. Now, whenever my cron job runs, it triggers the panic. The process that triggers the panic is: zfs receive -F data/jail/caravan Probably relevant, on boot, I have the following message: Solaris: WARNING: can't open objset for data/jail/caravan/%recv I have a core around if needed to debug. I will not try to repair the snapshot/dataset during this weekend, to get a chance to test a patch. Afterward I will have to start this job again. panic: solaris assert: dn-dn_datablkshift != 0, file: /usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/dmu_tx.c, line: 638 cpuid = 1 KDB: stack backtrace: db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame 0xfe00e62401a0 kdb_backtrace() at kdb_backtrace+0x39/frame 0xfe00e6240250 vpanic() at vpanic+0x126/frame 0xfe00e6240290 panic() at panic+0x43/frame 0xfe00e62402f0 assfail() at assfail+0x22/frame 0xfe00e6240300 dmu_tx_hold_free() at dmu_tx_hold_free+0x167/frame 0xfe00e62403e0 dmu_free_long_range() at dmu_free_long_range+0x1f5/frame 0xfe00e6240450 dmu_free_long_object() at dmu_free_long_object+0x1f/frame 0xfe00e6240480 dmu_recv_stream() at dmu_recv_stream+0x86e/frame 0xfe00e62406b0 zfs_ioc_recv() at zfs_ioc_recv+0x96c/frame 0xfe00e6240920 zfsdev_ioctl() at zfsdev_ioctl+0x54a/frame 0xfe00e62409c0 devfs_ioctl_f() at devfs_ioctl_f+0xf0/frame 0xfe00e6240a20 kern_ioctl() at kern_ioctl+0x2ca/frame 0xfe00e6240a90 sys_ioctl() at sys_ioctl+0x11f/frame 0xfe00e6240ae0 amd64_syscall() at amd64_syscall+0x265/frame 0xfe00e6240bf0 Xfast_syscall() at Xfast_syscall+0xfb/frame 0xfe00e6240bf0 --- syscall (54, FreeBSD ELF64, sys_ioctl), rip = 0x8019ddf1a, rsp = 0x7fff5c08, rbp = 0x7fff5c90 --- I rolled back my kernel arbitrarily in the past (2013/08/01). The panic doesn't happen any more. I will try to narrow this down by dichotomy but that will be more efficient if someone has a rough idea wherefrom the problem appeared. -- Jeremie Le Hen Scientists say the world is made up of Protons, Neutrons and Electrons. They forgot to mention Morons. ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org This e.mail is private and confidential between Multiplay (UK) Ltd. and the person or entity to whom it is addressed. In the event of misdirection, the recipient is prohibited from using, copying, printing or otherwise disseminating it or any information contained in it. In the event of misdirection, illegible or incomplete transmission please telephone +44 845 868 1337 or return the E.mail to postmas...@multiplay.co.uk. ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: Panic in ZFS: solaris assert: dn-dn_datablkshift != 0
On Sun, Sep 08, 2013 at 03:17:09PM +0100, Steven Hartland wrote: I believe this was added by this change set:- http://svnweb.freebsd.org/base?view=revisionrevision=253821 Might want to try back out that change and see if everything works after that? Actually, I already rolled back my kernel to August 1st: # svn info . Path: . Working Copy Root Path: /usr/src URL: http://svn0.us-west.freebsd.org/base/head/sys Repository Root: http://svn0.us-west.freebsd.org/base Repository UUID: ccf9f872-aa2e-dd11-9fc8-001c23d0bc1f Revision: 253847 Node Kind: directory Schedule: normal Last Changed Author: ian Last Changed Rev: 253847 Last Changed Date: 2013-07-31 21:14:00 +0200 (Wed, 31 Jul 2013) And the problem seems to have gone away. I could perform a full zfs send/receive whereas it would trigger a panic 100% of the time with a recent kernel. -- Jeremie Le Hen Scientists say the world is made up of Protons, Neutrons and Electrons. They forgot to mention Morons. ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: Panic in ZFS: solaris assert: dn-dn_datablkshift != 0
- Original Message - From: Jeremie Le Hen j...@freebsd.org On Sun, Sep 08, 2013 at 03:17:09PM +0100, Steven Hartland wrote: I believe this was added by this change set:- http://svnweb.freebsd.org/base?view=revisionrevision=253821 Might want to try back out that change and see if everything works after that? Actually, I already rolled back my kernel to August 1st: # svn info . Path: . Working Copy Root Path: /usr/src URL: http://svn0.us-west.freebsd.org/base/head/sys Repository Root: http://svn0.us-west.freebsd.org/base Repository UUID: ccf9f872-aa2e-dd11-9fc8-001c23d0bc1f Revision: 253847 Node Kind: directory Schedule: normal Last Changed Author: ian Last Changed Rev: 253847 Last Changed Date: 2013-07-31 21:14:00 +0200 (Wed, 31 Jul 2013) And the problem seems to have gone away. I could perform a full zfs send/receive whereas it would trigger a panic 100% of the time with a recent kernel. I still think r253821 is the cause the reason being is prior to r253996 ASSERTS in ZFS where not actually active in HEAD. So if you could roll forward but then backout r253821 and confirm this is indeed the cause that would be a good starting point. If this is indeed the cause be worth engaging Matthew Ahrens cc'ed to find out the reasoning behind the new ASSERT why you may be hitting it? Regards Steve This e.mail is private and confidential between Multiplay (UK) Ltd. and the person or entity to whom it is addressed. In the event of misdirection, the recipient is prohibited from using, copying, printing or otherwise disseminating it or any information contained in it. In the event of misdirection, illegible or incomplete transmission please telephone +44 845 868 1337 or return the E.mail to postmas...@multiplay.co.uk. ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: Panic in ZFS: solaris assert: dn-dn_datablkshift != 0
On Sun, Sep 08, 2013 at 09:02:48PM +0100, Steven Hartland wrote: On Sun, Sep 08, 2013 at 03:17:09PM +0100, Steven Hartland wrote: I believe this was added by this change set:- http://svnweb.freebsd.org/base?view=revisionrevision=253821 Might want to try back out that change and see if everything works after that? Actually, I already rolled back my kernel to August 1st: # svn info . Path: . Working Copy Root Path: /usr/src URL: http://svn0.us-west.freebsd.org/base/head/sys Repository Root: http://svn0.us-west.freebsd.org/base Repository UUID: ccf9f872-aa2e-dd11-9fc8-001c23d0bc1f Revision: 253847 Node Kind: directory Schedule: normal Last Changed Author: ian Last Changed Rev: 253847 Last Changed Date: 2013-07-31 21:14:00 +0200 (Wed, 31 Jul 2013) And the problem seems to have gone away. I could perform a full zfs send/receive whereas it would trigger a panic 100% of the time with a recent kernel. I still think r253821 is the cause the reason being is prior to r253996 ASSERTS in ZFS where not actually active in HEAD. So if you could roll forward but then backout r253821 and confirm this is indeed the cause that would be a good starting point. If this is indeed the cause be worth engaging Matthew Ahrens cc'ed to find out the reasoning behind the new ASSERT why you may be hitting it? -- Jeremie Le Hen Scientists say the world is made up of Protons, Neutrons and Electrons. They forgot to mention Morons. ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: Panic in ZFS: solaris assert: dn-dn_datablkshift != 0
- Original Message - From: Jeremie Le Hen j...@freebsd.org On Sun, Sep 08, 2013 at 09:02:48PM +0100, Steven Hartland wrote: On Sun, Sep 08, 2013 at 03:17:09PM +0100, Steven Hartland wrote: I believe this was added by this change set:- http://svnweb.freebsd.org/base?view=revisionrevision=253821 Might want to try back out that change and see if everything works after that? Actually, I already rolled back my kernel to August 1st: # svn info . Path: . Working Copy Root Path: /usr/src URL: http://svn0.us-west.freebsd.org/base/head/sys Repository Root: http://svn0.us-west.freebsd.org/base Repository UUID: ccf9f872-aa2e-dd11-9fc8-001c23d0bc1f Revision: 253847 Node Kind: directory Schedule: normal Last Changed Author: ian Last Changed Rev: 253847 Last Changed Date: 2013-07-31 21:14:00 +0200 (Wed, 31 Jul 2013) And the problem seems to have gone away. I could perform a full zfs send/receive whereas it would trigger a panic 100% of the time with a recent kernel. I still think r253821 is the cause the reason being is prior to r253996 ASSERTS in ZFS where not actually active in HEAD. So if you could roll forward but then backout r253821 and confirm this is indeed the cause that would be a good starting point. If this is indeed the cause be worth engaging Matthew Ahrens cc'ed to find out the reasoning behind the new ASSERT why you may be hitting it? Errm, was there meant to be some content in your reply Jeremie, as it seems to be missing? Regards Steve This e.mail is private and confidential between Multiplay (UK) Ltd. and the person or entity to whom it is addressed. In the event of misdirection, the recipient is prohibited from using, copying, printing or otherwise disseminating it or any information contained in it. In the event of misdirection, illegible or incomplete transmission please telephone +44 845 868 1337 or return the E.mail to postmas...@multiplay.co.uk. ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: Panic in ZFS: solaris assert: dn-dn_datablkshift != 0
On Sun, Sep 08, 2013 at 10:05:55PM +0100, Steven Hartland wrote: On Sun, Sep 08, 2013 at 09:02:48PM +0100, Steven Hartland wrote: On Sun, Sep 08, 2013 at 03:17:09PM +0100, Steven Hartland wrote: I believe this was added by this change set:- http://svnweb.freebsd.org/base?view=revisionrevision=253821 Might want to try back out that change and see if everything works after that? Actually, I already rolled back my kernel to August 1st: # svn info . Path: . Working Copy Root Path: /usr/src URL: http://svn0.us-west.freebsd.org/base/head/sys Repository Root: http://svn0.us-west.freebsd.org/base Repository UUID: ccf9f872-aa2e-dd11-9fc8-001c23d0bc1f Revision: 253847 Node Kind: directory Schedule: normal Last Changed Author: ian Last Changed Rev: 253847 Last Changed Date: 2013-07-31 21:14:00 +0200 (Wed, 31 Jul 2013) And the problem seems to have gone away. I could perform a full zfs send/receive whereas it would trigger a panic 100% of the time with a recent kernel. I still think r253821 is the cause the reason being is prior to r253996 ASSERTS in ZFS where not actually active in HEAD. So if you could roll forward but then backout r253821 and confirm this is indeed the cause that would be a good starting point. If this is indeed the cause be worth engaging Matthew Ahrens cc'ed to find out the reasoning behind the new ASSERT why you may be hitting it? Errm, was there meant to be some content in your reply Jeremie, as it seems to be missing? Indeed, probably a bad key combo in vi :). I'm reverting r253821 and r254753 (the second one was supposingly fixing the first one) and recompiling my kernel. I will let you know. -- Jeremie Le Hen Scientists say the world is made up of Protons, Neutrons and Electrons. They forgot to mention Morons. ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Panic in ZFS: solaris assert: dn-dn_datablkshift != 0
Hi, I have the following panic every time I do a zfs receive on a given dataset. For the background, I synchronize a zfs dataset every couple of minutes using zfs send/receive. I think I recently got a panic (I'm running mav@'s geom direct dispatch patch) which probably happen at a bad time and left the snapshot/data in an inconsistent state. Now, whenever my cron job runs, it triggers the panic. The process that triggers the panic is: zfs receive -F data/jail/caravan Probably relevant, on boot, I have the following message: Solaris: WARNING: can't open objset for data/jail/caravan/%recv I have a core around if needed to debug. I will not try to repair the snapshot/dataset during this weekend, to get a chance to test a patch. Afterward I will have to start this job again. panic: solaris assert: dn-dn_datablkshift != 0, file: /usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/dmu_tx.c, line: 638 cpuid = 1 KDB: stack backtrace: db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame 0xfe00e62401a0 kdb_backtrace() at kdb_backtrace+0x39/frame 0xfe00e6240250 vpanic() at vpanic+0x126/frame 0xfe00e6240290 panic() at panic+0x43/frame 0xfe00e62402f0 assfail() at assfail+0x22/frame 0xfe00e6240300 dmu_tx_hold_free() at dmu_tx_hold_free+0x167/frame 0xfe00e62403e0 dmu_free_long_range() at dmu_free_long_range+0x1f5/frame 0xfe00e6240450 dmu_free_long_object() at dmu_free_long_object+0x1f/frame 0xfe00e6240480 dmu_recv_stream() at dmu_recv_stream+0x86e/frame 0xfe00e62406b0 zfs_ioc_recv() at zfs_ioc_recv+0x96c/frame 0xfe00e6240920 zfsdev_ioctl() at zfsdev_ioctl+0x54a/frame 0xfe00e62409c0 devfs_ioctl_f() at devfs_ioctl_f+0xf0/frame 0xfe00e6240a20 kern_ioctl() at kern_ioctl+0x2ca/frame 0xfe00e6240a90 sys_ioctl() at sys_ioctl+0x11f/frame 0xfe00e6240ae0 amd64_syscall() at amd64_syscall+0x265/frame 0xfe00e6240bf0 Xfast_syscall() at Xfast_syscall+0xfb/frame 0xfe00e6240bf0 --- syscall (54, FreeBSD ELF64, sys_ioctl), rip = 0x8019ddf1a, rsp = 0x7fff5c08, rbp = 0x7fff5c90 --- -- Jeremie Le Hen Scientists say the world is made up of Protons, Neutrons and Electrons. They forgot to mention Morons. ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org