Re: [zfs-discuss] Race condition yields to kernel panic (u3, u4) or hanging zfs commands (u5)
Hello Matt, you wrote about panic in u3 u4: These stack traces look like 6569719 (fixed in s10u5). Then I suppose it's also fixed by 127127-11 because that patch mentions 6569719. According to my zfs-hardness-test script this is true. Instead of crashing with an panic, with 127127-11 these servers now show hanging zfs commands like update 5. Please try my test script on a test server or see below. For update 5, you could start with the kernel stack of the hung commands. (use ::pgrep and ::findstack) We might also need the sync thread's stack (something like ::walk spa | ::print spa_t spa_dsl_pool-dp_txg.tx_sync_thread | ::findstack) Okay, I'll give it a try. $ uname -a SunOS qacult10 5.10 Generic_137111-08 sun4u sparc SUNW,Ultra-5_10 $ head -1 /etc/release Solaris 10 5/08 s10s_u5wos_10 SPARC $ ps -ef|grep zfs root 23795 23466 0 11:02:45 pts/1 0:00 ssh localhost zfs receive hardness-test/received root 23782 23779 0 11:02:45 ? 0:01 zfs receive hardness-test/received root 23807 23804 0 11:02:52 ? 0:00 zfs receive hardness-test/received root 23466 23145 0 11:00:35 pts/1 0:00 /usr/bin/bash ./zfs-hardness-test.sh root 23793 23466 0 11:02:45 pts/1 0:00 /usr/bin/bash ./zfs-hardness-test.sh root 23804 23797 0 11:02:52 ? 0:00 sh -c zfs receive hardness-test/received root 23779 1 0 11:02:45 ? 0:00 sh -c zfs receive hardness-test/received It seems that a receiving process (pid 23782) already killed has not yet finished. After killing and aborting data transmission, the script does a retry of the send-receive pipe (with same arguments) with pid 23807 on receiving end. There must be a deadlock/race condition. $ mdb -k Loading modules: [ unix krtld genunix specfs dtrace ufs pcipsy ip hook neti sctp arp usba fcp fctl zfs random nfs audiosup md lofs logindmux sd ptm fcip crypto ipc ] ::pgrep zfs$ SPID PPID PGIDSIDUID FLAGS ADDR NAME R 23782 23779 23779 23779 0 0x4a004000 03000171cc90 zfs R 23807 23804 23804 23804 0 0x4a004000 030001728058 zfs ::pgrep zfs$ | ::walk thread | ::findstack -v stack pointer for thread 3d24480: 2a1007fc8c1 [ 02a1007fc8c1 cv_wait+0x38() ] 02a1007fc971 delay+0x90(1, 183f000, 17cdef7, 17cdef8, 1, 18c0578) 02a1007fca21 dnode_special_close+0x20(300221e0a58, 7, 1, 300221e0c68, 7, 300221e0a58) 02a1007fcad1 dmu_objset_evict+0xb8(30003a8dc40, 300027cf500, 7b652000, 70407538, 7b652000, 70407400) 02a1007fcb91 dsl_dataset_evict+0x34(30003a8dc40, 30003a8dc40, 0, 300027cf500, 3000418c2c0, 30022366200) 02a1007fcc41 dbuf_evict_user+0x48(7b6140b0, 30022366200, 30003a8dc48, 0, 0 , 30022355e20) 02a1007fccf1 dbuf_rele+0x8c(30022355e78, 30022355e20, 70400400, 3, 3, 3) 02a1007fcda1 dmu_recvbackup+0x94c(300017c7400, 300017c7d80, 300017c7c28, 300017c7416, 16, 1) 02a1007fcf71 zfs_ioc_recvbackup+0x74(300017c7000, 0, 30004320150, 0, 0, 300017c7400) 02a1007fd031 zfsdev_ioctl+0x15c(70401400, 57, ffbfee20, 1d, 74, ef0) 02a1007fd0e1 fop_ioctl+0x20(30001d7a0c0, 5a1d, ffbfee20, 13, 300027da0c0, 12247f8) 02a1007fd191 ioctl+0x184(3, 300043216f8, ffbfee20, 0, 1ec08, 5a1d) 02a1007fd2e1 syscall_trap32+0xcc(3, 5a1d, ffbfee20, 0, 1ec08, ff34774c) stack pointer for thread 30003d12e00: 2a1009dca41 [ 02a1009dca41 turnstile_block+0x600() ] 02a1009dcaf1 mutex_vector_enter+0x3f0(0, 0, 30022355e78, 3d24480, 3d24480, 0) 02a1009dcba1 dbuf_read+0x6c(30022355e20, 0, 1, 1, 0, 300220f1cf8) 02a1009dcc61 dmu_bonus_hold+0xec(0, 15, 30022355e20, 2a1009dd5d8, 8, 0) 02a1009dcd21 dsl_dataset_open_obj+0x2c(3000418c2c0, 15, 0, 9, 300043ebe88 , 2a1009dd6a8) 02a1009dcde1 dsl_dataset_open_spa+0x140(0, 7b64d000, 3000418c488, 300043ebe88, 2a1009dd768, 9) 02a1009dceb1 dmu_objset_open+0x20(30003ca9000, 5, 9, 2a1009dd828, 1, 300043ebe88) 02a1009dcf71 zfs_ioc_objset_stats+0x18(30003ca9000, 0, 0, 0, 70401400, 39 ) 02a1009dd031 zfsdev_ioctl+0x15c(70401400, 39, ffbfc468, 13, 4c, ef0) 02a1009dd0e1 fop_ioctl+0x20(30001d7a0c0, 5a13, ffbfc468, 13, 300027da010, 12247f8) 02a1009dd191 ioctl+0x184(3, 300043208f8, ffbfc468, 0, 1010101, 5a13) 02a1009dd2e1 syscall_trap32+0xcc(3, 5a13, ffbfc468, 0, 1010101, 7cb88) ::walk spa | ::print spa_t { spa_name = 0x30022613108 hardness-test spa_avl = { avl_child = [ 0, 0 ] avl_pcb = 0x1 } spa_config = 0x3002244abd0 spa_config_syncing = 0 spa_config_txg = 0x4 spa_config_cache_lock = { _opaque = [ 0 ] } spa_sync_pass = 0x1 spa_state = 0 spa_inject_ref = 0 spa_traverse_wanted = 0 spa_sync_on = 0x1 spa_load_state = 0 (SPA_LOAD_NONE) spa_zio_issue_taskq = [ 0x300225e5528, 0x300225e56d8, 0x300225e5888, 0x300225e5a38, 0x300225e5be8, 0x300225e5d98 ]
Re: [zfs-discuss] Race condition yields to kernel panic (u3, u4) or hanging zfs commands (u5)
Could you provide the panic message and stack trace, plus the stack traces of when it's hung? --matt Hello matt, here is info and stack trace of a server running Update 3: $ uname -a SunOS qacpp03 5.10 Generic_127111-05 sun4us sparc FJSV,GPUSC-M $ head -1 /etc/release Solaris 10 11/06 s10s_u3wos_10 SPARC $ grep panic: /var/adm/messages.0 Nov 4 16:04:42 qacpp03 savecore: [ID 570001 auth.error] reboot after panic: dangling dbufs (dn=600ac879410, dbuf=600475682c0) $ mdb -k unix.4 vmcore.4 Loading modules: [ unix krtld genunix specfs dtrace ufs sd pcipsy ssd fcp fctl md emlxs ip hook neti sctp arp usba nca random zfs nfs crypto sppp ipc logindmux ptm cpc fcip lofs ] $C 02a103108811 vpanic(7b2e1df0, 600ac879410, 600475682c0, 600ac879638, 60047568350, 90) 02a1031088c1 dnode_evict_dbufs+0x19c(600ac879410, 1, 7b2e1c00, 6000b0be500, 1, 2a103109170) 02a103108a31 dmu_objset_evict_dbufs+0xd8(17, 0, 0, 7b2db400, 6000b0be380, 6000b0be380) 02a103108ae1 dmu_objset_evict+0xb4(600054f9480, 6000b0be380, 7b2e1800, 7095b4f0, 7b2e1800, 7095b400) 02a103108ba1 dsl_dataset_evict+0x34(600054f9480, 600054f9480, deadbeef, 6000b0be380, 60076b3a700, 6009b489200) 02a103108c51 dbuf_evict_user+0x48(7b2a3f00, 6009b489200, 600054f9488, 0, 0, 6009a99b5c0) 02a103108d01 dbuf_rele+0x8c(6009a99b618, 6009a99b5c0, 70954400, 3, 3, 3) 02a103108db1 dmu_recvbackup+0x94c(60046a32a80, 60046a33400, 60046a332a8, 60046a32a96, 16, 1) 02a103108f81 zfs_ioc_recvbackup+0x74(60046a32680, 0, 6003b454130, 0, 0, 60046a32a80) 02a103109041 zfsdev_ioctl+0x15c(70955400, 57, ffbfee28, 1d, 74, ef0) 02a1031090f1 fop_ioctl+0x20(6000a020b80, 5a1d, ffbfee28, 13, 6003b28c020, 121e728) 02a1031091a1 ioctl+0x184(3, 6003b159680, ffbfee28, 0, 1ec08, 5a1d) 02a1031092f1 syscall_trap32+0xcc(3, 5a1d, ffbfee28, 0, 1ec08, ff347734) -- Panic of server running Update 4: $ uname -a SunOS vs104is1 5.10 Generic_127112-02 i86pc i386 i86pc $ head -1 /etc/release Solaris 10 8/07 s10x_u4wos_12b X86 $ grep panic: /var/adm/messages.0 Nov 4 14:37:15 vs104is1 savecore: [ID 570001 auth.error] reboot after panic: dangling dbufs (dn=d5662518, dbuf=ded05420) $ mdb -k unix.0 vmcore.0 Loading modules: [ unix krtld genunix specfs dtrace uppc pcplusmp ufs ip hook neti sctp arp usba fctl nca lofs zfs random nfs sppp ptm ] $C e5f73b68 vpanic(f9cdf0e4, d5662518, ded05420) e5f73c08 dnode_evict_dbufs+0x143(d5662518, 0) e5f73c28 dmu_objset_evict_dbufs+0xe3(e5f73c40, 0) e5f73c4c dmu_objset_evict+0x72(eab53e40, d56f8940) e5f73c68 dsl_dataset_evict+0x2f(f16fc240, eab53e40) e5f73c7c dbuf_evict_user+0x29(f16fc240) e5f73c9c dbuf_rele+0x7a(f16fc240, f9cd9519) e5f73cbc dsl_dataset_close+0x57(eab53e40, 3, f9cd9519) e5f73e0c dmu_recvbackup+0x88f(dec54400, dec54d80, dec54c28, 0, d6d48180, 186) e5f73e58 zfs_ioc_recvbackup+0x85(dec54000) e5f73e78 zfsdev_ioctl+0xf8() e5f73e98 cdev_ioctl+0x2b(2d4, 5a1d, 804680c, 13, ee43fb60, e5f73f78) e5f73ebc spec_ioctl+0x62(d689d900, 5a1d, 804680c, 13, ee43fb60, e5f73f78) e5f73eec fop_ioctl+0x24(d689d900, 5a1d, 804680c, 13, ee43fb60, e5f73f78) e5f73f84 ioctl+0x199() e5f73fac sys_sysenter+0x101() --- Hope this helps. Unfortunately I have no idea which kind of stack trace you need of the server running Update 5!? A stack trace of the running kernel is useless, since it will no show any usefull informationen. It's the zfs command (zfs receive and zfs list) in user space which are hanging. What kind of inforation do you need for analysis and how can I provide them? Or did I miss something? - Andreas -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Race condition yields to kernel panic (u3, u4) or hanging zfs commands (u5)
Matthew Ahrens wrote: Andreas Koppenhoefer wrote: Hello, occasionally we got some solaris 10 server to panic in zfs code while doing "zfs send -i [EMAIL PROTECTED] [EMAIL PROTECTED] | ssh remote zfs receive poolname". The race condition(s) get triggered by a broken data transmission or killing sending zfs or ssh command. The panic or hanging zfs commands occurs on receiving end. I believe this should be fixed in OpenSolaris, but we'll verify it. I recently changed the error handling code paths for zfs send/recv. What's the bugid please? -- Andrew ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Race condition yields to kernel panic (u3, u4) or hanging zfs commands (u5)
Andreas Koppenhoefer wrote: Hello, occasionally we got some solaris 10 server to panic in zfs code while doing zfs send -i [EMAIL PROTECTED] [EMAIL PROTECTED] | ssh remote zfs receive poolname. The race condition(s) get triggered by a broken data transmission or killing sending zfs or ssh command. The panic or hanging zfs commands occurs on receiving end. Could you provide the panic message and stack trace, plus the stack traces of when it's hung? --matt ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Race condition yields to kernel panic (u3, u4) or hanging zfs commands (u5)
Matthew Ahrens wrote: Andreas Koppenhoefer wrote: Hello, occasionally we got some solaris 10 server to panic in zfs code while doing zfs send -i [EMAIL PROTECTED] [EMAIL PROTECTED] | ssh remote zfs receive poolname. The race condition(s) get triggered by a broken data transmission or killing sending zfs or ssh command. The panic or hanging zfs commands occurs on receiving end. I believe this should be fixed in OpenSolaris, but we'll verify it. I recently changed the error handling code paths for zfs send/recv. Matt, Do you know if this is targeted for a Solaris 10 patch? If this is the same panic I managed to hit (see the thread Possible ZFS panic on Solaris 10 Update 6) it is a real show stopper. Not only did it panic the system under normal use and require manual intervention to bring it back, it also locked up ZFS commands on the peer system, requiring a reboot to clear. I don't dare risk replicating our users home directories with zfs send/receive until this is addressed. Thanks, -- Ian. ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] Race condition yields to kernel panic (u3, u4) or hanging zfs commands (u5)
Hello, occasionally we got some solaris 10 server to panic in zfs code while doing zfs send -i [EMAIL PROTECTED] [EMAIL PROTECTED] | ssh remote zfs receive poolname. The race condition(s) get triggered by a broken data transmission or killing sending zfs or ssh command. The panic or hanging zfs commands occurs on receiving end. I've tried to reproduce the conditions with a small (quickdirty) script. You'll find it as an attached file. After starting the enclosed script with ./zfs-hardness-test.sh /var/tmp it does the following: - Creates a zpool on files in /var/tmp/... - Copies some files to that zpool and creates snapshot A. - Copies more files into zpool and creates another snapshot B. - zfs send snapA fileA - zfs send -i snapA snapB fileAB - destroys zpool and creates new/empty zpool - zfs receive of snapA from fileA via ssh localhost At this point the script enters a endless loop of... - zfs receive of incremental snapAB from file AB via ssh localhost - zfs destroy snapB (if exists) - zfs list - restart loop The key point is that receiving of incremental snapAB will break at some random point, except for the first iteration where a full receive will be done. For breaking things the script sends a SIGINT to zfs and ssh commands. This simulates a broken data transmission. I suppose on busy (or slow) server you'll have more chance to trigger the problem. On all testes system, sparc and x86, solaris will panic or hang in zfs receive zfs list. Is this a (two?) known bug(s). Are there patches available? This is the output of uname -a;head -1 /etc/release of some systems tested: SunOS vs104is1 5.10 Generic_127112-02 i86pc i386 i86pc Solaris 10 8/07 s10x_u4wos_12b X86 SunOS qachpi10 5.10 Generic_137112-02 i86pc i386 i86pc Solaris 10 5/08 s10x_u5wos_10 X86 SunOS qacult10 5.10 Generic_137111-08 sun4u sparc SUNW,Ultra-5_10 Solaris 10 5/08 s10s_u5wos_10 SPARC SunOS qacpp03 5.10 Generic_127111-05 sun4us sparc FJSV,GPUSC-M Solaris 10 11/06 s10s_u3wos_10 SPARC SunOS qacult31 5.10 Generic_127127-11 sun4u sparc SUNW,Ultra-30 Solaris 10 5/08 s10s_u5wos_10 SPARC Disclaimer: DO NOT USE THE ATTACHED SCRIPT ON PRODUCTIVE SERVER. USE IT AT YOUR OWN RISK. USE IT ON A TEST SERVER ONLY. IT'S LIKELY THAT YOU WILL DAMAGE YOUR SERVER OR YOUR DATA BY RUNNING THIS SCRIPT. - Andreas -- This message posted from opensolaris.org zfs-hardness-test.sh Description: Binary data ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss