Re: panic: ia64 r255811: deadlkres: possible deadlock detected for 0xe000000012d07b00, blocked for 902743 ticks
From davide.itali...@gmail.com Fri Oct 11 15:39:49 2013 If you're not able to get a full dump, a textdump would be enough. In your DDB scripts just remove the 'call doadump' step and you should be done, unless I'm missing something. Please adjust the script as well to include all the informations requested as mentioned in my previous link, e.g. 'show lockedvnods' is not mentioned on the example section of textdump(4) manpage but it could be useful to ease debugging. It seems call doadump is always needed. At least it is included in textdump(4) examples. Also, this tutorial incluldes it: http://www.etinc.com/122/Using-FreeBSD-Text-Dumps I think I still haven't got textdump right, because instead of savecore: writing core to textdump.tar.0 I see: savecore: writing core to /var/crash/vmcore.9 Thanks Anton ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
panic: wrong page state m 0xe00000027a9adb40 + savecore deadlock
From davide.itali...@gmail.com Mon Oct 14 12:50:44 2013 This is fair enough -- If you're still at the ddb prompt, please print the whole panic message (or at least the address of the lock reported as deadlocked by DEADLKRES), so that we can at least have a candidate. Here's another one, followed by savecore deadlock. ia64 r255488 panic: wrong page state m 0xe0027a9adb40 cpuid = 0 KDB: stack backtrace: db_trace_self(0x9ffc00158380) at db_trace_self+0x40 db_trace_self_wrapper(0x9ffc00607370) at db_trace_self_wrapper+0x70 kdb_backtrace(0x9ffc00ed0e10, 0x9ffc0058e660, 0x40c, 0x9ffc010a44a0) at kdb_backtrace+0xc0 vpanic(0x9ffc00dd3fe0, 0xa0009de61118, 0x9ffc00ef9670, 0x9ffc00ed0bc0) at vpanic+0x260 kassert_panic(0x9ffc00dd3fe0, 0xe0027a9adb40, 0x81f, 0xe002013cf400, 0x9ffc006a0220, 0x2c60, 0xe002013cf400, 0xe002013cf418) at kassert_panic+0x120 vn_sendfile(0x8df, 0xd, 0x0, 0x0, 0x0, 0x8df, 0x7fffdfe0, 0x0) at vn_sendfile+0x15d0 sys_sendfile(0xe00012aef200, 0xa0009de614e8, 0x10, 0xa0009de61360) at sys_sendfile+0x2b0 syscall(0xe000154f2940, 0xd, 0x0, 0xe00012aef200, 0x0, 0x0, 0x9ffc00ab7280, 0x8) at syscall+0x5e0 epc_syscall_return() at epc_syscall_return KDB: enter: panic [ thread pid 5989 tid 100111 ] Stopped at kdb_enter+0x92: [I2]addl r14=0xffe2c990,gp ;; db db scripts lockinfo=show locks; show alllocks; show lockedvnods zzz=textdump set; capture on; run lockinfo; show pcpu; bt; ps; alltrace; capture off; call doadump; reset db db run zzz I get to db:0:alltrace capture off db:0:off call doadump Dumping 10220 MB (25 chunks) chunk 0: 1 pages ... ok chunk 1: 159 pages ... ok chunk 2: 256 pages ... ok chunk 3: 7680 pages ... ok chunk 4: 8192 pages ... ok chunk 5: 239734 pages ... ok chunk 6: 748 pages ... ok chunk 7: 533 pages ... ok chunk 8: 21 pages ... ok chunk 9: 1572862 pages ... ok chunk 10: 781683 pages ... ok chunk 11: 512 pages ... ok chunk 12: 139 pages ... ok chunk 13: 484 pages ... ok chunk 14: 1565 pages ... ok chunk 15: 1 pages ... ok chunk 16: 506 pages ... ok chunk 17: 1 pages ... ok chunk 18: 3 pages ... ok chunk 19: 566 pages ... ok chunk 20: 66 pages ... ok chunk 21: 1 pages ... ok chunk 22: 285 pages ... ok chunk 23: 6 pages ... ok chunk 24: 354 pages ... ok Dump complete = 0 db:0:doadump reset So far, so good. On reboot I get: Starting ddb. ddb: sysctl: debug.ddb.scripting.scripts: Invalid argument /etc/rc: WARNING: failed to start ddb This probably already indicates some problem? Eventually I get to: savecore: reboot after panic: wrong page state m 0xe0027a9adb40 Oct 15 09:05:50 mech-as28 savecore: reboot after panic: wrong page state m 0xe0027a9adb40 savecore: writing core to /var/crash/vmcore.9 So here I'm confused. I think I set up textdump as in the man page. So I think the core should not be written. Instead I was expecting ddb.txt, config.txt, etc., as in textdump(4). Anyway, savecore eventually deadlocks: panic: deadlkres: possible deadlock detected for 0xe000127b7b00, blocked for 901401 ticks cpuid = 0 KDB: stack backtrace: db_trace_self(0x9ffc00158380) at db_trace_self+0x40 db_trace_self_wrapper(0x9ffc00607370) at db_trace_self_wrapper+0x70 kdb_backtrace(0x9ffc00ed0e10, 0x9ffc0058e660, 0x40c, 0x9ffc010a44a0) at kdb_backtrace+0xc0 vpanic(0x9ffc00db8a18, 0xa0009dca7518) at vpanic+0x260 panic(0x9ffc00db8a18, 0x9ffc00db8c70, 0xe000127b7b00, 0xdc119) at panic+0x80 deadlkres(0xdc119, 0xe000127b7b00, 0x9ffc00dbb648, 0x9ffc00db89a8) at deadlkres+0x420 fork_exit(0x9ffc00e0fca0, 0x0, 0xa0009dca7550) at fork_exit+0x120 enter_userland() at enter_userland KDB: enter: panic [ thread pid 0 tid 100053 ] Stopped at kdb_enter+0x92: [I2]addl r14=0xffe2c990,gp ;; db db scripts lockinfo=show locks; show alllocks; show lockedvnods db run lockinfo db:0:lockinfo show locks db:0:locks show alllocks db:0:alllocks show lockedvnods Locked vnodes 0xe000127cbba8: tag devfs, type VCHR usecount 1, writecount 0, refcount 19 mountedhere 0xe000126ab200 flags (VI_ACTIVE) v_object 0xe000127c2b00 ref 0 pages 422 lock type devfs: EXCL by thread 0xe0001269 (pid 21, syncer, tid 100062) dev da3p1 0xe000127f4ec0: tag ufs, type VREG usecount 1, writecount 1, refcount 32934 mountedhere 0 flags (VI_ACTIVE) v_object 0xe000127f7200 ref 0 pages 1242850 lock type ufs: EXCL by thread 0xe000127b7b00 (pid 805, savecore, tid 100079) ino 6500740, on dev da3p1 db db ps pid ppid pgrp uid state wmesg wchancmd 805 80324 0 L+ *vm page 0xe00012402fc0 savecore 8032424 0 DL+ vm map ( 0xe0001285fa88 sh 801 1 801 0 Ss select 0xe00010c296c0 syslogd 792 1 792 0 Ss
Re: RFC: support for first boot rc.d scripts
Yes, it's hard to store state on diskless systems... but I figured that anyone building a diskless system would know to not create a run firstboot scripts marker. And not all embedded systems are diskless... The embedded systems we create at $work have readonly root and mfs /var, but we do have writable storage on another filesystem. It would work for us (not that we need this feature right now) if there were an rcvar that pointed to the marker file. Of course to make it work, something would have to get the alternate filesystem mounted early enough to be useful (that is something we do already with a custom rc script). Indeed... the way my patch currently does things, it looks for the firstboot sentinel at the start of /etc/rc, which means it *has* to be on /. Making the path an rcvar is a good idea (updated patch attached) but we still need some way to re-probe for that file after mounting extra filesystems. In many cases a simple test -f /firstboot bla_enable='YES' || bla_enable='NO' rm -f /firstboot in your specific rc.d script would suffice. Or for installing packages: for pkg in $PKGS; do if ! pkg_info $pkg-'[0-9]*' /dev/null 21; then pkg_add /some/dir/$pkg.txz fi done I am not quite sure why we need /firstboot handling in /etc/rc. Perhaps it is a better idea to make this more generic, to move the rc.d script containing a 'runonce' keyword to a subdirectory as the last step in rc (or make that an rc.d script in itself!). That way you could consider moving it back if you need to re-run it. Or have an rc.d script setup something like a database after installing a package by creating a rc.d runonce script. Default dir could be ./run-once relative to the rc.d dir it is in, configurable through runonce_directory . Note: The move would need to be done at the very end of rc.d to prevent rcorder returning a different ordering and skipping scripts because of that. Nick ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: panic: wrong page state m 0xe00000027a9adb40 + savecore deadlock
On Tue, Oct 15, 2013 at 10:43 AM, Anton Shterenlikht me...@bris.ac.uk wrote: Anyway, savecore eventually deadlocks: panic: deadlkres: possible deadlock detected for 0xe000127b7b00, blocked for 901401 ticks [trim] Tracing command savecore pid 805 tid 100079 td 0xe000127b7b00 cpu_switch(0xe000127b7b00, 0xe00011178900, 0xe00012402fc0, 0x9ffc005e7e80) at cpu_switch+0xd0 sched_switch(0xe000127b7b00, 0xe00011178900, 0x9ffc00f15698, 0x9ffc00f15680) at sched_switch+0x890 mi_switch(0x103, 0x0, 0xe000127b7b00, 0x9ffc0062d1f0) at mi_switch+0x3f0 turnstile_wait(0xe00012402fc0, 0xe00012400480, 0x0, 0x9ffc00dcb698) at turnstile_wait+0x960 __mtx_lock_sleep(0x9ffc010f9998, 0xe000127b7b00, 0xe00012402fc0, 0x9ffc00dc0558, 0x742) at __mtx_lock_sleep+0x2f0 __mtx_lock_flags(0x9ffc010f9980, 0x0, 0x9ffc00dd4a90, 0x742) at __mtx_lock_flags+0x1e0 vfs_vmio_release(0xa0009ebe72f0, 0xe0027ed2ab70, 0x3, 0xa0009ebe736c, 0xa0009ebe7498, 0xa0009ebe72f8, 0x9ffc00dd4a90, 0x9ffc010f9680) at vfs_vmio_release+0x290 getnewbuf(0xe000127f4ec0, 0x0, 0x0, 0x8000, 0xa0009ebe99a8, 0x0, 0x9ffc010f0798, 0xa0009ebe72f0) at getnewbuf+0x7e0 getblk(0xe000127f4ec0, 0x4cbaa, 0x8000, 0x0, 0x0, 0x0, 0x0, 0x0) at getblk+0xee0 ffs_balloc_ufs2(0xe000127f4ec0, 0x4cbaa, 0xa000c60ba000, 0xe00011165a00, 0x7f05, 0xa0009dd79160) at ffs_balloc_ufs2+0x2950 ffs_write(0xa0009dd79248, 0x3000, 0x265d5) at ffs_write+0x5c0 VOP_WRITE_APV(0x9ffc00e94ac0, 0xa0009dd79248, 0x0, 0x0) at VOP_WRITE_APV+0x330 vn_write(0xe000129ae820, 0xa0009dd79360, 0xe00011165a00, 0x0, 0xe000129ae830, 0xe000127f4ec0) at vn_write+0x450 vn_io_fault(0xe000129ae820, 0xa0009dd79360, 0xe00011165a00, 0x0, 0xe000127b7b00) at vn_io_fault+0x330 dofilewrite(0xe000127b7b00, 0x7, 0xe000129ae820, 0xa0009dd79360, 0x, 0x0) at dofilewrite+0x180 kern_writev(0xe000127b7b00, 0x7, 0xa0009dd79360) at kern_writev+0xa0 sys_write(0xe000127b7b00, 0xa0009dd794e8, 0x9ffc00abac80, 0x48d) at sys_write+0x100 syscall(0xe000129d04a0, 0x140857000, 0x8000, 0xe000127b7b00, 0x0, 0x0, 0x9ffc00ab7280, 0x8) at syscall+0x5e0 --More-- I'm not commenting on the first panic you got -- but on the deadlock reported by DEADLKRES. I think that's the vm_page lock. You can run kgdb /boot/${KERNEL}/kernel where ${KERNEL} is the incrimined one then l *vfs_vmio_release+0x290 to get the exact point where it fails. I'm unsure here because 'show alllocks' and 'show locks' outputs are empty -- are you building your kernel with WITNESS etc..? Thanks, -- Davide There are no solved problems; there are only problems that are more or less solved -- Henri Poincare ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: vmstat -z: zfs related failures on r255173
Please, any idea, thougth, help! Maybe what information can be useful for diggin - anything... System what I'm talkin about has a huge problem: performance degradation in short time period (day-two). Don't know can we somehow relate this vmstat fails with degradation. Hi all On CURRENT r255173 we have some interesting values from vmstat -z : REQ = FAIL [server]# vmstat -z ITEM SIZE LIMIT USED FREE REQ FAIL SLEEP ... skipped NCLNODE: 528, 0, 0, 0, 0, 0, 0 space_seg_cache: 64, 0, 289198, 299554,25932081,25932081, 0 zio_cache: 944, 0, 37512, 50124,1638254119,1638254119, 0 zio_link_cache: 48, 0, 50955, 38104,1306418638,1306418638, 0 sa_cache: 80, 0, 63694, 56, 198643,198643, 0 dnode_t: 864, 0, 128813, 3, 184863,184863, 0 dmu_buf_impl_t: 224, 0, 1610024, 314631,157119686,157119686, 0 arc_buf_hdr_t: 216, 0,82949975, 56107,156352659,156352659, 0 arc_buf_t: 72, 0, 1586866, 314374,158076670,158076670, 0 zil_lwb_cache: 192, 0, 6354, 7526, 2486242,2486242, 0 zfs_znode_cache: 368, 0, 63694, 16, 198643,198643, 0 . skipped .. Can anybody explain this strange failures in zfs related parameters in vmstat, can we do something with this and is this really bad signal? Thanks! ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: panic: wrong page state m 0xe00000027a9adb40 + savecore deadlock
From davide.itali...@gmail.com Tue Oct 15 11:30:07 2013 On Tue, Oct 15, 2013 at 10:43 AM, Anton Shterenlikht me...@bris.ac.uk wrote: Anyway, savecore eventually deadlocks: panic: deadlkres: possible deadlock detected for 0xe000127b7b00, blocked for 901401 ticks [trim] Tracing command savecore pid 805 tid 100079 td 0xe000127b7b00 cpu_switch(0xe000127b7b00, 0xe00011178900, 0xe00012402fc0, 0x9ffc005e7e80) at cpu_switch+0xd0 sched_switch(0xe000127b7b00, 0xe00011178900, 0x9ffc00f15698, 0x9ffc00f15680) at sched_switch+0x890 mi_switch(0x103, 0x0, 0xe000127b7b00, 0x9ffc0062d1f0) at mi_switch+0x3f0 turnstile_wait(0xe00012402fc0, 0xe00012400480, 0x0, 0x9ffc00dcb698) at turnstile_wait+0x960 __mtx_lock_sleep(0x9ffc010f9998, 0xe000127b7b00, 0xe00012402fc0, 0x9ffc00dc0558, 0x742) at __mtx_lock_sleep+0x2f0 __mtx_lock_flags(0x9ffc010f9980, 0x0, 0x9ffc00dd4a90, 0x742) at __mtx_lock_flags+0x1e0 vfs_vmio_release(0xa0009ebe72f0, 0xe0027ed2ab70, 0x3, 0xa0009ebe736c, 0xa0009ebe7498, 0xa0009ebe72f8, 0x9ffc00dd4a90, 0x9ffc010f9680) at vfs_vmio_release+0x290 getnewbuf(0xe000127f4ec0, 0x0, 0x0, 0x8000, 0xa0009ebe99a8, 0x0, 0x9ffc010f0798, 0xa0009ebe72f0) at getnewbuf+0x7e0 getblk(0xe000127f4ec0, 0x4cbaa, 0x8000, 0x0, 0x0, 0x0, 0x0, 0x0) at getblk+0xee0 ffs_balloc_ufs2(0xe000127f4ec0, 0x4cbaa, 0xa000c60ba000, 0xe00011165a00, 0x7f05, 0xa0009dd79160) at ffs_balloc_ufs2+0x2950 ffs_write(0xa0009dd79248, 0x3000, 0x265d5) at ffs_write+0x5c0 VOP_WRITE_APV(0x9ffc00e94ac0, 0xa0009dd79248, 0x0, 0x0) at VOP_WRITE_APV+0x330 vn_write(0xe000129ae820, 0xa0009dd79360, 0xe00011165a00, 0x0, 0xe000129ae830, 0xe000127f4ec0) at vn_write+0x450 vn_io_fault(0xe000129ae820, 0xa0009dd79360, 0xe00011165a00, 0x0, 0xe000127b7b00) at vn_io_fault+0x330 dofilewrite(0xe000127b7b00, 0x7, 0xe000129ae820, 0xa0009dd79360, 0x, 0x0) at dofilewrite+0x180 kern_writev(0xe000127b7b00, 0x7, 0xa0009dd79360) at kern_writev+0xa0 sys_write(0xe000127b7b00, 0xa0009dd794e8, 0x9ffc00abac80, 0x48d) at sys_write+0x100 syscall(0xe000129d04a0, 0x140857000, 0x8000, 0xe000127b7b00, 0x0, 0x0, 0x9ffc00ab7280, 0x8) at syscall+0x5e0 --More-- I'm not commenting on the first panic you got -- but on the deadlock reported by DEADLKRES. I think that's the vm_page lock. You can run kgdb /boot/${KERNEL}/kernel where ${KERNEL} is the incrimined one then l *vfs_vmio_release+0x290 to get the exact point where it fails. Like this? # kgdb /boot/kernel/kernel GNU gdb 6.1.1 [FreeBSD] Copyright 2004 Free Software Foundation, Inc. GDB is free software, covered by the GNU General Public License, and you are welcome to change it and/or distribute copies of it under certain conditions. Type show copying to see the conditions. There is absolutely no warranty for GDB. Type show warranty for details. This GDB was configured as ia64-marcel-freebsd... (kgdb) l *vfs_vmio_release+0x290 0x9ffc006b8830 is in vfs_vmio_release (/usr/src/sys/kern/vfs_bio.c:1859). 1854/* 1855 * In order to keep page LRU ordering consistent, put 1856 * everything on the inactive queue. 1857 */ 1858vm_page_lock(m); 1859vm_page_unwire(m, 0); 1860 1861/* 1862 * Might as well free the page if we can and it has 1863 * no valid data. We also free the page if the (kgdb) I'm unsure here because 'show alllocks' and 'show locks' outputs are empty -- are you building your kernel with WITNESS etc..? I think so: # Debugging support. Always need this: options KDB # Enable kernel debugger support. options KDB_TRACE # Print a stack trace for a panic. # For full debugger support use (turn off in stable branch): options DDB # Support DDB options GDB # Support remote GDB options DEADLKRES # Enable the deadlock resolver options INVARIANTS # Enable calls of extra sanity checking options INVARIANT_SUPPORT # required by INVARIANTS options WITNESS # Enable checks to detect deadlocks and cycles options WITNESS_SKIPSPIN # Don't run witness on spinlocks for speed options MALLOC_DEBUG_MAXZONES=8 # Separate malloc(9) zones # textdump(4) options TEXTDUMP_PREFERRED options TEXTDUMP_VERBOSE # http://www.freebsd.org/doc/en/books/developers-handbook/kerneldebug-deadlocks.html options DEBUG_LOCKS options DEBUG_VFS_LOCKS options DIAGNOSTIC Also, does this look right: $ sysctl -a | grep kdb debug.ddb.scripting.scripts: kdb.enter.panic=textdump set; capture on; run lockinfo; show pcpu; bt; ps; alltrace; capture off; call
Re: claws-mail deadlocking in iconv
Fabian Keil freebsd-lis...@fabiankeil.de wrote: After the iconv import claws-mail started to deadlock in iconv every now and then on my system, which prevented claws-mail from rendering windows or reacting to input. [...] Did anyone else run into this or can comment on the patch or the backtraces? Thanks for the feedback, everyone. This is now bin/182994: http://www.freebsd.org/cgi/query-pr.cgi?pr=182994 Fabian signature.asc Description: PGP signature
Re: vmstat -z: zfs related failures on r255173
On 2013-10-15 07:53, Dmitriy Makarov wrote: Please, any idea, thougth, help! Maybe what information can be useful for diggin - anything... System what I'm talkin about has a huge problem: performance degradation in short time period (day-two). Don't know can we somehow relate this vmstat fails with degradation. Hi all On CURRENT r255173 we have some interesting values from vmstat -z : REQ = FAIL [server]# vmstat -z ITEM SIZE LIMIT USED FREE REQ FAIL SLEEP ... skipped NCLNODE:528, 0, 0, 0, 0, 0, 0 space_seg_cache: 64, 0, 289198, 299554,25932081,25932081, 0 zio_cache: 944, 0, 37512, 50124,1638254119,1638254119, 0 zio_link_cache: 48, 0, 50955, 38104,1306418638,1306418638, 0 sa_cache:80, 0, 63694, 56, 198643,198643, 0 dnode_t:864, 0, 128813, 3, 184863,184863, 0 dmu_buf_impl_t: 224, 0, 1610024, 314631,157119686,157119686,0 arc_buf_hdr_t: 216, 0,82949975, 56107,156352659,156352659,0 arc_buf_t: 72, 0, 1586866, 314374,158076670,158076670,0 zil_lwb_cache: 192, 0,6354,7526, 2486242,2486242, 0 zfs_znode_cache:368, 0, 63694, 16, 198643,198643, 0 . skipped .. Can anybody explain this strange failures in zfs related parameters in vmstat, can we do something with this and is this really bad signal? Thanks! ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org I am guessing those 'failures' are failures to allocate memory. I'd recommend you install sysutils/zfs-stats and send the list the output of 'zfs-stats -a' -- Allan Jude ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re[3]: vmstat -z: zfs related failures on r255173
[:~]# zfs-stats -a ZFS Subsystem ReportTue Oct 15 16:48:43 2013 System Information: Kernel Version: 151 (osreldate) Hardware Platform: amd64 Processor Architecture: amd64 ZFS Storage pool Version: 5000 ZFS Filesystem Version: 5 FreeBSD 10.0-CURRENT #3 r255173: Fri Oct 11 17:15:50 EEST 2013 root 16:48 up 16:27, 1 user, load averages: 12,58 12,51 14,44 System Memory: 15.05% 18.76 GiB Active, 0.05% 61.38 MiB Inact 83.42% 103.98 GiB Wired, 0.55% 702.44 MiB Cache 0.92% 1.14GiB Free, 0.01% 16.93 MiB Gap Real Installed: 128.00 GiB Real Available: 99.96% 127.95 GiB Real Managed: 97.41% 124.65 GiB Logical Total: 128.00 GiB Logical Used: 98.52% 126.11 GiB Logical Free: 1.48% 1.89GiB Kernel Memory: 91.00 GiB Data: 99.99% 90.99 GiB Text: 0.01% 13.06 MiB Kernel Memory Map: 124.65 GiB Size: 69.88% 87.11 GiB Free: 30.12% 37.54 GiB ARC Summary: (HEALTHY) Memory Throttle Count: 0 ARC Misc: Deleted:30.38m Recycle Misses: 25.16m Mutex Misses: 7.45m Evict Skips:444.42m ARC Size: 100.00% 90.00 GiB Target Size: (Adaptive) 100.00% 90.00 GiB Min Size (Hard Limit): 44.44% 40.00 GiB Max Size (High Water): 2:1 90.00 GiB ARC Size Breakdown: Recently Used Cache Size: 92.69% 83.42 GiB Frequently Used Cache Size: 7.31% 6.58GiB ARC Hash Breakdown: Elements Max: 14.59m Elements Current: 99.70% 14.54m Collisions: 71.31m Chain Max: 25 Chains: 2.08m ARC Efficiency: 1.11b Cache Hit Ratio:93.89% 1.04b Cache Miss Ratio: 6.11% 67.70m Actual Hit Ratio: 91.73% 1.02b Data Demand Efficiency: 90.56% 294.97m Data Prefetch Efficiency: 9.64% 7.07m CACHE HITS BY CACHE LIST: Most Recently Used: 8.80% 91.66m Most Frequently Used: 88.89% 925.41m Most Recently Used Ghost: 0.50% 5.16m Most Frequently Used Ghost: 2.97% 30.95m CACHE HITS BY DATA TYPE: Demand Data: 25.66% 267.11m Prefetch Data:0.07% 681.36k Demand Metadata: 72.04% 749.94m Prefetch Metadata:2.24% 23.31m CACHE MISSES BY DATA TYPE: Demand Data: 41.15% 27.86m Prefetch Data:9.43% 6.38m Demand Metadata: 48.71% 32.98m Prefetch Metadata:0.71% 478.11k L2 ARC Summary: (HEALTHY) Passed Headroom:1.38m Tried Lock Failures:403.24m IO In Progress: 1.19k Low Memory Aborts: 6 Free on Write: 1.69m Writes While Full: 3.48k R/W Clashes:608.58k Bad Checksums: 0 IO Errors: 0 SPA Mismatch: 321.48m L2 ARC Size: (Adaptive) 268.26 GiB Header Size:0.85% 2.27GiB L2 ARC Breakdown: 67.70m Hit Ratio: 54.97% 37.21m Miss Ratio: 45.03% 30.48m Feeds: 62.45k L2 ARC Buffer: Bytes Scanned: 531.83 TiB Buffer Iterations:
WITNESS: unable to allocate a new witness object
I'm trying to set up textdump(4). On boot I see: WITNESS: unable to allocate a new witness object also Expensive timeout(9) function: 0x9ffc00e222e0(0xa09ed320) 0.002434387 s kickstart. Does the first indicate I haven't set up WITNESS correctly? What does the second tell me? Thanks Anton ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: WITNESS: unable to allocate a new witness object
On Tue, Oct 15, 2013 at 4:17 PM, Anton Shterenlikht me...@bris.ac.uk wrote: I'm trying to set up textdump(4). On boot I see: WITNESS: unable to allocate a new witness object also It means that you run out of WITNESS object on the free list. Expensive timeout(9) function: 0x9ffc00e222e0(0xa09ed320) 0.002434387 s kickstart. It's output from DIAGNOSTIC, it's triggered when the callout handler execution time exceeds a given threshold. You can safely ignore it. Also, please stop spamming on mailing lists with new posts. They more or less all refers to the same problem. Keeps posting doesn't encourage people to look at it, neither it helps to have it solved more quickly. Thanks, -- Davide There are no solved problems; there are only problems that are more or less solved -- Henri Poincare ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: Buildworld with ccache fails
On Tue, Oct 15, 2013 at 09:52:44AM +0400, Sevan / Venture37 wrote: Hi, I noticed that back in April changes had been commited to allow ccache to build -HEAD world so give it a try again on r256380. The build process now fails at building libc.so.7 with error /usr/bin/ld: this linker was not configured to use sysroots cc: error: linker command failed with exit code 1 (use -v to see invocation) This is a known failure. I haven't had a chance to look into what the reason is, but ccache simply doesn't work with clang right now anyway. Despite the CCACHE_CPP2 change I committed to devel/ccache, there are still some other issues as well. Sevan / Venture37 pgpc_vvnmkZLr.pgp Description: PGP signature
Re: ia64: panic: wrong page state m 0xe00000027fcc1900
On Tue, Oct 15, 2013 at 04:10:04PM +0100, Anton Shterenlikht wrote: This panic is always reproducible by starting nginx, and directing the browser to poudriere logs/bulk/ia64-default/latest/. From ddb, do 'show pginfo address of the page from the panic message'. pgpKqixxtbb2b.pgp Description: PGP signature
Re: [rfc] small bioq patch
On Fri, Oct 11, 2013 at 5:14 PM, John-Mark Gurney j...@funkthat.com wrote: Maksim Yevmenkin wrote this message on Fri, Oct 11, 2013 at 15:39 -0700: On Oct 11, 2013, at 2:52 PM, John-Mark Gurney j...@funkthat.com wrote: Maksim Yevmenkin wrote this message on Fri, Oct 11, 2013 at 11:17 -0700: i would like to submit the attached bioq patch for review and comments. this is proof of concept. it helps with smoothing disk read service times and arrear to eliminates outliers. please see attached pictures (about a week worth of data) - c034 control unmodified system - c044 patched system Can you describe how you got this data? Were you using the gstat code or some other code? Yes, it's basically gstat data. The reason I ask this is that I don't think the data you are getting from gstat is what you think you are... It accumulates time for a set of operations and then divides by the count... So I'm not sure if the stat improvements you are seeing are as meaningful as you might think they are... yes, i'm aware of it. however, i'm not aware of better tools. we also use dtrace and PCM/PMC. ktrace is not particularly useable for us because it does not really work well when we push system above 5 Gbps. in order to actually see any issues we need to push system to 10 Gbps range at least. graphs show max/avg disk read service times for both systems across 36 spinning drives. both systems are relatively busy serving production traffic (about 10 Gbps at peak). grey shaded areas on the graphs represent time when systems are refreshing their content, i.e. disks are both reading and writing at the same time. Can you describe why you think this change makes an improvement? Unless you're running 10k or 15k RPM drives, 128 seems like a large number.. as that's about halve number of IOPs that a normal HD handles in a second.. Our (Netflix) load is basically random disk io. We have tweaked the system to ensure that our io path is wide enough, I.e. We read 1mb per disk io for majority of the requests. However offsets we read from are all over the place. It appears that we are getting into situation where larger offsets are getting delayed because smaller offsets are jumping ahead of them. Forcing bioq insert tail operation and effectively moving insertion point seems to help avoiding getting into this situation. And, no. We don't use 10k or 15k drives. Just regular enterprise 7200 sata drives. I assume that the 1mb reads are then further broken up into 8 128kb reads? so it's more like every 16 reads in your work load that you insert the ordered io... i'm not sure where 128kb comes from. are you referring to MAXPHYS/DLFPHYS? if so, then, no, we have increased *PHYS to 1MB. I want to make sure that we choose the right value for this number.. What number of IOPs are you seeing? generally we see 100 IOPs per disk on a system pushing 10+ Gbps. i've experimented with different numbers on our system and i did not see much of a difference on our workload. i'm up a value of 1024 now. higher numbers seem to produce slightly bigger difference between average and max time, but i do not think its statistically meaningful. general shape of the curve remains smooth for all tried values so far. [...] Also, do you see a similar throughput of the system? Yes. We do see almost identical throughput from both systems. I have not pushed the system to its limit yet, but having much smoother disk read service time is important for us because we use it as one of the components of system health metrics. We also need to ensure that disk io request is actually dispatched to the disk in a timely manner. Per above, have you measured at the application layer that you are getting better latency times on your reads? Maybe by doing a ktrace of the io, and calculating times between read and return or something like that... ktrace is not particularly useful. i can see if i can come up with dtrace probe or something. our application (or rather clients) are _very_ sensitive to latency. having read service times outliers is not very good for us. Have you looked at the geom disk schedulers work that Luigi did a few years back? There have been known issues w/ our io scheduler for a long time... If you search the mailing lists, you'll see lots of reports from some processes starving out others, probably due to a similar issue... I've seen similar unfair behavior between processes, but spend time tracking it down... yes, we have looked at it. it makes things worse for us, unfortunately. It does look like a good improvement though... Thanks for the work! ok :) i'm interested to hear from people who have different workload profile. for example lots of iops, i.e. very small files reads or something like that. thanks, max ___ freebsd-current@freebsd.org mailing list
Re: RFC: support for first boot rc.d scripts
Wonderful! This capability is long overdue. On Oct 13, 2013, at 3:58 PM, Colin Percival cperc...@freebsd.org wrote: As examples of what such scripts could do: More examples: I've been experimenting with putting gpart resize and growfs into rc.d scripts to construct images that can be dd'ed onto some medium and then automatically grow to fill the medium. When cross-installing ports, there are certain operations (e.g., updating 'info' database) that can really only be done after the system next boots. I'd like to get this into HEAD in the near future in the hope that I can convince re@ that this is a simple enough (and safe enough) change to merge before 10.0-RELEASE. Please. Tim ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: RFC: support for first boot rc.d scripts
On 2013-10-15 15:33, Tim Kientzle wrote: Wonderful! This capability is long overdue. On Oct 13, 2013, at 3:58 PM, Colin Percival cperc...@freebsd.org wrote: As examples of what such scripts could do: More examples: I've been experimenting with putting gpart resize and growfs into rc.d scripts to construct images that can be dd'ed onto some medium and then automatically grow to fill the medium. I didn't think of that, that is a 'killer app' for rpi and other such devices, or any kind of 'embedded' image really When cross-installing ports, there are certain operations (e.g., updating 'info' database) that can really only be done after the system next boots. I'd like to get this into HEAD in the near future in the hope that I can convince re@ that this is a simple enough (and safe enough) change to merge before 10.0-RELEASE. Please. Tim ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org -- Allan Jude ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: RFC: support for first boot rc.d scripts
On Sun, Oct 13, 2013 at 3:58 PM, Colin Percival cperc...@freebsd.orgwrote: Hi all, I've attached a very simple patch which makes /etc/rc: 1. Skip any rc.d scripts with the firstboot keyword if /var/db/firstboot does not exist, 2. If /var/db/firstboot and /var/db/firstboot-reboot exist after running rc.d scripts, reboot. 3. Delete /var/db/firstboot (and firstboot-reboot) after the first boot. We use something like this at work. However, our version creates a file after the firstboot scripts have run, and doesn't run if the file exists. Is there a reason to prefer one choice over the other? Naively I'd expect it to be better to run when the file doesn't exist, creating when done; it solves the problem of making sure the magic file exists before first boot, for the other polarity. Thanks, matthew ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: RFC: support for first boot rc.d scripts
On 10/15/13 01:58, Nick Hibma wrote: Indeed... the way my patch currently does things, it looks for the firstboot sentinel at the start of /etc/rc, which means it *has* to be on /. Making the path an rcvar is a good idea (updated patch attached) but we still need some way to re-probe for that file after mounting extra filesystems. In many cases a simple test -f /firstboot bla_enable='YES' || bla_enable='NO' rm -f /firstboot in your specific rc.d script would suffice. [...] I am not quite sure why we need /firstboot handling in /etc/rc. Your suggestion wouldn't work if you have several scripts doing it; the first one would remove the sentinel and the others wouldn't run. In my EC2 code I have a single script which runs after all the others and removes the sentinel file, but that still means that every script has to be executed on every boot (even if just to check if it should do anything); putting the logic into /etc/rc would allow rcorder to skip those scripts entirely. Perhaps it is a better idea to make this more generic, to move the rc.d script containing a 'runonce' keyword to a subdirectory as the last step in rc (or make that an rc.d script in itself!). That way you could consider moving it back if you need to re-run it. Or have an rc.d script setup something like a database after installing a package by creating a rc.d runonce script. Default dir could be ./run-once relative to the rc.d dir it is in, configurable through runonce_directory . Note: The move would need to be done at the very end of rc.d to prevent rcorder returning a different ordering and skipping scripts because of that. I considered this, but decided that the most common requirement use of run once would be for run when the system is first booted, and it would be much simpler to provide just the firstboot functionality. -- Colin Percival Security Officer Emeritus, FreeBSD | The power to serve Founder, Tarsnap | www.tarsnap.com | Online backups for the truly paranoid ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: RFC: support for first boot rc.d scripts
On 10/15/13 13:09, Matthew Fleming wrote: We use something like this at work. However, our version creates a file after the firstboot scripts have run, and doesn't run if the file exists. Is there a reason to prefer one choice over the other? Naively I'd expect it to be better to run when the file doesn't exist, creating when done; it solves the problem of making sure the magic file exists before first boot, for the other polarity. I don't see that making sure that the magic file exists is a problem, since you'd also need to make sure you have knobs turned on in /etc/rc.conf and/or extra rc.d scripts installed. In a very marginal sense, deleting a file is safer than creating one, since if the filesystem is full you can delete but not create. It also seems to me that the sensible polarity is that having something extra lying around makes extra things happen rather than inhibiting them. But probably the best argument has to do with upgrading systems -- if you update a 9.2-RELEASE system to 10.1-RELEASE and there's a first boot script in that new release, you don't want to have it accidentally get run simply because you failed to create a /firstboot file during the upgrade process. -- Colin Percival Security Officer Emeritus, FreeBSD | The power to serve Founder, Tarsnap | www.tarsnap.com | Online backups for the truly paranoid ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: panic: uma_zfree: Freeing to non free bucket index.
On Monday, October 14, 2013 4:44:28 am Anton Shterenlikht wrote: BTW, I see in dmesg: Starting ddb. ddb: sysctl: debug.ddb.scripting.scripts: Invalid argument /etc/rc: WARNING: failed to start ddb What is that about? panic: uma_zfree: Freeing to non free bucket index. cpuid = 0 KDB: stack backtrace: db_trace_self(0x9ffc00158380) at db_trace_self+0x40 db_trace_self_wrapper(0x9ffc00607370) at db_trace_self_wrapper+0x70 kdb_backtrace(0x9ffc00ed0e10, 0x9ffc0058e660, 0x40c, 0x9ffc010a44a0) at kdb_backtrace+0xc0 vpanic(0x9ffc00dfc468, 0xa000e26e0fd8, 0x9ffc00ef9670, 0x9ffc00ed0bc0) at vpanic+0x260 kassert_panic(0x9ffc00dfc468, 0xe00015e25f90, 0xe00015e243e0, 0xe0027ffd5200) at kassert_panic+0x120 uma_zfree_arg(0xe0027ffccfc0, 0xe00015e243e0, 0x0) at uma_zfree_arg+0x2d0 g_destroy_bio(0xe00015e243e0, 0x9ffc004ad4a0, 0x30a, 0x30a) at g_destroy_bio+0x30 g_disk_done(0xe00015e243e0, 0xe00015e15d10, 0xe00012672700, 0x9ffc006b18c0) at g_disk_done+0x140 biodone(0xe00015e243e0, 0x9ffc00e0e150, 0xe00010c24030, 0x0, 0x0, 0x0, 0x9ffc00066890, 0x614) at biodone+0x180 dadone(0xe00012672600, 0xe00012541000, 0xe00015e243e0, 0x7) at dadone+0x620 camisr_runqueue(0xe00011a2dc00, 0xe00012541054, 0xe00012541000, 0x135d) at camisr_runqueue+0x6c0 camisr(0xe00011a2dc20, 0xe00011a2dc00, 0x9ffc00bee9d0, 0xa000e26e1548) at camisr+0x260 intr_event_execute_handlers(0xe000111764a0, 0xe0001118d998, 0xe00011191c00, 0x0) at intr_event_execute_handlers+0x280 ithread_loop(0xe00011192f00, 0xa000e26e1550, 0xe00011192f14, 0xe0001118d99c) at ithread_loop+0x1b0 fork_exit(0x9ffc00e12a90, 0xe00011192f00, 0xa000e26e1550) at fork_exit+0x120 enter_userland() at enter_userland KDB: enter: panic [ thread pid 12 tid 100015 ] Stopped at kdb_enter+0x92: [I2]addl r14=0xffe2c990,gp ;; db db scripts lockinfo=show locks; show alllocks; show lockedvnods db run lockinfo db:0:lockinfo show locks db:0:locks show alllocks db:0:alllocks show lockedvnods Locked vnodes 0xe0001ab39ba8: tag ufs, type VDIR usecount 1, writecount 0, refcount 3 mountedhere 0 flags (VI_ACTIVE) v_object 0xe0001cd30900 ref 0 pages 0 lock type ufs: EXCL by thread 0xe000183d9680 (pid 41389, cpdup, tid 100121) ino 5467932, on dev da5p1 0xe00015ed3ba8: tag ufs, type VDIR usecount 1, writecount 0, refcount 3 mountedhere 0 flags (VI_ACTIVE) v_object 0xe0001cd33e00 ref 0 pages 0 lock type ufs: EXCL by thread 0xe00012a28900 (pid 41421, cpdup, tid 100092) ino 5467948, on dev da5p1 0xe0001ab16938: tag ufs, type VREG usecount 1, writecount 0, refcount 3 mountedhere 0 flags (VI_ACTIVE) v_object 0xe0001cd98a00 ref 0 pages 1 lock type ufs: EXCL by thread 0xe00018494000 (pid 41337, cpdup, tid 100137) ino 5469420, on dev da5p1 0xe0001b2503b0: tag ufs, type VREG usecount 1, writecount 0, refcount 1 mountedhere 0 flags (VI_ACTIVE) lock type ufs: EXCL by thread 0xe00012a28900 (pid 41421, cpdup, tid 100092) ino 5469421, on dev da5p1 0xe0001ab2a760: tag ufs, type VREG usecount 1, writecount 0, refcount 1 mountedhere 0 flags (VI_ACTIVE) lock type ufs: EXCL by thread 0xe000183d9680 (pid 41389, cpdup, tid 100121) ino 5469422, on dev da5p1 db db script zzz=textdump set; capture on; run lockinfo; show pcpu; bt; ps; alltrace; capture off; reset db run zzz I think 'reset' is going to reset without doing a dump? -- John Baldwin ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Please shorten ZFS disk names.
BLACKIE:/root# uname -a FreeBSD BLACKIE.housenet.jrv 10.0-BETA1 FreeBSD 10.0-BETA1 #0 r256428M: Sun Oct 13 23:46:54 CDT 2013 r...@clank.housenet.jrv:/usr/obj/usr/src/sys/GENERIC amd64 This pool is on da{0,1,2,3,4,5,6,7} - I think, only da4 is sure NAME STATE READ WRITE CKSUM z03 ONLINE 0 0 0 raidz2-0 ONLINE 0 0 0 diskid/DISK-%20%20%20%20%20%20%20%20%20%20%20%20Z300HSTK ONLINE 0 0 0 da4 ONLINE 0 0 0 diskid/DISK-%20%20%20%20%20%20%20%20%20%20%20%20Z300HTCQ ONLINE 0 0 0 diskid/DISK-%20%20%20%20%20%20%20%20%20%20%20%20Z300JDT5 ONLINE 0 0 0 diskid/DISK-%20%20%20%20%20%20%20%20%20%20%20%20Z300HTCE ONLINE 0 0 0 diskid/DISK-%20%20%20%20%20%20%20%20%20%20%20%20Z300HTS7 ONLINE 0 0 0 diskid/DISK-%20%20%20%20%20%20%20%20%20%20%20%20Z300JBN1 ONLINE 0 0 0 diskid/DISK-%20%20%20%20%20%20%20%20%20%20%20%20Z300HTAP ONLINE 0 0 0 another example: BLACKIE:/usr/src# zpool status pool: BLACKIE state: ONLINE scan: none requested config: NAME STATE READ WRITE CKSUM BLACKIE ONLINE 0 0 0 gptid/3d882ab0-3588-11e3-b6bc-002590c08004 ONLINE 0 0 0 Based on the hardware config that's either ada0p3 or ada1p3. Whichever it is I want to mirror it onto the other but I don't the names to use for src and dst. ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: Please shorten ZFS disk names.
On Tue, Oct 15, 2013 at 05:49:06PM -0500, James R. Van Artsdalen wrote: BLACKIE:/root# uname -a FreeBSD BLACKIE.housenet.jrv 10.0-BETA1 FreeBSD 10.0-BETA1 #0 r256428M: Sun Oct 13 23:46:54 CDT 2013 r...@clank.housenet.jrv:/usr/obj/usr/src/sys/GENERIC amd64 This pool is on da{0,1,2,3,4,5,6,7} - I think, only da4 is sure NAME STATE READ WRITE CKSUM z03 ONLINE 0 0 0 raidz2-0 ONLINE 0 0 0 diskid/DISK-%20%20%20%20%20%20%20%20%20%20%20%20Z300HSTK [...] Based on the hardware config that's either ada0p3 or ada1p3. Whichever it is I want to mirror it onto the other but I don't the names to use for src and dst. You can set kern.geom.label.gptid.enable=0 in loader.conf(5), which will use the gptid. Glen pgpTY95DKn_sl.pgp Description: PGP signature
Re: Please shorten ZFS disk names.
On Tue, Oct 15, 2013 at 07:07:47PM -0400, Glen Barber wrote: Based on the hardware config that's either ada0p3 or ada1p3. Whichever it is I want to mirror it onto the other but I don't the names to use for src and dst. You can set kern.geom.label.gptid.enable=0 in loader.conf(5), which will use the gptid. Which will *not* use the gptid... Glen pgpvXxAQZEXCM.pgp Description: PGP signature
Re: Please shorten ZFS disk names.
On 2013-10-15 19:07, Glen Barber wrote: On Tue, Oct 15, 2013 at 05:49:06PM -0500, James R. Van Artsdalen wrote: BLACKIE:/root# uname -a FreeBSD BLACKIE.housenet.jrv 10.0-BETA1 FreeBSD 10.0-BETA1 #0 r256428M: Sun Oct 13 23:46:54 CDT 2013 r...@clank.housenet.jrv:/usr/obj/usr/src/sys/GENERIC amd64 This pool is on da{0,1,2,3,4,5,6,7} - I think, only da4 is sure NAME STATE READ WRITE CKSUM z03 ONLINE 0 0 0 raidz2-0 ONLINE 0 0 0 diskid/DISK-%20%20%20%20%20%20%20%20%20%20%20%20Z300HSTK [...] Based on the hardware config that's either ada0p3 or ada1p3. Whichever it is I want to mirror it onto the other but I don't the names to use for src and dst. You can set kern.geom.label.gptid.enable=0 in loader.conf(5), which will use the gptid. Glen In this case, it is the disk_ident that is being used, not the GPT label, so you want to set: kern.geom.label.disk_ident.enable=0 in /boot/loader.conf and then zfs won't see that device alias, and will show the expected device name -- Allan Jude ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: vi drop outs with Bus error
Hi, Reference: From: Julian H. Stacey j...@berklix.com Date: Mon, 14 Oct 2013 08:33:35 +0200 Julian H. Stacey wrote: Anyone else seeing vi dropping out after a while with Bus error no core ? Seen on 10.0-ALPHA4 now on 10.0-ALPHA5 (after buildkernel installkernel buildworld installworld ) It's not hardware, the laptop is stable has compiled 594 ports so far, cd /usr/bin ; ls -l nvi* -r-xr-xr-x 1 root wheel 402064 Oct 12 02:42 nvi.4* -r-xr-xr-x 1 root wheel 402432 Oct 12 02:42 nvi.5* file nvi* nvi.4: ELF 64-bit LSB executable, x86-64, version 1 (FreeBSD), dynamically \ linked (uses shared libs), for FreeBSD 10.0 (155), stripped nvi.5: ELF 64-bit LSB executable, x86-64, version 1 (FreeBSD), dynamically \ linked (uses shared libs), for FreeBSD 10.0 (155), stripped I'm no longer seeing drop outs with Bus error, instead, after xterm -sl 1024 -g 80x24 -j -n lapr -e rlogin -D 10beta1host vi freezes within the xterm after I do an X11 mouse resize (maybe that same SIGWINCH was causing Bus Error before, as resize I tend to do a lot without remembering :-) Anyone else see it ? Cheers, Julian -- Julian Stacey, BSD Unix Linux C Sys Eng Consultant, Munich http://berklix.com Interleave replies below like a play script. Indent old text with . Send plain text, not quoted-printable, HTML, base64, or multipart/alternative. ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
amd64 minidump slowness
Hi, At $JOB, we have machines with 400GB RAM that even the smallest 15GB amd64 minidump takes well over an hour. The major cause of the slowness is that in minidumpsys(), blk_write() is called PAGE_SIZE at a time. This causes blk_write() to poll the console for the Ctrl-C abort once per page. The attached patch changes blk_write() to be called with a run of physically contiguous pages. This reduced the dump time by over a magnitude. Of course, blk_write() could also be changed to poll the console less frequently (like only on every IO). If anybody else dumps on machines with lots of RAM, it would be nice to know the difference this patch makes. I've got a second set of patches that further reduces the dump time by over half that I'll try to clean up soon. http://people.freebsd.org/~bryanv/patches/minidump.patchcommit 25f9e82e4ac93e71c6cf06fe2faa1899967db725 Author: Bryan Venteicher bryanventeic...@gmail.com Date: Sun Sep 29 13:56:42 2013 -0500 Call blk_write() with a run of physically contiguous pages Previously, blk_write() was being called one page at a time, which would cause it to poll the console for every page. This change makes dumping a magnitude faster, and is especially useful on large memory machines. diff --git a/sys/amd64/amd64/minidump_machdep.c b/sys/amd64/amd64/minidump_machdep.c index f14c539..26b2b31 100644 --- a/sys/amd64/amd64/minidump_machdep.c +++ b/sys/amd64/amd64/minidump_machdep.c @@ -221,7 +221,8 @@ minidumpsys(struct dumperinfo *di) vm_offset_t va; int error; uint64_t bits; - uint64_t *pml4, *pdp, *pd, *pt, pa; + uint64_t *pml4, *pdp, *pd, *pt, start_pa, pa; + size_t sz; int i, ii, j, k, n, bit; int retry_count; struct minidumphdr mdhdr; @@ -412,18 +413,29 @@ minidumpsys(struct dumperinfo *di) } /* Dump memory chunks */ - /* XXX cluster it up and use blk_dump() */ - for (i = 0; i vm_page_dump_size / sizeof(*vm_page_dump); i++) { + for (i = 0, start_pa = 0, sz = 0; + i vm_page_dump_size / sizeof(*vm_page_dump); i++) { bits = vm_page_dump[i]; while (bits) { bit = bsfq(bits); pa = (((uint64_t)i * sizeof(*vm_page_dump) * NBBY) + bit) * PAGE_SIZE; - error = blk_write(di, 0, pa, PAGE_SIZE); - if (error) -goto fail; + if (sz == 0 || start_pa + sz == pa) { +if (sz == 0) + start_pa = pa; +sz += PAGE_SIZE; + } else { +error = blk_write(di, 0, start_pa, sz); +if (error) + goto fail; +start_pa = pa; +sz = PAGE_SIZE; + } bits = ~(1ul bit); } } + error = blk_write(di, 0, start_pa, sz); + if (error) + goto fail; error = blk_flush(di); if (error) ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: What happened to nslookup?
On Sun, Oct 13, 2013 at 5:47 PM, Julian Elischer jul...@freebsd.org wrote: On 10/12/13 10:28 AM, David Wolfskill wrote: On Sat, Oct 12, 2013 at 02:14:28AM +, Thomas Mueller wrote: ... Thanks for info! Glad to help. I saw that bind was removed from the current branch because of security problems, It was removed, but I believe that there was a bit more to it than security problems. I think it was just a personal preference that managed to get communicated as important, and no-one had the energy or will to argue about it. (that's the way software projects often work.. loudest and most persistent voice wins). but didn't know nslookup was part of BIND. Now I see in $PORTSDIR/dns/bind-tools/pkg-**plist bin/dig bin/host bin/nslookup so host is also part of BIND? :-} The version of host we had when BIND was part of base was part of BIND, yes. Looking in src/usr.bin/host/Makefile, I see: # $FreeBSD: head/usr.bin/host/Makefile 255949 2013-09-30 17:23:45Z des $ LDNSDIR=${.CURDIR}/../../contrib/ldns LDNSHOSTDIR=${.CURDIR}/../../contrib/ldns-**host ... which indicates that this is a re-implementation of host as provided by contrib/ldns. I will remember to use host in the future. I have found it generally easy to use (easier by far than nslookup). Peace, david nslookup(1) was deprecated about a decade ago because it often provides misleading results when used for DNS troubleshooting. It generally works fine for simply turning a name to an address or vice-versa. People should really use host(1) for simple lookups. It provides the same information and does it in a manner that will not cause misdirection when things are broken. If you REALLY want to dig (sorry) into DNS behavior or problems, learn to use dig(1). It does the same as host(1) or nslookup(1) in it's simplest form but has an extremely large number of options to adjust the query in a variety of ways to allow real analysis of DNS behavior. I'd love to see nslookup simply vanish, but I expect it to be around and causing grief until the day I die (which I hope will still e at least a couple of decades down the road.) -- R. Kevin Oberman, Network Engineer E-mail: rkober...@gmail.com ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org