Re: 10.3 and reboot -r (reroot)
On Tue, Apr 19, 2016 at 12:46:54PM +0200, Edward Tomasz Napierała wrote: > On 0419T0906, Melissa Jenkins wrote: > > I've been trying to get reboot -r to work but get an error that > > kern.proc.pathname is undefined. It then drops to single user mode. > > Interestingly I've checked the value of kern.proc.pathname and it > > appears to be undefined on all the OS boxes we have from 9.3 up to > > current. In fact the kern.proc tree doesn't appear to contain > > anything though it does exist at least on some of the boxes. > The kern.proc.pathname is a weird sysctl. It's per-process, and it's > impossible to access it via name, only by numeric ID. So, this is > normal. > The fact that reroot doesn't work because of this is not normal, > though. I have no idea why this would fail; I'll investigate. I can make it fail this way easily by installing a new init(8) binary. This makes the kern.proc.pathname sysctl fail because /sbin/init has been moved away or deleted. The command procstat -b 1 uses the same vnode-to-pathname translation code and fails similarly. If only a single install has been done, a command ls -l /sbin/init* will make the kernel realize that /sbin/init.bak is in fact the pathname of process 1's executable, and both procstat -b 1 and reboot -r start working. However, the reroot will use the old init binary to perform reboot(RB_REROOT) and to find init in the new root file system, which may be undesirable. It may be better to use the original argv[0]. The kernel passes a full pathname here. While reading the code, I noticed another issue. The kill(-1, SIGKILL) may fail with [ESRCH] if there is no process to kill. In this case, the reroot should continue. This problem sometimes occurs for me when rerooting from single user mode. -- Jilles Tjoelker ___ freebsd-stable@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: cp from NFS to ZFS hung in "fifoor"
On Sat, Nov 28, 2015 at 10:42:28AM -0500, Mikhail T. wrote: > I was copying /home from an old server (narawntapu) to a new one > (aldan). The narawntapu:/home is mounted on aldan as /mnt with flags > ro,intr. On narawntapu /home was simply located on an SSD, but on aldan > I created a ZFS filesystem for it. > The copying was started thus: > root@aldan:/home (435) cp -Rpn /mnt/* . > for a while this was proceeding at a decent clip with cp making > newnfsreq-uests: > load: 0.78 cmd: cp 38711 [newnfsreq] 802.84r 1.57u 140.63s 20% 10768k > > /mnt/mi/.kde/share/apps/kmail/dimap/.42838394.directory/sent/cur/1219621413.32392.hd8cl:2,S > -> > > ./mi/.kde/share/apps/kmail/dimap/.42838394.directory/sent/cur/1219621413.32392.hd8cl:2,S > 100% > load: 1.23 cmd: cp 38711 [newnfsreq] 874.19r 1.66u 154.74s 17% 4576k > > /mnt/mi/.kde/share/apps/kmail/dimap/.42838394.directory/ML/cur/1219595347.32392.rMDFf:2,S > -> > > ./mi/.kde/share/apps/kmail/dimap/.42838394.directory/ML/cur/1219595347.32392.rMDFf:2,S > 100% > ZFS on the destination compressing and writing stuff out and the traffic > between the two ranging from 30 to 50Mb/s (according to systat), but > then something happened and the cp-process is now hung: > load: 0.55 cmd: cp 38711 [fifoor] 1107.67r 2.09u 194.12s 0% 3300k > load: 0.50 cmd: cp 38711 [fifoor] 1112.66r 2.09u 194.12s 0% 3300k > load: 0.22 cmd: cp 38711 [fifoor] 1642.37r 2.09u 194.12s 0% 3300k This normally means that the process is opening a fifo for reading and is waiting for a writer. Although cp -R will normally copy a fifo by calling mkfifo at the destination, it may open one if a regular file is replaced with a fifo between the time it reads the directory and it copies that file. This is not that unlikely if large directory trees are copied during that time. On the other hand, cp without -R/-r/-l/-s will always open a fifo. You can make cp continue by opening the fifo (which you'll need to find first, for example by checking what has been copied already) for writing, like : >/path/to/some/fifo. It will be replaced with an empty file at the destination. -- Jilles Tjoelker ___ freebsd-stable@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: Latest stable (r287104) bash leaves zombies on exit
On Fri, Aug 28, 2015 at 07:18:47PM +0300, Konstantin Belousov wrote: On Fri, Aug 28, 2015 at 05:52:42PM +0200, Michiel Boland wrote: set -e for a in `seq 1000` do echo -n $a xterm -e ssh nonexisting done echo (The idea here is that 'ssh nonexisting' should do some work and then exit, xterm -e false, etc. don't appear to trigger the bug.) Prior to the patch, one of the xterms would hang after the counter reaches a random (reasonably small) number. After the patch the script runs till completion. Thank you for testing. Funny detail is that your loop does not hangs for me, I see flapping xterms until the completion. How many cpus does your machine have ? Below is a slightly improved version of the change, to avoid unnecessary relocations. Would be good to rebuild the world and confirm that you see no regression (the patch also affects rtld in some way). Looks good to me, except that I think a vforked child (in system() and posix_spawn*()) should use the system calls and not libthr's wrappers. This reduces the probability of weird things happening between vfork and exec, and also avoids an unexpected error when posix_spawnattr_setsigdefault()'s mask contains SIGTHR. -- Jilles Tjoelker ___ freebsd-stable@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: Latest stable (r287104) bash leaves zombies on exit
On Sat, Aug 29, 2015 at 04:41:30PM +0300, Konstantin Belousov wrote: On Sat, Aug 29, 2015 at 03:01:38PM +0200, Jilles Tjoelker wrote: Looks good to me, except that I think a vforked child (in system() and posix_spawn*()) should use the system calls and not libthr's wrappers. This reduces the probability of weird things happening between vfork and exec, and also avoids an unexpected error when posix_spawnattr_setsigdefault()'s mask contains SIGTHR. Thank you for the review, I agree with the note about vfork. Updated patch is below. Also, I removed the PIC_PROLOGUE from the i386 setjmp, it has no use after the plt calls are removed. [snip] diff --git a/lib/libc/gen/posix_spawn.c b/lib/libc/gen/posix_spawn.c index e3124b2..673c760 100644 --- a/lib/libc/gen/posix_spawn.c +++ b/lib/libc/gen/posix_spawn.c @@ -118,15 +118,18 @@ process_spawnattr(const posix_spawnattr_t sa) return (errno); } - /* Set signal masks/defaults */ + /* + * Set signal masks/defaults. + * Use unwrapped syscall, libthr is in undefined state after vfork(). + */ if (sa-sa_flags POSIX_SPAWN_SETSIGMASK) { - _sigprocmask(SIG_SETMASK, sa-sa_sigmask, NULL); + __libc_sigprocmask(SIG_SETMASK, sa-sa_sigmask, NULL); } if (sa-sa_flags POSIX_SPAWN_SETSIGDEF) { for (i = 1; i = _SIG_MAXSIG; i++) { if (sigismember(sa-sa_sigdefault, i)) - if (_sigaction(i, sigact, NULL) != 0) + if (__libc_sigaction(i, sigact, NULL) != 0) return (errno); } } Hmm, the comments say direct syscalls are being used, but in fact libthr's interposer is called. The change to system() does correctly use __sys_sigprocmask(). -- Jilles Tjoelker ___ freebsd-stable@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: script(1), cfmakeraw() and Ctrl-Z
On Mon, Jul 15, 2013 at 12:14:19AM +0700, Eugene Grosbein wrote: I've noted that commands like script -qa /tmp/log sleep 100 cannot be suspended with Ctrl-Z keys. The reason is call to cfmakeraw() in script.c - if I comment it out, Ctrl-Z starts to work as expected. portupgrade uses script(1) so build/install process cannot be suspended too. (I'm building libreoffice-4.04 now) The function cfmakeraw() is used since CVS revision 1.1 when script was imported with other BSD 4.4 Lite Usr.bin Sources. Is cfmakeraw() really needed? The cfmakeraw() call ensures that the processes running within script get all control characters. For example, you can suspend a job in the inner shell using Ctrl+Z. This indeed makes it impossible to suspend script itself. -- Jilles Tjoelker ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: FreeBSD 9: fdisk -It crashes kernel
On Thu, Apr 25, 2013 at 11:58:42AM -0500, Guy Helmer wrote: On Apr 25, 2013, at 10:58 AM, Jeremy Chadwick j...@koitsu.org wrote: On Thu, Apr 25, 2013 at 09:06:49AM -0500, Guy Helmer wrote: Encountered a surprise when my disk resizing rc.d script caused FreeBSD 9.1-STABLE to crash. I used fdisk -It ada0 to determine what the available size of the disk (which happened to be the root disk), and on FreeBSD 9.1 the kernel comes crashing down: The shell output can be explained. + fdisk -It ada0 + /rescue/sed -En 's,.*start ([0-9]+).*size ([0-9]+).*,\1 + \2,p' vnode_pager_getpages: I/O read error vm_fault: pager read error, pid 65 (fdisk) pid 65 (fdisk), uid 0: exited on signal 11 eval: arithmetic expression: expecting primary: The subshell for the growfs_vm script exits here because of the error. The eval is the eval in /etc/rc.subr _run_rc_doit. Entropy harvesting: point_to_pointeval: date: Device not configured eval: df: Device not configured eval: dmesg: Device not configured cat: /bin/ls: Device not configured kickstart. After growfs_vm has run (unsuccessfully), rc continues with initrandom. eval: cannot open /etc/fstab: Device not configured eval: cannot open /etc/fstab: Device not configured eval: swapon: Device not configured Warning! No /etc/fstab: skipping disk checks fstab: /etc/fstab:0: Device not configured Fatal trap 12: page fault while in kernel mode cpuid = 1; apic id = 01 fault virtual address = 0x0 fault code = supervisor read, page not present instruction pointer = 0x20:0xc0825fc4 stack pointer = 0x28:0xc5a088c8 frame pointer = 0x28:0xc5a08914 code segment= base 0x0, limit 0xf, type 0x1b = DLP 0, pres 1, def32 1, gran 1 processor eflags = interrupt enabled, resume, IOPL = 0 current process = 91 (mount) [ thread pid 91 tid 100056 ] Stopped at g_access+0x24: mlvl 0(%ebx),%eax db where Tracing pid 91 tid 100056 td 0xc84c42f0 g_access(c8481d34,0,1,1,0,…) at g_access+0x24/frame 0xc5a08914 ffs_mount(c8481d34,c0d78380,2,c5a08c00,c829ae6c,…) af ffs_mount+0xf74/frame 0xc5a08a34 vfs_donmount(c84c42f0,1,0,c84cf200,c84cf200,…) at vfs_donmount+0x1423/frame 0xc5a08c24 sys_nmount(c84c42f0,c5a08ccc,c5a08cc4,1010006,c5a08d08,…) at sys_nmount+0x7f/frame 0xc5a08c48 syscall(c5a08d08) at syscall+0x443/frame 0xc508cfc Xint0x80_syscall() at Xint0x80_syscall+0x21/frame 0xc5a08cfc --- syscall (378, FreeBSD ELF32, sys_nmount), eip = 0x480d5feb, esp = 0xbfbfce1c, ebp = 0xbfbfd378 --- Apparently a subsequent mount command kills it. I'll fix my script to not do this, but it seems odd that fdisk -It can make the disk go away. Yes, that seems wrong. -- Jilles Tjoelker ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: ipv6_addrs_IF aliases in rc.conf(5)
On Thu, Dec 20, 2012 at 01:04:34PM +0200, Kimmo Paasiala wrote: A question related to this for those who have been doing work on the rc(8) scripts. Can I assume that /usr/bin is available when network.subr functions are used? Doing calculations on hexadecimal numbers is going to be very awkward if I can't use for example bc(1). You cannot assume that /usr/bin is available when setting up the network. It may be that /usr is mounted via NFS. You can use hexadecimal numbers (prefixed with 0x) in $((...)) expressions. In FreeBSD 9.0 or newer, sh has a printf builtin you can use; in older versions you can use hexdigit and hexprint from network.subr. -- Jilles Tjoelker ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: /bin/sh arithmetic doesn't seem to like leading 0 now
On Fri, Sep 21, 2012 at 10:26:37PM +, David O'Brien wrote: On Fri, Sep 21, 2012 at 07:34:06PM +0200, Jilles Tjoelker wrote: On Fri, Sep 21, 2012 at 10:09:02AM -0700, David Wolfskill wrote: $ echo $(( ( $( date +%m ) - 1 ) / 3 + 1 )) arithmetic expression: expecting ')': ( 09 - 1 ) / 3 + 1 ... This was done to avoid an inconsistency where constants starting with 0 and containing 8 or 9 were decimal, so something like $((018-017)) expanded to 3. Jilles, Would it be possible to improve on the error message? If David had been given the Bash error message, I suspect he would have figured out the issue right away. It would certainly be possible to add a new error message, but from the embedded point of view the extra code size may not be worth it (also considering that error messages can be enhanced in many other places as well). -- Jilles Tjoelker ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: /bin/sh arithmetic doesn't seem to like leading 0 now
On Fri, Sep 21, 2012 at 10:09:02AM -0700, David Wolfskill wrote: I have a construct in a shell script that I had been using under stable/8 (most recently, @r240259), but which throws an error under stable/9 (at least as early as @r238602): $ echo $(( ( $( date +%m ) - 1 ) / 3 + 1 )) 3 $ uname -r 8.3-PRERELEASE $ echo $(( ( $( date +%m ) - 1 ) / 3 + 1 )) arithmetic expression: expecting ')': ( 09 - 1 ) / 3 + 1 $ uname -r 9.1-PRERELEASE Trying bits pieces of the above, I narrowed the issue down to: $ echo $(( 09 + 0 )) arithmetic expression: expecting EOF: 09 + 0 while $ echo $(( 9 + 0 )) 9 $ is not a problem. Is this intentional? Yes, it was changed with r216547, December 2010. This was done to avoid an inconsistency where constants starting with 0 and containing 8 or 9 were decimal, so something like $((018-017)) expanded to 3. There are indeed various cases where this inconsistency does not matter (because the numbers with leading zeroes do not exceed 10). (I can work around it -- e.g., by using sed to strip leading 0 from the month number (since strftime() doesn't appear to have a format that provides the value in a form that lacks the leading zero for values 10). But I'd rather not do that if I don't need to.) You can use date +%-m although it is not in POSIX. With POSIX only, it is still possible to do it reasonably efficiently, for example $(( 1$(date +%m) - 100 )) or v=$(date +%m); v=${v#0}. -- Jilles Tjoelker ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: Too many open files
On Mon, Mar 26, 2012 at 10:15:59AM +0100, Matthew Seaman wrote: Does 'procstat -fa' give better results for you? It seems to be one of those little hidden secrets that FreeBSD comes with a bunch of native applications that provide pretty much equivalent functionality to lsof(1). See: fstat(1), procstat(1), sockstat(1). Which is odd, given that since these sort of applications have to read and interpret kernel memory -- an action for which there isn't a nice well defined ABI -- the application has to be kept rigorously in synch with the kernel it is used against. Something that is intrinsically easier to do when kernel and application are compiled at the same time and from the same source tree. procstat (in all versions that have it) and fstat (in FreeBSD 9.0 and newer) use a well-defined sysctl-based API to access the information. This API was extended in FreeBSD 9.0 and a library libprocstat provides a convenient interface. Reading from kernel memory not only couples the application tightly to the kernel implementation, but also can also be considered a security issue because there is a lot of sensitive information in kernel memory; it cannot be permitted in a jail. -- Jilles Tjoelker ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: A problem with MAXPATHLEN on a back
On Sun, Feb 26, 2012 at 02:40:09PM +0100, Willem Jan Withagen wrote: I'm running into this on a backup-backupserver. (8.2-STABLE #134: Wed Feb 1 15:05:59 CET 2012 amd64) Haven't checked which paths are too long. But is there any easy way out? Like making MAXPATHLEN 2048 and rebuilding locate. Or is that going to propagate and major impact all and everything. Rebuilding locate database: locate: integer out of +-MAXPATHLEN (1024): 1031 locate: integer out of +-MAXPATHLEN (1024): 1031 It should be possible to replace (sed -i) MAXPATHLEN with something else in the locate source and recompile it. Changing the value of MAXPATHLEN itself is not safe because it defines the size of various buffers in the ABI (such as the one passed to realpath() if its resolved_path parameter is not NULL); in any case, it is a very intrusive change. Locate uses find(1) to generate its list of files, and find's output is not subject to MAXPATHLEN (unless the -L option or the -follow primary is used). Almost any use of the very long pathnames will require a manual split-up though (cd'ing to an initial part shorter than MAXPATHLEN, then repeating the process with relative pathnames until the remaining part is shorter than MAXPATHLEN). -- Jilles Tjoelker ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: GENERIC make buildkernel error / fails - posix_fadvise
On Sun, Jan 22, 2012 at 01:00:46PM -0600, clift...@volcano.org wrote: On 12.01.2012 15:52, Doug Barton wrote: chflags -R noschg /usr/obj/usr rm -rf /usr/obj/usr It's much faster to do: /bin/rm -rf ${obj}/* 2 /dev/null || /bin/chflags -R 0 ${obj}/* /bin/rm -rf ${obj}/* If I could just add one thing here, for those who might be tempted to immediately cut and paste that elegant command line: Consider, how does that command evaluate if the shell variable obj is not set, and you're running that literal string as root? A: You will very systematically wipe your entire server, starting at the root, and doing a second pass to get any protected files you missed. I'd recommend something safer like approximately this (untested): if [X${obj} != X -a -d ${obj}]; then cd ${obj} (rest of cmds); fi Sorry for the wasted bandwidth, for those to whom it was obvious, but anybody who has ever had to clean up after a junior admin's attempt to do something a little too clever will appreciate why I'm posting this. An easier way is to replace the first ${obj} with ${obj:?}, causing an error if obj is unset or null. One limitation is that it does not work with (t)csh. On the efficiency front, for the core file deletion operators, I've had good results with this trick (requires Perl and makes use of its implicit-operand idioms): find ${obj} | perl -nle unlink If rm had an option to take files from standard input, or if there's another program I'm not aware of which does this, it could serve as the right-hand side of this. This does not handle all possible characters in filenames, such as a newline. The perlrun manpage suggests something with find's -print0 primary. Alternatively, use find's -unlink primary. -- Jilles Tjoelker ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: SCHED_ULE should not be the default
On Tue, Dec 13, 2011 at 10:40:48AM +0200, Ivan Klymenko wrote: If the algorithm ULE does not contain problems - it means the problem has Core2Duo, or in a piece of code that uses the ULE scheduler. I already wrote in a mailing list that specifically in my case (Core2Duo) partially helps the following patch: --- sched_ule.c.orig 2011-11-24 18:11:48.0 +0200 +++ sched_ule.c 2011-12-10 22:47:08.0 +0200 @@ -794,7 +794,8 @@ * 1.5 * balance_interval. */ balance_ticks = max(balance_interval / 2, 1); - balance_ticks += random() % balance_interval; +// balance_ticks += random() % balance_interval; + balance_ticks += ((int)random()) % balance_interval; if (smp_started == 0 || rebalance == 0) return; tdq = TDQ_SELF(); This avoids a 64-bit division on 64-bit platforms but seems to have no effect otherwise. Because this function is not called very often, the change seems unlikely to help. @@ -2118,13 +2119,21 @@ struct td_sched *ts; THREAD_LOCK_ASSERT(td, MA_OWNED); + if (td-td_pri_class PRI_FIFO_BIT) + return; + ts = td-td_sched; + /* + * We used up one time slice. + */ + if (--ts-ts_slice 0) + return; This skips most of the periodic functionality (long term load balancer, saving switch count (?), insert index (?), interactivity score update for long running thread) if the thread is not going to be rescheduled right now. It looks wrong but it is a data point if it helps your workload. tdq = TDQ_SELF(); #ifdef SMP /* * We run the long term load balancer infrequently on the first cpu. */ - if (balance_tdq == tdq) { - if (balance_ticks --balance_ticks == 0) + if (balance_ticks --balance_ticks == 0) { + if (balance_tdq == tdq) sched_balance(); } #endif The main effect of this appears to be to disable the long term load balancer completely after some time. At some point, a CPU other than the first CPU (which uses balance_tdq) will set balance_ticks = 0, and sched_balance() will never be called again. It also introduces a hypothetical race condition because the access to balance_ticks is no longer restricted to one CPU under a spinlock. If the long term load balancer may be causing trouble, try setting kern.sched.balance_interval to a higher value with unpatched code. @@ -2144,9 +2153,6 @@ if (TAILQ_EMPTY(tdq-tdq_timeshare.rq_queues[tdq-tdq_ridx])) tdq-tdq_ridx = tdq-tdq_idx; } - ts = td-td_sched; - if (td-td_pri_class PRI_FIFO_BIT) - return; if (PRI_BASE(td-td_pri_class) == PRI_TIMESHARE) { /* * We used a tick; charge it to the thread so @@ -2157,11 +2163,6 @@ sched_priority(td); } /* - * We used up one time slice. - */ - if (--ts-ts_slice 0) - return; - /* * We're out of time, force a requeue at userret(). */ ts-ts_slice = sched_slice; and refusal to use options FULL_PREEMPTION But no one has unsubscribed to my letter, my patch helps or not in the case of Core2Duo... There is a suspicion that the problems stem from the sections of code associated with the SMP... Maybe I'm in something wrong, but I want to help in solving this problem ... -- Jilles Tjoelker ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: /usr/bin/script eating 100% cpu with portupgrade and xargs
On Wed, Oct 12, 2011 at 11:25:35PM +0100, Adrian Wontroba wrote: On Sat, Oct 08, 2011 at 01:27:07AM +0100, Adrian Wontroba wrote: I won't be in a position to create a simpler test case, raise a PR or try patches till Tuesday evening (UK) at the earliest. So far I have been unable to reproduce the problem with portupgrade (and will probably move to portmaster). I have however found a different but possibly related problem with the new version of script in RELENG_8, for which I have raised this PR: misc/161526: script outputs corrupt if input is not from a terminal Blast, should of course been bin/ The extra ^D\b\b are the EOF character being echoed. These EOF characters are being generated by the new script(1) to pass through the EOF condition on stdin. One fix would be to change the termios settings temporarily to disable the echoing but this may cause problems if the application is changing termios settings concurrently and generally feels bad. It may be best to remove writing EOF characters, perhaps adding an option to enable it again if there is a concrete use case for it. -- Jilles Tjoelker ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: sigwait return 4
On Thu, Aug 25, 2011 at 12:29:29AM +0300, Kostik Belousov wrote: On Wed, Aug 24, 2011 at 10:56:09PM +0200, Jilles Tjoelker wrote: sigwait() was fixed not to return EINTR in 9-current in r212405 (fixed up in r219709). The discussion started at http://lists.freebsd.org/pipermail/freebsd-threads/2010-September/004892.html Solaris is simply wrong in the same way we were wrong. Although POSIX may not be as clear on this as one may like, its intention is clear and additionally not returning EINTR reduces subtle portability problems. Can you, please, describe why do you consider the behaviour prohibiting return of EINTR reasonable ? I do consider that the Solaris behaviour is useful. Applications need to cope with EINTR returns (usually by retrying the call); if they do not do this, bugs arise in uncommon cases. In the case of sigwait(), applications do not really need EINTR: they can include the respective signal into the signal set and do the work inline that was originally in the signal handler. This might require additional pthread_sigmask() calls. This also fixes the race condition almost always associated with EINTR. Historically, this is because sigwait() came with POSIX threads, which also explains why it returns an error number rather than setting errno. The threads group considered EINTR errors not useful enough, given that they may lead to subtle bugs. This is fully standardized for functions like pthread_cond_wait() and pthread_mutex_lock(). In the case of sigwait(), it also plays a role that glibc has decided not to return EINTR, so that returning EINTR may lead to subtle bugs appearing on FreeBSD in software originally written for GNU/Linux. The functions sigwaitinfo() and sigtimedwait() came with POSIX realtime and therefore follow different conventions. -- Jilles Tjoelker ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: sigwait return 4
On Wed, Aug 24, 2011 at 10:07:03PM +0300, Kostik Belousov wrote: On Wed, Aug 24, 2011 at 10:19:07PM +0400, Slawa Olhovchenkov wrote: System is 8.2-RELEASE (GENERIC), amd64. Application -- i386 for freebsd7. In ktrace dump I find some strange result: 22951 100556 kas-milter CALL sigwait(0xffdfdf80,0xffdfdf7c) 22951 100556 kas-milter RET sigwait 4 22951 100556 kas-milter PSIG SIGUSR2 caught handler=0x804c0f0 mask=0x4003 code=0x0 RET sigwait 4 confused me, and, I think, confused application too. man sigwait: ERRORS The sigwait() system call will fail if: [EINVAL] The set argument specifies one or more invalid signal numbers. [EFAULT] Any arguments point outside the allocated address space or there is a memory protection fault. How sigwait can return '4'? May be EINTR, converted from ERESTART? But kern_sigtimedwait from sigwait must be called with timeout == NULL... What should the system do for a delivered signal not present in the set ? I guess this is the case of your ktrace. Looking at the SUSv4, I see no mention of the situation, but in Oracle SunOS 5.10 man page for sigwait(2), it is said explicitely EINTR The wait was interrupted by an unblocked, caught signal. So I think that we have a bug in the man page. diff --git a/lib/libc/sys/sigwait.2 b/lib/libc/sys/sigwait.2 index 8c00cf4..b462201 100644 --- a/lib/libc/sys/sigwait.2 +++ b/lib/libc/sys/sigwait.2 @@ -27,7 +27,7 @@ .\ .\ $FreeBSD$ .\ -.Dd November 11, 2005 +.Dd August 24, 2011 .Dt SIGWAIT 2 .Os .Sh NAME @@ -94,6 +94,8 @@ The .Fn sigwait system call will fail if: .Bl -tag -width Er +.It Bq Er EINTR +The system call was interrupted by an unblocked, caught signal. .It Bq Er EINVAL The .Fa set This patch would be wrong, except to document existing behaviour in -stable branches. sigwait() was fixed not to return EINTR in 9-current in r212405 (fixed up in r219709). The discussion started at http://lists.freebsd.org/pipermail/freebsd-threads/2010-September/004892.html Solaris is simply wrong in the same way we were wrong. Although POSIX may not be as clear on this as one may like, its intention is clear and additionally not returning EINTR reduces subtle portability problems. Note that sigwaitinfo() and sigtimedwait() may return EINTR. SA_RESTART applies to sigwaitinfo() but not to sigtimedwait() (because the timeout cannot be restarted). -- Jilles Tjoelker ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: Change in behavior to stat(1)
On Mon, Feb 28, 2011 at 11:15:39AM -0600, Stephen Montgomery-Smith wrote: I had a little script that would remove broken links. I used to do it like this: if ! stat -L $link /dev/null; then rm $link; fi But recently (some time in February according to the CVS records) stat was changed so that stat -L would use lstat(2) if the link is broken. So I had to change it to if stat -L $link | awk '{print $3}' | grep l /dev/null; then rm $link; fi but it is a lot less elegant. What is the proper accepted way to remove broken links? A better answer to your original question was already given, but for that command, isn't it sufficient to do if ! [ -e $link ]; then rm $link; fi All test(1)'s primaries that test things about files follow symlinks, except for -h/-L. -- Jilles Tjoelker ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: Policy on static linking ?
On Sat, Jan 15, 2011 at 09:11:50PM +1100, Jean-Yves Avenard wrote: On Friday, 14 January 2011, Pete French petefre...@ticketswitch.com wrote: I build code using static linking for deployment across a set of machines. For me this has a lot of advantages - I know that the code will run, no matter what the state of the ports is on the machine, and if there is a need to upgrade a library then I do it once on the build machine, rebuild the executable, and rsync it out to the leaf nodes. Only one place to track security updates, only one place where I need to have all the porst the code depends on installed. I actually tried to compile a port against another and have it link statically, but I couldn't find a way to do so without hacking the configure script. I was wondering if there was another (and easier) way to do so... I use ldap for authentication purposes, along with pam_ldap and nss_ldap If I compile openldap-client against openssl from ports, then it creates massive problems elsewhere. For example, base ssh server will now crash due to using different libcrypto. compiling ports will also become impossible as bsd tar itself crash (removing ldap call from nsswitch.conf is required to work again) I was then advised in the freebsd forums to uninstall openssl port, compile openldap against openssl base, install it, then re-install openssl port. (I have to use openssl from ports with apache/subversion to fix a bug with TLSv1 making svn commit crash under some circumstances) I dislike this method, because should openldap gets upgraded again and be linked against openssl port, I will lock myself out of the machine again due to sshd crashing. Just like what happened today :( So how can I configure openldap-client to link against libssl and libcrypto statically? I think this can be solved with a symbol versioning trick. By applying a different version to all symbols from each OpenSSL version, the dynamic linker will be able to distinguish between different versions of OpenSSL and will use the correct OpenSSL version each object was linked to, even if there are multiple OpenSSL versions in the process. Note that each OpenSSL version should still have its own SONAME (the security/openssl port does this). The version script can be as simple as (substituting the version) OPENSSL_0.9.8 { global: *; }; and needs to be in the top level and in the engines directory. For it to work completely, both base and the port need to be patched. Old binaries continue to work but the benefit only appears after recompilation. The SONAME needs to be bumped when the version string is changed or deleted (but not when it is initially added), otherwise binaries will stop working. This also means that making the change cannot be undone without breaking binary compatibility. What will not work is allocating an OpenSSL structure in one object linked to one OpenSSL version and then using it in another object linked to another OpenSSL version. That would require true symbol versioning, keeping compatibility with old versions in the same library with the same SONAME. Unlike the approach I propose, that would be a lot of work and can only be done by the OpenSSL project, and I think their policy is not to do such extra work for ABI compatibility. If they change their mind they will probably start with the symver version of the previous release so as to remain compatible with what various Linux distributions are doing. Also, a side effect is that it is no longer possible to cheat by symlinking different OpenSSL versions. The approach has been used by Debian for some time. Links: http://chris.dzombak.name/blog/2010/03/building-openssl-with-symbol-versioning/ http://chris.dzombak.name/files/openssl/openssl-0.9.8l-symbolVersioning.diff http://rt.openssl.org/Ticket/Display.html?id=1222user=guestpass=guest -- Jilles Tjoelker ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: Policy on static linking ?
On Fri, Jan 14, 2011 at 02:07:37PM +, Pete French wrote: I build code using static linking for deployment across a set of machines. For me this has a lot of advantages - I know that the code will run, no matter what the state of the ports is on the machine, and if there is a need to upgrade a library then I do it once on the build machine, rebuild the executable, and rsync it out to the leaf nodes. Only one place to track security updates, only one place where I need to have all the porst the code depends on installed. I recently wanted to use libdespatch, but I found that the port didn't install the static libraries. I filed a PR, and found out from the reponse that this was deliberate, and that a number of other ports were deliberately excluding static libraries too. Some good reasons where given, which I wont reporduce here, as you can read them at: http://www.freebsd.org/cgi/query-pr.cgi?pr=151306 Today I finally hit the problem where a critical library I am using has stopped working with static libraries (or so it appears at first glance). I was wondering what the general policy here was - should I just bite the bullet and go dynamic, and accept the maintannance headache that cases, or could we define something like 'WITH_STATIC_LIBRARIES' that could be set which would make ports install a set of static libraries (maybe into a separate /usr/local/lib/static?) so that the likes of me could continue to build static code ? I'd very much like to be able to continue to ship single executables that just run, but if theres some policy to only have dynamic libraries in ports going forward then fair enough... Various features do not work with static linking because dlopen() does not work from static executables. Libraries that are also used by dlopen()ed modules should generally be linked dynamically, particularly if these libraries have global state. Things that use dlopen() include NSS (getpwnam() and the like), PAM and most plugin systems. If libc is statically linked, NSS falls back to a traditional mode that only supports the traditional things (e.g. no LDAP user information); I think PAM and most plugin systems do not work at all. For some system libraries, there can be kernel compatibility problems that prevent old libraries from working, although an ABI-compatible shared library is available. This has happened with 6.x's libkse: binaries statically linked to it do not run on 8.x or newer, while libkse can be remapped to libthr for binaries dynamically linked to libkse. For these reasons, static linking to libc, libpthread and similar system libraries should be reserved for /rescue/rescue and similar programs, and not used in general. Another feature only available with dynamic linking is hidden symbols that are available only inside the shared object. Compiling a library that uses this feature as a static library will make the hidden symbols visible to the application or other libraries. This may cause name clashes that otherwise wouldn't have been a problem or invite API abuse. Proper use of hidden symbols can also speed up linking and load times considerably, particularly if the code is written in C++. Other issues are static linking's requirement to list all libraries a library depends on and in the correct order. With dynamic linking, listing the indirect dependencies is unnecessary and best avoided. This is generally not very hard to fix but still needs extra effort. (For example, pkg-config has Libs.private to help with it.) If you want to link dynamically but avoid too much management overhead, consider using PCBSD's PBI system which allows you to ship all necessary .so files (except system ones) with your application. -- Jilles Tjoelker ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: utmp.h exists or not in RELENG_8?
On Sat, Oct 02, 2010 at 12:22:22PM -0500, Jeremy Messenger wrote: My system is RELENG_8 and I have checkout by via csup today. It shows that utmp.h still exists in RELENG_8. But when I see this PR: http://www.freebsd.org/cgi/query-pr.cgi?pr=ports/149945 I have decided to check in the http://www.freebsd.org/cgi/cvsweb.cgi/src/include/?only_with_tag=RELENG_8 ... It shows that utmp.h has been removed. But in the http://sources.freebsd.org/RELENG_8/src/include/ shows a different story as it exists. I am confusing... Is it supposed to be deleted in CVS when it did the SVN-CVS? Or what? I don't have svn installed in my system at the moment, so can't check it now. utmp.h has been removed in HEAD (9.x) but is still present in 8.x and earlier branches. It looks like cvsweb is buggy in this area. The build error in ports/149945 may be caused by a stray utmpx related file found by the configure process. Partly because the various unix variant developers have made a mess of utmp/utmpx, the code to use it is rather fragile. -- Jilles Tjoelker ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: mysqld_safe holding open a pty/tty on FreeBSD (7.x and 8.x)
On Thu, Sep 30, 2010 at 09:03:33AM +0200, Ed Schouten wrote: * Jeremy Chadwick free...@jdc.parodius.com wrote: 1) mysqld_safe /dev/null 21 never released the tty 2) nohup mysqld_safe /dev/null 21 did release the tty What happens if you run the following command? daemon -cf mysqld_safe The point is that FreeBSD's pts(4) driver only deallocates TTYs when it's really sure nothing uses it anymore. Even if there is not a single file descriptor referring to the slave device, it has to wait until there exist no processes which have the TTY as its controlling TTY. In fact, POSIX allows dissociating the controlling terminal from the session when all file descriptors for it (in any session) have been closed. See SUSv4 XBD 11.1.3 The Controlling Terminal. Once the terminal has been dissociated, it is no longer in use at all and can, in case of a pty, be cleaned up. Implementing this may be an interesting idea. Of course, this will cause opening /dev/tty to fail in some cases where it previously succeeded, but it seems uncommon. Somewhat unrelated, I think that starting daemons with daemon(8), /dev/null /dev/null 21 or similar is inferior to implementing daemonizing in the program itself. Think of the poor soul who needs to install and start N daemons full of bugs and configuration errors: it is better if such errors show up on the console instead of being hidden away in a log file. -- Jilles Tjoelker ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: daily run output 800.scrub-zfs fixups
On Sun, Aug 22, 2010 at 03:08:42PM +0200, Alexander Leidinger wrote: On Sat, 21 Aug 2010 00:17:08 -0400 jhell jh...@dataix.net wrote: Hi Alexander, Attached is a fix for one problem and one slight overlook for 800.scrub-zfs. The first second change was probably just an oversight but none the less they both give a false impression of actions taken. Change1: ${daily_scrub_zfs_default_threshold=30} is missng the ':' which would ultimately reset the users supplied value in periodic.conf to 30. Sorry, but it is not missing the ':'. There is one in front of it. A lot of start scripts in ports use this. You need to use a := instead of a = if you use var=${var:=default_val} but not if you use : ${var=default_val} I have the impression that the ':' in front of the variable is the way it is supposed to be in the start scripts in ports. I adopted this style (one variable name less to type... specially with expressive names this is some amount less to type). As described in sh(1) and POSIX, ${var=default_val} assigns the default if var was not set, while ${var:=default_val} assigns the default if var was not set or if it was set to the empty string. The double assignment in the construct var=${var:=default_val} is a workaround for bugs in very old Bourne shells (see Autoconf documentation for more). Our sh(1) has never had that bug, so simply : ${var:=default_val} is better. The double-quotes prevent unnecessary pathname generation, which could be slow. However, even without the double-quotes, the correct value is assigned and no other side effects occur. And I remember to have tested a lot of cases for the timeout value, overriding a pool specific value and overriding the default where some of them and all worked. If you have a case where it does not work, it would be nice if you could add a set -x in the beginning of the script and send me the output of a failing run. -- Jilles Tjoelker ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: SIGEPIPE after update to 8.1-RC2
On Sun, Jul 18, 2010 at 12:20:33PM +1000, Sean wrote: I'm getting the same thing; what shell are you using? I changed my shell on one machine from /bin/tcsh to /usr/local/bin/bash and problem disappeared. That this workaround helps confirms that masked/ignored SIGPIPE is the problem. From a few shells I have tried, bash and zsh reset SIGPIPE to caught or default even if it was ignored (only in interactive mode, however), while tcsh, sh, mksh and ksh93 leave it ignored. The underlying problem is the program that is passing the ignored/masked signal to child processes. Please check if the problem occurs with various ways to log in (text console, ssh, xterm, etc). Things like PAM modules may also cause problems here. For example, sshd sets SIGPIPE to ignored, but resets it back to default before starting a child process, so assuming I read the code correctly it does not cause problems. -- Jilles Tjoelker ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: SIGEPIPE after update to 8.1-RC2
On Sat, Jul 17, 2010 at 06:24:55PM +0300, Alex Kozlov wrote: After updating my buildbox from 26 April 8-STABLE to 8.1-RC2 I constantly getting SIGEPIPE portsnap: Fetching 4 metadata patches... done. Applying metadata patches... done. Fetching 0 metadata files... done. Fetching 27 patches.1020... done. Applying patches... done. Fetching 3 new ports or files... done. sort: write failed: standard output: Broken pipe sort: write error Removing old files and directories... done. sudo make -C /usr/ports/converters/ascii2binary: === Patching for ascii2binary-2.13_2 === Applying FreeBSD patches for ascii2binary-2.13_2 === ascii2binary-2.13_2 depends on shared library: intlgrep: writing output: Broken pipe grep: writing output: Broken pipe [snip repetition] - found === Configuring for ascii2binary-2.13_2 Does anyone know something about this issue? This looks more like the absence of SIGPIPE than an inappropriate SIGPIPE. I can reproduce both of those error messages by running the commands with SIGPIPE ignored. grep(1) seems to behave strangely on write errors, not aborting, for example yes | { trap '' PIPE; grep -v foo; echo $? 2; } | : prints an endless stream of error messages. Note that sh(1) silently ignores attempts to change the disposition of signals that were ignored on entry to the shell, so a trap - PIPE is unlikely to help you. Similarly, SIGPIPE may be blocked (masked). Few programs expect this. The -i and -j options in procstat should be helpful in finding what exactly is wrong with SIGPIPE. (These options are relatively new, but should be in 8.1.) -- Jilles Tjoelker ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: calcru: runtime went backwards messages
On Fri, Jun 11, 2010 at 12:35:05AM +0800, Jansen Gotis wrote: Hi, for the past couple of months since moving to RELENG_8 I've been receiving calcru: runtime went backwards messages on the console. My machine is a dual Pentium III 1.26GHz with an Intel SAI2 board. Disabling EIST is not an option in my BIOS, and I've tried disabling the ACPI timer as well as setting kern.timecounter.hardware=i8254. I've also tried disabling cpufreq in my kernel configuration. For what it's worth, I'm running base ntpd. I've also tried openntpd, but no dice. I did a binary search of the commit with which this started, and apparently it's svn r204546, a summary of which can be seen here: http://freshbsd.org/2010/03/02/01/56/55 The calcru messages appear whether vesa is loaded as a module or compiled into the kernel. If anyone needs more information, I'll be happy to provide it. = snippet of /var/log/messages relating to calcru messages = Jun 10 22:41:42 hobbes kernel: calcru: runtime went backwards from 3502 usec to 3297 usec for pid 1106 (mksh) Jun 10 22:41:42 hobbes kernel: calcru: runtime went backwards from 36785 usec to 35858 usec for pid 1114 (csh) Jun 10 22:41:42 hobbes kernel: calcru: runtime went backwards from 13438 usec to 12652 usec for pid 1113 (su) Jun 10 22:41:42 hobbes kernel: calcru: runtime went backwards from 14956 usec to 14081 usec for pid (mksh) Jun 10 22:41:42 hobbes kernel: calcru: runtime went backwards from 3323 usec to 3128 usec for pid (mksh) Jun 10 22:41:42 hobbes kernel: calcru: runtime went backwards from 610 usec to 574 usec for pid 549 (devd) Jun 10 22:41:42 hobbes kernel: calcru: runtime went backwards from 517 usec to 486 usec for pid 548 (dhclient) Jun 10 22:41:42 hobbes kernel: calcru: runtime went backwards from 1912 usec to 1800 usec for pid 532 (dhclient) Jun 10 22:41:42 hobbes kernel: calcru: runtime went backwards from 39738 usec to 37412 usec for pid 532 (dhclient) Jun 10 22:41:42 hobbes kernel: calcru: runtime went backwards from 3369010 usec to 3334846 usec for pid 1 (init) This may well be a manifestation of a brokenness (which should not be unknown) in how FreeBSD stores CPU time utilization. The time is maintained in CPU ticks (CPU clock cycles), so if the clock frequency changes, the values of existing processes will be wrong (a jump when converted to seconds). When calcru detects this, it generates messages like the above. If this analysis is right, the messages can be ignored, but indicate that CPU time statistics may be inaccurate. I suppose fairly arbitrary changes can cause the messages to appear or disappear. -- Jilles Tjoelker ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: locale-related build problems (was: kernel build failed)
On Sat, Jan 02, 2010 at 11:01:06PM +0200, E. O. wrote: pls help me , kernel always returns an error make buildkernel and error ./pci_if.h:214: error: expected '=', ',', ';', 'asm' or '__attribute__' before '_RELEASE_MS' ./pci_if.h:214: error: stray '\335' in program ./pci_if.h:226: error: stray '\335' in program ./pci_if.h:226: error: expected '=', ',', ';', 'asm' or '__attribute__' before '_MS' ./pci_if.h:226: error: stray '\335' in program ./pci_if.h:238: error: stray '\335' in program ./pci_if.h:238: error: expected '=', ',', ';', 'asm' or '__attribute__' before '_MS' ./pci_if.h:238: error: stray '\335' in program [...] The build system appears not to cope with locales. My guess is that you're using Turkish locale, which has dotless and dotted uppercase and lowercase 'i', where the uppercase version of the dotted 'i' is an uppercase dotted 'I'. awk(1) knows about this and generates a file that conforms to Turkish conventions but obviously will not work. As a workaround, you can run your builds with LC_ALL=C in the environment, disabling locale support. (e.g. env LC_ALL=C make buildkernel). This should be fixed by adding LC_ALL=C somewhere in the build process, possibly only for these awk commands. -- Jilles Tjoelker ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: Can close-ing a pipe trigger a SIGPIPE?
On Sat, Oct 17, 2009 at 01:41:22PM -0400, Mikhail T. wrote: Kostik Belousov wrote: Take ktrace of both parent and child. Great idea! Here is the kdump's listing for both (after ktrace -i): http://aldan.algebra.com/~mi/tmp/tclx-kdump.txt (it is large, so be sure to use a compressing browser). Once loaded, look for substring: Error SIGPIPE signal received while closing file5. The parent process-ID is 92722. The child -- 92723. Thanks! Yours, The interesting part of the ktrace: 92723 tclsh8.5 CALL exit(0) 92722 tclsh8.5 CALL sigaction(SIGPIPE,0x7fffa9e0,0) 92722 tclsh8.5 RET sigaction 0 92722 tclsh8.5 CALL write(0x4,0x800e24028,0) 92722 tclsh8.5 RET write -1 errno 32 Broken pipe 92722 tclsh8.5 PSIG SIGPIPE caught handler=0x800f126d0 mask=0x0 code=0x0 92722 tclsh8.5 CALL sigreturn(0x7fffa0c0) 92722 tclsh8.5 RET sigreturn JUSTRETURN 92722 tclsh8.5 CALL close(0x5) 92722 tclsh8.5 RET close 0 92722 tclsh8.5 CALL close(0x4) 92722 tclsh8.5 RET close 0 It seems unwise to assume that a write(2) of 0 bytes is a noop. Even if it is, doing it is a waste of a system call. -- Jilles Tjoelker ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: Value of $? lost in the beginning of a function.
On Sun, Jul 19, 2009 at 10:26:38PM +0200, Romain Tartière wrote: Hi! Simple test case: 8-- #!/bin/sh foo() { echo \$?=$? \$1=$1 } false foo $? 8-- % sh foo.sh $?=0 $1=1 % zsh foo.sh $?=1 $1=1 % bash foo.sh $?=1 $1=1 As you can see, the value of $? is « lost » when FreeBSD sh enters a function. Is this supposed to behave this way? This has been fixed in 8.x: Revision 185231 - Directory Listing Modified Sun Nov 23 20:23:57 2008 UTC (7 months, 3 weeks ago) by stefanf Fix $? at the first command of a function. The previous exit status was saved twice and thus lost. -- Jilles Tjoelker ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: Why old files in /etc ?
On Sun, Jun 14, 2009 at 07:33:23AM -0500, Michael Gass wrote: Just installed 7.2-release in an old Pavilion 4455 (PII, 256M) and it runs great. Csuped src and ports and rebuilt world and generic kernel for 7.2-stable and that went well. My question is why are the files in /etc in 7.2-stable older versions (generally) than in 7.2-release? I do not just mean older by date, but older versions of the files - like many of the config files for sendmail or the net. Why is stable using older versions of these files than release? Mostly I did not let mergemaster install the files from the build because they were so much older than the original release versions. Again, why are the files for stable so much older? This is because of a weakness in the svn-to-cvs exporter. Formerly, only CVS was used and tagging a release did not require a commit. So right after a release, both the release and -stable would have the same revision numbers. With Subversion, tagging a release requires a commit. The CVS exporter keeps this commit, so all files will have a changed CVS Id. This looks newer, until/unless the file is changed on -stable again. To cope with this, it's best to use mergemaster's -F or -U options. This question has been asked before. -- Jilles Tjoelker ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: Unnamed POSIX shared semaphores
On Mon, Jun 01, 2009 at 06:33:42PM +0300, Vlad Galu wrote: According to sem_init(3), we can't have shared unnamed semaphores. However, the following code snippet seems to work just fine: -- cut here -- sem_t semaphore; if (sem_init(semaphore, 1, 10) 0) std::cout Couldn't init semaphore: strerror(errno) std::endl; if (sem_wait(semaphore) 0) std::cout Couldn't decrement semaphore: strerror(errno) std::endl; int val; sem_getvalue(semaphore, val); std::cout Value is val std::endl; -- and here -- Is this a case where sem_init() silently reports success, or should be the man page get an update? Reading the code, it seems like this should work, but only between related processes where the parent process creates the semaphore before forking and no exec is done. This is because a sem_t is a pointer to a structure containing the kernel level semid_t (and a mutex, condvar and the like for process-private semaphores). sem_init() will allocate this structure using malloc(3). Changing sem_t to be a structure would be the obvious way to fix this, but I think this cannot be versioned properly. For example, if someone puts in the public header file of their .so: struct my_struct { int foo; sem_t bar; int quux; }; Changing the size of sem_t will break this. Also, assuming symbol versioning were to be used, if you compile the .so for FreeBSD 7 and the app for FreeBSD 8, the FreeBSD 8 sem_* functions will get FreeBSD 7 style sem_t's. If process-shared semaphores really work, then the above structure is not a pathological case. Effectively, sem_t is carved in stone. So process-private semaphores should continue to have most of their stuff in a separately allocated structure, to preserve flexibility. Perhaps a better method is to set bit 0 of the sem_t to 1 and use the other bits to store the semid_t. -- Jilles Tjoelker ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: 7.2-RC2 Install Feedback
On Thu, Apr 30, 2009 at 02:50:22PM -0400, Mehmet Erol Sanliturk wrote: (1) I am using always Gnome in FreeBSD ( because KDE in 7.0 was opening forms of executables just like X , and for multiple docked forms this was very annoying) . During installations , I am selecting both Gnome and KDE because some packages are inserted only into KDE menus . If the KDE is not selected , those packages are not available in Gnome menus , and they should be inserted into Gnome menus one by one ( for me , this is a difficult task , because it requires to find proper executable to insert it into menu ) . IMHO, that would be a bug in those packages. Desktop files should be installed in /usr/local/share/applications (or similar) and contain a Categories field to determine where to place them in the menu. The OnlyShowIn and NotShowIn fields are available for desktop files specific to certain desktop environments. Many ports install desktop files to /usr/local/share/applnk (KDE) or /usr/local/share/gnome/apps (GNOME); if these files contain Categories it should be safe to move them to the new location. Many upstreams are already following this new standard, and some ports override the upstream build system's code to install to the new location with an installation to the old location. A user-level workaround is to make symlinks to the necessary desktop files in $HOME/.local/share/applications/; while it is annoying to have to do this, it is much easier than adding menu entries manually. XFCE also needs the desktop files in the new location. See http://standards.freedesktop.org/menu-spec/latest/ for more information. -- Jilles Tjoelker ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: expand_number(3) silently truncates numeric part of the argument to 32 bit on i386, light impact on gjournal
On Sun, Jun 29, 2008 at 04:16:25PM -0400, Alexandre Sunny Kovalenko wrote: I honestly don't know whether it should or should not do it, and if it should not, what errno should be set to. Program below gives following output on RELENG_7 as of June 28th: sunny:RabbitsDen./expand_number 5368709120k Result is 1099511627776 sunny:RabbitsDen./expand_number 5120G Result is 5497558138880 sunny:RabbitsDen One of the more interesting manifestations in the userland is that gjournal label -s 5368709120 -f /dev/da0s1a quietly gives you 1G of the journal in the resulting file system. [snip program calling expand_number(3)] This happens because src/lib/libutil/expand_number.c does not include the necessary header inttypes.h for calling strtoimax(3). The file is compiled without compiler warnings, so the bug shows up as wrong behaviour. Adding #include inttypes.h fixes it. The file is slightly changed in CURRENT but the same patch should apply. -- Jilles Tjoelker --- src/lib/libutil/expand_number.c.orig 2007-09-05 16:27:13.0 +0200 +++ src/lib/libutil/expand_number.c 2008-07-06 13:11:02.766238000 +0200 @@ -33,6 +33,7 @@ #include errno.h #include libutil.h #include stdint.h +#include inttypes.h /* * Convert an expression of the following forms to a int64_t. ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: panic(): vinvalbuf: dirty bufs: perhaps a ffs_syncvnode bug?
On Thu, Nov 16, 2006 at 09:24:07AM +0100, Rink Springer wrote: Over the night, we reset the shelf in order to activate its new management IP address, causing the /dev/da[12] devices to be temporarily unavailable. This resulted in the following panic on the rather busy mailstorage server (the other server has minor load and was fine): --- (da0:isp0:0:1:0): lost device (da0:isp0:0:1:0): removing device entry (da1:isp0:0:2:0): lost device g_vfs_done():da1s1[WRITE(offset=292316823552, length=16384)]error = 6 g_vfs_done():da1s1[WRITE(offset=240287318016, length=16384)]error = 6 g_vfs_done():da1s1[READ(offset=12175362048, length=2048)]error = 6 g_vfs_done():da1s1[WRITE(offset=240287318016, length=16384)]error = 6 g_vfs_done():da1s1[READ(offset=18370689024, length=2048)]error = 6 g_vfs_done():da1s1[READ(offset=25829486592, length=512)]error = 6 vnode_pager_getpages: I/O read error vm_fault: pager read error, pid 78035 (lmtpd) g_vfs_done():da1s1[WRITE(offset=240287318016, length=1638(da1:isp0:0:2:0): Invalidating pack 4)]error = 6 g_vfs_done():da1s1[READ(offset=13768671232, length=6144)]error = 6 g_vfs_done():da1s1[READ(offset=102126977024, length=16384)]error = 6 g_vfs_done():da1s1[READ(offset=13768671232, length=6144)]error = 6 g_vfs_dpone():da1s1[READ(offset=102319669248, length=16384)]error = 6a nic: vinvalbuf: dirty bufs cpuid = 2 Uptime: 54d15h48m38s When looking at the source code of vinvalbuf(), which calls bufobj_invalbuf(), it seems that this panic is raised after a bufobj still contains dirty data after waiting for it to complete without error. The code can be found at /sys/kern/vfs_subr.c Note that this panic can only occur if vinvalbuf() is called with V_SAVE (save modified data first). The exact condition for the panic is better described as: a bufobj still contains dirty data or still has output in progress after a successful synchronous BO_SYNC operation. bufobj_wwait() cannot return an error unless msleep() fails (e.g. interruptible sleep requested via slpflag and signal occured). If the I/O has failed, bufobj_wwait() will return success. The sync routine called eventually translates to bufsync(), as in /sys/kern/vfs_bio.c, which calls the filesystem's sync routine. It seems as if the return status of vfs_bio_awrite() in ffs_syncvnode() is not checked; all the other parts are checked. I believe this could provoke this panic. There does not seem much point in checking an asynchronous write result anyway, as the I/O is not completed yet. I don't understand well what the code is doing with async writes. For all but the last pass (see further), it will call bawrite() on the buffer, which sets B_ASYNC then calls bwrite(). For the last pass, it calls bwrite() directly (has something cleared B_ASYNC?), and returns an error if it fails. bwrite() itself is an inline function defined in /sys/sys/buf.h, which calls BO_WRITE after some KASSERTs. As the machine is in production use, it was instantly rebooted by a collegue and thus I have no vmcore, backtrace or anything. I therefore hope the information provided here is adequate. Can someone with more FreeBSD-VFS knowledge please look at this? There is another possible problem, from this comment in /sys/ufs/ffs/ffs_vnops.c ffs_syncvnode(): /* * Block devices associated with filesystems may * have new I/O requests posted for them even if * the vnode is locked, so no amount of trying will * get them clean. Thus we give block devices a * good effort, then just give up. For all other file * types, go around and try again until it is clean. */ Actually it just does NIADDR + 1 (four) passes and then gives up. If DIAGNOSTIC is enabled, it will then print the affected vnode, if it is not a disk. This failure is not reflected in ffs_syncvnode()'s return value, so if it occurs when ffs_syncvnode() is called from bufobj_invalbuf(), a panic will result. Suppose ffs_syncvnode() would be changed to return some error in this case. bufobj_invalbuf()/vinvalbuf() will handle a BO_SYNC/ffs_syncvnode() error by aborting with an error return. It seems that in most cases this will cause the operation invoking the vinvalbuf() to fail. However, in at least one case (vm_object_terminate()), the error will be ignored; this may cause old garbage/dangling references? -- Jilles Tjoelker ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: rc-ng problem with [procname] (e.g. kernel threaded procs)
On Tue, Aug 16, 2005 at 07:24:41PM +0200, Andy Hilker wrote: Hmh, no one interested in this issue? Or am i wrong with this issue? Apparently noone is interested. Be sure to file a PR if you haven't done so yet. You (Andy Hilker) wrote: i think I have found a problem with rc-ng scripts and procnames including brackets (e.g. kernel threaded, like mysqld). Brackets [] are ignored, process will not be found and is regarded as not running. This breaks stop+status functions of rcng. The following patch allows brackets in variable procname rc-ng scripts. Maybe someone can review and fix this issue. It was relevant for me when using [mysqld]. This happens if the argv is larger than kern.ps_arg_cache_limit and either /proc is not mounted or the user running ps is not allowed to read the command's memory. Your patch needs \[, \] by the way, not ( ). Alternatively, you could use ps -o pid,ucomm for the $_interpreter = . case and only look for $_procnamebn. This whole ps stuff has the potential of killing the wrong process, how about using pidfiles? # $FreeBSD: src/etc/rc.subr,v 1.31.2.1 2005/01/17 11:51:00 keramida Exp $ --- rc.subr Thu Aug 11 15:18:52 2005 +++ /etc/rc.subrThu Aug 11 15:14:06 2005 @@ -267,7 +267,7 @@ _procnamebn=${_procname##*/} _fp_args='_arg0 _argv' _fp_match='case $_arg0 in - $_procname|$_procnamebn|${_procnamebn}:|(${_procnamebn}))' + $_procname|$_procnamebn|${_procnamebn}:|(${_procnamebn}))' fi _proccheck=' -- Jilles Tjoelker ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: [PATCH] Re: /etc/rc.d/sshd : kldload random missing?
On Sun, Apr 24, 2005 at 07:02:42PM -0700, Rob wrote: My conclusion is: sshd does simply not call random at all, although I have added it in the # REQUIRE: line. Is this a general bug in 5-Stable? Or am I testing this in the wrong way? RCNG doesn't work that way. rc.d scripts do not call one another. The # REQUIRE: lines affect rcorder(8) so it should be ok on bootup. -- Jilles Tjoelker ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: USB mouse troubles
On Tue, Apr 05, 2005 at 09:17:55PM +0200, Michael Nottebrock wrote: FreeBSD 5.x has had funky issues with usb mice for as long as I've been using a usb mouse with it, but since it almost works ok with the default configuration, I never got around to complain about it. ;-) AFAIK FreeBSD 4.x has the same issues and some more, e.g. systems with more than one USB bus require a manual MAKEDEV for hotplugging to work correctly (as only /dev/usb0 is created by default). However: In various sitations and configurations, USB mice are not picked up. I have not tested this at this point, but it agrees with my understanding of how the code works. - With a GENERIC kernel and all of the usb support in the kernel, usb mice are usually recognized on boot, but they will cease to work after going single user and back to multiuser again. The /dev/ums0 device doesn't even get removed, but the mouse is dead. Unplugging and replugging usually gets it going again. On the first startup, usbd will get the initial events (the mouse being attached at boot time) and start moused. When going to single user, moused is killed. If usbd is started later, it will not get the initial events again (also the set of attached devices may have changed since boot), so will not start moused again. One fix could be to change usbd to throw away the initial events, instead doing as if attach events were received for all present USB devices. This would be nasty if usbd is restarted without a reboot/single user, which could be fixed by making the new behaviour optional. This is also nasty if usbd.conf contains an action for a device instead of starting a process, e.g. automatically copying files from a umass device. This might be fixed by distinguishing the two in usbd.conf. - With all of usb compiled as modules and usbd enabled in rc.conf, ums usually doesn't even get loaded, but usbdevs will show the mouse plugged in. Even subsequent unplugging and replugging will not get ums loaded. Manually loading ums doesn't get the mouse working either, an unplug/replug is necessary first. usbd loads only the usb kld, that will bring in the drivers for the USB controllers, ugen and uhub and not much more. When the mouse is plugged in, ugen will grab it (there being nothing better) and will not release it, even when you subsequently load ums. This happens before usbd gets to know about the device. A somewhat crude workaround would be to load the drivers for the devices to be used before starting usbd. This would mean /boot/loader.conf, most likely. -- Jilles Tjoelker ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: Just a sanity check before I sumbit a buig report
On Fri, Mar 11, 2005 at 02:13:54PM -0600, Jon Noack wrote: Pete French wrote: Why does sysconf(_SC_CLK_TCK) always returns 128? Check out sysconf() in src/lib/libc/gen/sysconf.c (lines 83-84 of rev. 1.10): [follow through of code showing it is defined as a constant snipped] sysconf(3) states that _SC_CLK_TCK is the frequency of the statistics clock in ticks per second. Considering this value varies, returning a constant is wrong. Feel free to attach my email on the PR. An important use for sysconf(_SC_CLK_TCK) is to specify the rate of the results of times(3). (I don't know how many applications call that stupid function, getrusage() having been available for so long ;-) Currently, src/lib/libc/gen/times.c compiles this in just like sysconf.c does. So that's all ok; times.c will have to be modified too if sysconf(_SC_CLK_TCK) changes. getrusage(2) says that ru_ixrss is based on statistics clock ticks with a frequency of sysconf(_SC_CLK_TCK). This cannot be right. In other systems, getrusage often only really supports the timeval fields and perhaps the fault and swap counts; if it is supported, the ru_i?rss ticks are often not described at all or they are something strange like one per second. Consequently, this facility is nonportable and the tick frequency should be described using sysctl(). -- Jilles Tjoelker ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
5.3-RELEASE-p5 panic bundirty: buffer 0xd63d85e0 still on queue 1
ip_moptions 1 1K 1K1 128 IpFw/IpAcct 1 1K 1K1 64 in_multi 4 1K 1K4 32 igmp 1 1K 1K1 16 routetbl 22841K 42K23065 16,32,64,128,256 kqueue 0 0K 15K 184464 128,1024 kenv 104 6K 6K 105 16,32,64,2048 sigio 2 1K 1K 67 32 lo 1 1K 1K1 1024 clone 520K 20K5 4096 ether_multi77 4K 4K 77 16,32,64 ifaddr5012K 12K 53 16,32,64,256,512,2048 BPF 4 1K 1K4 64 mount4322K 22K 49 16,32,128,512,1024 vnodes42 8K 8K 359 16,32,64,128,256 Export Host 2 1K 1K4 256 cluster_save buffer 0 0K 1K 3956 32,64 vfscache 1 512K512K1 BIO buffer72 114K 5697K 130296 1024,2048 file desc 24775K107K 1518255 16,32,64,256,512,1024,2048,4096 pcb 118 6K 8K 346499 16,32,64,2048 soname 10311K 45K 5116733 16,32,64,128 tag 0 0K 7K 2161374 32,64 mbextcnt 0 0K 2K11818 16 ptys31 4K 4K 31 128 ttys 3983 516K595K36848 128,512 shm 213K 15K 70 256 sem 4 7K 7K4 512,1024,4096 msg 425K 25K4 512,4096 iov 0 0K 1K 2041064 16,64,128,256,512 ioctlops 0 0K 4K 45 512,1024,2048,4096 cdev9223K 23K 92 256 acpica 0 0K 1K 15 16,32,64 turnstiles 65141K 41K 671 64 taskqueue 6 1K 1K6 64 ISOFS mount 1 256K256K1 sleep queues 65121K 21K 671 32 sbuf 0 0K 37K 2129 16,32,64,128,256,512,1024,2048,4096 rman 118 8K 8K 520 16,64 isadev42 3K 3K 42 64 GEOM63 9K 14K 241 16,32,64,128,256,512,1024 kobj 101 202K202K 121 2048 pfs_vncache 2 1K 52K12858 32 eventhandler27 2K 2K 27 32,128 devstat 613K 13K6 16,4096 pfs_fileno 120K 20K1 bus-sc3843K 48K 371 16,64,128,256,512,1024,2048,4096 bus 54124K 82K 2056 16,32,64,128,1024 SWAP 2 345K345K2 64 sysctltmp 0 0K 1K53047 16,32,64,128 sysctloid 148645K 45K 1486 16,32,64 sysctl 0 0K 1K45958 16,32,64 uidinfo18 2K 2K 4270 32,1024 plimit3910K 12K 1812147 256 pfs_nodes20 3K 3K 20 128 cred 17422K 30K 4610954 128 subproc 359 849K 1252K 3865370 32,4096 proc 2 8K 8K2 4096 session9112K 14K20158 128 pgrp99 7K 8K25399 64 mtx_pool 1 8K 8K1 module 18612K 12K 186 64,128 MSDOSFS mount 1 128K128K1 ip6ndp 9 1K 1K 13 64,128 ip6opt 0 0K 2K 136543 128 temp 3080 258K288K 2236282 16,32,64,128,256,512,1024,2048,4096 devbuf 2292 4517K 4677K 28998409 16,32,64,128,256,512,1024,2048,4096 # ### netstat -m crashes after printing a (high) number of mbufs in use # netstat -m -N kernel.debug.28 -M vmcore.28 4045 mbufs in use Segmentation fault # vmstat -z -N kernel.debug.28 -M vmcore.28 vmstat: not implemented # ^D Script done on Mon Feb 21 17:21:35 2005 -- Jilles Tjoelker ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: USB scanner not attached when connected after system startup]
On Wed, Jan 05, 2005 at 10:44:05PM +0100, Harald Weis wrote: On Mon, Jan 03, 2005 at 09:32:20PM +0100, Roland Smith wrote: That's not good. AFAICT, the code that reports a new scanner to dmesg is also present in 4.x, so if you don't see a message, it looks like the scanner is not found for one reason or another. Try all USB ports on your computer; use MAKEDEV to create /dev/usb1, /dev/usb2, etc (they are not created by default). -- Jilles Tjoelker ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
unkillable processes after debugging on 5.3R
I have two unkillable processes. System: FreeBSD turtle.stack.nl 5.3-RELEASE-p2 FreeBSD 5.3-RELEASE-p2 #5: Thu Dec 2 17:25:55 CET 2004 [EMAIL PROTECTED]:/usr/obj/usr/src/sys/SNAIL i386 The system is SMP with two CPUs. Quoting the user: I was debugging with the system gdb some c++-code (with a strange segmentation fault). I was logged in via ssh and the connection seemed to freeze (no response from keyboard input) so I disconnected (~.-sequence in ssh). Logging in again on the machine, I killed the debugger and shell (I don't remember in which order) and tried to kill the program skilllist (pid 20326). The skilllist program then appeared to be using 100% CPU time and did not respond to any of the signals I sent. About 24 hours later, I discovered that a zsh-process (pid 20328) was also running at lots of cpu-time. The program was initially not run in the background, the nicing and placing into the idle queue has been done later. My code is c++-code, using both fd 1 and 2 for output. It is not threaded. It does not use fork, exec etc. It's basically a simple prog, generating only output, not listening for input. The working directory is mounted over nfs (but my code does not open files). After the nicing and placing into the idle queue the system is properly responsive. Output of some commands about the processes: UID PID PPID CPU PRI NI VSZ RSS MWCHAN STAT TT TIME COMMAND 1711 20326 1 475 171 20 2280 1416 - RNph- 2421:35.76 ./skilllist 1711 20328 1 106 -8 20 2696 2328 - RNE ph- 729:34.15 -zsh (zsh) RTPRIO idle:25 idle:76 db trace 20328 sched_switch(c29cd320,0,1) at sched_switch+0x143 mi_switch(1,0,c29cd320,1,c29cd320) at mi_switch+0x1ba sleepq_switch(c2302a80) at sleepq_switch+0x133 sleepq_wait(c2302a80,0,0,0,0) at sleepq_wait+0xb msleep(c2302a80,c2302bd8,4c,c06d14b3,0) at msleep+0x322 pipeclose(c2302a80,c2302b14,c3eba484,e9e9eb94,c050736c) at pipeclose+0x88 pipe_close(c3eba484,c29cd320) at pipe_close+0x2a fdrop_locked(c3eba484,c29cd320,c25b0c8c,e9e9ec04,c050616f) at fdrop_locked+0xa8 fdrop(c3eba484,c29cd320,0,2,c388a000) at fdrop+0x41 closef(c3eba484,c29cd320) at closef+0x23f fdfree(c29cd320,c3c22d70) at fdfree+0x383 exit1(c29cd320,2,1,c29cd320,c388a000) at exit1+0x4d4 sigexit(c29cd320,2,0,c3c22c5c,c29cd320) at sigexit+0xd3 postsig(2) at postsig+0x13f ast(e9e9ed48) at ast+0x4ba doreti_ast() at doreti_ast+0x17 db trace 20326 sched_switch(,c22e3000,400,8067000,df42c340) at sched_switch+0x143 db c [EMAIL PROTECTED] /home/jilles% fstat -vp20326 USER CMD PID FD MOUNT INUM MODE SZ|DV R/W peters skilllist 20326 root / 2 drwxr-xr-x1024 r peters skilllist 20326 wd /toad.mnt/capitalism 892355 drwxr-xr-x 512 r peters skilllist 20326 text /toad.mnt/capitalism 892490 -rwxr-xr-x 235868 r peters skilllist 203260 - - bad- peters skilllist 203261* pipe c2302b2c - c2302a80 0 rw peters skilllist 203262* pipe c2302b2c - c2302a80 0 rw [EMAIL PROTECTED] /home/jilles% fstat -vp20328 USER CMD PID FD MOUNT INUM MODE SZ|DV R/W peters zsh20328 root / 2 drwxr-xr-x1024 r peters zsh20328 wd /toad.mnt/capitalism 892355 drwxr-xr-x 512 r peters zsh20328 text /usr 921879 -r-xr-xr-x3156 r can't read sock at 0x0 peters zsh20328 10* error peters zsh20328 12 - - bad- can't read pipe at 0x0 peters zsh20328 13* error [EMAIL PROTECTED] /home/jilles% The address c2302a80 occurs in the 20328 backtrace as well. The fstat output of 20328 is unreliable: a later query returned this: [EMAIL PROTECTED] /home/jilles% fstat -vp20328 USER CMD PID FD MOUNT INUM MODE SZ|DV R/W peters zsh20328 root / 2 drwxr-xr-x1024 r peters zsh20328 wd /toad.mnt/capitalism 892355 drwxr-xr-x 512 r peters zsh20328 text /usr 921879 -r-xr-xr-x3156 r unknown file type 5 for file 10 of pid 20328 unknown file type 5 for file 12 of pid 20328 can't read pipe at 0x0 peters zsh20328 13* error [EMAIL PROTECTED] /home/jilles% -- Jilles Tjoelker ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]