Re: 10.3 and reboot -r (reroot)

2016-05-02 Thread Jilles Tjoelker
On Tue, Apr 19, 2016 at 12:46:54PM +0200, Edward Tomasz Napierała wrote:
> On 0419T0906, Melissa Jenkins wrote:
> > I've been trying to get reboot -r to work but get an error that
> > kern.proc.pathname is undefined.  It then drops to single user mode.

> > Interestingly I've checked the value of kern.proc.pathname and it
> > appears to be undefined on all the OS boxes we have from 9.3 up to
> > current.  In fact the kern.proc tree doesn't appear to contain
> > anything though it does exist at least on some of the boxes.

> The kern.proc.pathname is a weird sysctl.  It's per-process, and it's
> impossible to access it via name, only by numeric ID.  So, this is
> normal.

> The fact that reroot doesn't work because of this is not normal,
> though. I have no idea why this would fail; I'll investigate.

I can make it fail this way easily by installing a new init(8) binary.
This makes the kern.proc.pathname sysctl fail because /sbin/init has
been moved away or deleted. The command  procstat -b 1  uses the same
vnode-to-pathname translation code and fails similarly.

If only a single install has been done, a command  ls -l /sbin/init*
will make the kernel realize that /sbin/init.bak is in fact the pathname
of process 1's executable, and both  procstat -b 1  and  reboot -r
start working. However, the reroot will use the old init binary to
perform reboot(RB_REROOT) and to find init in the new root file system,
which may be undesirable.

It may be better to use the original argv[0]. The kernel passes a full
pathname here.

While reading the code, I noticed another issue. The kill(-1, SIGKILL)
may fail with [ESRCH] if there is no process to kill. In this case, the
reroot should continue. This problem sometimes occurs for me when
rerooting from single user mode.

-- 
Jilles Tjoelker
___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

Re: cp from NFS to ZFS hung in "fifoor"

2015-11-28 Thread Jilles Tjoelker
On Sat, Nov 28, 2015 at 10:42:28AM -0500, Mikhail T. wrote:
> I was copying /home from an old server (narawntapu) to a new one
> (aldan). The narawntapu:/home is mounted on aldan as /mnt with flags
> ro,intr. On narawntapu /home was simply located on an SSD, but on aldan
> I created a ZFS filesystem for it.

> The copying was started thus:

> root@aldan:/home (435) cp -Rpn /mnt/* .

> for a while this was proceeding at a decent clip with cp making
> newnfsreq-uests:

> load: 0.78  cmd: cp 38711 [newnfsreq] 802.84r 1.57u 140.63s 20% 10768k
> 
> /mnt/mi/.kde/share/apps/kmail/dimap/.42838394.directory/sent/cur/1219621413.32392.hd8cl:2,S
> ->
> 
> ./mi/.kde/share/apps/kmail/dimap/.42838394.directory/sent/cur/1219621413.32392.hd8cl:2,S
> 100%
> load: 1.23  cmd: cp 38711 [newnfsreq] 874.19r 1.66u 154.74s 17% 4576k
> 
> /mnt/mi/.kde/share/apps/kmail/dimap/.42838394.directory/ML/cur/1219595347.32392.rMDFf:2,S
> ->
> 
> ./mi/.kde/share/apps/kmail/dimap/.42838394.directory/ML/cur/1219595347.32392.rMDFf:2,S
> 100%

> ZFS on the destination compressing and writing stuff out and the traffic
> between the two ranging from 30 to 50Mb/s (according to systat), but
> then something happened and the cp-process is now hung:

> load: 0.55  cmd: cp 38711 [fifoor] 1107.67r 2.09u 194.12s 0% 3300k
> load: 0.50  cmd: cp 38711 [fifoor] 1112.66r 2.09u 194.12s 0% 3300k
> load: 0.22  cmd: cp 38711 [fifoor] 1642.37r 2.09u 194.12s 0% 3300k

This normally means that the process is opening a fifo for reading and
is waiting for a writer. Although cp -R will normally copy a fifo by
calling mkfifo at the destination, it may open one if a regular file is
replaced with a fifo between the time it reads the directory and it
copies that file. This is not that unlikely if large directory trees are
copied during that time.

On the other hand, cp without -R/-r/-l/-s will always open a fifo.

You can make cp continue by opening the fifo (which you'll need to find
first, for example by checking what has been copied already) for
writing, like : >/path/to/some/fifo. It will be replaced with an empty
file at the destination.

-- 
Jilles Tjoelker
___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: Latest stable (r287104) bash leaves zombies on exit

2015-08-29 Thread Jilles Tjoelker
On Fri, Aug 28, 2015 at 07:18:47PM +0300, Konstantin Belousov wrote:
 On Fri, Aug 28, 2015 at 05:52:42PM +0200, Michiel Boland wrote:
  set -e
  for a in `seq 1000`
  do
  echo -n $a 
  xterm -e ssh nonexisting
  done
  echo 

  (The idea here is that 'ssh nonexisting' should do some work and then exit, 
  xterm -e false, etc. don't appear to trigger the bug.)

  Prior to the patch, one of the xterms would hang after the counter
  reaches a random (reasonably small) number.

  After the patch the script runs till completion.

 Thank you for testing.  Funny detail is that your loop does not hangs for
 me, I see flapping xterms until the completion.  How many cpus does your
 machine have ?

 Below is a slightly improved version of the change, to avoid unnecessary
 relocations.  Would be good to rebuild the world and confirm that you
 see no regression (the patch also affects rtld in some way).

Looks good to me, except that I think a vforked child (in system() and
posix_spawn*()) should use the system calls and not libthr's wrappers.
This reduces the probability of weird things happening between vfork and
exec, and also avoids an unexpected error when
posix_spawnattr_setsigdefault()'s mask contains SIGTHR.

-- 
Jilles Tjoelker
___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: Latest stable (r287104) bash leaves zombies on exit

2015-08-29 Thread Jilles Tjoelker
On Sat, Aug 29, 2015 at 04:41:30PM +0300, Konstantin Belousov wrote:
 On Sat, Aug 29, 2015 at 03:01:38PM +0200, Jilles Tjoelker wrote:
  Looks good to me, except that I think a vforked child (in system() and
  posix_spawn*()) should use the system calls and not libthr's wrappers.
  This reduces the probability of weird things happening between vfork and
  exec, and also avoids an unexpected error when
  posix_spawnattr_setsigdefault()'s mask contains SIGTHR.

 Thank you for the review, I agree with the note about vfork. Updated
 patch is below. Also, I removed the PIC_PROLOGUE from the i386 setjmp,
 it has no use after the plt calls are removed.

 [snip]
 diff --git a/lib/libc/gen/posix_spawn.c b/lib/libc/gen/posix_spawn.c
 index e3124b2..673c760 100644
 --- a/lib/libc/gen/posix_spawn.c
 +++ b/lib/libc/gen/posix_spawn.c
 @@ -118,15 +118,18 @@ process_spawnattr(const posix_spawnattr_t sa)
   return (errno);
   }
  
 - /* Set signal masks/defaults */
 + /*
 +  * Set signal masks/defaults.
 +  * Use unwrapped syscall, libthr is in undefined state after vfork().
 +  */
   if (sa-sa_flags  POSIX_SPAWN_SETSIGMASK) {
 - _sigprocmask(SIG_SETMASK, sa-sa_sigmask, NULL);
 + __libc_sigprocmask(SIG_SETMASK, sa-sa_sigmask, NULL);
   }
  
   if (sa-sa_flags  POSIX_SPAWN_SETSIGDEF) {
   for (i = 1; i = _SIG_MAXSIG; i++) {
   if (sigismember(sa-sa_sigdefault, i))
 - if (_sigaction(i, sigact, NULL) != 0)
 + if (__libc_sigaction(i, sigact, NULL) != 0)
   return (errno);
   }
   }

Hmm, the comments say direct syscalls are being used, but in fact
libthr's interposer is called. The change to system() does correctly use
__sys_sigprocmask().

-- 
Jilles Tjoelker
___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: script(1), cfmakeraw() and Ctrl-Z

2013-07-14 Thread Jilles Tjoelker
On Mon, Jul 15, 2013 at 12:14:19AM +0700, Eugene Grosbein wrote:
 I've noted that commands like script -qa /tmp/log sleep 100
 cannot be suspended with Ctrl-Z keys. The reason is call to cfmakeraw()
 in script.c - if I comment it out, Ctrl-Z starts to work as expected.

 portupgrade uses script(1) so build/install process cannot be suspended too.
 (I'm building libreoffice-4.04 now)

 The function cfmakeraw() is used since CVS revision 1.1 when script
 was imported with other BSD 4.4 Lite Usr.bin Sources.

 Is cfmakeraw() really needed?

The cfmakeraw() call ensures that the processes running within script
get all control characters. For example, you can suspend a job in the
inner shell using Ctrl+Z. This indeed makes it impossible to suspend
script itself.

-- 
Jilles Tjoelker
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: FreeBSD 9: fdisk -It crashes kernel

2013-04-25 Thread Jilles Tjoelker
On Thu, Apr 25, 2013 at 11:58:42AM -0500, Guy Helmer wrote:
 On Apr 25, 2013, at 10:58 AM, Jeremy Chadwick j...@koitsu.org wrote:
  On Thu, Apr 25, 2013 at 09:06:49AM -0500, Guy Helmer wrote:
  Encountered a surprise when my disk resizing rc.d script caused
  FreeBSD 9.1-STABLE to crash. I used fdisk -It ada0 to determine
  what the available size of the disk (which happened to be the root
  disk), and on FreeBSD 9.1 the kernel comes crashing down:

The shell output can be explained.

  + fdisk -It ada0
  + /rescue/sed -En 's,.*start ([0-9]+).*size ([0-9]+).*,\1 + \2,p'
  vnode_pager_getpages: I/O read error
  vm_fault: pager read error, pid 65 (fdisk)
  pid 65 (fdisk), uid 0: exited on signal 11
  eval: arithmetic expression: expecting primary: 

The subshell for the growfs_vm script exits here because of the error.
The eval is the eval in /etc/rc.subr _run_rc_doit.

  Entropy harvesting: point_to_pointeval: date: Device not configured
  eval: df: Device not configured
  eval: dmesg: Device not configured
  cat: /bin/ls: Device not configured
  kickstart.

After growfs_vm has run (unsuccessfully), rc continues with initrandom.

  eval: cannot open /etc/fstab: Device not configured
  eval: cannot open /etc/fstab: Device not configured
  eval: swapon: Device not configured
  Warning! No /etc/fstab: skipping disk checks
  fstab: /etc/fstab:0: Device not configured

  Fatal trap 12: page fault while in kernel mode
  cpuid = 1; apic id = 01
  fault virtual address   = 0x0
  fault code = supervisor read, page not present
  instruction pointer  = 0x20:0xc0825fc4
  stack pointer   = 0x28:0xc5a088c8
  frame pointer  = 0x28:0xc5a08914
  code segment= base 0x0, limit 0xf, type 0x1b
  = DLP 0, pres 1, def32 1, gran 1
  processor eflags   = interrupt enabled, resume, IOPL = 0
  current process = 91 (mount)
  [ thread pid 91 tid 100056 ]
  Stopped at  g_access+0x24: mlvl 0(%ebx),%eax
  db where
  Tracing pid 91 tid 100056 td 0xc84c42f0
  g_access(c8481d34,0,1,1,0,…) at g_access+0x24/frame 0xc5a08914
  ffs_mount(c8481d34,c0d78380,2,c5a08c00,c829ae6c,…) af 
  ffs_mount+0xf74/frame 0xc5a08a34
  vfs_donmount(c84c42f0,1,0,c84cf200,c84cf200,…) at 
  vfs_donmount+0x1423/frame 0xc5a08c24
  sys_nmount(c84c42f0,c5a08ccc,c5a08cc4,1010006,c5a08d08,…) at 
  sys_nmount+0x7f/frame 0xc5a08c48
  syscall(c5a08d08) at syscall+0x443/frame 0xc508cfc
  Xint0x80_syscall() at Xint0x80_syscall+0x21/frame 0xc5a08cfc
  --- syscall (378, FreeBSD ELF32, sys_nmount), eip = 0x480d5feb, esp = 
  0xbfbfce1c, ebp = 0xbfbfd378 ---

Apparently a subsequent mount command kills it.

  I'll fix my script to not do this, but it seems odd that fdisk -It
  can make the disk go away.

Yes, that seems wrong.

-- 
Jilles Tjoelker
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org

Re: ipv6_addrs_IF aliases in rc.conf(5)

2012-12-20 Thread Jilles Tjoelker
On Thu, Dec 20, 2012 at 01:04:34PM +0200, Kimmo Paasiala wrote:
 A question related to this for those who have been doing work on the
 rc(8) scripts. Can I assume that /usr/bin is available when
 network.subr functions are used? Doing calculations on hexadecimal
 numbers is going to be very awkward if I can't use for example bc(1).

You cannot assume that /usr/bin is available when setting up the
network. It may be that /usr is mounted via NFS.

You can use hexadecimal numbers (prefixed with 0x) in $((...))
expressions. In FreeBSD 9.0 or newer, sh has a printf builtin you can
use; in older versions you can use hexdigit and hexprint from
network.subr.

-- 
Jilles Tjoelker
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: /bin/sh arithmetic doesn't seem to like leading 0 now

2012-09-23 Thread Jilles Tjoelker
On Fri, Sep 21, 2012 at 10:26:37PM +, David O'Brien wrote:
 On Fri, Sep 21, 2012 at 07:34:06PM +0200, Jilles Tjoelker wrote:
  On Fri, Sep 21, 2012 at 10:09:02AM -0700, David Wolfskill wrote:
   $ echo $(( ( $( date +%m ) - 1 ) / 3 + 1 ))
   arithmetic expression: expecting ')':  ( 09 - 1 ) / 3 + 1 
 ...
  This was done to avoid an inconsistency where constants starting with
  0 and containing 8 or 9 were decimal, so something like
  $((018-017)) expanded to 3.

 Jilles,
 Would it be possible to improve on the error message?
 If David had been given the Bash error message, I suspect he would have
 figured out the issue right away.

It would certainly be possible to add a new error message, but from the
embedded point of view the extra code size may not be worth it (also
considering that error messages can be enhanced in many other places as
well).

-- 
Jilles Tjoelker
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: /bin/sh arithmetic doesn't seem to like leading 0 now

2012-09-21 Thread Jilles Tjoelker
On Fri, Sep 21, 2012 at 10:09:02AM -0700, David Wolfskill wrote:
 I have a construct in a shell script that I had been using under
 stable/8 (most recently, @r240259), but which throws an error under
 stable/9 (at least as early as @r238602):

 $ echo $(( ( $( date +%m ) - 1 ) / 3 + 1 ))
 3
 $ uname -r
 8.3-PRERELEASE

 $ echo $(( ( $( date +%m ) - 1 ) / 3 + 1 ))
 arithmetic expression: expecting ')':  ( 09 - 1 ) / 3 + 1 
 $ uname -r
 9.1-PRERELEASE

 Trying bits  pieces of the above, I narrowed the issue down to:

 $ echo $(( 09 + 0 ))
 arithmetic expression: expecting EOF:  09 + 0 

 while

 $ echo $(( 9 + 0 ))
 9
 $ 

 is not a problem.

 Is this intentional?

Yes, it was changed with r216547, December 2010.

This was done to avoid an inconsistency where constants starting with
0 and containing 8 or 9 were decimal, so something like
$((018-017)) expanded to 3.

There are indeed various cases where this inconsistency does not matter
(because the numbers with leading zeroes do not exceed 10).

 (I can work around it -- e.g., by using sed to strip leading 0 from the
 month number (since strftime() doesn't appear to have a format that
 provides the value in a form that lacks the leading zero for values 
 10).  But I'd rather not do that if I don't need to.)

You can use  date +%-m  although it is not in POSIX.

With POSIX only, it is still possible to do it reasonably efficiently,
for example $(( 1$(date +%m) - 100 )) or v=$(date +%m); v=${v#0}.

-- 
Jilles Tjoelker
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: Too many open files

2012-03-27 Thread Jilles Tjoelker
On Mon, Mar 26, 2012 at 10:15:59AM +0100, Matthew Seaman wrote:
 Does 'procstat -fa' give better results for you?

 It seems to be one of those little hidden secrets that FreeBSD comes
 with a bunch of native applications that provide pretty much equivalent
 functionality to lsof(1).  See: fstat(1), procstat(1), sockstat(1).

 Which is odd, given that since these sort of applications have to read
 and interpret kernel memory -- an action for which there isn't a nice
 well defined ABI -- the application has to be kept rigorously in synch
 with the kernel it is used against.  Something that is intrinsically
 easier to do when kernel and application are compiled at the same time
 and from the same source tree.

procstat (in all versions that have it) and fstat (in FreeBSD 9.0 and
newer) use a well-defined sysctl-based API to access the information.
This API was extended in FreeBSD 9.0 and a library libprocstat provides
a convenient interface.

Reading from kernel memory not only couples the application tightly to
the kernel implementation, but also can also be considered a security
issue because there is a lot of sensitive information in kernel memory;
it cannot be permitted in a jail.

-- 
Jilles Tjoelker
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: A problem with MAXPATHLEN on a back

2012-03-03 Thread Jilles Tjoelker
On Sun, Feb 26, 2012 at 02:40:09PM +0100, Willem Jan Withagen wrote:
 I'm running into this on a backup-backupserver.
 (8.2-STABLE #134: Wed Feb  1 15:05:59 CET 2012 amd64)

 Haven't checked which paths are too long.
 But is there any easy way out? Like making MAXPATHLEN 2048 and
 rebuilding locate.
 Or is that going to propagate and major impact all and everything.

 Rebuilding locate database:
 locate: integer out of +-MAXPATHLEN (1024): 1031
 locate: integer out of +-MAXPATHLEN (1024): 1031

It should be possible to replace (sed -i) MAXPATHLEN with something else
in the locate source and recompile it. Changing the value of MAXPATHLEN
itself is not safe because it defines the size of various buffers in the
ABI (such as the one passed to realpath() if its resolved_path parameter
is not NULL); in any case, it is a very intrusive change.

Locate uses find(1) to generate its list of files, and find's output is
not subject to MAXPATHLEN (unless the -L option or the -follow primary
is used). Almost any use of the very long pathnames will require a
manual split-up though (cd'ing to an initial part shorter than
MAXPATHLEN, then repeating the process with relative pathnames until the
remaining part is shorter than MAXPATHLEN).

-- 
Jilles Tjoelker
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: GENERIC make buildkernel error / fails - posix_fadvise

2012-01-22 Thread Jilles Tjoelker
On Sun, Jan 22, 2012 at 01:00:46PM -0600, clift...@volcano.org wrote:
 On 12.01.2012 15:52, Doug Barton wrote:
  chflags -R noschg /usr/obj/usr
  rm -rf /usr/obj/usr

  It's much faster to do:

  /bin/rm -rf ${obj}/* 2 /dev/null || /bin/chflags -R 0 ${obj}/* 
  /bin/rm -rf ${obj}/*

 If I could just add one thing here, for those who might be tempted
 to immediately cut and paste that elegant command line:

 Consider, how does that command evaluate if the shell variable obj
 is not set, and you're running that literal string as root?

 A: You will very systematically wipe your entire server, starting
 at the root, and doing a second pass to get any protected files you
 missed.

 I'd recommend something safer like approximately this (untested):

if [X${obj} != X -a -d ${obj}]; then cd ${obj}  (rest of cmds); 
 fi

 Sorry for the wasted bandwidth, for those to whom it was obvious,
 but anybody who has ever had to clean up after a junior admin's
 attempt to do something a little too clever will appreciate why I'm
 posting this.

An easier way is to replace the first ${obj} with ${obj:?}, causing an
error if obj is unset or null.

One limitation is that it does not work with (t)csh.

 On the efficiency front, for the core file deletion operators, I've
 had good results with this trick (requires Perl and makes use of
 its implicit-operand idioms):

find ${obj} | perl -nle unlink

 If rm had an option to take files from standard input, or if
 there's another program I'm not aware of which does this, it
 could serve as the right-hand side of this.

This does not handle all possible characters in filenames, such as a
newline. The perlrun manpage suggests something with find's -print0
primary. Alternatively, use find's -unlink primary.

-- 
Jilles Tjoelker
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: SCHED_ULE should not be the default

2011-12-13 Thread Jilles Tjoelker
On Tue, Dec 13, 2011 at 10:40:48AM +0200, Ivan Klymenko wrote:
 If the algorithm ULE does not contain problems - it means the problem
 has Core2Duo, or in a piece of code that uses the ULE scheduler.
 I already wrote in a mailing list that specifically in my case (Core2Duo)
 partially helps the following patch:
 --- sched_ule.c.orig  2011-11-24 18:11:48.0 +0200
 +++ sched_ule.c   2011-12-10 22:47:08.0 +0200
 @@ -794,7 +794,8 @@
* 1.5 * balance_interval.
*/
   balance_ticks = max(balance_interval / 2, 1);
 - balance_ticks += random() % balance_interval;
 +//   balance_ticks += random() % balance_interval;
 + balance_ticks += ((int)random()) % balance_interval;
   if (smp_started == 0 || rebalance == 0)
   return;
   tdq = TDQ_SELF();

This avoids a 64-bit division on 64-bit platforms but seems to have no
effect otherwise. Because this function is not called very often, the
change seems unlikely to help.

 @@ -2118,13 +2119,21 @@
   struct td_sched *ts;
  
   THREAD_LOCK_ASSERT(td, MA_OWNED);
 + if (td-td_pri_class  PRI_FIFO_BIT)
 + return;
 + ts = td-td_sched;
 + /*
 +  * We used up one time slice.
 +  */
 + if (--ts-ts_slice  0)
 + return;

This skips most of the periodic functionality (long term load balancer,
saving switch count (?), insert index (?), interactivity score update
for long running thread) if the thread is not going to be rescheduled
right now.

It looks wrong but it is a data point if it helps your workload.

   tdq = TDQ_SELF();
  #ifdef SMP
   /*
* We run the long term load balancer infrequently on the first cpu.
*/
 - if (balance_tdq == tdq) {
 - if (balance_ticks  --balance_ticks == 0)
 + if (balance_ticks  --balance_ticks == 0) {
 + if (balance_tdq == tdq)
   sched_balance();
   }
  #endif

The main effect of this appears to be to disable the long term load
balancer completely after some time. At some point, a CPU other than the
first CPU (which uses balance_tdq) will set balance_ticks = 0, and
sched_balance() will never be called again.

It also introduces a hypothetical race condition because the access to
balance_ticks is no longer restricted to one CPU under a spinlock.

If the long term load balancer may be causing trouble, try setting
kern.sched.balance_interval to a higher value with unpatched code.

 @@ -2144,9 +2153,6 @@
   if (TAILQ_EMPTY(tdq-tdq_timeshare.rq_queues[tdq-tdq_ridx]))
   tdq-tdq_ridx = tdq-tdq_idx;
   }
 - ts = td-td_sched;
 - if (td-td_pri_class  PRI_FIFO_BIT)
 - return;
   if (PRI_BASE(td-td_pri_class) == PRI_TIMESHARE) {
   /*
* We used a tick; charge it to the thread so
 @@ -2157,11 +2163,6 @@
   sched_priority(td);
   }
   /*
 -  * We used up one time slice.
 -  */
 - if (--ts-ts_slice  0)
 - return;
 - /*
* We're out of time, force a requeue at userret().
*/
   ts-ts_slice = sched_slice;

 and refusal to use options FULL_PREEMPTION
 But no one has unsubscribed to my letter, my patch helps or not in the
 case of Core2Duo...
 There is a suspicion that the problems stem from the sections of code
 associated with the SMP...
 Maybe I'm in something wrong, but I want to help in solving this
 problem ...

-- 
Jilles Tjoelker
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: /usr/bin/script eating 100% cpu with portupgrade and xargs

2011-10-14 Thread Jilles Tjoelker
On Wed, Oct 12, 2011 at 11:25:35PM +0100, Adrian Wontroba wrote:
 On Sat, Oct 08, 2011 at 01:27:07AM +0100, Adrian Wontroba wrote:
  I won't be in a position to create a simpler test case, raise a PR or
  try patches till Tuesday evening (UK) at the earliest.

 So far I have been unable to reproduce the problem with portupgrade (and
 will probably move to portmaster).

 I have however found a different but possibly related problem with the
 new version of script in RELENG_8, for which I have raised this PR:

 misc/161526: script outputs corrupt if input is not from a terminal

 Blast, should of course been bin/

The extra ^D\b\b are the EOF character being echoed. These EOF
characters are being generated by the new script(1) to pass through the
EOF condition on stdin.

One fix would be to change the termios settings temporarily to disable
the echoing but this may cause problems if the application is changing
termios settings concurrently and generally feels bad.

It may be best to remove writing EOF characters, perhaps adding an
option to enable it again if there is a concrete use case for it.

-- 
Jilles Tjoelker
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: sigwait return 4

2011-08-27 Thread Jilles Tjoelker
On Thu, Aug 25, 2011 at 12:29:29AM +0300, Kostik Belousov wrote:
 On Wed, Aug 24, 2011 at 10:56:09PM +0200, Jilles Tjoelker wrote:
  sigwait() was fixed not to return EINTR in 9-current in r212405 (fixed
  up in r219709). The discussion started at
  http://lists.freebsd.org/pipermail/freebsd-threads/2010-September/004892.html
  
  Solaris is simply wrong in the same way we were wrong. Although POSIX
  may not be as clear on this as one may like, its intention is clear and
  additionally not returning EINTR reduces subtle portability problems.

 Can you, please, describe why do you consider the behaviour prohibiting
 return of EINTR reasonable ? I do consider that the Solaris behaviour is
 useful.

Applications need to cope with EINTR returns (usually by retrying the
call); if they do not do this, bugs arise in uncommon cases.

In the case of sigwait(), applications do not really need EINTR: they
can include the respective signal into the signal set and do the work
inline that was originally in the signal handler. This might require
additional pthread_sigmask() calls. This also fixes the race condition
almost always associated with EINTR.

Historically, this is because sigwait() came with POSIX threads, which
also explains why it returns an error number rather than setting errno.
The threads group considered EINTR errors not useful enough, given that
they may lead to subtle bugs. This is fully standardized for functions
like pthread_cond_wait() and pthread_mutex_lock().

In the case of sigwait(), it also plays a role that glibc has decided
not to return EINTR, so that returning EINTR may lead to subtle bugs
appearing on FreeBSD in software originally written for GNU/Linux.

The functions sigwaitinfo() and sigtimedwait() came with POSIX realtime
and therefore follow different conventions.

-- 
Jilles Tjoelker
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: sigwait return 4

2011-08-24 Thread Jilles Tjoelker
On Wed, Aug 24, 2011 at 10:07:03PM +0300, Kostik Belousov wrote:
 On Wed, Aug 24, 2011 at 10:19:07PM +0400, Slawa Olhovchenkov wrote:
  System is 8.2-RELEASE (GENERIC), amd64.
  Application -- i386 for freebsd7.

  In ktrace dump I find some strange result:

   22951 100556 kas-milter CALL  sigwait(0xffdfdf80,0xffdfdf7c)
   22951 100556 kas-milter RET   sigwait 4
   22951 100556 kas-milter PSIG  SIGUSR2 caught handler=0x804c0f0 mask=0x4003 
  code=0x0

  RET   sigwait 4 confused me, and, I think, confused application too.

  man sigwait:

  ERRORS
   The sigwait() system call will fail if:

   [EINVAL]   The set argument specifies one or more invalid 
  signal
  numbers.

   [EFAULT]   Any arguments point outside the allocated address
  space or there is a memory protection fault.

  How sigwait can return '4'?
  May be EINTR, converted from ERESTART? But kern_sigtimedwait from
  sigwait must be called with timeout == NULL...

 What should the system do for a delivered signal not present in the set ?
 I guess this is the case of your ktrace.

 Looking at the SUSv4, I see no mention of the situation, but in Oracle
 SunOS 5.10 man page for sigwait(2), it is said explicitely
 EINTR The wait was interrupted by an unblocked, caught signal.

 So I think that we have a bug in the man page.

 diff --git a/lib/libc/sys/sigwait.2 b/lib/libc/sys/sigwait.2
 index 8c00cf4..b462201 100644
 --- a/lib/libc/sys/sigwait.2
 +++ b/lib/libc/sys/sigwait.2
 @@ -27,7 +27,7 @@
  .\
  .\ $FreeBSD$
  .\
 -.Dd November 11, 2005
 +.Dd August 24, 2011
  .Dt SIGWAIT 2
  .Os
  .Sh NAME
 @@ -94,6 +94,8 @@ The
  .Fn sigwait
  system call will fail if:
  .Bl -tag -width Er
 +.It Bq Er EINTR
 +The system call was interrupted by an unblocked, caught signal.
  .It Bq Er EINVAL
  The
  .Fa set

This patch would be wrong, except to document existing behaviour in
-stable branches.

sigwait() was fixed not to return EINTR in 9-current in r212405 (fixed
up in r219709). The discussion started at
http://lists.freebsd.org/pipermail/freebsd-threads/2010-September/004892.html

Solaris is simply wrong in the same way we were wrong. Although POSIX
may not be as clear on this as one may like, its intention is clear and
additionally not returning EINTR reduces subtle portability problems.

Note that sigwaitinfo() and sigtimedwait() may return EINTR. SA_RESTART
applies to sigwaitinfo() but not to sigtimedwait() (because the timeout
cannot be restarted).

-- 
Jilles Tjoelker
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: Change in behavior to stat(1)

2011-03-04 Thread Jilles Tjoelker
On Mon, Feb 28, 2011 at 11:15:39AM -0600, Stephen Montgomery-Smith wrote:
 I had a little script that would remove broken links.  I used to do it 
 like this:

 if ! stat -L $link  /dev/null; then rm $link; fi

 But recently (some time in February according to the CVS records) stat 
 was changed so that stat -L would use lstat(2) if the link is broken.

 So I had to change it to

 if stat -L $link | awk '{print $3}' | grep l  /dev/null;
 then rm $link; fi

 but it is a lot less elegant.

 What is the proper accepted way to remove broken links?

A better answer to your original question was already given, but for
that command, isn't it sufficient to do

  if ! [ -e $link ]; then rm $link; fi

All test(1)'s primaries that test things about files follow symlinks,
except for -h/-L.

-- 
Jilles Tjoelker
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: Policy on static linking ?

2011-01-15 Thread Jilles Tjoelker
On Sat, Jan 15, 2011 at 09:11:50PM +1100, Jean-Yves Avenard wrote:
 On Friday, 14 January 2011, Pete French petefre...@ticketswitch.com wrote:
  I build code using static linking for deployment across a set of
  machines. For me this has a lot of advantages - I know that the
  code will run, no matter what the state of the ports is on the
  machine, and if there is a need to upgrade a library then I do it
  once on the build machine, rebuild the executable, and rsync it out
  to the leaf nodes. Only one place to track security updates, only
  one place where I need to have all the porst the code depends on
  installed.

 I actually tried to compile a port against another and have it link
 statically, but I couldn't find a way to do so without hacking the
 configure script. I was wondering if there was another (and easier)
 way to do so...

 I use ldap for authentication purposes, along with pam_ldap and nss_ldap

 If I compile openldap-client against openssl from ports, then it
 creates massive problems elsewhere.

 For example, base ssh server will now crash due to using different
 libcrypto. compiling ports will also become impossible as bsd tar
 itself crash (removing ldap call from nsswitch.conf is required to
 work again)

 I was then advised in the freebsd forums to uninstall openssl port,
 compile openldap against openssl base, install it, then re-install
 openssl port.
 (I have to use openssl from ports with apache/subversion to fix a bug
 with TLSv1 making svn commit crash under some circumstances)

 I dislike this method, because should openldap gets upgraded again and
 be linked against openssl port, I will lock myself out of the machine
 again due to sshd crashing. Just like what happened today :(

 So how can I configure openldap-client to link against libssl and
 libcrypto statically?

I think this can be solved with a symbol versioning trick. By applying a
different version to all symbols from each OpenSSL version, the dynamic
linker will be able to distinguish between different versions of OpenSSL
and will use the correct OpenSSL version each object was linked to, even
if there are multiple OpenSSL versions in the process. Note that each
OpenSSL version should still have its own SONAME (the security/openssl
port does this).

The version script can be as simple as (substituting the version)

OPENSSL_0.9.8 {
global:
*;
};

and needs to be in the top level and in the engines directory.

For it to work completely, both base and the port need to be patched.
Old binaries continue to work but the benefit only appears after
recompilation.

The SONAME needs to be bumped when the version string is changed or
deleted (but not when it is initially added), otherwise binaries will
stop working. This also means that making the change cannot be undone
without breaking binary compatibility.

What will not work is allocating an OpenSSL structure in one object
linked to one OpenSSL version and then using it in another object linked
to another OpenSSL version. That would require true symbol versioning,
keeping compatibility with old versions in the same library with the
same SONAME. Unlike the approach I propose, that would be a lot of work
and can only be done by the OpenSSL project, and I think their policy is
not to do such extra work for ABI compatibility. If they change their
mind they will probably start with the symver version of the previous
release so as to remain compatible with what various Linux distributions
are doing.

Also, a side effect is that it is no longer possible to cheat by
symlinking different OpenSSL versions.

The approach has been used by Debian for some time.

Links:
http://chris.dzombak.name/blog/2010/03/building-openssl-with-symbol-versioning/
http://chris.dzombak.name/files/openssl/openssl-0.9.8l-symbolVersioning.diff
http://rt.openssl.org/Ticket/Display.html?id=1222user=guestpass=guest

-- 
Jilles Tjoelker
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: Policy on static linking ?

2011-01-14 Thread Jilles Tjoelker
On Fri, Jan 14, 2011 at 02:07:37PM +, Pete French wrote:
 I build code using static linking for deployment across a set of
 machines. For me this has a lot of advantages - I know that the
 code will run, no matter what the state of the ports is on the
 machine, and if there is a need to upgrade a library then I do it
 once on the build machine, rebuild the executable, and rsync it out
 to the leaf nodes. Only one place to track security updates, only
 one place where I need to have all the porst the code depends on
 installed.

 I recently wanted to use libdespatch, but I found that the port
 didn't install the static libraries. I filed a PR, and found out
 from the reponse that this was deliberate, and that a number of
 other ports were deliberately excluding static libraries too. Some
 good reasons where given, which I wont reporduce here,
 as you can read them at: http://www.freebsd.org/cgi/query-pr.cgi?pr=151306

 Today I finally hit the problem where a critical library I am using
 has stopped working with static libraries (or so it appears at first glance).
 I was wondering what the general policy here was - should I just bite the
 bullet and go dynamic, and accept the maintannance headache that cases, or
 could we define something like 'WITH_STATIC_LIBRARIES' that could be set
 which would make ports install a set of static libraries (maybe into
 a separate /usr/local/lib/static?) so that the likes of me could
 continue to build static code ? I'd very much like to be able to continue
 to ship single executables that just run, but if theres some policy
 to only have dynamic libraries in ports going forward then fair enough...

Various features do not work with static linking because dlopen() does
not work from static executables. Libraries that are also used by
dlopen()ed modules should generally be linked dynamically, particularly
if these libraries have global state.

Things that use dlopen() include NSS (getpwnam() and the like), PAM and
most plugin systems. If libc is statically linked, NSS falls back to a
traditional mode that only supports the traditional things (e.g. no LDAP
user information); I think PAM and most plugin systems do not work at
all.

For some system libraries, there can be kernel compatibility problems
that prevent old libraries from working, although an ABI-compatible
shared library is available. This has happened with 6.x's libkse:
binaries statically linked to it do not run on 8.x or newer, while
libkse can be remapped to libthr for binaries dynamically linked to
libkse.

For these reasons, static linking to libc, libpthread and similar system
libraries should be reserved for /rescue/rescue and similar programs,
and not used in general.

Another feature only available with dynamic linking is hidden symbols
that are available only inside the shared object. Compiling a library
that uses this feature as a static library will make the hidden symbols
visible to the application or other libraries. This may cause name
clashes that otherwise wouldn't have been a problem or invite API abuse.
Proper use of hidden symbols can also speed up linking and load times
considerably, particularly if the code is written in C++.

Other issues are static linking's requirement to list all libraries a
library depends on and in the correct order. With dynamic linking,
listing the indirect dependencies is unnecessary and best avoided. This
is generally not very hard to fix but still needs extra effort. (For
example, pkg-config has Libs.private to help with it.)

If you want to link dynamically but avoid too much management overhead,
consider using PCBSD's PBI system which allows you to ship all necessary
.so files (except system ones) with your application.

-- 
Jilles Tjoelker
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: utmp.h exists or not in RELENG_8?

2010-10-02 Thread Jilles Tjoelker
On Sat, Oct 02, 2010 at 12:22:22PM -0500, Jeremy Messenger wrote:
 My system is RELENG_8 and I have checkout by via csup today. It shows
 that utmp.h still exists in RELENG_8. But when I see this PR:

 http://www.freebsd.org/cgi/query-pr.cgi?pr=ports/149945

 I have decided to check in the
 http://www.freebsd.org/cgi/cvsweb.cgi/src/include/?only_with_tag=RELENG_8
 ... It shows that utmp.h has been removed. But in the
 http://sources.freebsd.org/RELENG_8/src/include/ shows a different
 story as it exists. I am confusing... Is it supposed to be deleted in
 CVS when it did the SVN-CVS? Or what? I don't have svn installed in
 my system at the moment, so can't check it now.

utmp.h has been removed in HEAD (9.x) but is still present in 8.x and
earlier branches. It looks like cvsweb is buggy in this area.

The build error in ports/149945 may be caused by a stray utmpx related
file found by the configure process. Partly because the various unix
variant developers have made a mess of utmp/utmpx, the code to use it is
rather fragile.

-- 
Jilles Tjoelker
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: mysqld_safe holding open a pty/tty on FreeBSD (7.x and 8.x)

2010-10-01 Thread Jilles Tjoelker
On Thu, Sep 30, 2010 at 09:03:33AM +0200, Ed Schouten wrote:
 * Jeremy Chadwick free...@jdc.parodius.com wrote:
  1) mysqld_safe  /dev/null 21  never released the tty
  2) nohup mysqld_safe  /dev/null 21  did release the tty

 What happens if you run the following command?

   daemon -cf mysqld_safe

 The point is that FreeBSD's pts(4) driver only deallocates TTYs when
 it's really sure nothing uses it anymore. Even if there is not a single
 file descriptor referring to the slave device, it has to wait until
 there exist no processes which have the TTY as its controlling TTY.

In fact, POSIX allows dissociating the controlling terminal from the
session when all file descriptors for it (in any session) have been
closed. See SUSv4 XBD 11.1.3 The Controlling Terminal. Once the terminal
has been dissociated, it is no longer in use at all and can, in case of
a pty, be cleaned up. Implementing this may be an interesting idea. Of
course, this will cause opening /dev/tty to fail in some cases where it
previously succeeded, but it seems uncommon.

Somewhat unrelated, I think that starting daemons with daemon(8),
/dev/null /dev/null 21 or similar is inferior to implementing
daemonizing in the program itself. Think of the poor soul who needs to
install and start N daemons full of bugs and configuration errors: it is
better if such errors show up on the console instead of being hidden
away in a log file.

-- 
Jilles Tjoelker
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: daily run output 800.scrub-zfs fixups

2010-08-22 Thread Jilles Tjoelker
On Sun, Aug 22, 2010 at 03:08:42PM +0200, Alexander Leidinger wrote:
 On Sat, 21 Aug 2010 00:17:08 -0400 jhell jh...@dataix.net wrote:
  Hi Alexander,

  Attached is a fix for one problem and one slight overlook for
  800.scrub-zfs.

  The first  second change was probably just an oversight but none the
  less they both give a false impression of actions taken.

  Change1:
  ${daily_scrub_zfs_default_threshold=30} is missng the ':'
  which would ultimately reset the users supplied value in
  periodic.conf to 30.

 Sorry, but it is not missing the ':'. There is one in front of it. A
 lot of start scripts in ports use this. You need to use a := instead of
 a = if you use
   var=${var:=default_val}
 but not if you use
   : ${var=default_val}

 I have the impression that the ':' in front of the variable is the way
 it is supposed to be in the start scripts in ports. I adopted this
 style (one variable name less to type... specially with expressive
 names this is some amount less to type).

As described in sh(1) and POSIX, ${var=default_val} assigns the default
if var was not set, while ${var:=default_val} assigns the default if var
was not set or if it was set to the empty string.

The double assignment in the construct
  var=${var:=default_val}
is a workaround for bugs in very old Bourne shells (see Autoconf
documentation for more). Our sh(1) has never had that bug, so simply
  : ${var:=default_val}
is better.

The double-quotes prevent unnecessary pathname generation, which could
be slow. However, even without the double-quotes, the correct value is
assigned and no other side effects occur.

 And I remember to have tested a lot of cases for the timeout value,
 overriding a pool specific value and overriding the default where some
 of them and all worked.

 If you have a case where it does not work, it would be nice if you
 could add a set -x in the beginning of the script and send me the
 output of a failing run.

-- 
Jilles Tjoelker
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: SIGEPIPE after update to 8.1-RC2

2010-07-25 Thread Jilles Tjoelker
On Sun, Jul 18, 2010 at 12:20:33PM +1000, Sean wrote:
 I'm getting the same thing; what shell are you using? I changed my shell 
 on one machine from /bin/tcsh to /usr/local/bin/bash and problem 
 disappeared.

That this workaround helps confirms that masked/ignored SIGPIPE is the
problem. From a few shells I have tried, bash and zsh reset SIGPIPE to
caught or default even if it was ignored (only in interactive mode,
however), while tcsh, sh, mksh and ksh93 leave it ignored.

The underlying problem is the program that is passing the ignored/masked
signal to child processes. Please check if the problem occurs with
various ways to log in (text console, ssh, xterm, etc). Things like PAM
modules may also cause problems here.

For example, sshd sets SIGPIPE to ignored, but resets it back to default
before starting a child process, so assuming I read the code correctly
it does not cause problems.

-- 
Jilles Tjoelker
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: SIGEPIPE after update to 8.1-RC2

2010-07-17 Thread Jilles Tjoelker
On Sat, Jul 17, 2010 at 06:24:55PM +0300, Alex Kozlov wrote:
 After updating my buildbox from 26 April 8-STABLE
 to 8.1-RC2 I constantly getting SIGEPIPE

 portsnap:
 Fetching 4 metadata patches... done.
 Applying metadata patches... done.
 Fetching 0 metadata files... done.
 Fetching 27 patches.1020... done.
 Applying patches... done.
 Fetching 3 new ports or files... done.
 sort: write failed: standard output: Broken pipe
 sort: write error
 Removing old files and directories... done.

 sudo make -C /usr/ports/converters/ascii2binary:
 ===  Patching for ascii2binary-2.13_2
 ===  Applying FreeBSD patches for ascii2binary-2.13_2
 ===   ascii2binary-2.13_2 depends on shared library: intlgrep: writing 
 output: Broken pipe
 grep: writing output: Broken pipe
[snip repetition]
  - found
  ===  Configuring for ascii2binary-2.13_2

 Does anyone know something about this issue?

This looks more like the absence of SIGPIPE than an inappropriate
SIGPIPE. I can reproduce both of those error messages by running the
commands with SIGPIPE ignored. grep(1) seems to behave strangely on
write errors, not aborting, for example
  yes | { trap '' PIPE; grep -v foo; echo $? 2; } | :
prints an endless stream of error messages.

Note that sh(1) silently ignores attempts to change the disposition of
signals that were ignored on entry to the shell, so a
  trap - PIPE
is unlikely to help you.

Similarly, SIGPIPE may be blocked (masked). Few programs expect this.

The -i and -j options in procstat should be helpful in finding what
exactly is wrong with SIGPIPE. (These options are relatively new, but
should be in 8.1.)

-- 
Jilles Tjoelker
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: calcru: runtime went backwards messages

2010-06-10 Thread Jilles Tjoelker
On Fri, Jun 11, 2010 at 12:35:05AM +0800, Jansen Gotis wrote:
 Hi, for the past couple of months since moving to RELENG_8 I've been
 receiving calcru: runtime went backwards messages on the console.

 My machine is a dual Pentium III 1.26GHz with an Intel SAI2 board.
 Disabling EIST is not an option in my BIOS, and I've tried disabling
 the ACPI timer as well as setting kern.timecounter.hardware=i8254.
 I've also tried disabling cpufreq in my kernel configuration.

 For what it's worth, I'm running base ntpd. I've also tried openntpd,
 but no dice.

 I did a binary search of the commit with which this started, and
 apparently it's svn r204546, a summary of which can be seen here:
 http://freshbsd.org/2010/03/02/01/56/55

 The calcru messages appear whether vesa is loaded as a module
 or compiled into the kernel.

 If anyone needs more information, I'll be happy to provide it.

 = snippet of /var/log/messages relating to calcru messages =
 Jun 10 22:41:42 hobbes kernel: calcru: runtime went backwards from
 3502 usec to 3297 usec for pid 1106 (mksh)
 Jun 10 22:41:42 hobbes kernel: calcru: runtime went backwards from
 36785 usec to 35858 usec for pid 1114 (csh)
 Jun 10 22:41:42 hobbes kernel: calcru: runtime went backwards from
 13438 usec to 12652 usec for pid 1113 (su)
 Jun 10 22:41:42 hobbes kernel: calcru: runtime went backwards from
 14956 usec to 14081 usec for pid  (mksh)
 Jun 10 22:41:42 hobbes kernel: calcru: runtime went backwards from
 3323 usec to 3128 usec for pid  (mksh)
 Jun 10 22:41:42 hobbes kernel: calcru: runtime went backwards from 610
 usec to 574 usec for pid 549 (devd)
 Jun 10 22:41:42 hobbes kernel: calcru: runtime went backwards from 517
 usec to 486 usec for pid 548 (dhclient)
 Jun 10 22:41:42 hobbes kernel: calcru: runtime went backwards from
 1912 usec to 1800 usec for pid 532 (dhclient)
 Jun 10 22:41:42 hobbes kernel: calcru: runtime went backwards from
 39738 usec to 37412 usec for pid 532 (dhclient)
 Jun 10 22:41:42 hobbes kernel: calcru: runtime went backwards from
 3369010 usec to 3334846 usec for pid 1 (init)

This may well be a manifestation of a brokenness (which should not be
unknown) in how FreeBSD stores CPU time utilization. The time is
maintained in CPU ticks (CPU clock cycles), so if the clock frequency
changes, the values of existing processes will be wrong (a jump when
converted to seconds). When calcru detects this, it generates messages
like the above. If this analysis is right, the messages can be ignored,
but indicate that CPU time statistics may be inaccurate.

I suppose fairly arbitrary changes can cause the messages to appear or
disappear.

-- 
Jilles Tjoelker
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: locale-related build problems (was: kernel build failed)

2010-01-03 Thread Jilles Tjoelker
On Sat, Jan 02, 2010 at 11:01:06PM +0200, E. O. wrote:
 pls help me , kernel always returns an error

 make buildkernel

 and error

 ./pci_if.h:214: error: expected '=', ',', ';', 'asm' or '__attribute__'
 before '_RELEASE_MS'
 ./pci_if.h:214: error: stray '\335' in program
 
 ./pci_if.h:226: error: stray '\335' in program
 ./pci_if.h:226: error: expected '=', ',', ';', 'asm' or '__attribute__'
 before '_MS'
 ./pci_if.h:226: error: stray '\335' in program
 ./pci_if.h:238: error: stray '\335' in program
 ./pci_if.h:238: error: expected '=', ',', ';', 'asm' or '__attribute__'
 before '_MS'
 ./pci_if.h:238: error: stray '\335' in program
 [...]

The build system appears not to cope with locales. My guess is that
you're using Turkish locale, which has dotless and dotted uppercase and
lowercase 'i', where the uppercase version of the dotted 'i' is an
uppercase dotted 'I'. awk(1) knows about this and generates a file that
conforms to Turkish conventions but obviously will not work.

As a workaround, you can run your builds with LC_ALL=C in the
environment, disabling locale support. (e.g. env LC_ALL=C make
buildkernel).

This should be fixed by adding LC_ALL=C somewhere in the build process,
possibly only for these awk commands.

-- 
Jilles Tjoelker
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: Can close-ing a pipe trigger a SIGPIPE?

2009-10-17 Thread Jilles Tjoelker
On Sat, Oct 17, 2009 at 01:41:22PM -0400, Mikhail T. wrote:
 Kostik Belousov wrote:
  Take ktrace of both parent and child.

 Great idea! Here is the kdump's listing for both (after ktrace -i):

 http://aldan.algebra.com/~mi/tmp/tclx-kdump.txt

 (it is large, so be sure to use a compressing browser). Once loaded,
 look for substring:

 Error SIGPIPE signal received while closing file5.

 The parent process-ID is 92722. The child -- 92723. Thanks! Yours,

The interesting part of the ktrace:

 92723 tclsh8.5 CALL  exit(0)
 92722 tclsh8.5 CALL  sigaction(SIGPIPE,0x7fffa9e0,0)
 92722 tclsh8.5 RET   sigaction 0
 92722 tclsh8.5 CALL  write(0x4,0x800e24028,0)
 92722 tclsh8.5 RET   write -1 errno 32 Broken pipe
 92722 tclsh8.5 PSIG  SIGPIPE caught handler=0x800f126d0 mask=0x0
code=0x0
 92722 tclsh8.5 CALL  sigreturn(0x7fffa0c0)
 92722 tclsh8.5 RET   sigreturn JUSTRETURN
 92722 tclsh8.5 CALL  close(0x5)
 92722 tclsh8.5 RET   close 0
 92722 tclsh8.5 CALL  close(0x4)
 92722 tclsh8.5 RET   close 0

It seems unwise to assume that a write(2) of 0 bytes is a noop.
Even if it is, doing it is a waste of a system call.

-- 
Jilles Tjoelker
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: Value of $? lost in the beginning of a function.

2009-07-19 Thread Jilles Tjoelker
On Sun, Jul 19, 2009 at 10:26:38PM +0200, Romain Tartière wrote:
 Hi!
 
 Simple test case:
 
 8--
 #!/bin/sh
 foo()
 {
   echo \$?=$? \$1=$1
 }
 false
 foo $?
 8--
 
 % sh foo.sh
 $?=0 $1=1
 % zsh foo.sh
 $?=1 $1=1
 % bash foo.sh
 $?=1 $1=1
 
 As you can see, the value of $? is « lost » when FreeBSD sh enters a
 function.  Is this supposed to behave this way?

This has been fixed in 8.x:

Revision 185231 - Directory Listing
Modified Sun Nov 23 20:23:57 2008 UTC (7 months, 3 weeks ago) by stefanf

Fix $? at the first command of a function.  The previous exit status was saved
twice and thus lost.

-- 
Jilles Tjoelker
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: Why old files in /etc ?

2009-06-14 Thread Jilles Tjoelker
On Sun, Jun 14, 2009 at 07:33:23AM -0500, Michael Gass wrote:
 Just installed 7.2-release in an old Pavilion 4455 (PII, 256M)
 and it runs great.  Csuped src and ports and rebuilt world and
 generic kernel for 7.2-stable and that went well. 

 My question is why are the files in /etc in 7.2-stable older 
 versions (generally) than in 7.2-release?  I do not just mean
 older by date, but older versions of the files - like many of
 the config files for sendmail or the net.  Why  is stable using
 older versions of these files than release?

 Mostly I did not let mergemaster install the files from the build
 because they were so much older than the original release versions.
 Again, why are the files for stable so much older?

This is because of a weakness in the svn-to-cvs exporter.

Formerly, only CVS was used and tagging a release did not require a
commit. So right after a release, both the release and -stable would
have the same revision numbers.

With Subversion, tagging a release requires a commit. The CVS exporter
keeps this commit, so all files will have a changed CVS Id. This looks
newer, until/unless the file is changed on -stable again.

To cope with this, it's best to use mergemaster's -F or -U options.

This question has been asked before.

-- 
Jilles Tjoelker
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: Unnamed POSIX shared semaphores

2009-06-01 Thread Jilles Tjoelker
On Mon, Jun 01, 2009 at 06:33:42PM +0300, Vlad Galu wrote:
 According to sem_init(3), we can't have shared unnamed semaphores.
 However, the following code snippet seems to work just  fine:

 -- cut here --
 sem_t semaphore;
 if (sem_init(semaphore, 1, 10)  0)
 std::cout  Couldn't init semaphore:  
 strerror(errno)  std::endl;
 if (sem_wait(semaphore)  0)
 std::cout  Couldn't decrement semaphore:  
 strerror(errno)  std::endl;
 int val;
 sem_getvalue(semaphore, val);
 std::cout  Value is   val  std::endl;
 -- and here --

 Is this a case where sem_init() silently reports success, or should be
 the man page get an update?

Reading the code, it seems like this should work, but only between
related processes where the parent process creates the semaphore before
forking and no exec is done. This is because a sem_t is a pointer to a
structure containing the kernel level semid_t (and a mutex, condvar and
the like for process-private semaphores). sem_init() will allocate this
structure using malloc(3).

Changing sem_t to be a structure would be the obvious way to fix this,
but I think this cannot be versioned properly. For example, if someone
puts in the public header file of their .so:

struct my_struct
{
int foo;
sem_t bar;
int quux;
};

Changing the size of sem_t will break this. Also, assuming symbol
versioning were to be used, if you compile the .so for FreeBSD 7 and the
app for FreeBSD 8, the FreeBSD 8 sem_* functions will get FreeBSD 7
style sem_t's.

If process-shared semaphores really work, then the above structure is
not a pathological case. Effectively, sem_t is carved in stone. So
process-private semaphores should continue to have most of their stuff
in a separately allocated structure, to preserve flexibility.

Perhaps a better method is to set bit 0 of the sem_t to 1 and use the
other bits to store the semid_t.

-- 
Jilles Tjoelker
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: 7.2-RC2 Install Feedback

2009-04-30 Thread Jilles Tjoelker
On Thu, Apr 30, 2009 at 02:50:22PM -0400, Mehmet Erol Sanliturk wrote:
 (1)
 I am using always Gnome in FreeBSD  ( because KDE in 7.0 was opening forms
 of executables just like  X , and for multiple docked forms this was very
 annoying) .

 During installations , I am selecting both Gnome and KDE because some
 packages are inserted only into KDE menus . If the KDE is not selected ,
 those packages are not available in Gnome menus , and they should be
 inserted into Gnome menus one by one ( for me , this is a difficult task ,
 because it requires to find proper executable to insert it into menu ) .

IMHO, that would be a bug in those packages. Desktop files should be
installed in /usr/local/share/applications (or similar) and contain a
Categories field to determine where to place them in the menu. The
OnlyShowIn and NotShowIn fields are available for desktop files specific
to certain desktop environments. Many ports install desktop files to
/usr/local/share/applnk (KDE) or /usr/local/share/gnome/apps (GNOME); if
these files contain Categories it should be safe to move them to the new
location. Many upstreams are already following this new standard, and
some ports override the upstream build system's code to install to the
new location with an installation to the old location.

A user-level workaround is to make symlinks to the necessary desktop
files in $HOME/.local/share/applications/; while it is annoying to have
to do this, it is much easier than adding menu entries manually.

XFCE also needs the desktop files in the new location.

See http://standards.freedesktop.org/menu-spec/latest/ for more
information.

-- 
Jilles Tjoelker
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: expand_number(3) silently truncates numeric part of the argument to 32 bit on i386, light impact on gjournal

2008-07-06 Thread Jilles Tjoelker
On Sun, Jun 29, 2008 at 04:16:25PM -0400, Alexandre Sunny Kovalenko wrote:
 I honestly don't know whether it should or should not do it, and if it
 should not, what errno should be set to. Program below gives following
 output on RELENG_7 as of June 28th:

 sunny:RabbitsDen./expand_number 5368709120k 
 Result is 1099511627776
 sunny:RabbitsDen./expand_number 5120G
 Result is 5497558138880
 sunny:RabbitsDen

 One of the more interesting manifestations in the userland is that

 gjournal label -s 5368709120 -f /dev/da0s1a

 quietly gives you 1G of the journal in the resulting file system.
 [snip program calling expand_number(3)]

This happens because src/lib/libutil/expand_number.c does not include
the necessary header inttypes.h for calling strtoimax(3). The file is
compiled without compiler warnings, so the bug shows up as wrong
behaviour.

Adding #include inttypes.h fixes it.

The file is slightly changed in CURRENT but the same patch should apply.

-- 
Jilles Tjoelker
--- src/lib/libutil/expand_number.c.orig	2007-09-05 16:27:13.0 +0200
+++ src/lib/libutil/expand_number.c	2008-07-06 13:11:02.766238000 +0200
@@ -33,6 +33,7 @@
 #include errno.h
 #include libutil.h
 #include stdint.h
+#include inttypes.h
 
 /*
  * Convert an expression of the following forms to a int64_t.
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]

Re: panic(): vinvalbuf: dirty bufs: perhaps a ffs_syncvnode bug?

2006-11-27 Thread Jilles Tjoelker
On Thu, Nov 16, 2006 at 09:24:07AM +0100, Rink Springer wrote:
 Over the night, we reset the shelf in order to activate its new
 management IP address, causing the /dev/da[12] devices to be temporarily
 unavailable. This resulted in the following panic on the rather busy
 mailstorage server (the other server has minor load and was fine):

 ---
 (da0:isp0:0:1:0): lost device
 (da0:isp0:0:1:0): removing device entry
 (da1:isp0:0:2:0): lost device
 g_vfs_done():da1s1[WRITE(offset=292316823552, length=16384)]error = 6
 g_vfs_done():da1s1[WRITE(offset=240287318016, length=16384)]error = 6
 g_vfs_done():da1s1[READ(offset=12175362048, length=2048)]error = 6
 g_vfs_done():da1s1[WRITE(offset=240287318016, length=16384)]error = 6
 g_vfs_done():da1s1[READ(offset=18370689024, length=2048)]error = 6
 g_vfs_done():da1s1[READ(offset=25829486592, length=512)]error = 6
 vnode_pager_getpages: I/O read error
 vm_fault: pager read error, pid 78035 (lmtpd)
 g_vfs_done():da1s1[WRITE(offset=240287318016,
 length=1638(da1:isp0:0:2:0): Invalidating pack
 4)]error = 6
 g_vfs_done():da1s1[READ(offset=13768671232, length=6144)]error = 6
 g_vfs_done():da1s1[READ(offset=102126977024, length=16384)]error = 6
 g_vfs_done():da1s1[READ(offset=13768671232, length=6144)]error = 6
 g_vfs_dpone():da1s1[READ(offset=102319669248, length=16384)]error = 6a
 nic: vinvalbuf: dirty bufs
 cpuid = 2
 Uptime: 54d15h48m38s

 When looking at the source code of vinvalbuf(), which calls
 bufobj_invalbuf(), it seems that this panic is raised after a bufobj
 still contains dirty data after waiting for it to complete without
 error. The code can be found at /sys/kern/vfs_subr.c

Note that this panic can only occur if vinvalbuf() is called with
V_SAVE (save modified data first).

The exact condition for the panic is better described as: a bufobj still
contains dirty data or still has output in progress after a successful
synchronous BO_SYNC operation.  bufobj_wwait() cannot return an error
unless msleep() fails (e.g. interruptible sleep requested via slpflag
and signal occured).  If the I/O has failed, bufobj_wwait() will return
success.

 The sync routine called eventually translates to bufsync(), as in
 /sys/kern/vfs_bio.c, which calls the filesystem's sync routine. It seems
 as if the return status of vfs_bio_awrite() in ffs_syncvnode() is not
 checked; all the other parts are checked. I believe this could provoke
 this panic.

There does not seem much point in checking an asynchronous write result
anyway, as the I/O is not completed yet.

I don't understand well what the code is doing with async writes.  For
all but the last pass (see further), it will call bawrite() on the
buffer, which sets B_ASYNC then calls bwrite(). For the last pass, it
calls bwrite() directly (has something cleared B_ASYNC?), and returns an
error if it fails. bwrite() itself is an inline function defined in
/sys/sys/buf.h, which calls BO_WRITE after some KASSERTs.

 As the machine is in production use, it was instantly rebooted by a
 collegue and thus I have no vmcore, backtrace or anything. I therefore
 hope the information provided here is adequate.

 Can someone with more FreeBSD-VFS knowledge please look at this?

There is another possible problem, from this comment in
/sys/ufs/ffs/ffs_vnops.c ffs_syncvnode():
/*
 * Block devices associated with filesystems may
 * have new I/O requests posted for them even if
 * the vnode is locked, so no amount of trying will
 * get them clean. Thus we give block devices a
 * good effort, then just give up. For all other file
 * types, go around and try again until it is clean.
 */
Actually it just does NIADDR + 1 (four) passes and then gives up.  If
DIAGNOSTIC is enabled, it will then print the affected vnode, if it is
not a disk.  This failure is not reflected in ffs_syncvnode()'s return
value, so if it occurs when ffs_syncvnode() is called from
bufobj_invalbuf(), a panic will result.

Suppose ffs_syncvnode() would be changed to return some error in this
case.  bufobj_invalbuf()/vinvalbuf() will handle a BO_SYNC/ffs_syncvnode()
error by aborting with an error return.  It seems that in most cases
this will cause the operation invoking the vinvalbuf() to fail.
However, in at least one case (vm_object_terminate()), the error will be
ignored; this may cause old garbage/dangling references?

-- 
Jilles Tjoelker
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: rc-ng problem with [procname] (e.g. kernel threaded procs)

2005-09-16 Thread Jilles Tjoelker
On Tue, Aug 16, 2005 at 07:24:41PM +0200, Andy Hilker wrote:
 Hmh, no one interested in this issue? Or am i wrong with this issue?

Apparently noone is interested.

Be sure to file a PR if you haven't done so yet.

 You (Andy Hilker) wrote:
  i think I have found a problem with rc-ng scripts and procnames
  including brackets (e.g. kernel threaded, like mysqld).

  Brackets [] are ignored, process will not be found and is regarded
  as not running. This breaks stop+status functions of rcng. The
  following patch allows brackets in variable procname rc-ng scripts.
  Maybe someone can review and fix this issue.

  It was relevant for me when using [mysqld].

This happens if the argv is larger than kern.ps_arg_cache_limit and
either /proc is not mounted or the user running ps is not allowed to
read the command's memory.

Your patch needs \[, \] by the way, not ( ).

Alternatively, you could use ps -o pid,ucomm for the $_interpreter = .
case and only look for $_procnamebn.

This whole ps stuff has the potential of killing the wrong process, how
about using pidfiles?

  # $FreeBSD: src/etc/rc.subr,v 1.31.2.1 2005/01/17 11:51:00 keramida Exp $
  --- rc.subr Thu Aug 11 15:18:52 2005
  +++ /etc/rc.subrThu Aug 11 15:14:06 2005
  @@ -267,7 +267,7 @@
  _procnamebn=${_procname##*/}
  _fp_args='_arg0 _argv'
  _fp_match='case $_arg0 in
  -   
  $_procname|$_procnamebn|${_procnamebn}:|(${_procnamebn}))'
  +   
  $_procname|$_procnamebn|${_procnamebn}:|(${_procnamebn}))'
  fi
   
  _proccheck='

-- 
Jilles Tjoelker
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: [PATCH] Re: /etc/rc.d/sshd : kldload random missing?

2005-04-25 Thread Jilles Tjoelker
On Sun, Apr 24, 2005 at 07:02:42PM -0700, Rob wrote:
 My conclusion is:
 sshd does simply not call random at all, although I
 have added it in the # REQUIRE: line.

 Is this a general bug in 5-Stable?
 Or am I testing this in the wrong way?

RCNG doesn't work that way. rc.d scripts do not call one another. The #
REQUIRE: lines affect rcorder(8) so it should be ok on bootup.

-- 
Jilles Tjoelker
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: USB mouse troubles

2005-04-07 Thread Jilles Tjoelker
On Tue, Apr 05, 2005 at 09:17:55PM +0200, Michael Nottebrock wrote:
 FreeBSD 5.x has had funky issues with usb mice for as long as I've
 been using a usb mouse with it, but since it almost works ok with the
 default configuration, I never got around to complain about it. ;-)

AFAIK FreeBSD 4.x has the same issues and some more, e.g. systems with
more than one USB bus require a manual MAKEDEV for hotplugging to work
correctly (as only /dev/usb0 is created by default).

 However: In various sitations and configurations, USB mice are not
 picked up.

I have not tested this at this point, but it agrees with my
understanding of how the code works.

 - With a GENERIC kernel and all of the usb support in the kernel, usb
 mice are usually recognized on boot, but they will cease to work after
 going single user and back to multiuser again. The /dev/ums0 device
 doesn't even get removed, but the mouse is dead. Unplugging and
 replugging usually gets it going again.

On the first startup, usbd will get the initial events (the mouse being
attached at boot time) and start moused. When going to single user,
moused is killed. If usbd is started later, it will not get the initial
events again (also the set of attached devices may have changed since
boot), so will not start moused again.

One fix could be to change usbd to throw away the initial events,
instead doing as if attach events were received for all present USB
devices. This would be nasty if usbd is restarted without a
reboot/single user, which could be fixed by making the new behaviour
optional. This is also nasty if usbd.conf contains an action for a
device instead of starting a process, e.g. automatically copying files
from a umass device. This might be fixed by distinguishing the two in
usbd.conf.

 - With all of usb compiled as modules and usbd enabled in rc.conf, ums
 usually doesn't even get loaded, but usbdevs will show the mouse
 plugged in. Even subsequent unplugging and replugging will not get ums
 loaded. Manually loading ums doesn't get the mouse working either, an
 unplug/replug is necessary first.

usbd loads only the usb kld, that will bring in the drivers for the
USB controllers, ugen and uhub and not much more. When the mouse is
plugged in, ugen will grab it (there being nothing better) and will not
release it, even when you subsequently load ums. This happens before
usbd gets to know about the device.

A somewhat crude workaround would be to load the drivers for the devices
to be used before starting usbd. This would mean /boot/loader.conf, most
likely.

-- 
Jilles Tjoelker
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: Just a sanity check before I sumbit a buig report

2005-03-13 Thread Jilles Tjoelker
On Fri, Mar 11, 2005 at 02:13:54PM -0600, Jon Noack wrote:
 Pete French wrote:
 Why does sysconf(_SC_CLK_TCK) always returns 128?  Check out sysconf() 
 in src/lib/libc/gen/sysconf.c (lines 83-84 of rev. 1.10):

 [follow through of code showing it is defined as a constant snipped]

 sysconf(3) states that _SC_CLK_TCK is the frequency of the statistics 
 clock in ticks per second.  Considering this value varies, returning a 
 constant is wrong.  Feel free to attach my email on the PR.

An important use for sysconf(_SC_CLK_TCK) is to specify the rate of the
results of times(3). (I don't know how many applications call that
stupid function, getrusage() having been available for so long ;-)
Currently, src/lib/libc/gen/times.c compiles this in just like sysconf.c
does. So that's all ok; times.c will have to be modified too if
sysconf(_SC_CLK_TCK) changes. 

getrusage(2) says that ru_ixrss is based on statistics clock ticks
with a frequency of sysconf(_SC_CLK_TCK). This cannot be right.
In other systems, getrusage often only really supports the timeval
fields and perhaps the fault and swap counts; if it is supported, the
ru_i?rss ticks are often not described at all or they are something
strange like one per second. Consequently, this facility is nonportable
and the tick frequency should be described using sysctl().

-- 
Jilles Tjoelker
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


5.3-RELEASE-p5 panic bundirty: buffer 0xd63d85e0 still on queue 1

2005-02-21 Thread Jilles Tjoelker
  ip_moptions 1 1K  1K1  128
  IpFw/IpAcct 1 1K  1K1  64
 in_multi 4 1K  1K4  32
 igmp 1 1K  1K1  16
 routetbl   22841K 42K23065  16,32,64,128,256
   kqueue 0 0K 15K   184464  128,1024
 kenv   104 6K  6K  105  16,32,64,2048
sigio 2 1K  1K   67  32
   lo 1 1K  1K1  1024
clone 520K 20K5  4096
  ether_multi77 4K  4K   77  16,32,64
   ifaddr5012K 12K   53  16,32,64,256,512,2048
  BPF 4 1K  1K4  64
mount4322K 22K   49  16,32,128,512,1024
   vnodes42 8K  8K  359  16,32,64,128,256
  Export Host 2 1K  1K4  256
cluster_save buffer 0 0K  1K 3956  32,64
 vfscache 1   512K512K1
   BIO buffer72   114K   5697K   130296  1024,2048
file desc   24775K107K  1518255  16,32,64,256,512,1024,2048,4096
  pcb   118 6K  8K   346499  16,32,64,2048
   soname   10311K 45K  5116733  16,32,64,128
  tag 0 0K  7K  2161374  32,64
 mbextcnt 0 0K  2K11818  16
 ptys31 4K  4K   31  128
 ttys  3983   516K595K36848  128,512
  shm 213K 15K   70  256
  sem 4 7K  7K4  512,1024,4096
  msg 425K 25K4  512,4096
  iov 0 0K  1K  2041064  16,64,128,256,512
 ioctlops 0 0K  4K   45  512,1024,2048,4096
 cdev9223K 23K   92  256
   acpica 0 0K  1K   15  16,32,64
   turnstiles   65141K 41K  671  64
taskqueue 6 1K  1K6  64
  ISOFS mount 1   256K256K1
 sleep queues   65121K 21K  671  32
 sbuf 0 0K 37K 2129  16,32,64,128,256,512,1024,2048,4096
 rman   118 8K  8K  520  16,64
   isadev42 3K  3K   42  64
 GEOM63 9K 14K  241  16,32,64,128,256,512,1024
 kobj   101   202K202K  121  2048
  pfs_vncache 2 1K 52K12858  32
 eventhandler27 2K  2K   27  32,128
  devstat 613K 13K6  16,4096
   pfs_fileno 120K 20K1
   bus-sc3843K 48K  371  16,64,128,256,512,1024,2048,4096
  bus   54124K 82K 2056  16,32,64,128,1024
 SWAP 2   345K345K2  64
sysctltmp 0 0K  1K53047  16,32,64,128
sysctloid  148645K 45K 1486  16,32,64
   sysctl 0 0K  1K45958  16,32,64
  uidinfo18 2K  2K 4270  32,1024
   plimit3910K 12K  1812147  256
pfs_nodes20 3K  3K   20  128
 cred   17422K 30K  4610954  128
  subproc   359   849K   1252K  3865370  32,4096
 proc 2 8K  8K2  4096
  session9112K 14K20158  128
 pgrp99 7K  8K25399  64
 mtx_pool 1 8K  8K1
   module   18612K 12K  186  64,128
MSDOSFS mount 1   128K128K1
   ip6ndp 9 1K  1K   13  64,128
   ip6opt 0 0K  2K   136543  128
 temp  3080   258K288K  2236282  16,32,64,128,256,512,1024,2048,4096
   devbuf  2292  4517K   4677K 28998409  16,32,64,128,256,512,1024,2048,4096
# ### netstat -m crashes after printing a (high) number of mbufs in use
# netstat -m -N kernel.debug.28 -M vmcore.28
4045 mbufs in use
Segmentation fault
# vmstat -z -N kernel.debug.28 -M vmcore.28
vmstat: not implemented
# ^D
Script done on Mon Feb 21 17:21:35 2005

-- 
Jilles Tjoelker
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: USB scanner not attached when connected after system startup]

2005-01-09 Thread Jilles Tjoelker
On Wed, Jan 05, 2005 at 10:44:05PM +0100, Harald Weis wrote:
 On Mon, Jan 03, 2005 at 09:32:20PM +0100, Roland Smith wrote:

  That's not good. AFAICT, the code that reports a new scanner to dmesg is
  also present in 4.x, so if you don't see a message, it looks like the
  scanner is not found for one reason or another.

Try all USB ports on your computer; use MAKEDEV to create /dev/usb1,
/dev/usb2, etc (they are not created by default).

-- 
Jilles Tjoelker
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


unkillable processes after debugging on 5.3R

2005-01-03 Thread Jilles Tjoelker
I have two unkillable processes.

System:
FreeBSD turtle.stack.nl 5.3-RELEASE-p2 FreeBSD 5.3-RELEASE-p2 #5: Thu Dec  2 
17:25:55 CET 2004 [EMAIL PROTECTED]:/usr/obj/usr/src/sys/SNAIL  i386

The system is SMP with two CPUs.

Quoting the user:
 I was debugging with the system gdb some c++-code (with a strange
 segmentation fault). I was logged in via ssh and the connection seemed to
 freeze (no response from keyboard input) so I disconnected (~.-sequence in
 ssh).

 Logging in again on the machine, I killed the debugger and shell (I don't
 remember in which order) and tried to kill the program skilllist (pid
 20326). The skilllist program then appeared to be using 100% CPU time and
 did not respond to any of the signals I sent. About 24 hours later, I
 discovered that a zsh-process (pid 20328) was also running at lots of
 cpu-time. The program was initially not run in the background, the nicing
 and placing into the idle queue has been done later.

 My code is c++-code, using both fd 1 and 2 for output. It is not threaded.
 It does not use fork, exec etc. It's basically a simple prog, generating
 only output, not listening for input. The working directory is mounted over
 nfs (but my code does not open files).

After the nicing and placing into the idle queue the system is properly
responsive.

Output of some commands about the processes:

  UID   PID  PPID CPU PRI NI   VSZ  RSS MWCHAN STAT  TT   TIME COMMAND
 1711 20326 1 475 171 20  2280 1416 -  RNph- 2421:35.76 ./skilllist 
 1711 20328 1 106  -8 20  2696 2328 -  RNE   ph- 729:34.15 -zsh (zsh)
 RTPRIO
idle:25
idle:76

db trace 20328
sched_switch(c29cd320,0,1) at sched_switch+0x143
mi_switch(1,0,c29cd320,1,c29cd320) at mi_switch+0x1ba
sleepq_switch(c2302a80) at sleepq_switch+0x133
sleepq_wait(c2302a80,0,0,0,0) at sleepq_wait+0xb
msleep(c2302a80,c2302bd8,4c,c06d14b3,0) at msleep+0x322
pipeclose(c2302a80,c2302b14,c3eba484,e9e9eb94,c050736c) at pipeclose+0x88
pipe_close(c3eba484,c29cd320) at pipe_close+0x2a
fdrop_locked(c3eba484,c29cd320,c25b0c8c,e9e9ec04,c050616f) at fdrop_locked+0xa8
fdrop(c3eba484,c29cd320,0,2,c388a000) at fdrop+0x41
closef(c3eba484,c29cd320) at closef+0x23f
fdfree(c29cd320,c3c22d70) at fdfree+0x383
exit1(c29cd320,2,1,c29cd320,c388a000) at exit1+0x4d4
sigexit(c29cd320,2,0,c3c22c5c,c29cd320) at sigexit+0xd3
postsig(2) at postsig+0x13f
ast(e9e9ed48) at ast+0x4ba
doreti_ast() at doreti_ast+0x17
db trace 20326
sched_switch(,c22e3000,400,8067000,df42c340) at sched_switch+0x143
db c

[EMAIL PROTECTED] /home/jilles% fstat -vp20326
USER CMD  PID   FD MOUNT  INUM MODE SZ|DV R/W
peters   skilllist  20326 root / 2 drwxr-xr-x1024  r
peters   skilllist  20326   wd /toad.mnt/capitalism 892355 drwxr-xr-x 512  r
peters   skilllist  20326 text /toad.mnt/capitalism 892490 -rwxr-xr-x  235868  r
peters   skilllist  203260 - - bad-
peters   skilllist  203261* pipe c2302b2c - c2302a80  0 rw
peters   skilllist  203262* pipe c2302b2c - c2302a80  0 rw
[EMAIL PROTECTED] /home/jilles% fstat -vp20328
USER CMD  PID   FD MOUNT  INUM MODE SZ|DV R/W
peters   zsh20328 root / 2 drwxr-xr-x1024  r
peters   zsh20328   wd /toad.mnt/capitalism 892355 drwxr-xr-x 512  r
peters   zsh20328 text /usr 921879 -r-xr-xr-x3156  r
can't read sock at 0x0
peters   zsh20328   10* error
peters   zsh20328   12 - - bad-
can't read pipe at 0x0
peters   zsh20328   13* error
[EMAIL PROTECTED] /home/jilles%

The address c2302a80 occurs in the 20328 backtrace as well.

The fstat output of 20328 is unreliable: a later query returned this:

[EMAIL PROTECTED] /home/jilles% fstat -vp20328
USER CMD  PID   FD MOUNT  INUM MODE SZ|DV R/W
peters   zsh20328 root / 2 drwxr-xr-x1024  r
peters   zsh20328   wd /toad.mnt/capitalism 892355 drwxr-xr-x 512  r

peters   zsh20328 text /usr 921879 -r-xr-xr-x3156  r
unknown file type 5 for file 10 of pid 20328
unknown file type 5 for file 12 of pid 20328
can't read pipe at 0x0
peters   zsh20328   13* error
[EMAIL PROTECTED] /home/jilles%

-- 
Jilles Tjoelker
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]