Re: Trying to use /bin/sh
On Fri, Sep 27, 2013 at 12:20:32PM -0700, Sean Bruno wrote: I was attempting to simply set my EDITOR env variable for use on my system today and discovered that I do *not* understand how computers work. After realizing that I simply do not understand what's going on, I broke down and put an echo test into /etc/profile and ~/.profile. I can't see why, but logging in via xdm/xfce4 isn't doing any of the normal login things and nothing in my .profile is ever executed by any shell, ever. None of these get executed unless its a login shell. xfce4 doesn't seem to pick these up when starting up, so I have no idea what to do. For now, I've put the needed things into a .bashrc and switched my loging shell to bash, which is not really what I wanted to do. sh's model of startup files (only login shells use startup files with fixed names, other interactive shells only use $ENV) assumes that every session will load /etc/profile and ~/.profile at some point. This includes graphical sessions. The ENV file typically contains only shell options, aliases, function definitions and unexported variables but no environment variables. Some graphical environments actually source shell startup files like ~/.profile when logging in. I remember this from CDE for example. It is important to have some rule where this should happen to avoid doing it twice or never in strange configurations. As a workaround, I made ~/.xsession a script interpreted by my login shell and source some startup files. A problem here is that different login shells have incompatible startup files. A different direction is to accept that the environment variables in the X session will be incorrect and make all xterms start login shells (e.g. xterm -ls). You may also be able to use setenv in ~/.login_conf. -- Jilles Tjoelker ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org
Re: [kde-freebsd] virtualbox file dialog problem
, if setting KDE_FORK_SLAVES=1 works around the problem, perhaps KDE should behave that way automatically if it is called from a process with issetugid() true. -- Jilles Tjoelker ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org
Re: Importing tradcpp (traditional (KR-style) C macro preprocessor) into base?
On Wed, Jun 12, 2013 at 12:23:44AM +0200, Matthias Andree wrote: Am 12.06.2013 00:11, schrieb Baptiste Daroussin: I have been working in importing tradcpp (developped by David A. Holland from NetBSD) into the ports tree, it is a traditional (KR-style) C macro preprocessor BSD licensed. I first worked on it so that imake can work properly without gcc. I discovered that some part of the base system still needs a traditional preprocessor, like (calendar), what I propose it to import tradcpp into the base system (not the version in port right now but what will become version 0.2). It mostly behave like gcpp, and I'm able to properly use calendar along with tradcpp with this small patch: http://people.freebsd.org/~bapt/tradcpp.diff Any objections against me importing it? Shouldn't we fix calendar and imake so that they can use a modern cpp, instead of going back 25 years? Or am I missing the point here? A modern cpp is only suitable for preprocessing C code. It makes various assumptions about syntax that make it work better for preprocessing actual C code but worse for text that is not C. Calendar and imake use the traditional cpp for preprocessing non-C text. There is already a patch for removing the dependency on cpp in calendar. I think that makes more sense than importing a traditional cpp (assuming there are no other users in base that cannot be fixed). The fix for imake is probably not to use it. Ports that need imake can continue to use it if it depends on a traditional cpp port. A further point of confusion is that things that want a traditional cpp in fact want an equivalent of gcpp -traditional. This is in fact the point of tradcpp and why the old preprocessors do not work. -- Jilles Tjoelker ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org
Re: /bin/sh = STDIN functions, var scope messing
On Tue, May 28, 2013 at 11:48:47AM +0200, Václav Zeman wrote: On 27 May 2013 21:58, Reid Linnemann wrote: from SH(1) Note that unlike some other shells, sh executes each process in a pipe- line with more than one command in a subshell environment and as a child of the sh process. I'm taking this to mean that redirecting to sh_f has sh_f execute in a subshell in which global_scope_var changes, but the original shell's copy is uncahnged. Curious. Which of the two behaviours is POSIXly correct? Both. As per XCU 2.12 Shell Execution Environment, each command in a multi-command pipeline may or may not be executed in a subshell environment. Behaviour different from our sh is most often encountered in the various versions of the real Korn shell (ksh88 and ksh93), which execute the last command in a pipeline in the current shell environment. If things like jobs | cat work, that can also be explained using this rule. -- Jilles Tjoelker ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org
Re: find -delete broken, or just used improperly?
On Tue, May 21, 2013 at 11:06:39AM -0400, John Baldwin wrote: On Monday, May 20, 2013 5:47:31 pm Jilles Tjoelker wrote: The below patch allows deleting the pathname given to find itself: Index: usr.bin/find/function.c === --- usr.bin/find/function.c (revision 250661) +++ usr.bin/find/function.c (working copy) @@ -442,7 +442,8 @@ errx(1, -delete: forbidden when symlinks are followed); /* Potentially unsafe - do not accept relative paths whatsoever */ - if (strchr(entry-fts_accpath, '/') != NULL) + if (entry-fts_level FTS_ROOTLEVEL + strchr(entry-fts_accpath, '/') != NULL) errx(1, -delete: %s: relative path potentially not safe, entry-fts_accpath); I'm curious, how would you instruct a patched find to avoid deleteing the /tmp/foo directory (e.g. if you wanted this to be a job that pruned empty dirs from /tmp/foo but never pruned the directory itself). Would -mindepth 1 do it? (Just asking. I have also found this message annoying but most of the jobs I have seen it on probably don't want to delete the root path, just descendants.) -mindepth 1 works, as does cd /tmp/foo find . -... (-delete silently ignores . and ..). -- Jilles Tjoelker ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org
Re: find -delete broken, or just used improperly?
On Mon, May 20, 2013 at 03:23:16PM -0400, Kurt Lidl wrote: OK, maybe I'm missing something obvious, but... find(1) says: -delete Delete found files and/or directories. Always returns true. This executes from the current working directory as find recurses down the tree. It will not attempt to delete a filename with a ``/'' character in its pathname relative to ``.'' for security reasons. Depth-first traversal processing is implied by this option. Following symlinks is incompatible with this option. However, it fails even when the path is absolute: bhyve9# mkdir /tmp/foo bhyve9# find /tmp/foo -empty -delete find: -delete: /tmp/foo: relative path potentially not safe Shouldn't this work? The relative path refers to a pathname that contains a slash other than at the beginning or end and may therefore refer to somewhere else if a directory is concurrently replaced by a symlink. When -L is not specified and . can be opened, the fts(3) code underlying find(1) is careful to avoid following symlinks or being dropped in different locations by moving the directory fts is currently traversing. If a problematic concurrent modification is detected, fts will not enter the directory or abort. Files found in the search are returned via the current working directory and a pathname not containing a slash. For paranoia, find(1) verifies this when -delete is used. However, it is too paranoid about the root of the traversal. It is already assumed that the initial pathname does not refer to directories or symlinks that might be replaced by untrusted users; otherwise, the whole traversal would be unsafe. Therefore, it is not necessary to do the check for fts_level == FTS_ROOTLEVEL. The below patch allows deleting the pathname given to find itself: Index: usr.bin/find/function.c === --- usr.bin/find/function.c (revision 250661) +++ usr.bin/find/function.c (working copy) @@ -442,7 +442,8 @@ errx(1, -delete: forbidden when symlinks are followed); /* Potentially unsafe - do not accept relative paths whatsoever */ - if (strchr(entry-fts_accpath, '/') != NULL) + if (entry-fts_level FTS_ROOTLEVEL + strchr(entry-fts_accpath, '/') != NULL) errx(1, -delete: %s: relative path potentially not safe, entry-fts_accpath); -- Jilles Tjoelker ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org
Re: Managing userland data pointers in kqueue/kevent
On Wed, May 15, 2013 at 01:34:58PM +0100, Paul LeoNerd Evans wrote: On Wed, 15 May 2013 13:29:59 +0100 Paul LeoNerd Evans leon...@leonerd.org.uk wrote: Is that not the exact thing I suggested? The extension to create register a kevent to catch these events is that you put the EV_DROPWATCH bit flag in the event at the time you register it. The returned event [that] could have all the appropriate informaiton for the event being dropped is that you receive an event with EV_DROPPED set on it. It being a real event includes of course the udata pointer, so you can handle it. In fact, to requote the original PR I wrote[1] on the subject: --- I propose the addition of a new flag applicable to any kevent watch structure, documented thusly: The flags field can contain the following values: .. EV_DROPWATCH Requests that the kernel will send an EV_DROPPED event on this watch when it has finished watching it for any reason, including EV_DELETE, expiry because of EV_ONESHOT, or because the filehandle was closed by close(2). EV_DROPPED This flag is returned by the kernel if it is now about to drop the watch. After this flag has been received, no further events will occur on this watch. This flag then makes it trivial to build a generic wrapper for kqueue that can always manage its memory correctly. a) at EV_ADD time, simply set flags |= EV_DROPWATCH b) after an event has been processed that included the EV_DROPPED flag, free() the pointer given in the udata field. An important detail is missing: how do you avoid using up all kernel memory on knotes if someone keeps adding new file descriptors with EV_ADD | EV_DROPWATCH and closing the file descriptors again without ever draining the kqueue? This problem did not use to exist for file descriptor events before: the number of such knotes was limited to the number of open file descriptors. However, it does already exist for most of the other event types. For example, pwait -v will return the exit status even if it was suspended (^Z) while the process terminated and the parent reaped the zombie. For EVFILT_TIMER, the worst effect is a denial of service of EVFILT_TIMER on all other processes in the system. EVFILT_USER does not appear to check anything and appears to allow arbitrary kernel memory consumption. The EVFILT_TIMER needs to keep its global limit and EVFILT_USER needs something similar. For the rest, call an event that is no longer associated to a kernel object (e.g. EVFILT_READ whose file descriptor is closed, EVFILT_PROC whose process has terminated and been reaped by the parent or EVFILT_AIO whose I/O request is completed) unbound. The number of events that are not unbound is limited by existing limits on the other kernel objects. A possible fix is to reject (such as with [ENOMEM]) adding new events when there are too many unbound events in the queue already. The application should then allow kevent() to return pending events first before it adds new ones. If the kernel returns unbound events in preference to other events, a kevent() call with nevents = 2 * nchanges cannot result in a net increase in the number of current and potential unbound events, since it allows the kernel to return (and forget) as many unbound events as it may add (nchanges entries are required for EV_ERROR leaving nchanges for returning other events). It is not required that these two flags have distinct values; since one is userland-kernel and the other kernel-userland, they could for neatness reuse the same bit field. I think it would be consistent with other EV_* to use the same name and value for both. -- Jilles Tjoelker ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org
Re: SystemV IPC. Segment info
On Tue, Apr 30, 2013 at 07:49:40AM +0400, Vagner wrote: Tell me please, may I send this to PR, or this changes is not valid? A few weeks ago, I ran into problem, which related to SystemV IPC. More than 20 processes attached to a segment shared queue. Process-initiator for create segment was killed, as process which was accessed to segment last. Segment didn't free memory, but tagged it as SHMSEG_REMOVED as the result. This is a reason of memory overflow (memory assotiated as shm). Moreover, processes, which was attached to this segment did't get a new data. I have one resolve. I need to restarted all process, which still attached to segment. But this reason have a problem. We haven't list of this processes at system. Moreover, struct shmid_ds, which described segment, haven't this info too. This patch is a resolve of problem. It: - added a linked list of structures shmid_pi in struct shmid_ds. PID (and last access time) recorded to this struct consistently. Memory allocates with ident 'shminfo' for this list of struct shmid_pi. - added syscall shminf for get all elements from list shmid_ds. - added option [-P] in ipcs(1) for system call shminf. I think it is strange to maintain pids in the VM system. This makes the VM system more complex for little reason (because the information is only needed for monitoring, not normal operation). Perhaps it is possible to do what you want using procstat -v or a slight extension to it. Alternatively, you may find a different way to get rid of your stale worker processes. For example, before invoking shmctl(IPC_RMID), set a flag inside the memory segment that all workers should exit and activate whatever mechanism you use for telling the workers that they need to do something. -- Jilles Tjoelker ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org
Re: Some improvements to rm(1)
On Thu, Apr 25, 2013 at 10:56:10PM -0400, Eitan Adler wrote: On 25 April 2013 22:50, Brooks Davis bro...@freebsd.org wrote: On Thu, Apr 25, 2013 at 10:16:32PM -0400, Eitan Adler wrote: Anyone have thoughts on the following? commit 82c78ba923d8ce4a1bfbb309658c49021c8bb384 Author: Eitan Adler li...@eitanadler.com Date: Thu Apr 25 22:14:49 2013 -0400 Take some improvements from DragonFlyBSD: - add const where appropriate - add static where appropriate - fix a whitespace issues The no-op changes look more correct to me. I think the -x option seems a bit odd. What is the use case? At a first thought, it seems to raise more questions than it resolves. It goes along with cp -x, find -x, and others. Quick example #1: You have /usr/ports /usr/ports/distfiles as different mount points it lets you wipe /usr/ports without wiping your distfile cache. Quick example #2: You have /usr/src/ null mounted in every user's /home/ and you want to wipe one home directory. Hmm, isn't this already possible using find -x DIR -delete ? There will be an error message 'Device busy' about attempting to delete the mount point but this does not even affect the exit status and all files not under the mount point are removed. -- Jilles Tjoelker ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org
Re: Some improvements to rm(1)
On Fri, Apr 26, 2013 at 08:33:54AM -0500, Joshua Isom wrote: On 4/26/2013 7:23 AM, Eitan Adler wrote: Yes, rm's functionality can be fully replicated by find. As well as anything using -R. Emulating other -R things using find becomes quite slow when you don't want to impose {PATH_MAX} limits or open up symlink-based race windows because the only safe option is -execdir UTILITY {} \;. Any find command based on -exec, -print or -print0 passes pathnames which are subject to {PATH_MAX} limits and directories concurrently replaced with symlinks. The construct -execdir ... {} + is unusably broken in older FreeBSD versions and gives no advantage compared to -execdir ... {} \; in recent -CURRENT. With -L, this is not a new problem because symlinks are followed anyway and the underlying code (fts(3)) always imposes the {PATH_MAX} limit in that case. The -delete primary is safe like -execdir. I'm not entirely sure about this because the rm(1) patch is simple and the new syntax is fairly clear. -- Jilles Tjoelker ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org
Re: [patch] pipe2
On Mon, Apr 22, 2013 at 08:31:18PM +0300, Konstantin Belousov wrote: On Sun, Apr 21, 2013 at 11:02:43PM +0200, Jilles Tjoelker wrote: On Sat, Apr 20, 2013 at 12:48:39AM +0200, Jilles Tjoelker wrote: I'm also working on pipe2() (using linuxulator work) and dup3() (patch from Jukka A. Ukkonen). This is an implementation of pipe2. As with the accept4 patch, make sysent needs to be run in sys/kern and sys/compat/freebsd32. As a bonus, new architectures might implement pipe(p) as pipe2(p, 0) avoiding the need for assembler (but behaviour is different if the pointer is invalid). I do not see anything wrong with the patch. That said, I prefer the pipe(2) approach of delegating the access to the usermode memory to usermode, which avoids the need of the calls to kern_close(). Delegating the access gives better semantics (a signal instead of an [EFAULT] error) but I don't like adding architecture-specific assembler wrappers like pipe.S. Also, different calling conventions than expected from the C level prototype may make things harder to understand. Perhaps the wrapper can be implemented in C with the below declarations but it is quite likely that this is not fully portable. struct pipereturn { register_t a, b; }; struct pipereturn __sys_pipe(void); -- Jilles Tjoelker ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org
[patch] pipe2
On Sat, Apr 20, 2013 at 12:48:39AM +0200, Jilles Tjoelker wrote: I'm also working on pipe2() (using linuxulator work) and dup3() (patch from Jukka A. Ukkonen). This is an implementation of pipe2. As with the accept4 patch, make sysent needs to be run in sys/kern and sys/compat/freebsd32. As a bonus, new architectures might implement pipe(p) as pipe2(p, 0) avoiding the need for assembler (but behaviour is different if the pointer is invalid). I will do a __FreeBSD_version bump once these are in. commit 256ab750c5e70db85d87c884994eb13038b182a2 Author: Jilles Tjoelker jil...@stack.nl Date: Sat Mar 30 23:32:34 2013 +0100 Add pipe2() system call. diff --git a/include/unistd.h b/include/unistd.h index dabf178..9df0777 100644 --- a/include/unistd.h +++ b/include/unistd.h @@ -533,6 +533,7 @@ char*mktemp(char *); #endif int nfssvc(int, void *); int nlm_syscall(int, int, int, char **); +int pipe2(int *, int); int profil(char *, size_t, vm_offset_t, int); int rcmd(char **, int, const char *, const char *, const char *, int *); int rcmd_af(char **, int, const char *, diff --git a/lib/libc/sys/Makefile.inc b/lib/libc/sys/Makefile.inc index 105f469..8e918bf 100644 --- a/lib/libc/sys/Makefile.inc +++ b/lib/libc/sys/Makefile.inc @@ -352,6 +352,7 @@ MLINKS+=pathconf.2 lpathconf.2 MLINKS+=pdfork.2 pdgetpid.2\ pdfork.2 pdkill.2 \ pdfork.2 pdwait4.2 +MLINKS+=pipe.2 pipe2.2 MLINKS+=read.2 pread.2 \ read.2 preadv.2 \ read.2 readv.2 diff --git a/lib/libc/sys/Symbol.map b/lib/libc/sys/Symbol.map index 149fa41..24f4621 100644 --- a/lib/libc/sys/Symbol.map +++ b/lib/libc/sys/Symbol.map @@ -393,6 +393,7 @@ FBSD_1.3 { ffclock_getcounter; ffclock_getestimate; ffclock_setestimate; + pipe2; posix_fadvise; wait6; }; diff --git a/lib/libc/sys/pipe.2 b/lib/libc/sys/pipe.2 index 92d137f..bb3db51 100644 --- a/lib/libc/sys/pipe.2 +++ b/lib/libc/sys/pipe.2 @@ -28,7 +28,7 @@ .\ @(#)pipe.2 8.1 (Berkeley) 6/4/93 .\ $FreeBSD$ .\ -.Dd January 30, 2006 +.Dd March 31, 2013 .Dt PIPE 2 .Os .Sh NAME @@ -40,6 +40,8 @@ .In unistd.h .Ft int .Fn pipe int fildes[2] +.Ft int +.Fn pipe2 int fildes[2] int flags .Sh DESCRIPTION The .Fn pipe @@ -50,6 +52,29 @@ which is an object allowing bidirectional data flow, and allocates a pair of file descriptors. .Pp +The +.Fn pipe2 +system call allows control over the attributes of the file descriptors +via the +.Fa flags +argument. +Values for +.Fa flags +are constructed by a bitwise-inclusive OR of flags from the following +list, defined in +.In fcntl.h : +.Bl -tag -width .Dv O_NONBLOCK +.It Dv O_CLOEXEC +Set the close-on-exec flag for the new file descriptors. +.It Dv O_NONBLOCK +Set the non-blocking flag for the ends of the pipe. +.El +.Pp +If the +.Fa flags +argument is 0, the behavior is identical to a call to +.Fn pipe . +.Pp By convention, the first descriptor is normally used as the .Em read end of the pipe, @@ -88,7 +113,9 @@ pipe in one direction. .Sh ERRORS The .Fn pipe -system call will fail if: +and +.Fn pipe2 +system calls will fail if: .Bl -tag -width Er .It Bq Er EMFILE Too many descriptors are active. @@ -97,6 +124,16 @@ The system file table is full. .It Bq Er ENOMEM Not enough kernel memory to establish a pipe. .El +.Pp +The +.Fn pipe2 +system call will also fail if: +.Bl -tag -width Er +.It Bq Er EINVAL +The +.Fa flags +argument is invalid. +.El .Sh SEE ALSO .Xr sh 1 , .Xr fork 2 , @@ -111,3 +148,8 @@ function appeared in .Pp Bidirectional pipes were first used on .At V.4 . +.Pp +The +.Fn pipe2 +function appeared in +.Fx 10.0 . diff --git a/sys/compat/freebsd32/syscalls.master b/sys/compat/freebsd32/syscalls.master index 2cbdf31..be245be 100644 --- a/sys/compat/freebsd32/syscalls.master +++ b/sys/compat/freebsd32/syscalls.master @@ -1026,3 +1026,4 @@ struct sockaddr * __restrict name, \ __socklen_t * __restrict anamelen, \ int flags); } +542AUE_PIPENOPROTO { int pipe2(int *fildes, int flags); } diff --git a/sys/kern/capabilities.conf b/sys/kern/capabilities.conf index 0b64503..d2fa51c 100644 --- a/sys/kern/capabilities.conf +++ b/sys/kern/capabilities.conf @@ -490,6 +490,7 @@ pdkill ## Allow pipe(2). ## pipe +pipe2 ## ## Allow poll(2), which will be scoped by capability rights. diff --git a/sys/kern/sys_pipe.c b/sys/kern/sys_pipe.c index 90c3022..493fee5e 100644 --- a/sys/kern/sys_pipe.c +++ b/sys/kern/sys_pipe.c @@ -477,6 +477,24 @@ sys_pipe(struct thread *td, struct pipe_args *uap) return (0); } +int +sys_pipe2(struct thread *td, struct pipe2_args *uap) +{ + int error, fildes[2]; + + if (uap-flags ~(O_CLOEXEC | O_NONBLOCK)) + return (EINVAL); + error = kern_pipe2(td, fildes, uap-flags); + if (error) + return (error); + error
[patch] accept4
The accept4() function, compared to accept(), allows setting the new file descriptor atomically close-on-exec and explicitly controlling the non-blocking status on the new socket. (Note that the latter point means that accept() is not equivalent to any form of accept4().) The linuxulator's accept4 implementation leaves a race window where the new file descriptor is not close-on-exec because it calls sys_accept(). This implementation leaves no such race window (by using falloc() flags). The linuxulator could be fixed and simplified by using the new code. The comms/openobex port now compiles again without the patch-lib_cloexec.h patch. More information about API extensions to set new file descriptors atomically close-on-exec is at https://wiki.freebsd.org/AtomicCloseOnExec I'm also working on pipe2() (using linuxulator work) and dup3() (patch from Jukka A. Ukkonen). commit 2e410d9260d12328b689b0938e09d09649b0460a Author: Jilles Tjoelker jil...@stack.nl Date: Sat Mar 30 21:44:14 2013 +0100 Add accept4() system call. diff --git a/lib/libc/sys/Makefile.inc b/lib/libc/sys/Makefile.inc index fef0f3c..105f469 100644 --- a/lib/libc/sys/Makefile.inc +++ b/lib/libc/sys/Makefile.inc @@ -270,6 +270,7 @@ MAN+= sctp_generic_recvmsg.2 \ wait.2 \ write.2 +MLINKS+=accept.2 accept4.2 MLINKS+=access.2 eaccess.2 \ access.2 faccessat.2 MLINKS+=brk.2 sbrk.2 diff --git a/lib/libc/sys/Symbol.map b/lib/libc/sys/Symbol.map index 6faa0af..149fa41 100644 --- a/lib/libc/sys/Symbol.map +++ b/lib/libc/sys/Symbol.map @@ -378,6 +378,7 @@ FBSD_1.2 { }; FBSD_1.3 { + accept4; bindat; cap_fcntls_get; cap_fcntls_limit; @@ -461,6 +462,8 @@ FBSDprivate_1.0 { __sys_abort2; _accept; __sys_accept; + _accept4; + __sys_accept4; _access; __sys_access; _acct; diff --git a/lib/libc/sys/accept.2 b/lib/libc/sys/accept.2 index 978b948..e8bccaf 100644 --- a/lib/libc/sys/accept.2 +++ b/lib/libc/sys/accept.2 @@ -41,6 +41,8 @@ .In sys/socket.h .Ft int .Fn accept int s struct sockaddr * restrict addr socklen_t * restrict addrlen +.Ft int +.Fn accept4 int s struct sockaddr * restrict addr socklen_t * restrict addrlen int flags .Sh DESCRIPTION The argument .Fa s @@ -66,6 +68,26 @@ and signals from the original socket .Fa s . .Pp +The +.Fn accept4 +system call is similar, +but the +.Dv O_NONBLOCK +property of the new socket is instead determined by the +.Dv SOCK_NONBLOCK +flag in the +.Fa flags +argument, +the +.Dv O_ASYNC +property is cleared, +the signal destination is cleared +and the close-on-exec flag on the new file descriptor can be set via the +.Dv SOCK_CLOEXEC +flag in the +.Fa flags +argument. +.Pp If no pending connections are present on the queue, and the original socket is not marked as non-blocking, @@ -141,13 +163,15 @@ properties and the signal destination being inherited, but should set them explicitly using .Xr fcntl 2 . .Sh RETURN VALUES -The call returns \-1 on error. -If it succeeds, it returns a non-negative +These calls return \-1 on error. +If they succeed, they return a non-negative integer that is a descriptor for the accepted socket. .Sh ERRORS The .Fn accept -system call will fail if: +and +.Fn accept4 +system calls will fail if: .Bl -tag -width Er .It Bq Er EBADF The descriptor is invalid. @@ -180,6 +204,16 @@ are present to be accepted. A connection arrived, but it was closed while waiting on the listen queue. .El +.Pp +The +.Fn accept4 +system call will also fail if: +.Bl -tag -width Er +.It Bq Er EINVAL +The +.Fa flags +argument is invalid. +.El .Sh SEE ALSO .Xr bind 2 , .Xr connect 2 , @@ -194,3 +228,8 @@ The .Fn accept system call appeared in .Bx 4.2 . +.Pp +The +.Fn accept4 +system call appeared in +.Fx 10.0 . diff --git a/lib/libthr/pthread.map b/lib/libthr/pthread.map index 355edea..bbbd930e 100644 --- a/lib/libthr/pthread.map +++ b/lib/libthr/pthread.map @@ -181,6 +181,7 @@ FBSDprivate_1.0 { ___wait; ___waitpid; __accept; + __accept4; __aio_suspend; __close; __connect; @@ -408,3 +409,7 @@ FBSD_1.2 { setcontext; swapcontext; }; + +FBSD_1.3 { + accept4; +}; diff --git a/lib/libthr/thread/thr_syscalls.c b/lib/libthr/thread/thr_syscalls.c index 2327d74..7a08302 100644 --- a/lib/libthr/thread/thr_syscalls.c +++ b/lib/libthr/thread/thr_syscalls.c @@ -101,6 +101,7 @@ extern pid_t__waitpid(pid_t, int *, int); extern int __sys_aio_suspend(const struct aiocb * const[], int, const struct timespec *); extern int __sys_accept(int, struct sockaddr *, socklen_t *); +extern int __sys_accept4(int, struct sockaddr *, socklen_t *, int); extern int __sys_connect(int, const struct sockaddr *, socklen_t); extern int __sys_fsync(int); extern int __sys_msync(void *, size_t, int); @@ -129,6 +130,7 @@ int ___usleep(useconds_t useconds); pid_t
Re: GSOC: Qt front-end for freebsd-update
On Sun, Apr 14, 2013 at 02:11:44AM -0400, Justin Edward Muniz wrote: I am excited for this year's Google Summer of Code, and I have a project in mind that I am working to propose. I am a CS major and have experience with Qt, C++ and shell scripting. I have been developing on FreeBSD for several years, and I am looking to tackle developing a new Qt front-end for the freebsd-update command. I am thinking a rather straight-forward user interface with more advanced options available (such as selecting a specific server). I am looking for constructive feedback, and also a developer interested in mentoring me this Summer. If you want to work on freebsd-update, I think it will be more useful to improve freebsd-update itself. Some issues are: * It is not particularly robust in the face of errors (such as a full disk). From reading the documentation, you might get the impression that it is better than it actually is. The state for the rollback command is only set up after installation of the update so that command is not useful for backing out a partially installed update. * The upgrade feature takes large amounts of time and network bandwidth. * Less required user interaction would be nice. Think of upgrading many machines. * freebsd-update and pkgng are two upgrade systems. Perhaps freebsd-upgrade should be scrapped altogether in favour of a pkgng-based solution. If you want to write GUIs, perhaps a pkgng GUI is more interesting. -- Jilles Tjoelker ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org
Re: GSOC 2013 project Kernel Size Reduction for Embedded System
On Mon, Apr 08, 2013 at 08:28:04PM +, Amit Rawat wrote: GSOC posted the list of selected organization for GSOC 2013 and I am highly happy that FreeBSD is among the selected organization. I am a third year student interested to work in the field of embedded system. I applied last year and the title of my project was Kernel Size Reduction for Embedded System. The link to my last year proposal is https://www.google-melange.com/gsoc/proposal/review/google/gsoc2012/amitrawat10/1#c8001 But due to some flaws it doesn't get selected. I am looking to improve my proposal for this year and apply again. I explain some portion of my project pictorially on my blog https://amit10rawat.wordpress.com/2013/02/26/kernel-size-reduction-for-embedded-system/ I am looking for suggestion and new ideas by which I can reduce the size of kernel. It looks like the overlay idea could be implemented more simply by taking advantage of the VM system: make part of the kernel code pageable. Memory formerly occupied by rarely used kernel code can then be used by userland applications. You will need some sort of backing store where the kernel code can be read after booting; this is not normally available. However, almost no kernel code is safe in a situation where an instruction fetch may fault. Reading or writing the secondary storage can easily cause a deadlock. It causes the thread to sleep, which is not allowed while holding a mutex. It would help if you could wire down pieces which will need to be used in the near future from a place where a fault is safe, but this may also be very slow even if the code is already in memory. Some other ideas for kernel size reduction: * Find pieces of code that are required but seem big for what they do for you, and try to make them smaller. The proposal should list concrete parts. * Find variables and functions that are required only during kernel initialization, place them in a special section and add this section to the free memory pool after kernel initialization. -- Jilles Tjoelker ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org
Re: Fwd: Where does FreeBSD tr -C differ from tr -c?
On Sun, Apr 07, 2013 at 09:12:57PM +0200, Cedric Blancher wrote: Forwarding to freebsd-hackers@/freebsd-standa...@freebsd.org The question remain open and I need help. tr -C is implemented by FreeBSD tr -C but I can't find examples (or a testcase) where tr -c and tr -C differ. Reading the rationale of POSIX, here is an example of a difference: % printf 'a\200'|LC_ALL=en_US.US-ASCII tr -cd '\000-\177'|hd 61|a| 0001 % printf 'a\200'|LC_ALL=en_US.US-ASCII tr -Cd '\000-\177'|hd 61 80 |a.| 0002 Because the bytes 128..255 are not characters in us-ascii, they cannot be removed with -Cd, only with -cd. Here is another difference (using LC_CTYPE=en_US.UTF-8, rest C): % echo $'\U0001a000'|tr -cd '\U0001a000'|hd % echo $'\U0001a000'|tr -Cd '\U0001a000'|hd f0 9a 80 80 || 0004 The cause is that iswrune(3) returns false for the unassigned code point U+0001A000. This may well contain bugs because Unicode adds new characters from time to time and our tables seem to be updated very rarely. POSIX also says things about collation order. You may not have detected this because FreeBSD does not implement LC_COLLATE for multibyte locales yet. PS: Who wrote tr -C and how can I contact the author? You can read the Subversion logs but people may no longer be around. -- Jilles Tjoelker ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org
Re: close(2) while accept(2) is blocked
On Thu, Apr 04, 2013 at 08:43:17PM +0300, Andriy Gapon wrote: on 01/04/2013 18:22 John Baldwin said the following: I think you need to split the 'struct file' reference count into two different counts similar to the how we have vref/vrele vs vhold/vdrop for vnodes. The fget for accept and probably most other system calls should probably be equivalent to vhold, whereas things like open/dup (and storing an fd in a cmsg) should be more like vref. close() should then be a vrele(). This model makes perfect sense. Unfortunately, ENOTIME to work on this. This looks like it can work but I don't know whether it's worth the trouble. Meanwhile I am using the following patch specific to local domain sockets, accept(2) and shutdown(2). Turns out that the problematic application does both shutdown(RDWR) and close(2), but that doesn't help on FreeBSD. BTW, this is the application: http://thread.gmane.org/gmane.os.freebsd.devel.office/1754 The patch does help. Author: Andriy Gapon a...@icyb.net.ua Date: Thu Mar 28 20:08:13 2013 +0200 [test!] uipc_shutdown: use soisdisconnected instead of socantsendmore So that in addition to disabling sends we also wake up threads blocked in accept (on so_timeo in general). diff --git a/sys/kern/uipc_usrreq.c b/sys/kern/uipc_usrreq.c index 9b60eab..e93d46c 100644 --- a/sys/kern/uipc_usrreq.c +++ b/sys/kern/uipc_usrreq.c @@ -1074,7 +1074,7 @@ uipc_shutdown(struct socket *so) UNP_LINK_WLOCK(); UNP_PCB_LOCK(unp); - socantsendmore(so); + soisdisconnected(so); unp_shutdown(unp); UNP_PCB_UNLOCK(unp); UNP_LINK_WUNLOCK(); I think this patch makes shutdown(SHUT_WR) on unix domain sockets act like shutdown(SHUT_RDWR), i.e. receives are incorrectly denied. The below patch also makes shutdown(SHUT_RDWR) abort a blocking accept on a unix domain socket, but it should work for all domains: Index: sys/kern/uipc_socket.c === --- sys/kern/uipc_socket.c (revision 248873) +++ sys/kern/uipc_socket.c (working copy) @@ -2428,9 +2428,11 @@ soshutdown(struct socket *so, int how) sorflush(so); if (how != SHUT_RD) { error = (*pr-pr_usrreqs-pru_shutdown)(so); + wakeup(so-so_timeo); CURVNET_RESTORE(); return (error); } + wakeup(so-so_timeo); CURVNET_RESTORE(); return (0); } A blocking accept (and some other operations) is waiting on so-so_timeo. Once it wakes up, it will detect the SBS_CANTRCVMORE bit. A spurious wakeup on so-so_timeo appears harmless (sleep retried) except when lingering on close (SO_LINGER) so this should be fairly safe. -- Jilles Tjoelker ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org
Re: close(2) while accept(2) is blocked
On Thu, Mar 28, 2013 at 06:54:31PM +0200, Andriy Gapon wrote: So, this started as a simple question, but the answer was quite unexpected to me. Let's say we have an opened and listen-ed socket and let's assume that we know that one thread is blocked in accept(2) and another thread is calling close(2). What is going to happen? Turns out that practically nothing. For kernel the close call would be almost a nop. My understanding is this: - when socket is created, its reference count is 1 - when accept(2) is called, fget in kernel increments the reference count (kept in an associated struct file) - when close(2) is called, the reference count is decremented The reference count is still greater than zero, so fdrop does not call fo_close. That means that in the case of a socket soclose is not called. I am sure that the reference counting in this case is absolutely correct with respect to managing kernel side structures. I agree this is expected and correct from the kernel point of view. But I am not that it is correct with respect to hiding the explicit close(2) call from other threads that may be waiting on the socket. In other words, I am not sure if fo_close is supposed to signify that there are no uses of a file, or that userland close-d the file. Or perhaps these should be two different methods. It would be possible to keep track of the file descriptor number but I think it is not worth the large amount of extra code. Keeping track of file descriptor number would be necessary to interrupt waits after a close or dup2 on the file descriptor that was passed to the blocking call, even if the object remains open on a different file descriptor number or in a different process. Also, most people would use the new functionality incorrectly anyway. A close() on a file descriptor another thread is using is risky since it is in most cases impossible to prove that the other thread is in fact blocked on the file descriptor and not preempted right before making the system call. In the latter case, the other thread might accept a connection from a different socket created later. A dup2() of /dev/null onto the file descriptor would be safer. Additional note is that shutdown(2) doesn't wake up the thread in accept(2) either. At least that's true for unix domain sockets. Not sure if this is a bug. I think it is a bug. It works properly for IPv4 TCP sockets. The resulting error is [ECONNABORTED] for a blocking socket which likely leads to infinite looping if the thread does not know about the shutdown(2) (because that error normally means the accept should be retried later). For a non-blocking socket the error is [EWOULDBLOCK] which also leads to infinite looping and is certainly wrong because select/poll do report the socket as readable. Both of these are in kern_accept() in sys/kern/uipc_syscalls.c. POSIX does not say which error code we should return here but these two are almost certainly wrong (it is usable for waking up threads stuck in accept() if those threads check a variable after every accept() failure and do not rely on the exact value of errno). Linux returns [EINVAL] for both blocking and non-blocking sockets, probably from the POSIX error condition The socket is not accepting connections. In our man page that error condition is formulated listen(2) has not been called on the socket descriptor. which is clearly not the case. Also, I think a non-blocking accept() should immediately fail with the head-so_error if it is set, rather than returning [EWOULDBLOCK] until another connection arrives. Likewise, filt_solisten() in sys/kern/uipc_socket.c only returns true if there is a connection, not if there was an error or shutdown() has been called. On the other hand, sopoll_generic() looks correct. Error reporting on non-blocking accept() might usefully be postponed until there is a connection or the socket has been shut down, to reduce context switches. But the summary seems to be is that currently it is not possible to break a thread out of accept(2) (at least without resorting to signals). Pthread cancellation works better than raw signals for this use case. In a proper implementation such as in FreeBSD 9.0 or newer and used properly, it allows avoiding the resource leak that may happen when calling longjmp() or pthread_exit() in a signal handler just after accept() has created a new socket. -- Jilles Tjoelker ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org
Re: [patch] SOCK_CLOEXEC, SOCK_NONBLOCK and MSG_CMSG_CLOEXEC
On Mon, Mar 18, 2013 at 10:59:57PM +0200, Konstantin Belousov wrote: On Sun, Mar 17, 2013 at 10:23:53PM +0100, Jilles Tjoelker wrote: Here are some more modifications to allow creating file descriptors with close-on-exec set. Like in linux/glibc, SOCK_CLOEXEC and SOCK_NONBLOCK can be OR'ed in socket() and socketpair()'s type parameter, and MSG_CMSG_CLOEXEC to recvmsg() makes file descriptors (SCM_RIGHTS) atomically close-on-exec. The numerical values for SOCK_CLOEXEC and SOCK_NONBLOCK are as in NetBSD. MSG_CMSG_CLOEXEC is the first free bit for MSG_*. I do not pass the SOCK_* flags to MAC because this may cause incorrect failures and can be done later via fcntl() anyway. I expect audit to cope with the new flags. For MSG_CMSG_CLOEXEC, I had to change unp_externalize to take a flags argument. This looks fine to me. Thanks for the review. The only note I have, which is not directly related to your patch, is the recvmsg(2) behaviour when the undefined flag is passed. The syscall silently ignores the flags. I think this is quite wrong, and would cause interesting (security) implications if the program using the MSG_CMSG_CLOEXEC is run on older kernel which does not interpret the flag. This is indeed unfortunate and it also affects open(2). It may have originally been done for reasons of modularity; the sys_open() and kern_open() need not know about the valid flag bits this way and a device driver or filesystem can define new bits. The effect of an older kernel ignoring MSG_CMSG_CLOEXEC is probably not that bad, considering that running new userland on an old kernel is not supported. Most compatibility shims for MSG_CMSG_CLOEXEC and similar flags (such as the one in Wayland) simply contain the race condition, so you lose anyway with an old kernel. It would be possible to use closefrom() or an emulation of it or to lock fork() conditionally on the necessary CLOEXEC flags not being available but it appears that people do not go to that trouble. Might be, we should start returning EINVAL for unknown flag, despite SUSv4 not specifying the condition ? Per POSIX it is valid to do this but it might cause a compatibility problem if there are applications that pass garbage in flags. -- Jilles Tjoelker ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org
[patch] SOCK_CLOEXEC, SOCK_NONBLOCK and MSG_CMSG_CLOEXEC
(). */ - error = socreate(uap-domain, so, uap-type, uap-protocol, + error = socreate(uap-domain, so, type, uap-protocol, td-td_ucred, td); if (error) { fdclose(td-td_proc-p_fd, fp, fd, td); } else { - finit(fp, FREAD | FWRITE, DTYPE_SOCKET, so, socketops); + finit(fp, FREAD | FWRITE | fflag, DTYPE_SOCKET, so, socketops); + if ((fflag FNONBLOCK) != 0) + (void) fo_ioctl(fp, FIONBIO, fflag, td-td_ucred, td); td-td_retval[0] = fd; } fdrop(fp, td); @@ -648,9 +663,20 @@ struct filedesc *fdp = td-td_proc-p_fd; struct file *fp1, *fp2; struct socket *so1, *so2; - int fd, error; + int fd, error, oflag, fflag; AUDIT_ARG_SOCKET(domain, type, protocol); + + oflag = 0; + fflag = 0; + if ((type SOCK_CLOEXEC) != 0) { + type = ~SOCK_CLOEXEC; + oflag |= O_CLOEXEC; + } + if ((type SOCK_NONBLOCK) != 0) { + type = ~SOCK_NONBLOCK; + fflag |= FNONBLOCK; + } #ifdef MAC /* We might want to have a separate check for socket pairs. */ error = mac_socket_check_create(td-td_ucred, domain, type, @@ -665,12 +691,12 @@ if (error) goto free1; /* On success extra reference to `fp1' and 'fp2' is set by falloc. */ - error = falloc(td, fp1, fd, 0); + error = falloc(td, fp1, fd, oflag); if (error) goto free2; rsv[0] = fd; fp1-f_data = so1; /* so1 already has ref count */ - error = falloc(td, fp2, fd, 0); + error = falloc(td, fp2, fd, oflag); if (error) goto free3; fp2-f_data = so2; /* so2 already has ref count */ @@ -686,8 +712,14 @@ if (error) goto free4; } - finit(fp1, FREAD | FWRITE, DTYPE_SOCKET, fp1-f_data, socketops); - finit(fp2, FREAD | FWRITE, DTYPE_SOCKET, fp2-f_data, socketops); + finit(fp1, FREAD | FWRITE | fflag, DTYPE_SOCKET, fp1-f_data, + socketops); + finit(fp2, FREAD | FWRITE | fflag, DTYPE_SOCKET, fp2-f_data, + socketops); + if ((fflag FNONBLOCK) != 0) { + (void) fo_ioctl(fp1, FIONBIO, fflag, td-td_ucred, td); + (void) fo_ioctl(fp2, FIONBIO, fflag, td-td_ucred, td); + } fdrop(fp1, td); fdrop(fp2, td); return (0); -- Jilles Tjoelker ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org
Re: [patch] statfs does not detect -t nullfs -o union as a union mount
On Thu, Feb 28, 2013 at 12:21:44AM +0200, Konstantin Belousov wrote: On Wed, Feb 27, 2013 at 10:31:42PM +0100, Jilles Tjoelker wrote: While testing recent changes to opendir(), I noticed that fstatfs() does not return the MNT_UNION flag for a -t nullfs -o union mount. As a result, opendir()/readdir() return files that exist in both top and bottom directories twice (at least . and ..). Other -o union mounts and -t unionfs mounts work correctly in this regard. The below patch passes through just the MNT_UNION flag of the nullfs mount itself. Perhaps more flags should be passed through. commit fce32a779af4eb977c9b96feb6e4f811d89f2881 Author: Jilles Tjoelker jil...@stack.nl Date: Sat Feb 23 22:22:39 2013 +0100 nullfs: Preserve the MNT_UNION flag of the nullfs mount itself. This is needed so that opendir() can properly detect a union mount like mount -t nullfs -o union dir1 dir2. diff --git a/sys/fs/nullfs/null_vfsops.c b/sys/fs/nullfs/null_vfsops.c index 3724e0a..ff06f57 100644 --- a/sys/fs/nullfs/null_vfsops.c +++ b/sys/fs/nullfs/null_vfsops.c @@ -313,7 +313,7 @@ nullfs_statfs(mp, sbp) /* now copy across the interesting information and fake the rest */ sbp-f_type = mstat.f_type; - sbp-f_flags = mstat.f_flags; + sbp-f_flags = (sbp-f_flags MNT_UNION) | mstat.f_flags; sbp-f_bsize = mstat.f_bsize; sbp-f_iosize = mstat.f_iosize; sbp-f_blocks = mstat.f_blocks; Would it make sense to preserve more flags from the upper mount ? I see a use for MNT_NOEXEC as well, at least. Yes, preserving MNT_NOEXEC will make -t nullfs -o noexec work better, in particular rtld's check for libraries loaded via environment variables. In the same way MNT_RDONLY, MNT_NOSUID and MNT_NOSYMFOLLOW could be preserved. On the other hand, MNT_ROOTFS should probably not be passed through from the underlying filesystem. This would give sbp-f_flags = (sbp-f_flags (MNT_RDONLY | MNT_NOEXEC | MNT_NOSUID | MNT_UNION | MNT_NOSYMFOLLOW) | (mstat.f_flags ~MNT_ROOTFS); -- Jilles Tjoelker ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org
[patch] statfs does not detect -t nullfs -o union as a union mount
While testing recent changes to opendir(), I noticed that fstatfs() does not return the MNT_UNION flag for a -t nullfs -o union mount. As a result, opendir()/readdir() return files that exist in both top and bottom directories twice (at least . and ..). Other -o union mounts and -t unionfs mounts work correctly in this regard. The below patch passes through just the MNT_UNION flag of the nullfs mount itself. Perhaps more flags should be passed through. commit fce32a779af4eb977c9b96feb6e4f811d89f2881 Author: Jilles Tjoelker jil...@stack.nl Date: Sat Feb 23 22:22:39 2013 +0100 nullfs: Preserve the MNT_UNION flag of the nullfs mount itself. This is needed so that opendir() can properly detect a union mount like mount -t nullfs -o union dir1 dir2. diff --git a/sys/fs/nullfs/null_vfsops.c b/sys/fs/nullfs/null_vfsops.c index 3724e0a..ff06f57 100644 --- a/sys/fs/nullfs/null_vfsops.c +++ b/sys/fs/nullfs/null_vfsops.c @@ -313,7 +313,7 @@ nullfs_statfs(mp, sbp) /* now copy across the interesting information and fake the rest */ sbp-f_type = mstat.f_type; - sbp-f_flags = mstat.f_flags; + sbp-f_flags = (sbp-f_flags MNT_UNION) | mstat.f_flags; sbp-f_bsize = mstat.f_bsize; sbp-f_iosize = mstat.f_iosize; sbp-f_blocks = mstat.f_blocks; -- Jilles Tjoelker ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org
Re: Request for review, time_pps_fetch() enhancement
On Wed, Feb 06, 2013 at 05:58:30PM +0200, Konstantin Belousov wrote: On Tue, Feb 05, 2013 at 09:41:38PM -0700, Ian Lepore wrote: I'd like feedback on the attached patch, which adds support to our time_pps_fetch() implementation for the blocking behaviors described in section 3.4.3 of RFC 2783. The existing implementation can only return the most recently captured data without blocking. These changes add the ability to block (forever or with timeout) until a new event occurs. Index: sys/kern/kern_tc.c === --- sys/kern/kern_tc.c (revision 246337) +++ sys/kern/kern_tc.c (working copy) @@ -1446,6 +1446,50 @@ * RFC 2783 PPS-API implementation. */ +static int +pps_fetch(struct pps_fetch_args *fapi, struct pps_state *pps) +{ [snip] + aseq = pps-ppsinfo.assert_sequence; + cseq = pps-ppsinfo.clear_sequence; + while (aseq == pps-ppsinfo.assert_sequence + cseq == pps-ppsinfo.clear_sequence) { Note that compilers are allowed to optimize these accesses even over the sequential point, which is the tsleep() call. Only accesses to volatile objects are forbidden to be rearranged. I suggest to add volatile casts to pps in the loop condition. The memory pointed to by pps is global (other code may have a pointer to it); therefore, the compiler must assume that the tsleep() call (which invokes code in a different compilation unit) may modify it. Because volatile does not make concurrent access by multiple threads defined either, adding it here only seems to slow down the code (potentially). + err = tsleep(pps, PCATCH, ppsfch, timo); + if (err == EWOULDBLOCK fapi-timeout.tv_sec == -1) { + continue; + } else if (err != 0) { + return (err); + } + } + } -- Jilles Tjoelker ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org
Re: use after free in grep?
On Thu, Dec 20, 2012 at 01:19:07PM +0100, Dimitry Andric wrote: On 2012-12-20 08:13, Eitan Adler wrote: in xrealloc_impl 338 new_ptr = realloc(ptr, new_size); 339 if (new_ptr != NULL) 340 { 341 hash_table_del(xmalloc_table, ptr); ^^^ isn't this a use-after-free of ptr? Yes, realloc does not guarantee the realloc'd space will be at the same address, so it may free ptr at its discretion. Even if you somehow know realloc() is not going to move the block, it is still wrong to use any pointer not derived from its return value to access the block. Comparing the old and the new pointers (normally or with memcmp()) does not help; it has an indeterminate result. See http://www.open-std.org/jtc1/sc22/wg14/www/docs/dr_260.htm Also, there is a memory leak if realloc() returns NULL. This is a very usual mistake when using realloc(). :-) No, this would be correct if a successful realloc() call did not make the old pointer indeterminate. The hash table remains unchanged if realloc() fails. Probably, the code should do the hash_table_del() before the realloc(), but I am not sure if hash_table_del() will already free ptr. Yes, and add it back if realloc() fails. A smarter internal interface to the hash table would avoid freeing and reallocating hash table entries here (which might fail). -- Jilles Tjoelker ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org
Re: Fix overlinking in base aka import pkgconf
On Sun, Dec 16, 2012 at 04:03:40PM +0200, Konstantin Belousov wrote: On Sun, Dec 16, 2012 at 02:01:00PM +0100, Jeremie Le Hen wrote: I think this could be solved by an implicit linker script contained in .so and .a files, pointing to the real libraries. We have already the SHLIB_LDSCRIPT variable to accomplish this for .so files. It may be possible to do the same for .a files, though this would require renaming the real archives to something like .a.0 (here I'm just taking a similar scheme to the one used for DSO). As a matter of fact, libprocstat.a would contain: GROUP ( /usr/lib/libprocstat.a.0 /usr/lib/libkvm.a /usr/lib/libutil.a ) Note that libkvm.a and libutil.a could be linker scripts as well. Kib, do you see any problem to this proposition? Wouldn't you need to completely rewrite the handling of the .a files in the share/mk ? I somewhat dislike the mere thought that .a is not an archive any longer. Does it make sense from the overhead and complexity POV, for such small goal ? For base this may not make much sense since there is not much need to run an old base binary against newer base libraries. Apart from that the only advantage is some form of philosophical correctness. For ports this may be a good way to keep static linking working after fixing overlinking caused by libtool's dependency_libs in .la files. -- Jilles Tjoelker ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org
Incorrect use of posix_memalign() (was: Re: svn commit: r243405 - in stable/9: include lib/libc/stdlib)
On Thu, Nov 22, 2012 at 03:19:53PM +, Ed Schouten wrote: Author: ed Date: Thu Nov 22 15:19:53 2012 New Revision: 243405 URL: http://svnweb.freebsd.org/changeset/base/243405 Log: MFC r229848: Add aligned_alloc(3). The C11 folks reinvented the wheel by introducing an aligned version of malloc(3) called aligned_alloc(3), instead of posix_memalign(3). Instead of returning the allocation by reference, it returns the address, just like malloc(3). I'm MFCing this now, as it seems aligned_alloc(3) is needed to make the new version of libc++ work, which was merged back to FreeBSD 9 in r243376. The C11 committee knew about posix_memalign() and had several reasons for creating a new function; see for example http://www.open-std.org/jtc1/sc22/wg14/www/docs/n1397.htm . In particular, posix_memalign() is a little annoying to use correctly, often requiring a temporary variable of type void *. It is tempting to do something like error = posix_memalign((void **)some_ptr, aln, sz); and some FreeBSD code does this, but it violates strict-aliasing. A further mostly theoretical objection is that assumes that the representation of some_ptr and void * are compatible which C does not guarantee. The problem can be fixed by adding the temporary pointer variable like void *tmp_ptr; error = posix_memalign(tmp_ptr, aln, sz); some_ptr = tmp_ptr; or by using aligned_alloc() instead of posix_memalign() some_ptr = aligned_alloc(aln, sz); with error checking against some_ptr instead of error. -- Jilles Tjoelker ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org
[patch] rtld: fix fd leak with parallel dlopen and fork/exec
); if (error != 0) errc(1, error, posix_spawnp); do wpid = waitpid(pid, status, 0); while (wpid == -1 errno == EINTR); if (wpid == -1) err(1, waitpid); if (status != 0) errx(1, sh -c failed); } } int main(int argc, char *argv[]) { pthread_t td; int error; if (argc != 2) { fprintf(stderr, Usage: %s dso\n, argv[0]); return 1; } error = pthread_create(td, NULL, dlopen_thread, argv[1]); if (error != 0) errc(1, error, pthread_create); filestat_loop(); return 0; } -- Jilles Tjoelker ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org
Re: -lpthread vs -pthread: does -D_REENTRANT matter?
On Sun, Oct 14, 2012 at 04:55:07PM -0400, Eitan Adler wrote: On 14 October 2012 10:42, Jilles Tjoelker jil...@stack.nl wrote: Because C99 does not specify threading, it allows these transformations. In C11, they are forbidden. Passing -pthread disables them as well. Is the man page wrong or do I misunderstand? This option sets flags for both the preprocessor and linker. It does not affect the thread safety of object code produced by the compiler or that of libraries supplied with it. OK, that would explain why I could not find such things. I seem to recall GCC must sometimes be instructed not to do thread-unsafe transformations but I cannot find how to do so. -- Jilles Tjoelker ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org
Re: -lpthread vs -pthread: does -D_REENTRANT matter?
On Mon, Oct 08, 2012 at 12:17:08PM -0400, Eitan Adler wrote: The only difference between -lpthread and -pthread that I could see is that the latter also sets -D_REENTRANT. However, I can't find any uses of _REENTRANT anywhere outside of a few utilities that seem to define it manually. Testing with various manually written pthread programs resulted in identical binaries, let alone identical results. Is there an actual difference between -pthread and -lpthread or is this just a historical artifact? In some cases, -pthread also affects the compiler's code generation. On some RISC architectures, compilers may try to avoid loads and stores of less than 32 bits. For example (untested): struct { int n; char a, b, c, d; } *p; p-a = p-b = p-c = 0; The compiler might load p-d and then store the 32 bits containing a, b, c and d at once. This causes a race condition if p-d is written concurrently. Because C99 does not specify threading, it allows these transformations. In C11, they are forbidden. Passing -pthread disables them as well. -- Jilles Tjoelker ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org
Re: Change vfork() to posix_spawn()?
On Fri, Sep 14, 2012 at 01:45:49PM +0200, Erik Cederstrand wrote: Den 14/09/2012 kl. 13.03 skrev Ivan Voras ivo...@freebsd.org: On 14/09/2012 09:49, Erik Cederstrand wrote: I'm looking through the Clang Analyzer scans on http://scan.freebsd.your.org/freebsd-head looking for false positives to report back to LLVM. There are quite a list of reports suggesting to change vfork() calls to posix_spawn(). Example from /bin/rpc: http://scan.freebsd.your.org/freebsd-head/bin.rcp/2012-09-12-amd64/report-nsOV80.html#EndPath I know nothing about this but I can see fork and posix_spawn have been discussed on this list previously. Is this a legitimate warning (in this case and in general in FreeBSD base)? Currently (on 9-stable at least), posix_spawn() is implemented as a wrapper around vfork(), so I doubt replacing one with the other would do much. vfork() returns twice, possibly reanimating variables from the dead. Calling posix_spawn() limits this issue to the posix_spawn() implementation only. For example, in case of unwilling compiler developers, the optimization level for that file might be lowered or more volatile keywords might be added. I think it makes more sense to disable various optimizations in the compiler automatically in functions that call vfork(), longjmp() and similar functions, but I do not decide what compiler developers do. The analyzer added this warning in January. The release notes link to this explanation: https://www.securecoding.cert.org/confluence/display/seccode/POS33-C.+Do+not+use+vfork() I guess this is the important part: Because of the implementation of the vfork() function, the parent process is suspended while the child process executes. If a user sends a signal to the child process, delaying its execution, the parent process (which is privileged) is also blocked. This means that an unprivileged process can cause a privileged process to halt, which is a privilege inversion resulting in a denial of service. This problem only occurs if privileges are dropped between vfork and exec, which is uncommon. If no privileges are dropped, the user can affect the parent directly. Furthermore, this exact problem does not happen in FreeBSD because child processes between vfork and exec/exit are not affected by stop signals (this is stronger than the vfork(2) man page documents). However, related issues are still present. If there is a signal handler that blocks for a long time (many functions which do this are async-signal-safe) for a signal permitted by security.bsd.conservative_signals, an unprivileged user will be able to trigger it and delay the thread calling vfork(). A function may also be async-signal-safe but not suitable for a vforked child (for example, libthr makes many functions async-signal-safe by postponing signal handlers which is not good enough if a vforked child is SIGKILL'ed). An unprivileged user may also trigger priority inversion by lowering the priority of the child process and consuming CPU time at a higher priority. Obviously, the child process should not lower its priority voluntarily either. These problems can be fixed in various ways. The direct priority inversion problem can be fixed by using fork() instead in that case or by adding a priority inheritance scheme in the kernel for vforked children (but only for the static priority; the parent's dynamic priority will increase because it is sleeping). The privilege manipulation available via POSIX_SPAWN_RESETIDS seems safe enough. Since it only affects the effective UID/GID, it does not affect the ability to modify scheduling parameters (real UID) or to send signals (real or saved UID). Since the seteuid() call itself will set the issetugid flag to true if it changed anything, it does not affect the ability to debug before exec. More general privilege dropping frequently involves frameworks such as PAM which are not async-signal-safe and certainly not vfork-safe. -- Jilles Tjoelker ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org
Re: [patch] libc: Do not export .cerror
On Tue, Aug 28, 2012 at 02:03:22PM +0300, Konstantin Belousov wrote: On Sat, Aug 25, 2012 at 12:16:55AM +0200, Jilles Tjoelker wrote: Not exporting .cerror causes it to be jumped to directly instead of via the PLT. The below patch is for i386 only and also takes advantage of .cerror's new status by not saving and loading %ebx before jumping to it. (Therefore, .cerror now saves and loads %ebx itself.) Where there was a conditional jump to a jump to .cerror, the conditional jump has been changed to jump to .cerror directly (many modern CPUs don't do static prediction and in any case it is not much of a benefit anyway). Why do you need to save/restore the %ebx at all ? %ebx == __GLOBAL_OFFSET_TABLE__ is only needed when you access GOT, but .cerror only works with PLT, which is addressed using the instruction capable of relative addressing. The old .cerror does not need it as well, but it is just engraved in the function ABI. On i386, a shared object's PLT entry needs %ebx set up to work properly. This is because such a PLT entry needs to access the GOT to find the address to jump to (the first instruction is jmp *d32(%ebx)). An executable's PLT entry accesses the GOT via absolute addressing and therefore does not need %ebx. The patch decreases the size of libc.so.7 by a few kilobytes. Similar changes could be made to other architectures, and there may be more symbols that are exported but need not be. Sure, would you handle at least amd64 too ? The below patch handles amd64. I'm a bit annoyed that most of the syscall stubs are 17 bytes long now and have the maximum 15 bytes of padding. This means that the patch provides virtually no gain in code size. Index: lib/libc/amd64/Symbol.map === --- lib/libc/amd64/Symbol.map (revision 239865) +++ lib/libc/amd64/Symbol.map (working copy) @@ -66,7 +66,6 @@ .curbrk; .minbrk; _brk; - .cerror; _end; __sys_vfork; _vfork; Index: lib/libc/amd64/SYS.h === --- lib/libc/amd64/SYS.h(revision 239865) +++ lib/libc/amd64/SYS.h(working copy) @@ -36,38 +36,20 @@ #include sys/syscall.h #include machine/asm.h -#ifdef PIC #defineRSYSCALL(x) ENTRY(__CONCAT(__sys_,x)); \ .weak CNAME(x); \ .set CNAME(x),CNAME(__CONCAT(__sys_,x));\ .weak CNAME(__CONCAT(_,x)); \ .set CNAME(__CONCAT(_,x)),CNAME(__CONCAT(__sys_,x)); \ - mov __CONCAT($SYS_,x),%rax; KERNCALL; jb 2f; ret; \ - 2: movq PIC_GOT(HIDENAME(cerror)),%rcx; jmp *%rcx; \ + mov __CONCAT($SYS_,x),%eax; KERNCALL; \ + jb HIDENAME(cerror); ret; \ END(__CONCAT(__sys_,x)) #definePSEUDO(x) ENTRY(__CONCAT(__sys_,x)); \ .weak CNAME(__CONCAT(_,x)); \ .set CNAME(__CONCAT(_,x)),CNAME(__CONCAT(__sys_,x)); \ - mov __CONCAT($SYS_,x),%rax; KERNCALL; jb 2f; ret ; \ - 2: movq PIC_GOT(HIDENAME(cerror)),%rcx; jmp *%rcx; \ + mov __CONCAT($SYS_,x),%eax; KERNCALL; \ + jb HIDENAME(cerror); ret; \ END(__CONCAT(__sys_,x)) -#else -#defineRSYSCALL(x) ENTRY(__CONCAT(__sys_,x)); \ - .weak CNAME(x); \ - .set CNAME(x),CNAME(__CONCAT(__sys_,x));\ - .weak CNAME(__CONCAT(_,x)); \ - .set CNAME(__CONCAT(_,x)),CNAME(__CONCAT(__sys_,x)); \ - mov __CONCAT($SYS_,x),%rax; KERNCALL; jb 2f; ret; \ - 2: jmp HIDENAME(cerror);\ - END(__CONCAT(__sys_,x)) -#definePSEUDO(x) ENTRY(__CONCAT(__sys_,x)); \ - .weak CNAME(__CONCAT(_,x)); \ - .set CNAME(__CONCAT(_,x)),CNAME(__CONCAT(__sys_,x)); \ - mov __CONCAT($SYS_,x),%rax; KERNCALL; jb 2f; ret; \ - 2: jmp HIDENAME(cerror);\ - END(__CONCAT(__sys_,x)) -#endif - #define KERNCALL movq %rcx, %r10; syscall Index: lib/libc/amd64/gen/rfork_thread.S === --- lib/libc/amd64/gen/rfork_thread.S (revision 239865) +++ lib/libc/amd64/gen/rfork_thread.S (working copy) @@ -93,12 +93,7 @@ 2: popq%r12
Re: system() using vfork() or posix_spawn() and libthr
On Sun, Aug 19, 2012 at 03:26:36PM +0300, Konstantin Belousov wrote: On Fri, Aug 17, 2012 at 06:08:47PM +0200, Jilles Tjoelker wrote: On Fri, Aug 17, 2012 at 03:43:12PM +0300, Konstantin Belousov wrote: On Fri, Aug 17, 2012 at 12:39:33AM +0200, Jilles Tjoelker wrote: On Thu, Aug 16, 2012 at 02:44:26PM +0300, Konstantin Belousov wrote: BTW, returning to Jilles proposal, can we call vfork() only for single-threaded parent ? I think it gives good boost for single-threaded case, and also eliminates the concerns of non-safety. This would probably fix the safety issues but at a price. There is a correlation between processes so large that they benefit greatly from vfork and threaded processes. Ok, so I will continue with my patch. On the other hand, I think direct calls to vfork() in applications are risky and it may not be possible to support them safely in all circumstances. However, if libc is calling vfork() such as via popen(), system() or posix_spawn(), it should be possible even in a multi-threaded process. In that case, the rtld and libthr problems can be avoided by using aliases with hidden visibility for all functions the vforked child needs to call (or any other method that prevents interposition and hard-codes a displacement into libc.so). I do not see how using any aliases could help there. Basically, if mt process is not single-threaded for vfork, you can have both some parent thread and child enter rtld. This is complete mess. If libc calls a function with hidden visibility, this proceeds directly (not via the PLT) and rtld is not involved. Oh, so you are only concerned with libc there ? If spending so much efforts on this issue, IMO it is pity to only fix libc and not fix vfork for all consumers. True. If vfork can be fixed more generally without too much penalty, it is better to do that. And it looks like that is possible, including detecting unexpected kills, see below. Several ways to do this are in section 2.2.7 Avoid Using Exported Symbols of Ulrich Drepper's dsohowto. One of them is something like extern __typeof(next) next_int __attribute((alias(next), visibility(hidden))); in the same source as the definition of the function int next(void) { ...; } As Drepper notes, the visibility attribute is not strictly required if a version script keeps the symbol local but it might lead to better code. At least on i386, though, the optimal direct near call instruction is generated even without it. For example, _nsdispatch() calls libc_dlopen() (kept local by the version script) without going through the PLT (from the output of objdump -dS on the libc.so.7 in /usr/obj). I am not confident, so this might be a hallucination, but 'optimization' in the recent gnu ld might rewrite the call target to avoid PLT when symbol visibility is hidden. If symbol visibility is hidden, rtld does not know about the symbol's name; therefore, the only way ld can make it work is to make the call instruction point directly at the called function, not to a PLT entry. Different from what I wrote earlier, on i386, there is still a penalty in omitting the visibility attribute: the compiler will generate code to load %ebx which may not be necessary. In the assembler syscall stubs using code from lib/libc/arch/SYS.h this can be done by adding another .set (we currently have foo, _foo and __sys_foo for a syscall foo; some syscalls have only _foo and __sys_foo) such as __syshidden_foo. The ones that are actually used then need prototypes (similar to the __sys_* ones in lib/libthr/?). For some reason, the symbol .cerror (HIDENAME(cerror)) is also exported. Either this should be kept local or a local uninterposable alias should be added and used (as with the syscalls). I sent a patch for .cerror (i386 only) in a new mailing list thread. The function __error() (to get errno's address for the current thread) is and must be called via the PLT (because libthr is separate). Therefore, we should ensure it is called at least once before vfork so calls in the child do not involve rtld. The implementations for the various architectures use the TLS register (or memory location for ARM), so they seem safe. This should suffice to fix posix_spawn() but the _execvpe() used by posix_spawnp also uses various string functions. If not all of these have already been called earlier, this will not work. Making calls to them not go through the PLT seems fairly hard, even though it would make sense in general, so perhaps I should instead reimplement it such that the parent does almost all of the work. An alternative is to write the core of posix_spawn() in assembler using system calls directly but I would rather avoid that :( BTW, we have very gross inefficiency in our libc syscall stubs on i386. Namely, the PIC_PROLOGUE generates 'call 1f; 1f:' sequence, which
[patch] libc: Do not export .cerror
- jmp PIC_PLT(HIDENAME(cerror)) END(__sys_getcontext) .section .note.GNU-stack,,%progbits Index: lib/libc/i386/sys/cerror.S === --- lib/libc/i386/sys/cerror.S (revision 239195) +++ lib/libc/i386/sys/cerror.S (working copy) @@ -48,13 +48,14 @@ .globl CNAME(__error) .type CNAME(__error),@function HIDENAME(cerror): +#ifdef PIC + PIC_PROLOGUE pushl %eax -#ifdef PIC - /* The caller must execute the PIC prologue before jumping to cerror. */ callPIC_PLT(CNAME(__error)) popl%ecx PIC_EPILOGUE #else + pushl %eax callCNAME(__error) popl%ecx #endif Index: lib/libc/i386/sys/sbrk.S === --- lib/libc/i386/sys/sbrk.S(revision 239195) +++ lib/libc/i386/sys/sbrk.S(working copy) @@ -59,7 +59,7 @@ addl%eax,4(%esp) mov $SYS_break,%eax KERNCALL - jb err + jb HIDENAME(cerror) PIC_PROLOGUE movlPIC_GOT(HIDENAME(curbrk)),%edx movl(%edx),%eax @@ -67,9 +67,6 @@ PIC_EPILOGUE back: ret -err: - PIC_PROLOGUE - jmp PIC_PLT(HIDENAME(cerror)) #else /* !PIC */ @@ -80,13 +77,11 @@ addl%eax,4(%esp) mov $SYS_break,%eax KERNCALL - jb err + jb HIDENAME(cerror) movlHIDENAME(curbrk),%eax addl%ecx,HIDENAME(curbrk) back: ret -err: - jmp HIDENAME(cerror) #endif /* PIC */ END(sbrk) Index: lib/libc/i386/sys/Ovfork.S === --- lib/libc/i386/sys/Ovfork.S (revision 239195) +++ lib/libc/i386/sys/Ovfork.S (working copy) @@ -50,8 +50,7 @@ jmp *%ecx 1: pushl %ecx - PIC_PROLOGUE - jmp PIC_PLT(HIDENAME(cerror)) + jmp HIDENAME(cerror) END(__sys_vfork) .section .note.GNU-stack,,%progbits Index: lib/libc/i386/sys/ptrace.S === --- lib/libc/i386/sys/ptrace.S (revision 239195) +++ lib/libc/i386/sys/ptrace.S (working copy) @@ -50,11 +50,8 @@ #endif mov $SYS_ptrace,%eax KERNCALL - jb err + jb HIDENAME(cerror) ret -err: - PIC_PROLOGUE - jmp PIC_PLT(HIDENAME(cerror)) END(ptrace) .section .note.GNU-stack,,%progbits Index: lib/libc/i386/sys/exect.S === --- lib/libc/i386/sys/exect.S (revision 239195) +++ lib/libc/i386/sys/exect.S (working copy) @@ -47,8 +47,7 @@ pushl %edx popf KERNCALL - PIC_PROLOGUE - jmp PIC_PLT(HIDENAME(cerror)) /* exect(file, argv, env); */ + jmp HIDENAME(cerror)/* exect(file, argv, env); */ END(exect) .section .note.GNU-stack,,%progbits Index: lib/libc/i386/sys/syscall.S === --- lib/libc/i386/sys/syscall.S (revision 239195) +++ lib/libc/i386/sys/syscall.S (working copy) @@ -45,11 +45,8 @@ KERNCALL push%ecx/* need to push a word to keep stack frame intact upon return; the word must be the return address. */ - jb 1f + jb HIDENAME(cerror) ret -1: - PIC_PROLOGUE - jmp PIC_PLT(HIDENAME(cerror)) END(syscall) .section .note.GNU-stack,,%progbits -- Jilles Tjoelker ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org
Re: system() using vfork() or posix_spawn() and libthr
On Fri, Aug 17, 2012 at 03:43:12PM +0300, Konstantin Belousov wrote: On Fri, Aug 17, 2012 at 12:39:33AM +0200, Jilles Tjoelker wrote: On Thu, Aug 16, 2012 at 02:44:26PM +0300, Konstantin Belousov wrote: BTW, returning to Jilles proposal, can we call vfork() only for single-threaded parent ? I think it gives good boost for single-threaded case, and also eliminates the concerns of non-safety. This would probably fix the safety issues but at a price. There is a correlation between processes so large that they benefit greatly from vfork and threaded processes. Ok, so I will continue with my patch. On the other hand, I think direct calls to vfork() in applications are risky and it may not be possible to support them safely in all circumstances. However, if libc is calling vfork() such as via popen(), system() or posix_spawn(), it should be possible even in a multi-threaded process. In that case, the rtld and libthr problems can be avoided by using aliases with hidden visibility for all functions the vforked child needs to call (or any other method that prevents interposition and hard-codes a displacement into libc.so). I do not see how using any aliases could help there. Basically, if mt process is not single-threaded for vfork, you can have both some parent thread and child enter rtld. This is complete mess. If libc calls a function with hidden visibility, this proceeds directly (not via the PLT) and rtld is not involved. Several ways to do this are in section 2.2.7 Avoid Using Exported Symbols of Ulrich Drepper's dsohowto. One of them is something like extern __typeof(next) next_int __attribute((alias(next), visibility(hidden))); in the same source as the definition of the function int next(void) { ...; } As Drepper notes, the visibility attribute is not strictly required if a version script keeps the symbol local but it might lead to better code. At least on i386, though, the optimal direct near call instruction is generated even without it. For example, _nsdispatch() calls libc_dlopen() (kept local by the version script) without going through the PLT (from the output of objdump -dS on the libc.so.7 in /usr/obj). In the assembler syscall stubs using code from lib/libc/arch/SYS.h this can be done by adding another .set (we currently have foo, _foo and __sys_foo for a syscall foo; some syscalls have only _foo and __sys_foo) such as __syshidden_foo. The ones that are actually used then need prototypes (similar to the __sys_* ones in lib/libthr/?). For some reason, the symbol .cerror (HIDENAME(cerror)) is also exported. Either this should be kept local or a local uninterposable alias should be added and used (as with the syscalls). The function __error() (to get errno's address for the current thread) is and must be called via the PLT (because libthr is separate). Therefore, we should ensure it is called at least once before vfork so calls in the child do not involve rtld. The implementations for the various architectures use the TLS register (or memory location for ARM), so they seem safe. This should suffice to fix posix_spawn() but the _execvpe() used by posix_spawnp also uses various string functions. If not all of these have already been called earlier, this will not work. Making calls to them not go through the PLT seems fairly hard, even though it would make sense in general, so perhaps I should instead reimplement it such that the parent does almost all of the work. An alternative is to write the core of posix_spawn() in assembler using system calls directly but I would rather avoid that :( There may still be a problem in programs that install signal handlers because the set of async-signal-safe functions is larger than what can be done in a vforked child. Userland can avoid this by masking affected signals before calling vfork() and resetting them to SIG_DFL before unmasking them. This will add many syscalls if the code does not know which signals are affected (such as libc). Alternatively, the kernel could map caught signals to the default action for processes with P_PPWAIT (just like it does not stop such processes because of signals or TTY job control). If rtld does not work, then any library function call from a signal handler is problematic. My goal is to get a working rtld and possibly malloc. BTW, not quite related, it seems that the placement of openat, setcontext and swapcontext in the pthread.map is wrong. openat is at FBSD_1.1 in libc, and *context are at FBSD_1.0 version, while libthr exports them at 1.2. App or library gets linked to arbitrary version depending on whether libphread was specified at link time, and interposition from libthr does not work. Oops :( Can this still be fixed (like by exporting identical functions in multiple versions)? Below is the latest version of my patch for vfork, which survives (modified) tools/test/pthread_vfork_test. Patch is only for x86 right now. Why does
Re: system() using vfork() or posix_spawn() and libthr
On Thu, Aug 16, 2012 at 02:44:26PM +0300, Konstantin Belousov wrote: My point is that the fact that fork() is called from dynamic context that was identified as the signal handler does not mean anything. It can be mis-identified for many reasons, which both me and you tried to partially enumerate above. The really important thing is the atomicity of the current context, e.g. synchronous SIGSEGV caused by a language runtime GC is very much safe place to call atfork handlers, since runtimes usually cause signal generations only at the safe place. I do not think that such approach as you described can be completed, the _Fork() seems the only robust way. Agreed, that way (detecting signal handler) lies madness. If need be, _Fork() is easily implemented and used. BTW, returning to Jilles proposal, can we call vfork() only for single-threaded parent ? I think it gives good boost for single-threaded case, and also eliminates the concerns of non-safety. This would probably fix the safety issues but at a price. There is a correlation between processes so large that they benefit greatly from vfork and threaded processes. On the other hand, I think direct calls to vfork() in applications are risky and it may not be possible to support them safely in all circumstances. However, if libc is calling vfork() such as via popen(), system() or posix_spawn(), it should be possible even in a multi-threaded process. In that case, the rtld and libthr problems can be avoided by using aliases with hidden visibility for all functions the vforked child needs to call (or any other method that prevents interposition and hard-codes a displacement into libc.so). There may still be a problem in programs that install signal handlers because the set of async-signal-safe functions is larger than what can be done in a vforked child. Userland can avoid this by masking affected signals before calling vfork() and resetting them to SIG_DFL before unmasking them. This will add many syscalls if the code does not know which signals are affected (such as libc). Alternatively, the kernel could map caught signals to the default action for processes with P_PPWAIT (just like it does not stop such processes because of signals or TTY job control). -- Jilles Tjoelker ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org
Re: system() using vfork() or posix_spawn() and libthr
On Tue, Aug 14, 2012 at 11:15:06PM +0800, David Xu wrote: But in real word, pthread atfork handlers are not async-signal safe, they mostly do mutex locking and unlocking to keep consistent state, mutex is not async-signal safe. The malloc prefork and postfork handlers happen to work because I have some hacking code in library for malloc locks. Otherwise, you even can not use fork() in signal handler. This problem was also reported to the Austin Group at http://austingroupbugs.net/view.php?id=62 Atfork handlers are inherently async-signal-unsafe. An interpretation was issued suggesting to remove fork() from the list of async-signal-safe functions and add a new async-signal-safe function _Fork() which does not call the atfork handlers. This change will however not be in POSIX.1-2008 TC1 but only in the next issue (SUSv5). A slightly earlier report http://austingroupbugs.net/view.php?id=18 just requested the _Fork() function because an existing application deadlocked when calling fork() from a signal handler. -- Jilles Tjoelker ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org
Re: system() using vfork() or posix_spawn() and libthr
On Fri, Aug 10, 2012 at 10:16:04AM +0800, David Xu wrote: On 2012/08/09 18:56, Jilles Tjoelker wrote: On Mon, Aug 06, 2012 at 11:25:35AM +0300, Konstantin Belousov wrote: On Sun, Aug 05, 2012 at 11:54:32PM +0200, Jilles Tjoelker wrote: On Mon, Jul 30, 2012 at 01:53:03PM +0300, Konstantin Belousov wrote: On Mon, Jul 30, 2012 at 12:24:08PM +0200, Jilles Tjoelker wrote: People sometimes use system() from large address spaces where it would improve performance greatly to use vfork() instead of fork(). A simple approach is to change fork() to vfork(), although I have not tried this. It seems safe enough to use sigaction and sigprocmask system calls in the vforked process. Alternatively, we can have posix_spawn() do the vfork() with signal changes. This avoids possible whining from compilers and static analyzers about using vfork() in system.c. However, I do not like the tricky code for signals and that it adds lines of code. This is lightly tested. It is interesting to note that for some time our vfork(2) no longer stops the whole forked process (parent), only the forking thread is waiting for the child exit or exec. I am not sure is this point important for system(3), but determined code can notice the difference from the fork-vfork switch. Neither fork nor vfork call thread_single(SINGLE_BOUNDARY), so this is not a difference. It is the difference, because vforked child shares parent address space. Thread singling may be noticeable from a failing execve() (but only in the process doing execve()) and in the rare case of rfork() without RFPROC. No, other running threads in parent affect vforked child till exec or exit. In fact, I would classify this as bug, but not a serious one. There are some ugly ways this parallel execution is depended on. If the vforked child calls sigaction() while another thread is also in sigaction() for that signal, the vforked child needs to wait for the other thread to release the lock. This uses a per-process lock to synchronize threads in different processes, which may not work properly. If the vforked child is killed (such as by SIGKILL) while holding the lock, the parent is not killed but its _thr_sigact is damaged. These problems could be avoided in libthr by skipping the lock in _sigaction() if a signal action is being set to SIG_DFL or SIG_IGN and the old action is not queried. In those cases, _thr_sigact is not touched so no lock is required. This change also helps applications, provided they call sigaction() and not signal(). Alternatively, posix_spawn() and system() could use the sigaction system call directly, bypassing libthr (if present). However, this will not help applications that call vfork() and sigaction() themselves (such as a shell that wants to implement ... using vfork()). posix_spawn() likely still needs some adjustment so that having it reset all signals (sigfillset()) to the default action will not cause it to [EINVAL] because libthr does not allow changing SIGTHR's disposition. Index: lib/libthr/thread/thr_sig.c === --- lib/libthr/thread/thr_sig.c (revision 238970) +++ lib/libthr/thread/thr_sig.c (working copy) @@ -519,8 +519,16 @@ return (-1); } - if (act) + if (act) { newact = *act; + /* +* Short-circuit cases where we do not touch _thr_sigact. +* This allows performing these safely in a vforked child. +*/ + if ((newact.sa_handler == SIG_DFL || + newact.sa_handler == SIG_IGN) oact == NULL) + return (__sys_sigaction(sig,newact, NULL)); + } __sys_sigprocmask(SIG_SETMASK,_thr_maskset,oldset); _thr_rwl_wrlock(_thr_sigact[sig-1].lock); Your patch is better than nothing, I don't object. The problem is visble to you, but there is also invisible user - rtld. If a symbol is never used in parent process, but now it is used in a vforked child, the rtld will be involved, if the child is killed, the rtld's data structure may be in inconsistent state, such as locking or link list etcs... I think this problem might be a non-fixable problem. Hmm. Rtld cannot be fixed like libthr because its data structures are inherently in userland. Perhaps signal handling should be different for a vforked child, like the default action of a signal sent to a thread affects the entire process and not just the thread. This cannot be implemented in the calling code because resolving execve() itself also needs rtld (ugly hacks like performing an execve() call that is guaranteed to fail aside). The rtld problem can be avoided specifically by linking with '-z now'. This might be acceptable for sh and csh; most applications can use posix_spawn() which would have to become a system call. -- Jilles Tjoelker
Re: system() using vfork() or posix_spawn() and libthr
On Mon, Aug 06, 2012 at 11:25:35AM +0300, Konstantin Belousov wrote: On Sun, Aug 05, 2012 at 11:54:32PM +0200, Jilles Tjoelker wrote: On Mon, Jul 30, 2012 at 01:53:03PM +0300, Konstantin Belousov wrote: On Mon, Jul 30, 2012 at 12:24:08PM +0200, Jilles Tjoelker wrote: People sometimes use system() from large address spaces where it would improve performance greatly to use vfork() instead of fork(). A simple approach is to change fork() to vfork(), although I have not tried this. It seems safe enough to use sigaction and sigprocmask system calls in the vforked process. Alternatively, we can have posix_spawn() do the vfork() with signal changes. This avoids possible whining from compilers and static analyzers about using vfork() in system.c. However, I do not like the tricky code for signals and that it adds lines of code. This is lightly tested. It is interesting to note that for some time our vfork(2) no longer stops the whole forked process (parent), only the forking thread is waiting for the child exit or exec. I am not sure is this point important for system(3), but determined code can notice the difference from the fork-vfork switch. Neither fork nor vfork call thread_single(SINGLE_BOUNDARY), so this is not a difference. It is the difference, because vforked child shares parent address space. Thread singling may be noticeable from a failing execve() (but only in the process doing execve()) and in the rare case of rfork() without RFPROC. No, other running threads in parent affect vforked child till exec or exit. In fact, I would classify this as bug, but not a serious one. There are some ugly ways this parallel execution is depended on. If the vforked child calls sigaction() while another thread is also in sigaction() for that signal, the vforked child needs to wait for the other thread to release the lock. This uses a per-process lock to synchronize threads in different processes, which may not work properly. If the vforked child is killed (such as by SIGKILL) while holding the lock, the parent is not killed but its _thr_sigact is damaged. These problems could be avoided in libthr by skipping the lock in _sigaction() if a signal action is being set to SIG_DFL or SIG_IGN and the old action is not queried. In those cases, _thr_sigact is not touched so no lock is required. This change also helps applications, provided they call sigaction() and not signal(). Alternatively, posix_spawn() and system() could use the sigaction system call directly, bypassing libthr (if present). However, this will not help applications that call vfork() and sigaction() themselves (such as a shell that wants to implement ... using vfork()). posix_spawn() likely still needs some adjustment so that having it reset all signals (sigfillset()) to the default action will not cause it to [EINVAL] because libthr does not allow changing SIGTHR's disposition. Index: lib/libthr/thread/thr_sig.c === --- lib/libthr/thread/thr_sig.c (revision 238970) +++ lib/libthr/thread/thr_sig.c (working copy) @@ -519,8 +519,16 @@ return (-1); } - if (act) + if (act) { newact = *act; + /* +* Short-circuit cases where we do not touch _thr_sigact. +* This allows performing these safely in a vforked child. +*/ + if ((newact.sa_handler == SIG_DFL || + newact.sa_handler == SIG_IGN) oact == NULL) + return (__sys_sigaction(sig, newact, NULL)); + } __sys_sigprocmask(SIG_SETMASK, _thr_maskset, oldset); _thr_rwl_wrlock(_thr_sigact[sig-1].lock); -- Jilles Tjoelker ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org
Re: system() using vfork() or posix_spawn()
On Mon, Jul 30, 2012 at 01:53:03PM +0300, Konstantin Belousov wrote: On Mon, Jul 30, 2012 at 12:24:08PM +0200, Jilles Tjoelker wrote: People sometimes use system() from large address spaces where it would improve performance greatly to use vfork() instead of fork(). A simple approach is to change fork() to vfork(), although I have not tried this. It seems safe enough to use sigaction and sigprocmask system calls in the vforked process. Alternatively, we can have posix_spawn() do the vfork() with signal changes. This avoids possible whining from compilers and static analyzers about using vfork() in system.c. However, I do not like the tricky code for signals and that it adds lines of code. This is lightly tested. It is interesting to note that for some time our vfork(2) no longer stops the whole forked process (parent), only the forking thread is waiting for the child exit or exec. I am not sure is this point important for system(3), but determined code can notice the difference from the fork-vfork switch. Neither fork nor vfork call thread_single(SINGLE_BOUNDARY), so this is not a difference. Thread singling may be noticeable from a failing execve() (but only in the process doing execve()) and in the rare case of rfork() without RFPROC. -- Jilles Tjoelker ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org
system() using vfork() or posix_spawn()
People sometimes use system() from large address spaces where it would improve performance greatly to use vfork() instead of fork(). A simple approach is to change fork() to vfork(), although I have not tried this. It seems safe enough to use sigaction and sigprocmask system calls in the vforked process. Alternatively, we can have posix_spawn() do the vfork() with signal changes. This avoids possible whining from compilers and static analyzers about using vfork() in system.c. However, I do not like the tricky code for signals and that it adds lines of code. This is lightly tested. Index: lib/libc/stdlib/system.c === --- lib/libc/stdlib/system.c(revision 238371) +++ lib/libc/stdlib/system.c(working copy) @@ -42,16 +42,21 @@ #include unistd.h #include paths.h #include errno.h +#include spawn.h #include un-namespace.h #include libc_private.h +extern char **environ; + int __system(const char *command) { pid_t pid, savedpid; - int pstat; + int error, pstat; struct sigaction ign, intact, quitact; - sigset_t newsigblock, oldsigblock; + sigset_t newsigblock, oldsigblock, defmask; + const char *argv[4]; + posix_spawnattr_t attr; if (!command) /* just checking... */ return(1); @@ -65,28 +70,36 @@ ign.sa_flags = 0; (void)_sigaction(SIGINT, ign, intact); (void)_sigaction(SIGQUIT, ign, quitact); + (void)sigemptyset(defmask); + if ((intact.sa_flags SA_SIGINFO) != 0 || + intact.sa_handler != SIG_IGN) + (void)sigaddset(defmask, SIGINT); + if ((quitact.sa_flags SA_SIGINFO) != 0 || + quitact.sa_handler != SIG_IGN) + (void)sigaddset(defmask, SIGQUIT); (void)sigemptyset(newsigblock); (void)sigaddset(newsigblock, SIGCHLD); (void)_sigprocmask(SIG_BLOCK, newsigblock, oldsigblock); - switch(pid = fork()) { - case -1:/* error */ - break; - case 0: /* child */ - /* -* Restore original signal dispositions and exec the command. -*/ - (void)_sigaction(SIGINT, intact, NULL); - (void)_sigaction(SIGQUIT, quitact, NULL); - (void)_sigprocmask(SIG_SETMASK, oldsigblock, NULL); - execl(_PATH_BSHELL, sh, -c, command, (char *)NULL); - _exit(127); - default:/* parent */ + argv[0] = sh; + argv[1] = -c; + argv[2] = command; + argv[3] = NULL; + if ((error = posix_spawnattr_init(attr)) != 0 || + (error = posix_spawnattr_setsigmask(attr, oldsigblock)) != 0 || + (error = posix_spawnattr_setsigdefault(attr, defmask)) != 0 || + (error = posix_spawnattr_setflags(attr, + POSIX_SPAWN_SETSIGDEF | POSIX_SPAWN_SETSIGMASK)) != 0 || + (error = posix_spawn(pid, _PATH_BSHELL, NULL, attr, + __DECONST(char **, argv), environ)) != 0) { + pid = -1; + errno = error; + } else { savedpid = pid; do { pid = _wait4(savedpid, pstat, 0, (struct rusage *)0); } while (pid == -1 errno == EINTR); - break; } + posix_spawnattr_destroy(attr); (void)_sigaction(SIGINT, intact, NULL); (void)_sigaction(SIGQUIT, quitact, NULL); (void)_sigprocmask(SIG_SETMASK, oldsigblock, NULL); -- Jilles Tjoelker ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org
Re: [RFC] last(1) with security.bsd.see_other_uids support
On Wed, Jun 06, 2012 at 01:20:12PM +0200, Pawel Jakub Dawidek wrote: On Tue, Jun 05, 2012 at 11:31:01PM +0200, Jilles Tjoelker wrote: Also, the attack surface of such a daemon may be smaller than that of a setuid/setgid program. Really? I don't see that. With current patch and setgid to utmp the process can only read some files that don't even contain very sensitive data (like passwords). Any privileged daemon is much bigger threat. Also, do we really want a daemon running all the time just to be able to parse utx files? The daemon would run with non-root privileges just sufficient to read the utmpx files. If we have a good way to start it, the attack surface is limited to what you can do with its socket and this can be cut down tightly. On the other hand, an attacker can control various process attributes of a setgid program such as the output file, a subset of signals, rlimits and a subset of environment variables. For example, last, w and who have some degree of locale support (time/date formats). Also, in this particular case, dropping privileges does not help much since the utmpx file descriptor is almost as valuable as the group credentials. I agree that leaving a daemon running for this is ugly. Alternatively, the daemon could be a setgid program that is spawned by the utmpx APIs when needed. Still seems a bit too far for my taste. Spawning a daemon somewhere from within library doesn't sound like a good idea to me... At least until we have something like launchd that can start such services on demand. The suggested approach is used by old implementations of grantpt(). If the kernel does not set up the ownership of a new pseudo terminal properly, grantpt() can invoke a setuid root binary, for example /usr/libexec/pt_chown. Similarly, the utmpx APIs might invoke a setgid helper if they cannot read the files themselves. Communication would be over a pipe. This has the downside of having a setgid program at all but things like locale support are handled by the calling (unprivileged) application and not by the setgid program. -- Jilles Tjoelker ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org
Re: [RFC] last(1) with security.bsd.see_other_uids support
On Sun, Jun 03, 2012 at 08:42:04PM -0500, Bryan Drewery wrote: I've written up a patch to add some privacy to last(1) while still giving non-privileged users access to their own login history. This is still a work in progress. I am reaching out to make sure my approach is proper and to get some input on code sharing. My goal is to add this support to w(1) and who(1) as well. FWIW I have been running a similar patch on my own shared-hosting systems (pre-utmpx) for a few years. Changes: * Added utmp group * All utmpx files are 660 root:utmp The utmp group was previously (on other systems) used to allow applications write access to utmp without making them setuid root (making them setgid utmp instead). * last(1) runs setgid(utmp) and drops this as soon as the utmpx files are opened. * Users in the wheel or utmp group can see all entries * IFF security.bsd.see_other_uids=0: users only see their own entries, as well as shutdown/boot/init times. * If security.bsd.see_other_uids=1, all entries are always shown, as it does now. Justifications: Why the changes? This makes sense for shared hosting environments where jails are not practical. A user should be able to see their own login history, to see if someone has been accessing their account, but not to see the IPs of other users. The intention is *not* to disallow them to see that other users of the system. Obviously they can just cat /etc/passwd. This is just about IP privacy. Why the setgid? Allow reading the entries, but disallow directly opening the utx files. I've seen some shared hosts incorrectly chmod 0 /usr/bin/last, but still leave the utmp files wide open for reading! Why the utmp group? It's consistent with other systems (OpenBSD, Linux), and allows giving a user access to see all entries, without granting them wheel or allowing a non-privileged user to run as setgid(wheel). It also helps mitigates security concerns by using a specific group only having extra privilege to utmpx files. I originally had not planned for security.bsd.see_other_uids, but considering POLA and consistency, it makes sense. This requires every utmpx access to go through a setgid binary, regardless of the value of security.bsd.see_other_uids. If called by a user that is not root or in the utmp group, getutxent(3) and related APIs become worthless. While POSIX permits this (security restriction denying all visibility of utmpx), this is not what applications expect. For example, tcsh and zsh offer a watch feature that reports on logins and logouts by calling utmpx APIs. To avoid this, the utmpx APIs could communicate with a privileged daemon if the files are not readable. The daemon can check the identity of the caller via getpeereid(3). (Unfortunately, even if getpeereid() is bypassed and LOCAL_PEERCRED called directly, only 16 groups can be queried. Therefore the daemon cannot check the process credential for the groups but will have to check the group database for the user.) Also, the attack surface of such a daemon may be smaller than that of a setuid/setgid program. Alternatively, the daemon could be a setgid program that is spawned by the utmpx APIs when needed. -- Jilles Tjoelker ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org
Re: mmap segmentation fault
On Fri, Apr 13, 2012 at 12:25:32AM +0530, Maninya M wrote: I want to allocate memory at a specified address location 'a' of size 'b'. I wrote code below to do it, but I'm getting a seg fault. How can I solve this? How can I get the allocated memory at the required address? int main() { unsigned int *addr,*newaddr; unsigned long a=134516736,a1; unsigned long b=3895296; unsigned long flags =6; a1=(a0x); printf(%x\n,(void *)a); newaddr=(unsigned int *)mmap((void *)a,b,6,MAP_ANONYMOUS|MAP_FIXED,-1,0); if(newaddr==MAP_FAILED) printf(mmap failed); else printf(sucess %x,newaddr); return 0; } Output is 8049000 Segmentation fault If this is i386, you're mapping onto the executable itself. If you really want to map something there, you will have to move your code somewhere else or manipulate your executable to contain a suitable memory area at the required address. Try, for example, procstat -v $$ -- Jilles Tjoelker ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org
Re: a sysctl for process binary osreldate
On Sat, Mar 17, 2012 at 09:30:05PM +0200, Mikolaj Golub wrote: I added osrel output to procstat -b option: kopusha:~% procstat -b 2975 PID COMMOSREL PATH 2975 emacs 101 /usr/local/bin/emacs-23.3 Would this be ok or someone see a better way? Hmm, this means that procstat is not supposed to be used from scripts as it is apparently OK to change its output format like this? In some ways, querying via ps would be better for scripts since it allows things like ps -p PID -o KEYWORD= which do not need additional parsing except that many of the newer things in procstat do not have ps keywords. -- Jilles Tjoelker ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org
Re: xargs short-circuit
On Tue, Feb 14, 2012 at 01:34:49PM -0500, Matthew Story wrote: After reading the man-page, and browsing around the internet for a minute, I was just wondering if there is an option in (any) xargs to short-circuit on first failure of [utility [arguments]]. e.g. $ jot - 1 10 | xargs -e -n1 sh -c 'echo $*; echo exit 1' worker || echo $? 1 1 such that any non-0 exit code in a child process would cause xargs to stop processing. seems like this would be a nice feature to have. As per xargs(1), you can do this by having the command exit on a signal or with a value of 255. -- Jilles Tjoelker ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org
Re: sem(4) lockup in python?
On Sun, Feb 05, 2012 at 01:32:42PM -0500, Daniel Eischen wrote: On Sun, 5 Feb 2012, Ivan Voras wrote: On 5 February 2012 11:44, Garrett Cooper yaneg...@gmail.com wrote: 'make MAKE_JOBS_NUMBER=1' is the workground used right now.. David Xu suggested that it is a bug in Python - it doesn't set process-shared attribute when it calls sem_init(), but i've tried patching it (replacing the port patchfile file the one I've attached) and I still get the hang. I don't understand how process shared semaphores can work. Perhaps I'm dumb and ignorant, but a sem_id_t is an allocated struct. The actual kernel sem_id is inside the struct. Isn't this the same reason pthread_mutex_t and pthread_cond_t cannot be process-shared? That's how the old implementation works. It does not support process-shared semaphores although they may happen to work in some specific cases. However, in 9.0, sem_t works differently and contains the actual lock word directly, so that process-shared semaphores work. The implementation is in lib/libc/gen/sem_new.c. The pshared flag to sem_init() is not a no-op because it tells the kernel to allow for use from multiple processes. Note that the old implementation is still present as well, for compatibility with old binaries. -- Jilles Tjoelker ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org
sh(1) vfork patch, with benchmarks
Here is a new version of the vfork patch. The concept is the same, trying to use vfork in simple common cases. Compared to http://lists.freebsd.org/pipermail/freebsd-hackers/2011-June/035618.html I have extended vfork use to some command substitutions and allowed setting an environment variable SH_DISABLE_VFORK to disable all vfork use. The test machine is a 4-core Phenom II X4 in 32-bit mode with 4GB RAM, stable/8 with newer sh. I have run tests with a dummy environment variable SH_DISABLE_VFORL=1 (y) and with SH_DISABLE_VFORK=1 (n). A microbenchmark sh -c 'x=0; while [ $x -lt 1 ]; do /bin/kill -0 $$; x=$(($x + 1)); done' is much faster: x micro-vfork-timings1-n + micro-vfork-timings1-y +--+ |+ | |+ x | | + ++ x xx | |++ ++ xx xx x| | |_A||___AM_| | +--+ N Min MaxMedian AvgStddev x 9 3.86 4.05 4 3.963 0.076648549 + 9 2.52 2.6 2.58 2.572 0.033082389 Difference at 95.0% confidence -1.39111 +/- 0.0589948 -35.0995% +/- 1.48851% (Student's t, pooled s = 0.0590315) A make -j4 buildkernel is about 0.5% faster: x buildkernel-vfork-timings-n + buildkernel-vfork-timings-y +--+ | + x + | |+ + +* + + +x*+xx+ x xx x + x x x *+ * x+ x| ||__|M_AMA|___|| +--+ N Min MaxMedian AvgStddev x 17 435.3443.65 438.8 439.03588 2.378805 + 17 432.8442.86436.76 437.06412 3.20108 Difference at 95.0% confidence -1.97176 +/- 1.97034 -0.449112% +/- 0.448789% (Student's t, pooled s = 2.82007) In both cases, the difference comes mainly from the system time, but the user time is also a bit lower (statistically significant). In a virtual machine with 10-current (default kernel config + capsicum and procdesc) and the patch, the microbenchmark is similarly much faster: x micro-vfork-timings-n + micro-vfork-timings-y +--+ |+ + | |+++ x | |xx x | | xx | |+ x xxx| ||_A| |_A_| | +--+ N Min MaxMedian AvgStddev x 18 15.14 15.85 15.5715.5550.17088868 + 18 9.79 10.14 9.92 9.9127778 0.096820365 Difference at 95.0% confidence -5.64222 +/- 0.0940703 -36.2727% +/- 0.604759% (Student's t, pooled s = 0.138883) Ian Lepore has been kind enough to try an earlier version of this patch on some sort of ARM board and reports an improvement in boot time from 54 to 51 seconds, and a large difference in microbenchmarks. commit f55a350fa9c3792e10f93160a93d016a7bfdd630 Author: Jilles Tjoelker jil...@stack.nl Date: Mon May 30 00:31:45 2011 +0200 sh: Use vfork in a few common cases. This uses vfork() for simple commands and command substitutions containing a single simple command, invoking an external program under certain conditions (no redirections or variable assignments, non-interactive shell, no job control). The use of vfork() can be disabled by setting a variable named SH_DISABLE_VFORK. diff --git a/bin/sh/eval.c b/bin/sh/eval.c index a5f0aff..2d90921 100644 --- a/bin/sh/eval.c +++ b/bin/sh/eval.c @@ -921,6 +921,15 @@ evalcommand(union node *cmd, int flags, struct backcmd *backcmd) if (pipe(pip) 0) error(Pipe call failed: %s, strerror(errno)); } + if (cmdentry.cmdtype == CMDNORMAL + cmd
Re: sh(1) vfork patch, with benchmarks
On Wed, Jan 25, 2012 at 11:54:46PM +0100, Jilles Tjoelker wrote: [snip] x micro-vfork-timings1-n + micro-vfork-timings1-y +--+ |+ | |+ x | | + ++ x xx | |++ ++ xx xx x| | |_A| |___AM_| | +--+ N Min MaxMedian AvgStddev x 9 3.86 4.05 4 3.963 0.076648549 + 9 2.52 2.6 2.58 2.572 0.033082389 Difference at 95.0% confidence -1.39111 +/- 0.0589948 -35.0995% +/- 1.48851% (Student's t, pooled s = 0.0590315) [snip] I forgot to mention, the numbers are time in seconds (measured with /usr/bin/time). -- Jilles Tjoelker ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org
Re: intent of tab-completion in /bin/sh in 9.0
On Wed, Jan 18, 2012 at 08:46:18PM -0500, Matthew Story wrote: On Wed, Jan 18, 2012 at 5:16 PM, Jilles Tjoelker jil...@stack.nl wrote: POSIX itself has gradually adopted ksh features, so seeing more of them in future is not unlikely. Most of the new language features in 9.0 are either from POSIX.1-2008 or on the roadmap for a new version of POSIX (in collaboration with other shell authors). Tab completion is a welcome addition, I was unaware that this had been (or is slated to be) added to the POSIX specification. This makes far more sense than my proposed explanations. Thanks for the clarification. Tab completion is not in POSIX and not planned to be (as far as I know), although there is some fairly ugly pathname completion functionality required in 'set -o vi' mode (we do not implement it). The reason is more like that I noticed that NetBSD had it and found someone willing to port it and add some small features (escaping special characters in what is inserted). In using /bin/sh as login shell on virtual machines (to avoid the need to install something else from ports), I have found the filename completion to be remarkably useful. -- Jilles Tjoelker ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org
Re: intent of tab-completion in /bin/sh in 9.0
On Wed, Jan 18, 2012 at 11:00:44AM -0500, Matthew Story wrote: Just noticed that tab-completion in /bin/sh has been added in 9.0 (verified that it is not there in 8.0, dunno if it's there in 8.2, could probably go digging to figure it out). In addition to the command history via up:down (which is present in 8.0) FreeBSD sh is now actually a pretty usable interactive shell. I also noticed that the following bit has been removed from the sh(1): This version has many features which make it appear similar in some respects to the Korn shell, but it is not a Korn shell clone like pdksh. That sentence is an implicit comparison to the Bourne shell. When it was written, many other Un*x variants had a version of the Bourne shell as /bin/sh, and it was not taken for granted to have things like $(...), $((...)) and ${...#...} in a /bin/sh. Nowadays, this is often taken for granted; the last OS to have a Bourne shell as /bin/sh is probably Solaris 10 (11 has ksh93). On the contrary, our /bin/sh is minimalistic compared to many other shells used in that role, like bash, pdksh, mksh and ksh93. It (the 9.0 version) has only slightly more features than dash or NetBSD's sh, and dash has instead some other features. Just wondering if the general direction here is attempting to provide a minimal POSIX shell, that is useful enough interactively to become the default root shell (supplanting csh)? Or if there is just a general trend towards adopting more of the ksh feature-set. POSIX itself has gradually adopted ksh features, so seeing more of them in future is not unlikely. Most of the new language features in 9.0 are either from POSIX.1-2008 or on the roadmap for a new version of POSIX (in collaboration with other shell authors). Adding other ksh features is not very likely. It is certainly possible to use /bin/sh as root's shell, but the distributed master.passwd entry will probably continue to use /bin/csh for a long time. Some plans for sh in 10.0 are in this mailing list post: http://lists.freebsd.org/pipermail/freebsd-arch/2011-December/011976.html -- Jilles Tjoelker ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org
Re: ps -e without procfs(5)
On Sat, Oct 29, 2011 at 01:32:39PM +0300, Mikolaj Golub wrote: [KERN_PROC_AUXV requires just p_cansee()] If we are ever going to do ASLR, the AUXV information tells an attacker where the stack, executable and RTLD are located, which defeats much of the point of randomizing the addresses in the first place. Given that the AUXV information seems to be used by debuggers only anyway, I think it would be good to move it to p_candebug() now. The full virtual memory maps (KERN_PROC_VMMAP, procstat -v) are already under p_candebug(). -- Jilles Tjoelker ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org
Re: ps -e without procfs(5)
On Sun, Dec 04, 2011 at 10:58:10PM +0200, Mikolaj Golub wrote: RNMW Agreed. In general, my view is that p_cansee() should be used for very RNMW few of our process inspection APIs. I like your example of ASLR RNMW especially, as it illustrates how debugging information can aid even RNMW local attacks (i.e., user vs. setuid binary). What do you think about recently added kern.proc.ps_strings, which returns location of ps_strings structure? It uses p_cansee() too. The location is the same for all processes of the same ABI, so this does not look like sensitive information, on the other hand it also seems to be used by debuggers only. With stack ASLR, the address will not be the same for every process of the same ABI and will be sensitive information. Therefore I think this should be locked down too. -- Jilles Tjoelker ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org
Re: cron(8) mis-feature with @reboot long after system startup
On Sat, Nov 26, 2011 at 02:58:46PM -0500, APseudoUtopia wrote: On Fri, Nov 25, 2011 at 4:36 PM, Christian Kastner deb...@kvr.at wrote: On 2011-11-25 08:02, Jason Hellenthal wrote: So with that said... is there a way we could actually make this run @reboot only ? Debian's cron[0] and Fedora's cronie[1] have solved this by touching a file on first startup and running @reboot only when this file does not yet exist. I like this idea, however it has a major caveat: Assuming the shutdown scripts remove said file (and the boot scripts create said file), what happens in the event that the disk was umount'ed uncleanly? For example, a power failure (I know, that's what UPSs are for, but lets ignore that for a second). If the system is configured to automatically boot after a power failure, the @reboot cron script wont run (since the said file still exists...). The file can be stored in /var/run, which is cleared at boot. -- Jilles Tjoelker ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org
Re: xlocale patch
On Wed, Sep 21, 2011 at 08:28:54PM +0100, David Chisnall wrote: And here's an updated version of the patch. I've fixed some other bugs, including where wcstod() and wcstodl() in trunk return the wrong value for any input string starting with spaces, wchar.h's violation of POSIX by not declaring FILE, and a few C++ incompatibilities in other headers (e.g. putchar being a macro, which breaks things like std::putchar(foo)). All of my libcxxrt and libc++ changes have now been pushed upstream, so this should now be repeatable. The libunwind port still has an irritating bug in the header, where the extern C {} block ends with a semicolon, which causes it to be rejected in any C++ program, but with that fixed you can compile libcxxrt (I used cmake .. -DCMAKE_CXX_FLAGS=-I/usr/local/include -nostdlib -g - hopefully I'll work out how to make CMake add these flags automatically soon...). Libc++ should build out of the box with cmake. I think it is unfortunate that xlocale.h depends on #include order; we should be moving away from such ordering dependencies, not adding more. POSIX does not have such dependencies so if they add all these new _l functions, they will most likely do it differently. POSIX.1-2008 has some of the _l functions but adds them to the plain header; for example isalnum_l() is in ctype.h like isalnum(), nl_langinfo_l() is in langinfo.h like nl_langinfo() and strcasecmp_l() is in strings.h like strcasecmp(). This appears more sensible to me. The xlocale.h header can then be an empty file. -- Jilles Tjoelker ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org
Re: Does anyone use nscd?
On Wed, Oct 05, 2011 at 03:54:00PM -0700, Artem Belevich wrote: 2011/10/5 Dag-Erling Sm?rgrav d...@des.no: Michael Bushkov bush...@freebsd.org writes: 2. Consequences of the aforementioned problem can probably be corrected by using _setsockopt(..., SO_NOSIGPIPE) in __open_cached_connection() in nscachedcli.c That sounds like a workaround rather than a fix... Not necessarily. Using SO_NOSIGPIPE is a valid option when someone wants to see read/write on a closed socket fail and return -1 with errno=EPIPE. Quick grep in libc shows that resolver code in lib/libc/resolv/res_send.c also sets SO_NOSIGPIPE for exactly that reason. Disabling SIGPIPE is good anyway because a crashing/dying nscd should not cause applications to terminate. However, if EPIPE/SIGPIPE happens in normal operation, that is still a bug that should be fixed. By the way, SO_NOSIGPIPE is not in POSIX.1-2008 while the MSG_NOSIGNAL flag to send() is. It may be better to replace the write() call with send() with the MSG_NOSIGNAL flag and drop the setsockopt(). -- Jilles Tjoelker ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org
Re: Re[2]: Sharing device driver between kernel and user space
On Wed, Sep 21, 2011 at 04:23:28PM +0200, Ivan Voras wrote: On 21 September 2011 16:09, geoffrey levand geoffrey.lev...@mail.ru wrote: Sure i can use the synchronization primitives, the problem is that the response to a request sent to PS3 VUART port is not available immediately, and i have to disallow kernel access to the PS3 VUART while i'm waiting for the response in user space. I send request with write syscall from user space and wait for response with read syscall. In the period of time between sending request and receiving response i could receive some other packets from VUART port, e.g. some kind of event notification, i have to skip them. But kernel should not interfer until i get my response. So i would need to lock out the kernel during this time. I think i found a good solution for this problem, just use a IOCTL which tells kernel device driver to stop processing kernel requests and events, something like SET_USER_MODE. After that i can use it in user space. Perhaps it is better to lock out kernel activity (and further opens) while userland has the device open. In any case, a close should probably release the device (possibly after some sort of cleanup), so that a crash of the userland program does not necessitate a reboot before the device can be used again. Have you read sema(9)? sema(9) is best avoided in new code. It is not used much in existing code, and it would be useful to be able to remove some day in order to reduce redundancy in the provided synchronization primitives. Instead, see mutex(9) and condvar(9). The implementation of sema(9) is not very optimized, using a mutex and a condition variable. (This is unlike the userland sem_post(3) and friends which only require a single atomic op in the uncontested case, except for named semaphores in 8.x and earlier.) Or if returning EBUSY is acceptable when the resource is in use by $whatever, maybe you just need a boolean variable. -- Jilles Tjoelker ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org
Re: Mixing Asynchronous Network I/O and POSIX Threads
On Sun, Sep 18, 2011 at 11:32:08AM -0400, Richard Yao wrote: I wrote a program for Linux that uses Asynchronous Network I/O and POSIX Threads. It uses a mix of gettid(), fcntl() and F_SETOWN to specify which thread handles each connection's SIGIO interrupts. gettid() is Linux-specific and I would prefer to do this in a way that also works with FreeBSD. Is that possible? In FreeBSD, you can only F_SETOWN processes or process groups, not threads, so a direct port is probably not possible. Another Linux feature not in FreeBSD is F_SETSIG (to send a realtime signal instead of SIGIO and provide siginfo_t information). My recommendation is not to use SIGIO and SIGURG at all. If you use signal handlers, your program is very likely to suffer from race condition problems unless you unmask the signal only at well-defined safe points. If you use sigtimedwait() or similar, select()/poll() can provide similar functionality. Alternatively, if you have many descriptors to watch, use non-portable facilities like BSD kqueue, Linux epoll, Solaris ports or portable wrappers around them. Realtime signals are nasty for this -- the signal queue has a finite size and when it overflows you need a resync step that is expensive both in CPU and programmer time. -- Jilles Tjoelker ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org
fdopendir() wrongly closes passed fd on error, union mess
While trying to use openat()/fdopendir()/fstatat() to improve performance of sh(1) pathname generation, I noticed that fdopendir() closes the passed file descriptor if it fails (such as when stable/8 silently ignores O_DIRECTORY even though it is defined in the header file). The below patch should fix this problem. I have changed the ugly DTF_REWIND code so it moves the reopened directory to the same file descriptor number, so as to minimize change there while ensuring the correct fd is closed. A later possibility is to fix the DTF_REWIND problem by reading union directories from a new descriptor obtained via fd2 = openat(fd, ., O_RDONLY | O_DIRECTORY | O_CLOEXEC); and doing this unconditionally, retiring the DTF_REWIND flag. Reading the kernel code, it appears that calling getdirentries() on an open file description in a filesystem mounted with MNT_UNION (mount -o union) may change that open file description irreversibly to one pointing to the lower layer. This must not happen to the descriptor passed to fdopendir() or the descriptor returned via dirfd() since fchdir() and *at functions may not work properly with such a changed descriptor. Note: unionfs does not appear to mangle the open file description like MNT_UNION does, but it does need the duplicate removal code like MNT_UNION. Index: lib/libc/gen/opendir.c === --- lib/libc/gen/opendir.c (revision 224739) +++ lib/libc/gen/opendir.c (working copy) @@ -75,6 +75,8 @@ __opendir2(const char *name, int flags) { int fd; struct stat statb; + DIR *dir; + int saved_errno; /* * stat() before _open() because opening of special files may be @@ -89,7 +91,13 @@ __opendir2(const char *name, int flags) if ((fd = _open(name, O_RDONLY | O_NONBLOCK | O_DIRECTORY)) == -1) return (NULL); - return __opendir_common(fd, name, flags); + dir = __opendir_common(fd, name, flags); + if (dir == NULL) { + saved_errno = errno; + _close(fd); + errno = saved_errno; + } + return (dir); } static int @@ -110,6 +118,7 @@ __opendir_common(int fd, const char *name, int fla int incr; int saved_errno; int unionstack; + int fd2; struct stat statb; dirp = NULL; @@ -199,14 +208,15 @@ __opendir_common(int fd, const char *name, int fla * which has also been read -- see fts.c. */ if (flags DTF_REWIND) { - (void)_close(fd); - if ((fd = _open(name, O_RDONLY | O_DIRECTORY)) == -1) { + if ((fd2 = _open(name, O_RDONLY | O_DIRECTORY)) == -1) { saved_errno = errno; free(buf); free(dirp); errno = saved_errno; return (NULL); } + (void)_dup2(fd2, fd); + _close(fd2); } /* @@ -309,7 +319,6 @@ __opendir_common(int fd, const char *name, int fla fail: saved_errno = errno; free(dirp); - (void)_close(fd); errno = saved_errno; return (NULL); } -- Jilles Tjoelker ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org
Re: eliminating a syscall on accept()+ioctl() combo
On Mon, Aug 01, 2011 at 08:11:04AM +0200, Vlad Galu wrote: On Jul 31, 2011, at 9:59 PM, Bernard van Gastel wrote: I want to reduce the number of syscalls for my networking application. The app handles incoming connections with the 'accept()' system call. Is there a way to specify to accept() that the newly created file descriptors should be non-blocking (FIONBIO)? This will avoid an ioctl() after the accept(). Thanks! You can make your listening socket non-blocking. Newly created file descriptors will inherit that property. However, that will require you to select()/poll()/kqueue() for that descriptor as well, instead of simply blocking in accept(). This is documented FreeBSD behaviour and common across BSDs, but is not portable. POSIX leaves it unspecified what the non-blocking state of the new socket is and in fact Linux always makes the new socket blocking (unless you request non-blocking using their new accept4() call). Because this portability issue can be very subtle, I suggest not blindly relying on it. -- Jilles Tjoelker ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org
Re: invalid argument in select() when peer socket is in FD_SET
On Sun, Jul 31, 2011 at 04:20:08PM +0200, Christoph P.U. Kukulies wrote: I posted this on freebsd-questions also but maybe the expert density isn't that high as here in hackers. Since I think it may be a design or implementation issue in FreeBSDs' select(), I'm posting it here as well, hoping to get an experts' answer. I have written a small server to test TCP/IP roundtrip times of the packets in a proprietary protocol and while compiling and running this server on different platforms (Windows 7/cygwin, UbuntuLinux, FreeBSD 8.0 Release), I found that the server produces an error when the listening socket (on which the accept() is performed) is also member of the select() fd_set. On the other platforms the program works without error, just under FreeBSD I'm getting this invalid argument error. Comments appreciated (despite comments about the error checking logic [snip] tv.tv_sec = 0; tv.tv_usec = 500; /* 5 seconds */ [snip] n = select(nfds, readfds, (fd_set *) NULL, /* not interested in write */ (fd_set *) NULL, /* ...or exceptions */ tv);/* timeout */ The number of microseconds in a struct timeval must be nonnegative and less than one million (likewise, the number of nanoseconds in a struct timespec must be nonnegative and less than 1000 million). FreeBSD checks this strictly in most functions. -- Jilles Tjoelker ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org
[PATCH] [RFC] sh(1) vfork support
; + TRACE((In parent shell: child = %d\n, (int)pid)); + return pid; +} + + /* * Wait for job to finish. * -- Jilles Tjoelker ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org
Re: tr A-Z a-z in locales other than C
On Tue, Jun 07, 2011 at 04:24:43AM +0400, Andrey Chernov wrote: On Tue, Jun 07, 2011 at 12:41:05AM +0200, Jilles Tjoelker wrote: There is a related issue with ranges in regular expressions, glob and fnmatch (likewise unspecified by POSIX outside the POSIX locale), but this is less likely to cause problems. You care about ports, but suggested change is americano-centrism which kills tr usage for national language documents due to impossibility to specify whole national alphabet easily, just by two letters. Hmm, so that's with translation to a constant, or with the -d and/or -s options. In such cases, there may be a range for all letters with collation order, but not with codeset order (mainly if all letters includes letters with diacritical marks). In FreeBSD, upper case sorts before lower case, so cases can be distinguished this way but all letters may require two ranges. In most other operating systems the cases go together so a single range is sufficient, but cases cannot be distinguished. Making such things work on multiple operating systems requires careful testing. Moreover, having differently treated regex ranges in tr vs other places you mention will produce additional chaos. I think this is already inconsistent because some programs do not enable locale or use different locale code. With UTF-8 or other multibyte character sets, this is even more so because functions like isalpha work very poorly by definition and there is no collation support for such character sets in FreeBSD. Back to the ports: it is not hard to run _any_ port's make or configure with LANG=C directly by the ports Mk system to eliminate that problem. True, but some ports install scripts with problematic tr calls. -- Jilles Tjoelker ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org
Re: tr A-Z a-z in locales other than C
On Wed, Jun 08, 2011 at 09:56:39AM +1200, Atom Smasher wrote: the man page makes it clear... Translate the contents of file1 to upper-case. tr [:lower:] [:upper:] file1 (This should be preferred over the traditional UNIX idiom of ``tr a-z A-Z'', since it works correctly in all locales.) for any other uses, either build the port with locale specified as C as mentioned, or patch the port so: tr '[a-z]' '[A-Z]' becomes: env LC_ALL=C tr '[a-z]' '[A-Z]' the only change that would be appropriate to the tr utility would be a command-line option to select a locale... something like: tr -l C '[a-z]' '[A-Z]' i don't think anyone would object to that, but it would still require patching some ports under some locales... That new option would provide zero benefit. If things are going to be patched anyway then patch them to be standards compliant. maybe another option would be modifying tr to recognize other [new] environment variables... TR_LANG, TR_LC_ALL, TR_LC_CTYPE and TR_LC_COLLATE. done that way, things could be set in /etc/make.conf (or sys.mk), not need any patching, and not interfere with other uses of locale. That would be rather ugly. If tr a-z A-Z is supposed to be deceiving in some locales, then let it remain so unconditionally. -- Jilles Tjoelker ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org
tr A-Z a-z in locales other than C
++ = cnt; - *p = OOBCH; - n = p - s-set; - - s-cnt = 0; - s-state = SET; - if (n 1) - mergesort(s-set, n, sizeof(*(s-set)), charcoll); + s-cnt = stopval - s-lastch + 1; + s-state = RANGE; + --s-lastch; return (1); } -- Jilles Tjoelker ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org
Re: int64_t and printf
On Sun, Jun 05, 2011 at 02:31:27PM -0400, Sean C. Farley wrote: On Sun, 5 Jun 2011, Ben Laurie wrote: So, for example int64_t has no printf modifier I am aware of. Likewise its many friends. In the past I've handled this by having a define somewhere along the lines of... #if something # define INT_64_T_FMT %ld #else # define INT_64_T_FMT %lld #endif but I have no idea where to put such a thing in FreeBSD. Opinions? Also, I guess I'd really need to do a modifier rather than a format, for full generality. You need to include inttypes.h, which includes machine/_inttypes.h. This will provide the appropriate macro which in this case is PRId64. The macros from inttypes.h are certainly valid, but the style in most of FreeBSD prefers casting to a type such as intmax_t or uintmax_t for which a constant format string is available (%jd/%ju). In the particular case of int64_t, it would seem that long long is better than intmax_t, as it is possibly shorter but still guaranteed large enough by C99, but there are objections against this that I do not understand. Also, note the format strings for size_t, %zu, and for ptrdiff_t, %td. -- Jilles Tjoelker ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org
Re: [RFC] rcexecr: rcorder in parallel
On Sat, Jun 04, 2011 at 12:45:37PM +0100, Julien Laffaye wrote: On Sat, Jun 4, 2011 at 10:10 AM, Buganini bugan...@gmail.com wrote: https://github.com/buganini/rcexecr Currently it is able to determine the exec/wait order There are something I haven't digged in deeply in the self modification part. patches/ideas are welcome. Thanks for doing that! Yes. You should use kqueue(2) instead of waitpid(2) so that you can efficiently monitor a pool of processes. See pwait(1) for an example. Hmm, I don't think kqueue() should be used here. Its main advantage is that it works regardless of parent-child relationships, but that advantage is not relevant here. On the other hand, waitpid() is still necessary to get rid of the zombies. Furthermore, waitpid() is standard while kqueue() is not, and I think non-standard interfaces should only be used if they provide a real benefit above standard interfaces. The current approach with waitpid() for specific processes should be good enough for a proof of concept. It will keep zombies longer than necessary, particularly for things that are not explicitly depended on. To avoid this, use waitpid(-1, ...) and maintain more tracking for processes that have already terminated. -- Jilles Tjoelker ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org
Re: Does FreeBSD have replacement for posix_fadvice() or fcntl(F_RDADVISE)?
On Thu, May 12, 2011 at 11:38:12AM +0400, Lev Serebryakov wrote: Does FreeBSD have some custom call, which can be used where Linux programs uses posix_fadvice() and DARWIN ones fcntl(F_RDADVISE)? It is like madvise(2) but for file descriptors. An effect like POSIX_FADV_SEQUENTIAL can be obtained with the F_READAHEAD or F_RDAHEAD fcntl(2) requests. The implementation is divided between the generic VFS layer and various filesystems (cd9660, ext2fs, msdosfs, nfs, xfs and ufs appear to use the information to some degree). Setting readahead to 0 with the fcntl requests has little effect: it resets the state of the sequential access detection heuristic but does not disable it. The O_DIRECT open(2) flag has an effect similar to what POSIX_FADV_NOREUSE should do, except that it changes semantics by adding alignment restrictions. -- Jilles Tjoelker ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org
Re: [LIBC] Modfied Version of sscanf
On Sat, Apr 30, 2011 at 06:44:43PM +0200, Martin Möller wrote: This is my first email to this list, so hello to all members. The current version of sscanf, stops when a whitespace characters occurs in a string when the %s (string) type is used. The following code: char name [20], value [20]; sscanf (Test 2-Test 3, %s-%s, name, value); printf (%s-%s\n, name, value); outputs total garbage on my FreeBSD-7.0-RELEASE #0 amd64. Is there already a way to do this or should we release a new version of sscanf, e.g. called sscanfWS. This modified version would output: Test 2-Test 3. I think you should use functions like memchr(), strchr() and strtok_r() rather than sscanf(). For one, your code has undefined behaviour if the name or the value exceed 19 bytes. If the input is untrusted, as your follow-up seems to indicate, this undefined behaviour likely manifests in allowing an attacker to execute code of his own choosing. Even if you avoid the buffer overflow using a format string like %19s-%19s it is still not very good as you may not get an error if the string is too long. Silent truncation might invalidate security checks done elsewhere and can lead to hard-to-diagnose bugs. -- Jilles Tjoelker ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org
Basic UTF-8 support for sh(1)
Here is a patch that adds basic UTF-8 support to sh(1). This is enabled if the locale is set appropriately. Features: * ${#var} counts codepoints. (Really, bytes with (b 0xc0) != 0x80.) * ?, [...] patterns match codepoints instead of bytes. They do not match invalid sequences. This is so that ${var#?} removes the first codepoint, not the first byte. However, * continues to match any string and an invalid sequence matches an identical invalid sequence. (This differs from fnmatch(3).) Internal: * CTL* bytes are moved to bytes that cannot occur in UTF-8 so that mbrtowc(3) can be used directly. The new locations do occur in iso-8859-* encodings. Limitations: * Only UTF-8 support is added, not any other multibyte encodings. I do not want to bloat up sh with mbrtowc(3) and similar everywhere. * Invalid sequences may not be handled as desired. It seems aborting on invalid UTF-8 sequences would break things, so they are let through. This also avoids bloating the code up with checking everywhere. * There is no special treatment for combining characters, accented letters may match ? or ?? or even more depending on normalization form. This matches other code in FreeBSD and is usually good enough because normalization forms that use as few codepoints as possible tend to be used. * IFS remains byte-based as in ksh93 (but unlike bash and zsh). * Our version of libedit does not support UTF-8 so sh will still be rather unpleasant to use interactively with characters not in us-ascii. Is this useful and worth the (small) bloat? A somewhat related feature is support for \u and \U sequences in $'...' (this will be added to POSIX, see http://austingroupbugs.net/view.php?id=249 and I plan to add it to sh). Ideally, these are converted using iconv(3) but as long as it is not unconditionally available in base or if it is not supposed to be used, the codepoints can be encoded in UTF-8 for UTF-8 locales, leaving other locales with question marks. -- Jilles Tjoelker Index: parser.h === --- parser.h (revision 218371) +++ parser.h (working copy) @@ -34,16 +34,16 @@ */ /* control characters in argument strings */ -#define CTLESC '\201' -#define CTLVAR '\202' -#define CTLENDVAR '\203' -#define CTLBACKQ '\204' +#define CTLESC '\300' +#define CTLVAR '\301' +#define CTLENDVAR '\371' +#define CTLBACKQ '\372' #define CTLQUOTE 01 /* ored with CTLBACKQ code if in quotes */ /* CTLBACKQ | CTLQUOTE == '\205' */ -#define CTLARI '\206' -#define CTLENDARI '\207' -#define CTLQUOTEMARK '\210' -#define CTLQUOTEEND '\211' /* only for ${v+-...} */ +#define CTLARI '\374' +#define CTLENDARI '\375' +#define CTLQUOTEMARK '\376' +#define CTLQUOTEEND '\377' /* only for ${v+-...} */ /* variable substitution byte (follows CTLVAR) */ #define VSTYPE 0x0f /* type of variable substitution */ Index: sh.1 === --- sh.1 (revision 218467) +++ sh.1 (working copy) @@ -2510,4 +2510,7 @@ was originally written by .Sh BUGS The .Nm -utility does not recognize multibyte characters. +utility does not recognize multibyte characters other than UTF-8. +The line editing library +.Xr editline 3 +does not recognize multibyte characters. Index: expand.c === --- expand.c (revision 218371) +++ expand.c (working copy) @@ -52,6 +52,7 @@ __FBSDID($FreeBSD$); #include stdlib.h #include string.h #include unistd.h +#include wchar.h /* * Routines to expand arguments to commands. We have to deal with @@ -111,16 +112,16 @@ static void addfname(char *); static struct strlist *expsort(struct strlist *); static struct strlist *msort(struct strlist *, int); static char *cvtnum(int, char *); -static int collate_range_cmp(int, int); +static int collate_range_cmp(wchar_t, wchar_t); static int -collate_range_cmp(int c1, int c2) +collate_range_cmp(wchar_t c1, wchar_t c2) { - static char s1[2], s2[2]; + static wchar_t s1[2], s2[2]; s1[0] = c1; s2[0] = c2; - return (strcoll(s1, s2)); + return (wcscoll(s1, s2)); } /* @@ -665,6 +666,7 @@ evalvar(char *p, int flag) int special; int startloc; int varlen; + int varlenb; int easy; int quotes = flag (EXP_FULL | EXP_CASE | EXP_REDIR); @@ -712,8 +714,15 @@ again: /* jump here after setting a variable with if (special) { varvalue(var, varflags VSQUOTE, subtype, flag); if (subtype == VSLENGTH) { -varlen = expdest - stackblock() - startloc; -STADJUST(-varlen, expdest); +varlenb = expdest - stackblock() - startloc; +varlen = varlenb; +if (localeisutf8) { + val = stackblock() + startloc; + for (;val != expdest; val++) + if ((*val 0xC0) == 0x80) + varlen--; +} +STADJUST(-varlenb, expdest); } } else { char const *syntax = (varflags VSQUOTE) ? DQSYNTAX @@ -721,7 +730,9 @@ again: /* jump here after setting
Re: man 3 getopt char * const argv[] - is const wrong ?
On Sun, Feb 13, 2011 at 01:59:38PM +0100, Matthias Andree wrote: Am So, 13.02.2011, 13:20 schrieb Julian H. Stacey: Hi Hackers Ref.: man 3 getopt int getopt(int argc, char * const argv[], const char *optstring); Ref.: KR 2nd Ed P.211 last indent, 2nd sentence The purpose of const is to announce [objects] that may be placed in read-only memory, and perhaps to increas[e] opportunities for optimization const only rarely allows optimization, mostly because a declaration like const foo *p only says that the foo may not be modified via p, not that it may not be modified at all. Another important purpose of const is to allow the programmer to document what is not changed by a function, and have the compiler check this documentation. optstring is obviously const, but I don't see that argv can calim to be const ? The const basically states that the argv[] elements (i. e. the char * pointers) are const[ant], or, read another way, that getopt promises not to mess with the *pointers* but is free to modify the actual strings pointed to (with the usual constraints of not assuming they can be extended, for instance) - and I don't see the problem in that. getopt() should not modify the strings themselves either, but there are at least two reasons why the type is not const char *const argv[]: 1. Elements of argv are written into char *optarg. 2. Many programs declare main's second argument as char *argv[] which cannot be converted to const char *const [], other than via a cast. -- Jilles Tjoelker ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org
Removing PATH=...%builtin... from sh
Most ash derivatives have an undocumented feature where the presence of an entry %builtin in $PATH will cause builtins to be checked at that point of the PATH search, rather than before looking at any directories as documented in the man page (very old versions do document this feature). I would like to remove this feature from sh, as it complicates the code, may violate expectations (for example, /usr/bin/alias is very close to a forkbomb with PATH=/usr/bin:%builtin, only /usr/bin/builtin not being another link saves it) and appears to be unused (all the %builtin google code search finds is in some sort of ash source code). The feature also has a bug in that it does not work properly if there is a PATH assignment before 'type', 'command -v' or 'command -V' and %builtin differs between the set and temporary paths. (This worked in 8.x for 'command'; 'type' ignores PATH assignment in 8.x.) It is true that dash ignores %builtin for the special builtins and the regular builtins that are found before PATH, using it only for the other builtins. I could do the same but I think it is not useful enough. See also http://lists.freebsd.org/pipermail/svn-src-head/2010-November/022613.html Which builtins this is about: POSIX special builtins: : . break continue eval exec exit export readonly return set shift times trap unset POSIX commands found before PATH (regular builtins): alias bg cd command false fc fg getopts jobs kill pwd read true umask unalias wait Other POSIX utilities that must be builtins: hash type ulimit Other POSIX utilities: [ echo printf test Non-POSIX utilities that must be builtins: builtin bind chdir jobid local setvar Undocumented builtins: exp let wordexp And for completeness one command that should really be a builtin or otherwise magic: Missing POSIX commands found before PATH (regular builtins): newgrp -- Jilles Tjoelker Index: bin/sh/exec.c === --- bin/sh/exec.c (revision 216690) +++ bin/sh/exec.c (working copy) @@ -92,7 +92,6 @@ static struct tblentry *cmdtable[CMDTABLESIZE]; -static int builtinloc = -1; /* index in path of %builtin, or -1 */ int exerrno = 0; /* Last exec error */ @@ -245,8 +244,7 @@ } while ((name = *argptr) != NULL) { if ((cmdp = cmdlookup(name, 0)) != NULL - (cmdp-cmdtype == CMDNORMAL - || (cmdp-cmdtype == CMDBUILTIN builtinloc = 0))) + cmdp-cmdtype == CMDNORMAL) delete_cmd_entry(); find_command(name, entry, DO_ERR, pathval()); if (verbose) { @@ -337,8 +335,8 @@ goto success; } - /* If %builtin not in path, check for builtin next */ - if (builtinloc 0 (i = find_builtin(name, spec)) = 0) { + /* Check for builtin next */ + if ((i = find_builtin(name, spec)) = 0) { INTOFF; cmdp = cmdlookup(name, 1); if (cmdp-cmdtype == CMDFUNCTION) @@ -354,7 +352,7 @@ prev = -1; /* where to start */ if (cmdp) { /* doing a rehash */ if (cmdp-cmdtype == CMDBUILTIN) - prev = builtinloc; + prev = -1; else prev = cmdp-param.index; } @@ -366,19 +364,7 @@ stunalloc(fullname); idx++; if (pathopt) { - if (prefix(builtin, pathopt)) { -if ((i = find_builtin(name, spec)) 0) - goto loop; -INTOFF; -cmdp = cmdlookup(name, 1); -if (cmdp-cmdtype == CMDFUNCTION) - cmdp = loc_cmd; -cmdp-cmdtype = CMDBUILTIN; -cmdp-param.index = i; -cmdp-special = spec; -INTON; -goto success; - } else if (prefix(func, pathopt)) { + if (prefix(func, pathopt)) { /* handled below */ } else { goto loop; /* ignore unimplemented options */ @@ -485,8 +471,7 @@ for (pp = cmdtable ; pp cmdtable[CMDTABLESIZE] ; pp++) { for (cmdp = *pp ; cmdp ; cmdp = cmdp-next) { - if (cmdp-cmdtype == CMDNORMAL - || (cmdp-cmdtype == CMDBUILTIN builtinloc = 0)) + if (cmdp-cmdtype == CMDNORMAL) cmdp-rehash = 1; } } @@ -506,13 +491,11 @@ const char *old, *new; int idx; int firstchange; - int bltin; old = pathval(); new = newval; firstchange = ; /* assume no change */ idx = 0; - bltin = -1; for (;;) { if (*old != *new) { firstchange = idx; @@ -523,19 +506,12 @@ } if (*new == '\0') break; - if (*new == '%' bltin 0 prefix(builtin, new + 1)) - bltin = idx; if (*new == ':') { idx++; } new++, old++; } - if (builtinloc 0 bltin = 0) - builtinloc = bltin; /* zap builtins */ - if (builtinloc = 0 bltin 0) - firstchange = 0; clearcmdentry(firstchange); - builtinloc = bltin; } @@ -556,9 +532,7 @@ pp = tblp; while ((cmdp = *pp) != NULL) { if ((cmdp-cmdtype == CMDNORMAL - cmdp-param.index = firstchange) - || (cmdp-cmdtype == CMDBUILTIN - builtinloc = firstchange)) { + cmdp-param.index = firstchange)) { *pp = cmdp-next; ckfree(cmdp); } else { ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd
Re: libkvm: consumers of kvm_getprocs for non-live kernels?
On Wed, Nov 10, 2010 at 09:41:52PM +0100, Ulrich Spörlein wrote: I have this cleanup of libkvm sitting in my tree and it needs a little bit of testing, especially the function kvm_proclist, which is only called from kvm_deadprocs which is only called from kvm_getprocs when kd is not ALIVE. The only consumer in our tree that I can make out is *probably* kgdb, as ps(1), top(1), w(1), pkill(1), fstat(1), systat(1), pmcstat(8) and bsnmpd don't really work on coredumps But, the kgdb file gnu/usr.bin/binutils/gdb/kvm-fbsd.c, where kvm_getprocs is probably called on a dead kernel is not even used during build! So I guess I'm staring at dead code here, any kvm people around that can clue me in? It is a while ago that I last used this, but ps and fstat definitely worked on crashdumps, to some extent. /usr/sbin/crashinfo uses this. -- Jilles Tjoelker ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org
Re: [PATCH] update to the latest libedit version and remove libreadline deps
On Fri, Nov 05, 2010 at 04:32:56PM +0100, Baptiste Daroussin wrote: I've updated libedit to the latest version available in the netbsd cvs. UTF8 support is disabled for now has it seems to be experimental and segfault. I also patch and tested all the sources that used to be linked against libreadline so that it now uses libedit making libreadline unused (I guess, perhaps I have missed some) beware that there are collision between libreadline and libedit (/usr/include/readline/readline.h) is provided by both of them. You can find the patch against current here: http://people.freebsd.org/~bapt/update-libedit.patch This patch reverts various local enhancements to libedit, such as: * r212191 more expected behaviour of ed-delete-next-char * r212235 delete key support * r209217,r209219,r209224 completion improvements for sh(1) * built-in \033[1~ and \033[4~ support (not sure how useful this is) * .An macros in man pages If the csh-style history expansion gets in, a set -o histexpand for sh could be useful. I think this is not important enough to add all the code for, but if we have the code anyway why not allow using it. On the other hand, if we can replace it with stubs and is very simplistic, perhaps it is better to stub it out and not change sh. -- Jilles Tjoelker ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org
Re: Space character in rc.conf variable
On Sat, Oct 30, 2010 at 01:11:51PM -0700, Doug Barton wrote: On 10/30/10 11:56, Harald Servat wrote: My SSID has an space character and I don't know how to deal with it. So my question is easy, how do I add an space in the SSID? I've tried wrapping the SSID using \ , \' also adding \(space), changing for ' and others without success. As an example my rc.conf entry looks like ifconfig_wlan0=ssid SSID WITH SPACE dhcp and after booting I see that ifconfig tried to connect to SSID instead to SSID WITH SPACE. As I said, if I run ifconfig wlan0 ssid SSID WITH SPACE from the command line, works fine. The best solution would be to change the SSID to one without spaces. :) Well, as mentioned elsewhere in the thread write the SSID in hex. You also need this if *, ? or [ occur in the SSID. If you can't or won't do that, one of these should work, please report back which one does: ='ssid ssid with space DHCP' ='ssid \ssid with space\ DHCP' =ssid 'ssid with space' DHCP =ssid \'ssid with space\' DHCP This will not work, as the value is stored in a variable and then split on IFS characters by expanding the variable outside quotes. There is no escape mechanism - either the whole variable is split or none of it (if in double-quotes). One workaround would be to set IFS to something not containing space, for example tabnewline, and then delimiting fields with tabs instead of spaces. Apart from the general ugliness all over the rc.confs (including the defaults/ one), breakage would probably also happen in rc.subr and network.subr. Doug's suggestions point to something like eval set -- ${ifconfig_FOO} ifconfig $_if $@ This has potential security issues with arbitrary stuff from untrusted sources (say, some sort of privilege separation where someone may configure the network but not do arbitrary things) in /etc/rc.conf: ifconfig_wlan0='ssid $(chown evilhacker topsecretfile) dhcp' is perfectly safe now, but will be insecure. (Pathname generation may lead to information disclosure, but not executing arbitrary commands.) Also, the syntax for the keywords such as DHCP is too flexible. This currently does not do much harm other than stopping you from using DHCP as an ssid or something, but is hard to remove from $@ in the above code. If they were restricted to appear at the start only, simple comparisons of $1 and shift would do the job, but removing from the middle requires constructing a string with quoting characters and then using eval on that string. Array support in the shell could make this easier, but not much unless the rc.conf syntax would be changed as well, like ifconfig_wlan0=(ssid SSID WITH SPACE dhcp) but then changed some more such that DHCP can be used as SSID. Array support would add a fair bit of code to sh, and even then it will remain fairly limited and clumsy. For example, an array can only be returned from a function by passing the name of an existing array to place the result in and using eval, extending setvar in some strange way or implementing another ksh93 extension, namerefs. -- Jilles Tjoelker ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org
Re: sysrc -- a sysctl(8)-like utility for managing /etc/rc.conf et. al.
On Sat, Oct 09, 2010 at 03:39:44PM -0700, Devin Teske wrote: On Oct 9, 2010, at 1:25 PM, Garrett Cooper wrote: [...] The second usage (: function) aids in finding the function declaration among the usages. See, in Perl, I can simply search for sub preceding the function name. In C, I tend to split the return type from the function name and ensure that the function name always starts in column-1 so I can search for ^funcname to go to the declaration opposed to the usages/references. In BASH, `function' is a valid keyword and you can say function funcname ( ) BLOCK but unfortunately in good ol' bourne shell, function is not an understood keyword, ... but really liking this keyword, I decided to make use of it in bourne shell by-way-of simply making it a non-executed-expression (preceded it with : and terminated it with ;). I think function f() { ... } is a despicable bashism. In POSIX, an empty pair of parentheses is an unambiguous indication of a function definition; although FreeBSD sh also accepts () as a subshell containing an empty command, there is no reason to use that. In ksh, function f { ... } is different from f() { ... }, so there is a reason to use it, and function f() { ... } does not exist. I originally had been programming in tests for '!' and 'in', but in POSIX bourne-shell, they aren't defined (though understood) in the keyword table (so type(1) balks in bourne-shell while csh and bash do respond to '!' and 'in' queries). Since you've pointed out command(1)... I now have a way of checking '!'. Though unfortunately, command -v, like type(1), also does not like in (in bourne-shell at least). The type builtin in the original Bourne shell does not know keywords. The type builtin in FreeBSD sh knows them, but unlike most shells in is not a keyword -- it is only recognized at the appropriate places in a case or for command (this is probably a POSIX violation and may change in the future). x$fmt != x ? It seems like it could be simplified to: if [ $# -gt 0 ] then local fmt=$1 shift 1 eprintf $fmt\n $@ fi I never understood why people don't trust the tools they are using... `[' is very very similar (if not identical) to test(1) [ ... ] is the same thing as [ -n ... ] or test -n ... [ ! ... ] is the same things as [ -z ... ] or test -z ... I'll never understand why people have to throw an extra letter in there and then compare it to that letter. If the variable expands to nothing, go ahead and let it. I've traced every possible expansion of variables when used in the following manner: [ $VAR ] ... and it never fails. If $VAR is anything but null, the entire expression will evaluate to true. Right, except in very old implementations if VAR is -t, but those should be extinct and are not worth coding for. However, in some fairly recent implementations [ $A = $B ] does not work as expected for all values of A and B. For example, FreeBSD 7.0 returns true for A='(' and B=')' because it applies the POSIX rules in the wrong order (certainly, [ '(' $X ')' ] is true for most non-empty values of X, but this does not apply if X is an operator). # depend $name [ $dependency ... ] # # Add a dependency. Die with error if dependency is not met. # : dependency checks performed after depend-function declaration : function ; depend ( ) # $name [ $dependency ... ] { local by=$1 arg shift 1 for arg in $@; do local d Wouldn't it be better to declare this outside of the loop (I'm not sure how optimal it is to place it inside the loop)? I'm assuming you mean the local d statement. There's no restriction that says you have to put your variable declarations at the beginning of a block (like in C -- even if only within a superficial block { in the middle of nowhere } ... like that). Meanwhile, in Perl, it's quite a difference to scope it to the loop rather than the block. So, it all depends on whichever _looks_ nicer to you ^_^ Although this works, I do not recommend it. It makes the impression that the variable is scoped to the do block, which is not the case as there is only function scope. Also, FreeBSD sh will register another local variable record if you make the same variable local again (but it will reset the variable to the correct value later). I'd say that you have bigger fish to try if your shell lacks the needed lexicon to parse built-ins like for, do, local, etc :)... local might be worth checking for as the original Bourne shell and ksh93 don't know it. -- Jilles Tjoelker ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org
Re: Why I can't trace linux process's childs with truss?
On Fri, Sep 10, 2010 at 12:07:05PM -0700, Yuri wrote: I am trying to get the log of all system calls that skype makes with truss -f /usr/local/share/skype/skype For some reason the resulting log only has the leading process calls and nothing from it's 8 childs. Truss doesn't show any 'cloned' processes. Is this a bug in truss that it doesn't follow 'cloned' processes? Is there any workaround or other way I can debug skype? strace doesn't work on amd64. I am primarily interested why it can't read /dev/video0 device, created by webcamd. Try using ktrace instead of truss. You will need devel/linux_kdump from ports to decode the resulting ktrace.out. Alternatively, if you're familiar with dtrace, you could try that. -- Jilles Tjoelker ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org
Re: ar(1) format_decimal failure is fatal?
On Sat, Aug 28, 2010 at 07:08:34PM -0400, Benjamin Kaduk wrote: I've been working on fixing the OpenAFS network filesystem client for FreeBSD, and it's at the point where I want to use the lazy man's filesystem stress test: buildworld. However, quite early in the process, I get the following error: stage 1.1: legacy release compatibility shims [...] === tools/build (obj,includes,depend,all,install) [...] building static egacy library ar: fatal: Numeric user ID too large *** Error code 70 This error appears to be coming from lib/libarchive/archive_write_set_format_ar.c , which seems to only have provisions for outputting a user ID in AR_uid_size = 6 columns. Now, AFS user IDs are not tied to unix IDs; there is a separate protection server whose pts IDs are used for access control; these pts IDs are tied to kerberos authentication. In this case, I was authenticated as daemon/freebuild.mit@athena.mit.edu, with pts ID 33554737, which needs 8 columns to display. It looks like this macro was so defined in version 1.1 of that file, with commit message 'ar' format support for libarchive, contributed by Kai Wang.. This doesn't make it terribly clear whether the 'ar' format mandates this length, or if it is an implementation decision, so I get to ask: what reasoning (if any) was behind this choice? Would anything break if it was bumped up to a larger size? Are there other options for a workaround in my AFS environment? I wonder if the uid/gid fields are useful at all for ar archives. Ar archives are usually not extracted, and when they are, the current user's values seem good enough. The uid/gid also prevent exactly reproducible builds (together with the timestamp). -- Jilles Tjoelker ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org
Re: tiny patch to prevent head from closing pipes
On Sat, Aug 28, 2010 at 06:42:29PM +0400, Anonymous wrote: Alexander Best arun...@freebsd.org writes: i just had subversion complain about a broken pipe while piping its output through awk straight to head [1]. i decided to add a switch to head which will tell it to never close a pipe unless the input has stopped [2]. You can do same with sh(1), e.g. $ svn log | (IFS=; while read li; do [ $((i+=1)) -le 10 ] echo $li; done) versus $ svn log | (IFS=; while read li [ $((i+=1)) -le 10 ]; do echo $li; done) ... svn: Write error: Broken pipe Even easier (and avoiding mangling through read) is svn log | { head; cat /dev/null; } Beware that the cat command will not necessarily get all of the input. head may buffer reads and it is unable to push the extra data back onto the pipe. But I think subversion should I think Subversion should not print an error message for broken pipe on its stdout. there's probably a much more efficient way of discarding the input without closing the pipe unless the input ceased. it's just a 5 minute hack in order to see if people find the idea useful or not. ;) Can you give an example of usefulness that does not involve subversion? -- Jilles Tjoelker ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org
Re: Help with some makefile hackery
On Tue, Jun 22, 2010 at 11:18:43PM -0700, Patrick Mahan wrote: src-kern-tools: cd src; ./machine-kernel-toolchain.sh 21 | tee logfile The pipeline will return the status of 'tee' which is almost always 0. The exit status of the build script is ignored. A simple fix is store the status in a file and refer to that afterwards. Another fix uses a separate file descriptor to pass through the status, like { st=$( { { ./kernel-toolchain.sh 21 3- 4-; echo $? 3; } | tee logfile 4 } 31) } 41 but this is fairly complicated. The issue can be sidestepped entirely by sending the output to a file only; a developer can use tail -f to follow the build if necessary. -- Jilles Tjoelker ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org
Re: libc symbol versioning difficulties with iconv integration
On Fri, Jun 04, 2010 at 12:58:34PM +0200, Gabor Kovesdan wrote: I'm trying to integrate the result of my last SoC work to the base system but I'm facing some difficulties with libc symbol versioning. I placed the iconv code into an iconv subdirectory inside src/lib/libc and I added a Makefile and a symbol map, just like another parts of libc do but when I try to compile this stuff, I get this error in the linking phase: building shared library libc.so.7 /usr/bin/ld: libc.so.7: undefined versioned symbol namefts_o...@fbsd_1.0 /usr/bin/ld: failed to set dynamic section sizes: Bad value *** Error code 1 I have no idea what's going wrong because I did everything exactly in the same way as another components do. I don't know why does it break at fts_open(), which is unrelated to iconv, not even used in the iconv code. If I just unhook the iconv part fromt he build, everything goes fine. Any ideas? Patch is here: http://kovesdan.org/patches/iconv-libc.diff There is a .include bsd.lib.mk in iconv/Makefile.inc, what happens if you take that out? -- Jilles Tjoelker ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org
Re: /dev/null zero inside chroot for make release
On Sat, May 15, 2010 at 09:12:04AM +0200, Tjado Mäcke wrote: Am 13.05.2010 19:44, schrieb Julian H. Stacey: Hi Hackers, Problem with /dev/null /dev/zero inside a chroot: I wanted to build a release from inside a chroot What sort of null zero should be in chroot ? man mknod ... deprecated ... Should I be running a devfs (I'm not currently) Or a jail ? (I dont really want that level of encapsulation ). http://www.ijs.si/software/amavisd/README.chroot mknod dev/nullc 2 2 # FreeBSD This is deprecated as of FreeBSD 5.0 and does not work at all as of FreeBSD 6.0, see mknod(8). Only device nodes in devfs can be used to access devices. -- Jilles Tjoelker ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org
Re: /dev/null zero inside chroot for make release
On Thu, May 13, 2010 at 07:44:58PM +0200, Julian H. Stacey wrote: Problem with /dev/null /dev/zero inside a chroot: I wanted to build a release from inside a chroot What sort of null zero should be in chroot ? man mknod ... deprecated ... Should I be running a devfs (I'm not currently) Or a jail ? (I dont really want that level of encapsulation ). Mount devfs in your chroot. Then use devfs(8) to limit what's visible in that particular devfs, if you want. -- Jilles Tjoelker ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org
Re: [patch] [RFC] pathchk quiet flag
On Mon, Apr 26, 2010 at 08:57:31PM +0300, Eitan Adler wrote: This path adds a -q flag to stop pathchk from outputting error messages but still return an error code. Why do you need this; can't you do something like pathconf -- $pathname 2/dev/null -- Jilles Tjoelker ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org
libc NLS, NFS mounted /usr/local, DHCP, no default route causes hangs
The changes to use NLS for strerror sometimes cause one of my virtual machines to deadlock. This virtual machine runs 9-CURRENT, acquires its IP address via DHCP (virtualbox host-only networking), has no default route and has /usr/local and /usr/home NFS mounted. When the DHCP lease expires such as by resetting the date after a VM saverestore, one of the route(8) commands executed by dhclient-script(8) fails and calls strerror(3). Following the default NLSPATH, catopen(3) looks in /usr/share/nls first; because the catalog is not there it then tries in /usr/local/share/nls which deadlocks because the network is not available. I currently use the attached patch which returns failure on any attempt to open a catalog for language C, but I think this is not correct. Am I using a configuration that is not supposed to work (NFS mounted /usr/local with DHCP in particular), or should this be fixed in some other way? -- Jilles Tjoelker ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org
Re: libc NLS, NFS mounted /usr/local, DHCP, no default route causes hangs
On Fri, Apr 23, 2010 at 04:17:36PM +0200, Jilles Tjoelker wrote: The changes to use NLS for strerror sometimes cause one of my virtual machines to deadlock. This virtual machine runs 9-CURRENT, acquires its IP address via DHCP (virtualbox host-only networking), has no default route and has /usr/local and /usr/home NFS mounted. When the DHCP lease expires such as by resetting the date after a VM saverestore, one of the route(8) commands executed by dhclient-script(8) fails and calls strerror(3). Following the default NLSPATH, catopen(3) looks in /usr/share/nls first; because the catalog is not there it then tries in /usr/local/share/nls which deadlocks because the network is not available. I currently use the attached patch which returns failure on any attempt to open a catalog for language C, but I think this is not correct. Am I using a configuration that is not supposed to work (NFS mounted /usr/local with DHCP in particular), or should this be fixed in some other way? The patch: Index: lib/libc/nls/msgcat.c === --- lib/libc/nls/msgcat.c (revision 206760) +++ lib/libc/nls/msgcat.c (working copy) @@ -138,6 +138,9 @@ lang = C; } + if (strcmp(lang, C) == 0) + NLRETERR(ENOENT); + /* Try to get it from the cache first */ RLOCK(NLERR); SLIST_FOREACH(np, cache, list) { -- Jilles Tjoelker ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org
Re: System() returning ECHILD error on FreeBSD 7.2
On Wed, Feb 10, 2010 at 12:44:57PM +0530, Naveen Gujje wrote: [SIGCHLD handler that calls waitpid()] And, in some other part of the code, we call system() to add an ethernet interface. This system() call is returning -1 with errno set to ECHILD, though the passed command is executed successfully. I have noticed that, the problem is observed only after we register SigChildHandler. If I have a simple statement like system(ls) before and after the call to signal(SIGCHLD, SigChildHandler), the call before setting signal handler succeeds without errors and the call after setting signal handler returns -1 with errno set to ECHILD. Here, I believe that within the system() call, the child exited before the parent got a chance to call _wait4 and thus resulted in ECHILD error. But, for the child to exit without notifying the parent, SIGCHLD has to be set to SIG_IGN in the parent and this is not the case, because we are already setting it to SigChildHandler. If I set SIGCHLD to SIG_DFL before calling system() then i don't see this problem. I would like to know how setting SIGCHLD to SIG_DFL or SigChildHanlder is making the difference. I think your process is multi-threaded. In a single-threaded process, system()'s signal masking will ensure it will reap the zombie, leaving the signal handler with nothing (in fact, as of FreeBSD 7.0 it will not be called at all unless there are other child processes). In a multi-threaded process, each thread has its own signal mask and system() can only affect its own thread's signal mask. If another thread has SIGCHLD unblocked, the signal handler will race with system() trying to call waitpid() first. Possible fixes are using siginfo_t information to only waitpid() child processes you know about, setting up the signal masks so the bad situation does not occur (note that the signal mask is inherited across pthread_create()) and calling fork/execve and managing the child process exit yourself. Note that POSIX does not require system() to be thread-safe. -- Jilles Tjoelker ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org
Re: ps time field jumps backward
On Fri, Feb 05, 2010 at 05:45:01PM -0500, Linda Messerschmidt wrote: On Fri, Feb 5, 2010 at 4:28 PM, Dan Nelson dnel...@allantgroup.com wrote: Ideally, top and ps would total up all the per-thread CPU counts when displaying the per-process numbers, but it doesn't seem to. It does seem to total them: $ ps axHo pid,lwp,time,wchan,comm | awk '$1 == 1647' 1647 100401 0:00.63 select mysqld 1647 100466 0:11.08 sigwai mysqld 1647 100521 0:00.00 ucond mysqld $ ps axo pid,lwp,time,wchan,comm | awk '$1 == 1647' 1647 100521 0:11.71 ucond mysqld But you put me on the right track. I ran both side by side for awhile, and found that ps/top only sums up those threads that haven't already exited. I.e., once a thread has exited, it's as if its usage never happened from the perspective of ps and top's total calculation. That seems like undesirable behavior, particularly if it conceals CPU-churning behavior by short-lived threads, but possibly very hard to change. :( I wonder if the system accounting records are more accurate? It should not be very hard to fix the time field, because the rusage struct is correct. This can be seen in the ^T status line and, after the process has terminated, time, times, ps S. In the kernel code, sys/kern/kern_proc.c, it seems that fill_kinfo_proc_only() puts in the correct ki_runtime using p-p_rux.rux_runtime, but fill_kinfo_aggregate() later overwrites this using the sum of all threads' td-td_runtime. Removing the bogus calculation from fill_kinfo_aggregate() fixes ps(1)'s time field, but not the %cpu field, nor anything in top(1). The latter is because top(1) always requests information about threads and does the same wrong calculation as fill_kinfo_aggregate() internally. Fixing the %cpu field needs changes to the scheduler. The schedulers have functions to propagate CPU usage from terminating child processes and threads, but this seems to affect scheduling decisions only and not the %cpu field. Note that the CPU usage is always charged to the most recently created thread in the process, not necessarily the thread that called or will call fork(), pthread_create(), waitpid() or pthread_join(). If the thread charged to could be selected better, it could be useful to add in the cpu time as well. Index: sys/kern/kern_proc.c === --- sys/kern/kern_proc.c(revision 203549) +++ sys/kern/kern_proc.c(working copy) @@ -676,11 +676,9 @@ kp-ki_estcpu = 0; kp-ki_pctcpu = 0; - kp-ki_runtime = 0; FOREACH_THREAD_IN_PROC(p, td) { thread_lock(td); kp-ki_pctcpu += sched_pctcpu(td); - kp-ki_runtime += cputick2usec(td-td_runtime); kp-ki_estcpu += td-td_estcpu; thread_unlock(td); } Test program that starts threads that waste about 1 second of cpu time each. #include pthread.h #include stdio.h #include unistd.h #include err.h void * threadfunc(void *unused) { int error; error = pthread_setcanceltype(PTHREAD_CANCEL_ASYNCHRONOUS, NULL); if (error) errc(1, error, pthread_setcanceltype(PTHREAD_CANCEL_ASYNCHRONOUS)); for (;;) ; } int main(int argc, char *argv[]) { int error; pthread_t td; for (;;) { error = pthread_create(td, NULL, threadfunc, NULL); if (error != 0) errc(1, error, pthread_create); sleep(1); error = pthread_cancel(td); if (error != 0) errc(1, error, pthread_cancel); error = pthread_join(td, NULL); if (error != 0) errc(1, error, pthread_join); } return 0; } -- Jilles Tjoelker ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org
Re: Unique process id (not pid) and accounting daemon
On Mon, Jan 25, 2010 at 02:33:35AM +0300, cronfy wrote: I am trying to create an accounting daemon that would be more precise than usual BSD system accounting. It should read the whole process tree from time to time (say, every 10 seconds) and log changes in usage of CPU, I/O operations and memory per process. After daemon notices process exit, it should read /var/account/acct to get a last portion of accounting data and make a last entry for the process. Also daemon should read /var/account/acct to find information about processes that had been running between taking process tree snapshots. There is a problem: it is not always possible to link a process in a process tree against matching process in an accounting file. Only command name, user/group id and start time will match, but: * start time may change (i. e. after ntpdate); * command name saved in /var/account/acct is 15 characters max (AC_COMM_LEN in sys/sys/acct.h), while command name in the process tree is 19 characters max (MAXCOMLEN in sys/sys/param.h). To ensure that process in the process tree and process in the accounting file are the same, I want to add unique process identifier (uint64_t) to 'proc' struct in sys/sys/proc.h and increment it for every process fork. I see it is possible to do this just before sx_sunlock() in fork1() in sys/kern/kern_fork.c. I'll have to add saving of this identifier in kern_acct.c, of course. This way I will be extremely easy to remember a process in the process tree and find a matching one in the accounting file after it finishes. Am I looking in a right direction or should I try some other way? Thanks in advance. Have you looked at audit(4)? -- Jilles Tjoelker ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org
Re: Greetings... a patch I would like your comments on...
On Fri, Jan 22, 2010 at 07:55:47AM -0800, Randall Stewart wrote: On Jan 22, 2010, at 7:33 AM, Ivan Voras wrote: On 01/22/10 16:10, Ed Schouten wrote: * Ivan Vorasivo...@freebsd.org wrote: This is a good and useful addition! I think Windows has implemented a generalization of this (called wait objects or something like that), which effectively allows a select()- (or in this case kqueue())-like syscall to wait on both file descriptors and condvars (as well as probably other MS-style objects). It's useful for multiplexing events for dissimilar sources. NtWaitForSingleObject(), NtWaitForMultipleObjects(), etc. :-) Yes, I was thinking about WaitForMultipleObjects() - I sometimes wished I had it in FreeBSD :) I think the hackers@ side of the thread is missing the original link to the patch file offered for review, so here it is: http://people.freebsd.org/~rrs/kque_umtx.patch Cool. My kqueue-fu level is too low to be really useful here but from what I've read it looks like a logical and even reasonably clean way of doing it. thanks it made sense to me ;-) If I read the comment at filt_umtxattach() correctly, in the best case you would need an extension to the kevent structure to add more fields like data udata (for passing values back and forth between userland and kernel). I agree with this - it would be very convenient for some future purposes (like file modification notification) if the kernel filter could both accept and return a struct of data from/to the userland. Yeah, more arguments inside the kevent would allow me to add the COND_CV_WAIT* where a lock and condition are passed in as well... But I was hesitant to add more than was already there since doing so would cause ABI ripples that I did not want to face. I don't think passing the lock is needed. Traditional way (error handling omitted): pthread_mutex_lock(M); while (!P) pthread_cond_wait(C, M); do_work(); pthread_mutex_unlock(M); The thread must be registered as a waiter before unlocking the mutex, otherwise a wakeup could be lost. Possible kqueue way (error handling omitted): kq = kqueue(); pthread_mutex_lock(M); while (!P) { pthread_cond_kqueue_register_wait_np(C, kq); pthread_mutex_unlock(M); nevents = kevent(kq, NULL, 0, events, 1, NULL); ... handle other events ... pthread_mutex_lock(M); } do_work(); pthread_mutex_unlock(M); close(kq); To avoid lost wakeup, the kqueue must be registered as a waiter before unlocking the mutex. Once it has been registered, it is safe to unlock the mutex as the kqueue will store the fact that the condition has been signalled. Hence, pthread_cond_kqueue_register_wait_np() or however it will be named does not need the mutex explicitly. kevent() needs to return some sort of identifier of C, possibly via kevent.udata. In that case the hack with fflags and data remains needed. Registering multiple condvars in one function call could be nasty, as it requires all the mutexes to be locked simultaneously. On the other hand if you do want registration and waiting in the same function, like kqueue can do with file descriptors, then it is needed to pass all the locks so they can be unlocked after registering and before waiting. I think this results in a rather complicated API though. Adding a mutex to a kqueue probably means that mutexes are used wrongly: they should not be locked for so long that it is desirable to wait for one together with other objects (pthread_mutex_timedlock() is meant as a fail-safe). Semaphores could also be waited for via kqueue, probably via a version of sem_trywait() that registers the kqueue as a waiter. Once this is signalled, the thread can call the function again to obtain the semaphore or register again if another thread obtained the semaphore first. -- Jilles Tjoelker ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org
Re: [PATCH] linprocfs dofilesystems
On Fri, Jan 08, 2010 at 06:19:29PM +0100, Fernando Apesteguía wrote: This patch implements the filesystems file in the linux proc fs. I have used it for some time without seeing any problems. Let me know in case this is useful. Concrete examples of Linux apps that start working or work better with this change will be helpful. -- Jilles Tjoelker ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org
Re: linprocfs Input/output error
On Fri, Jan 01, 2010 at 06:45:33PM +0100, Fernando Apesteguía wrote: Hi all, I post here cause I didn't get any answers in freebsd-emulation. In 8.0-RELEASE-p1 if I try to execute this: cat /compat/linux/proc/cpuinfo ~/cpuinfo.txt I get cat: /compat/linux/proc/cpuinfo: Input/output error truss shows the read system call returns ERR#5. It is the same with other files from linprocfs. cat /compat/linux/proc/[any_file] works fine What am I missing? You are not missing anything, this is a bug. A while ago, cat(1) was changed to use larger buffers when writing to regular files on machines with sufficient RAM, usually a multiple of MAXPHYS. On the other hand, pseudofs (the general framework for filesystems such as procfs, linprocfs and linsysfs) in src/sys/fs/pseudofs/pseudofs_vnops.c pfs_read() fails any read over MAXPHYS + 1 with EIO. This limit probably has to do with the allocation of a buffer of that size using sbuf_new(9) (and so, malloc(9)). Some sort of limit seems appropriate, but MAXPHYS seems unrelated, and if the request is too long it should perhaps just truncate it. -- Jilles Tjoelker ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org
[PATCH] fts(3): allow FTS_LOGICAL without FTS_NOCHDIR
fts(3) currently automatically enables FTS_NOCHDIR if FTS_LOGICAL (-L in various utilities) is used, according to a comment because symlinks are too hard. However, the effect of fts_instr=FTS_FOLLOW seems quite similar to FTS_LOGICAL in chdir mode. Using one more bit in fts_flags, I made FTS_LOGICAL work with chdir. This has several advantages: * it runs faster * it eliminates the PATH_MAX restriction on deep directory trees * find -L ... -type l -delete can be made to work without ugliness (although it is of course still insecure if run over directory trees where untrusted users have write access) Question: why are the values for fts_flags in the public header file fts.h, even though they are supposedly private? In case the mailing list eats the patch, it is also available here: http://www.stack.nl/~jilles/unix/fts-logical-chdir.patch -- Jilles Tjoelker Index: include/fts.h === --- include/fts.h (revision 201269) +++ include/fts.h (working copy) @@ -106,6 +106,7 @@ #define FTS_DONTCHDIR 0x01 /* don't chdir .. to the parent */ #define FTS_SYMFOLLOW 0x02 /* followed a symlink to get here */ #define FTS_ISW 0x04 /* this is a whiteout object */ +#define FTS_DIRSYM 0x08 /* this is a symlink to a directory */ unsigned fts_flags; /* private flags for FTSENT structure */ #define FTS_AGAIN 1 /* read node again */ Index: lib/libc/gen/fts.c === --- lib/libc/gen/fts.c (revision 201269) +++ lib/libc/gen/fts.c (working copy) @@ -141,10 +141,6 @@ /* Shush, GCC. */ tmp = NULL; - /* Logical walks turn on NOCHDIR; symbolic links are too hard. */ - if (ISSET(FTS_LOGICAL)) - SET(FTS_NOCHDIR); - /* * Start out with 1K of path space, and enough, in any case, * to hold the user's paths. @@ -355,8 +351,10 @@ /* If skipped or crossed mount point, do post-order visit. */ if (instr == FTS_SKIP || (ISSET(FTS_XDEV) p-fts_dev != sp-fts_dev)) { - if (p-fts_flags FTS_SYMFOLLOW) + if (p-fts_flags FTS_SYMFOLLOW) { (void)_close(p-fts_symfd); +p-fts_symfd = -1; + } if (sp-fts_child) { fts_lfree(sp-fts_child); sp-fts_child = NULL; @@ -385,6 +383,14 @@ * FTS_STOP or the fts_info field of the node. */ if (sp-fts_child != NULL) { + if (p-fts_flags FTS_DIRSYM + !(p-fts_flags FTS_SYMFOLLOW)) { +if ((p-fts_symfd = _open(., O_RDONLY, 0)) 0) { + p-fts_errno = errno; + p-fts_info = FTS_ERR; +} else + p-fts_flags |= FTS_SYMFOLLOW; + } if (fts_safe_changedir(sp, p, -1, p-fts_accpath)) { p-fts_errno = errno; p-fts_flags |= FTS_DONTCHDIR; @@ -674,8 +680,8 @@ nlinks = 0; /* Be quiet about nostat, GCC. */ nostat = 0; - } else if (ISSET(FTS_NOSTAT) ISSET(FTS_PHYSICAL)) { - if (fts_ufslinks(sp, cur)) + } else if (ISSET(FTS_NOSTAT)) { + if (ISSET(FTS_PHYSICAL) fts_ufslinks(sp, cur)) nlinks = cur-fts_nlink - (ISSET(FTS_SEEDOT) ? 0 : 2); else nlinks = -1; @@ -707,7 +713,19 @@ */ cderrno = 0; if (nlinks || type == BREAD) { - if (fts_safe_changedir(sp, cur, dirfd(dirp), NULL)) { + if (cur-fts_flags FTS_DIRSYM + !(cur-fts_flags FTS_SYMFOLLOW)) { + if ((cur-fts_symfd = _open(., O_RDONLY, 0)) 0) { +if (nlinks type == BREAD) + cur-fts_errno = errno; +cur-fts_flags |= FTS_DONTCHDIR; +descend = 0; +cderrno = errno; + } else +cur-fts_flags |= FTS_SYMFOLLOW; + } + if (!(cur-fts_flags FTS_DONTCHDIR) + fts_safe_changedir(sp, cur, dirfd(dirp), NULL)) { if (nlinks type == BREAD) cur-fts_errno = errno; cur-fts_flags |= FTS_DONTCHDIR; @@ -796,7 +814,8 @@ } else if (nlinks == 0 #ifdef DT_DIR || (nostat - dp-d_type != DT_DIR dp-d_type != DT_UNKNOWN) + dp-d_type != DT_DIR dp-d_type != DT_UNKNOWN + (dp-d_type != DT_LNK || ISSET(FTS_PHYSICAL))) #endif ) { p-fts_accpath = @@ -883,7 +902,7 @@ FTSENT *t; dev_t dev; ino_t ino; - struct stat *sbp, sb; + struct stat *sbp, sb, sb2; int saved_errno; /* If user needs stat info, stat buffer already allocated. */ @@ -923,6 +942,20 @@ if (S_ISDIR(sbp-st_mode)) { /* + * Check if this is actually a symlink to a directory. + * If so, record this fact so we save a file descriptor + * to get back instead of using ... + */ + if (ISSET(FTS_LOGICAL) !follow) { + if (lstat(p-fts_accpath, sb2)) { +p-fts_errno = errno; +goto err; + } + if (S_ISLNK(sb2.st_mode)) +p-fts_flags |= FTS_DIRSYM; + } + + /* * Set the device/inode. Used to find cycles and check for * crossing mount points. Also remember the link count, used * in fts_build to limit the number of stat calls. It is ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org
why does _PATH_STDPATH contain the current directory?
/usr/include/paths.h has: /* All standard utilities path. */ #define _PATH_STDPATH /usr/bin:/bin:/usr/sbin:/sbin: The current directory appears to have been added accidentally years ago. Can I go ahead and take it out (the colon)? The paths for rescue already do not have the current directory. The main use for _PATH_STDPATH is to find all standard utilities, such as for the POSIX-specified confstr(_CS_PATH), 'getconf PATH' and 'command -p'. Some utilities also use it instead of _PATH_DEFPATH for root, but these values tend to be overridden by /etc/login.conf and/or shell startup files. -- Jilles Tjoelker ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org
Re: Workaround for ntop as daemon, is it ok?
On Fri, Nov 27, 2009 at 09:19:11AM +0100, Henner Morten Kruse wrote: I have just set up an ntop server based on 8.0-RELEASE. FreeBSD ntop 8.0-RELEASE FreeBSD 8.0-RELEASE #0: Sat Nov 21 15:48:17 UTC 2009 r...@almeida.cse.buffalo.edu:/usr/obj/usr/src/sys/GENERIC i386 After installing ntop 1.3.10 and all dependencies from the ports ntop did work, but when running ntop as a daemon I got permanent and repeating warning messages: [warn] kevent: Bad file descriptor [...] I found out that ntop forks another thread for the daemon and kills the initial one. The problem with this behaviour is that the kqueue is started by the initial thread and the daemon thread doesn't use the same file descriptors. So the kqueue is lost. That's a process, not a thread. [...] When I change the fork() in line 186 to rfork(RFPROC) everything works and I get no more warning messages and procstat reports an existing kqueue for the daemonized ntop: So my question to this is: Is my workaround ok or could this cause any problems? And what is the cause of these warnings? Is it a bug or incapability in the kqueue implementation or is it caused by bad code in ntop? Because it refers to other file descriptors, a kqueue is only really meaningful in the file descriptor table it was created in (this could be avoided for events that do not refer to file descriptors, and I think Solaris's ports system does that). As described in the kqueue(2) man page, calling rfork(2) without RFFDG will allow sharing the kqueue between the two processes. As long as the parent process exits quickly, or no tricks with file descriptor numbers are done, this is pretty safe. Another fix is to create the kqueue in the child process. -- Jilles Tjoelker ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org
Re: CFR: Exceedingly minor fixes to libc
On Fri, Nov 13, 2009 at 12:18:49AM -0500, Garrett Wollman wrote: If you have a moment, please take a look at the following patch. It contains some very minor fixes to various parts of libc which were found by the clang static analyzer. They fall into a few categories: 1) Bug fixes in very rare situations (mostly error-handling code that has probably never been executed). 2) Dead store elimination. 3) Elimination of unused variables. (Or in a few cases, making use of them.) Some minor style problems were also fixed in the vicinity. There should be no functional changes except in very unusual conditions. This looks mostly ok, with a few exceptions. [...] Index: gen/getpwent.c === --- gen/getpwent.c(revision 199242) +++ gen/getpwent.c(working copy) @@ -257,22 +257,19 @@ [...] @@ -1876,7 +1871,6 @@ syslog(LOG_ERR, getpwent memory allocation failure); *errnop = ENOMEM; - rv = NS_UNAVAIL; break; } st-compat = COMPAT_MODE_NAME; I think this change hides the wrongness in the handling of malloc errors while leaving it as broken as it is. [...] Index: net/getservent.c === --- net/getservent.c (revision 199242) +++ net/getservent.c (working copy) @@ -476,11 +476,11 @@ struct nis_state *st; int rv; - enum nss_lookup_type how; char *name; char *proto; int port; + enum nss_lookup_type how; struct servent *serv; char *buffer; size_t bufsize; @@ -491,7 +491,8 @@ name = NULL; proto = NULL; - how = (enum nss_lookup_type)mdata; + + how = (intptr_t)mdata; switch (how) { case nss_lt_name: name = va_arg(ap, char *); In what way is this an improvement? Index: posix1e/acl_delete_entry.c === --- posix1e/acl_delete_entry.c(revision 199242) +++ posix1e/acl_delete_entry.c(working copy) @@ -89,20 +89,20 @@ return (-1); } - if ((acl-ats_acl.acl_cnt 1) || - (acl-ats_acl.acl_cnt ACL_MAX_ENTRIES)) { + if ((acl_int-acl_cnt 1) || + (acl_int-acl_cnt ACL_MAX_ENTRIES)) { errno = EINVAL; return (-1); } If you are changing this code anyway, consider fixing a style bug (extraneous parentheses) here. - for (i = 0; i acl-ats_acl.acl_cnt;) { - if (_entry_matches((acl-ats_acl.acl_entry[i]), entry_d)) { + for (i = 0; i acl_int-acl_cnt;) { + if (_entry_matches((acl_int-acl_entry[i]), entry_d)) { /* ...shift the remaining entries... */ - for (j = i; j acl-ats_acl.acl_cnt - 1; ++j) - acl-ats_acl.acl_entry[j] = - acl-ats_acl.acl_entry[j+1]; + for (j = i; j acl_int-acl_cnt - 1; ++j) + acl_int-acl_entry[j] = + acl_int-acl_entry[j+1]; /* ...drop the count and zero the unused entry... */ - acl-ats_acl.acl_cnt--; - bzero(acl-ats_acl.acl_entry[j], + acl_int-acl_cnt--; + bzero(acl_int-acl_entry[j], sizeof(struct acl_entry)); acl-ats_cur_entry = 0; -- Jilles Tjoelker ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org
Re: [patch] add pwait utility
On Sat, Nov 07, 2009 at 03:01:36PM +0200, Kostik Belousov wrote: On Sat, Nov 07, 2009 at 11:28:32AM +0100, Gary Jennejohn wrote: On Fri, 6 Nov 2009 23:24:46 +0100 Jilles Tjoelker jil...@stack.nl wrote: I propose adding a small new utility to /usr/bin: pwait. Similar to the Solaris utility of the same name, it waits for any process to terminate. Why not /bin so it can be used before /usr is mounted? And it seems to make sense to add this functionality to pkill/pgrep binary, creating another hardlink to it. Hmm, pwait's syntax is incompatible: it takes pids (pkill says: use kill) and the -v option does something totally different. -- Jilles Tjoelker ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org
Re: help needed to fix contrib/ee crash/exit when receiving SIGWINCH
On Fri, Oct 23, 2009 at 07:56:35PM +0400, Eygene Ryabinkin wrote: Hmm, we can transform this code to the following one: - errno = 0; do { in = wgetch(text_win); } while (errno == EINTR); if (in == -1) exit(0); - This won't help with FreeBSD's ncurses, but may be other variants will feel much better with such a event loop variant. That should be: - do in = wgetch(text_win); while (in == -1 errno == EINTR); if (in == -1) exit(0); - errno should only be checked after failed function calls or for functions where it is documented that errno should be used to check for error conditions (example: readdir). -- Jilles Tjoelker ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org
Re: semaphores between processes
On Thu, Oct 22, 2009 at 04:08:11PM -0400, Andrew Gallatin wrote: We then moved on to posix semaphores. Using sem_wait/sem_post with the sem_t residing in a shared page seems to work on all 3 platforms. However, the FreeBSD (7-stable) man page for sem_init(3) has this scary text regarding the pshared value: The sem_init() function initializes the unnamed semaphore pointed to by sem to have the value value. A non-zero value for pshared specifies a shared semaphore that can be used by multiple processes, which this implementation is not capable of. Is this text obsolete? Or is my test just getting lucky? They work, but only if the processes are related and do not exec and the parent process initializes the semaphores before forking. This is because sem_t is a pointer to a malloc'ed structure. For process-shared semaphores this really only contains an identifier of the kernel semaphore. This also means process-shared semaphores are slower than in-process semaphores (libthr implements those using atomics and umtx so that system calls are only needed if a thread needs to sleep or be awakened). This is documented in comments in the source code, but not in man pages or other documentation. Is there recommended way to do this? Apart from sysv semaphores, perhaps posix named semaphores (sem_open() etc). These require a 'kldload sem' on older versions though. -- Jilles Tjoelker ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org
Re: sigwait - differences between Linux FreeBSD
On Thu, Oct 08, 2009 at 01:02:09PM +0300, Kostik Belousov wrote: On Thu, Oct 08, 2009 at 11:53:21AM +1100, Stephen Hocking wrote: In my efforts to make the xrdp port more robust under FreeBSD, I have discovered that sigwait (kind of an analogue to select(2), but for signals rather than I/O) re-enables ignored signals in its list under Linux, but not FreeBSD. The sesman daemon uses SIGCHLD to clean up after a session has exited. Under Linux this works OK, under FreeSBD it doesn't. I have worked around it in a very hackish manner (define a dummy signal handler and enable it using signal, which means that the sigwait call can then be unblocked by it), but am wondering if anyone else has run across the same problem, and if so, if they fixed it in an elegant manner. Also, does anyone know the correct semantics of sigwait under this situation? ports@ is the wrong list to discuss the issue in the base system. Solaris 10 sigwait(2) manpage says the following: If sigwait() is called on an ignored signal, then the occurrence of the signal will be ignored, unless sigaction() changes the disposition. We have the same behaviour as Solaris, ingored signals are not queued or recorded regardeless of the presence of sigwaiting thread. POSIX permits both behaviours here: a blocked and ignored signal may or may not be discarded immediately on generation. Making this depend on whether there is a sigwaiting thread seems broken, and I don't think Linux does that. I think your very hackish approach is correct: set up a dummy signal handler after blocking the signal. Additionally, POSIX requires applications to set the SA_SIGINFO flag if they want queuing. This applies even if the signals are blocked and received using sigwaitinfo(2) or sigtimedwait(2). The SA_SIGINFO flag can only be set by setting a handler using sigaction(2). (Note, this does not mean that all signals are queued if SA_SIGINFO is set. It means that signals may not be queued if SA_SIGINFO is not set.) -- Jilles Tjoelker ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org
Re: Problem in bin/sh stripping the * character through ${expansion%}
On Fri, Aug 07, 2009 at 03:26:50AM +0400, Eygene Ryabinkin wrote: Thu, Aug 06, 2009 at 11:15:12AM -0700, Doug Barton wrote: I came across this problem during a recent portmaster update. When trying to strip off the * character using variable expansion in bin/sh it doesn't work. Other special characters do work if they are properly escaped. The attached mini-script clearly shows the problem: $ sh sh-strip-problem var before stripping: foo\* var after stripping: foo\* var before stripping: foo\$ var after stripping: foo\ According to the sh(1), it is not a problem. Namely, - \* being unquoted at all will produce a lone '*'; - '*' when treated as the smallest pattern, will result in a stripping of a zero-length string -- it is the smallest pattern in the case of '*' that matches anything. That is indeed an explanation why it works that way, but I think it is wrong. Generally, the shell command language avoids unnecessary levels of quoting. In the POSIX spec, Shell Command Language, note the part about ${x#*} (pattern) and ${x#*} (literal asterisk). Also compare with case $something in \*) echo asterisk;; esac which matches a literal asterisk. Two PRs already exist for aspects of stripping: bin/57554 (double quotes) and bin/117748 (trying to match pattern matching characters literally). In order to strip the trailing star you should use - var=${var%[*]} - This gives you the pattern of '[*]' that is properly treated as the single star -- it's a weird way to escape the star in the patterns. This is indeed a good workaround. -- Jilles Tjoelker ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org