Re: svn commit: r237660 - head/lib/libc/gen
On Thu, Jun 28, 2012 at 4:00 PM, David Xu listlog2...@gmail.com wrote: On 2012/6/28 16:49, David Xu wrote: On 2012/6/28 15:53, Konstantin Belousov wrote: On Thu, Jun 28, 2012 at 10:53:03AM +0800, David Xu wrote: On 2012/6/28 10:32, Attilio Rao wrote: 2012/6/28, David Xulistlog2...@gmail.com: On 2012/6/28 10:21, Attilio Rao wrote: 2012/6/28, David Xulistlog2...@gmail.com: On 2012/6/28 4:32, Konstantin Belousov wrote: Author: kib Date: Wed Jun 27 20:32:45 2012 New Revision: 237660 URL: http://svn.freebsd.org/changeset/base/237660 Log: Optimize the handling of SC_NPROCESSORS_CONF, by using auxv AT_NCPU value if present. MFC after: 1 week Modified: head/lib/libc/gen/sysconf.c Modified: head/lib/libc/gen/sysconf.c == --- head/lib/libc/gen/sysconf.c Wed Jun 27 20:24:25 2012 (r237659) +++ head/lib/libc/gen/sysconf.c Wed Jun 27 20:32:45 2012 (r237660) @@ -42,6 +42,7 @@ __FBSDID($FreeBSD$); #includesys/resource.h #includesys/socket.h +#includeelf.h #includeerrno.h #includelimits.h #includepaths.h @@ -51,6 +52,7 @@ __FBSDID($FreeBSD$); #include ../stdlib/atexit.h #include tzfile.h /* from ../../../contrib/tzcode/stdtime */ +#include libc_private.h #define _PATH_ZONEINFO TZDIR /* from tzfile.h */ @@ -585,6 +587,8 @@ yesno: case _SC_NPROCESSORS_CONF: case _SC_NPROCESSORS_ONLN: + if (_elf_aux_info(AT_NCPUS,value, sizeof(value)) == 0) + return ((long)value); mib[0] = CTL_HW; mib[1] = HW_NCPU; break; Will this make controlling the number of CPU online or CPU hotplug be impossible on FreeBSD ? If I think about hotplug CPUs I can think of other 1000 problems/races/bad situations to be fixed before this one, really. These are problems only in kernel, but kib's change is about ABI between userland and kernel, I hope we don't introduce an ABI which is not extendable road stone. I'm not entirely sure I see the ABI breakage here. It is not breakage, it is the ABI thinks number of online cpu is fixed, obviously, it is not the case in future unless FreeBSD won't support dynamic number of online cpus. If the AT_NCPUS becames unconvenient and not correct at some point we can just fix sysconf() to not look into the aux vector anymoe. If you already know this will be a problem, why do you introduce it and later need to fix it ? Please note that AT_NCPUS is already exported nowadays. I think this is instead a clever optimization to avoid the sysctl() (usual way to retrieve the number of CPUs). But why don't you cache it in libc ? following code is enough: static int online_cpu; if (online_cpu == 0) online_cpu = sysctl return online_cpu; Thread did evolved somewhat while I was AFK. First, please note that the ABI which I designed there is fixable: if kernel does not export AT_NCPUS at all, then auxv correctly handles the situation returning an error, and libc falls back to sysctl(2). Do we really want to bypass sysctl and instead passing all info via auxv vector ? I found the sysconf() is a bunch of switch-case, which is already slow, before _SC_NPROCESSES_ONLN, it has already a quite number of case branches, and in your code, it calls _elf_aux_info() which also has some switch-cases branch, if you cache smp_cpus in libc, the call for _elf_aux_info is not needed, and you don't need code in kernel to passing it either, in any case, the code to call sysctl is still needed, so why don't we just use sysctl instead and cache the result in libc ? this at least can generate small code and a bit faster after first call to sysconf(_SC_NPROCESSES_ONLN). And as a side note, I think we should not put non-critical code into fork/exec path, these two functions are rather critical path for any UNIX like system, anything slowing down these two functions will affect overall performance, so we should not waste cpu cycle trying to push data to user mode via auxv or other ways and yet the data is not used by user code in most time, such as the number of online cpu. And in rtld-elf or libc, we should not waste too much time before executing main(). My motivation for extending auxv vector and to develop auxv.c was exactly to shave around dozen of syscalls from the application startup sequence. If you look at the ktrace output of the binary startup on RELENG_8 libc, you should note exactly the sysctls to request ncpus, pagesizes, canary and so on. In RELENG_9 and HEAD, there are no sysctls in the trace, because the data is already pre-filled by kernel for libc consumption. The sysconf(3) commit you are commenting on was caused by jemalloc(3) initialization starting using _sysconf(3) to query ncpu (for older version, which used sysctl directly, I direct auxv call). So HEAD has temporary +1 sysctl syscall done
Re: svn commit: r230583 - head/sys/kern
On Sun, Jan 29, 2012 at 05:39:04PM -0500, David Schultz wrote: On Sun, Jan 29, 2012, Kostik Belousov wrote: On Sat, Jan 28, 2012 at 07:12:25PM -0500, David Schultz wrote: On Sat, Jan 28, 2012, Kostik Belousov wrote: On Fri, Jan 27, 2012 at 02:42:21PM -0500, David Schultz wrote: On Fri, Jan 27, 2012, Kostik Belousov wrote: On Fri, Jan 27, 2012 at 07:50:30PM +1100, Bruce Evans wrote: On Thu, 26 Jan 2012, Gleb Smirnoff wrote: On Thu, Jan 26, 2012 at 11:53:57PM +1100, Bruce Evans wrote: B @@ -1552,6 +1552,12 @@ aio_aqueue(struct thread *td, struct aio B return (error); B } B B + /* XXX: aio_nbytes is later casted to signed types. */ B + if ((int)aiocbe-uaiocb.aio_nbytes 0) { B B This should avoid implementation-defined behaviour by checking if B B (uncast)aiocbe-uaiocb.aio_nbytes INT_MAX. Is the attached patch okay? Yes. It now matches the style used for read^Wsys_read() and friends. This used to have to fit the count in int uio_resid. uio_resid now has type ssize_t, but for some reason the old INT_MAX limits remain. Well, I can revive the patch. I still think it is good to get rid of the limit. The correct limit on the maximum size of a single read/write is SSIZE_MAX, but FreeBSD uses INT_MAX. It's not safe to raise the limit yet, though, because of bugs in several filesystems. For example, FFS copies uio_resid into a local variable of type int. I have some old patches that fix some of these issues for FFS and cd9660, but surely there are more places I didn't notice. Absolutely agree. http://people.freebsd.org/~kib/misc/uio_resid.5.patch Nice. You found a lot more than I've got in my tree, and you even fixed the return values. There are at least a few more places to fix. For instance, cd9660 and the NFS client pass uio_resid or iov_len to min(), which operates on ints. (Incidentally, C11 generics ought to make it possible to write type-generic min() and max() functions.) Thank you, http://people.freebsd.org/~kib/misc/uio_resid.6.patch changed them to MIN(). This looks good to me. I tried to think of other places that you might have missed, and the only one that occurred to me is the Might ? I think this is a blatant understate. pipe code. sys_pipe.c has an `int orig_resid' and lots of bogus casts of iov_len and uio_resid to type u_int. Some look harmless, although it appears that writing a multiple of 2^32 bytes might result in pipe_build_write_buffer() allocating a 0-length buffer. My only reservation is that raising the limit could unmask a kernel buffer overflow if we missed something, but I guess we have to cross that bridge some day anyway. Yes, and it is an obvious reason why I am chicken to commit this for so long time. One more place, if this is reasonable to count as 'one' place, are the cdevsw methods. devfs passes uio down to the drivers. http://people.freebsd.org/~kib/misc/uio_resid.7.patch contains the sys_pipe.c changes. pgp5AQrkknHLA.pgp Description: PGP signature
Re: svn commit: r230583 - head/sys/kern
On Fri, Jan 27, 2012 at 02:42:21PM -0500, David Schultz wrote: On Fri, Jan 27, 2012, Kostik Belousov wrote: On Fri, Jan 27, 2012 at 07:50:30PM +1100, Bruce Evans wrote: On Thu, 26 Jan 2012, Gleb Smirnoff wrote: On Thu, Jan 26, 2012 at 11:53:57PM +1100, Bruce Evans wrote: B @@ -1552,6 +1552,12 @@ aio_aqueue(struct thread *td, struct aio B return (error); B } B B + /* XXX: aio_nbytes is later casted to signed types. */ B + if ((int)aiocbe-uaiocb.aio_nbytes 0) { B B This should avoid implementation-defined behaviour by checking if B B (uncast)aiocbe-uaiocb.aio_nbytes INT_MAX. Is the attached patch okay? Yes. It now matches the style used for read^Wsys_read() and friends. This used to have to fit the count in int uio_resid. uio_resid now has type ssize_t, but for some reason the old INT_MAX limits remain. Well, I can revive the patch. I still think it is good to get rid of the limit. The correct limit on the maximum size of a single read/write is SSIZE_MAX, but FreeBSD uses INT_MAX. It's not safe to raise the limit yet, though, because of bugs in several filesystems. For example, FFS copies uio_resid into a local variable of type int. I have some old patches that fix some of these issues for FFS and cd9660, but surely there are more places I didn't notice. Absolutely agree. http://people.freebsd.org/~kib/misc/uio_resid.5.patch By the way, PR 147226 is about this. pgpBbROiWgYr5.pgp Description: PGP signature
Re: svn commit: r230583 - head/sys/kern
On Sat, Jan 28, 2012 at 07:12:25PM -0500, David Schultz wrote: On Sat, Jan 28, 2012, Kostik Belousov wrote: On Fri, Jan 27, 2012 at 02:42:21PM -0500, David Schultz wrote: On Fri, Jan 27, 2012, Kostik Belousov wrote: On Fri, Jan 27, 2012 at 07:50:30PM +1100, Bruce Evans wrote: On Thu, 26 Jan 2012, Gleb Smirnoff wrote: On Thu, Jan 26, 2012 at 11:53:57PM +1100, Bruce Evans wrote: B @@ -1552,6 +1552,12 @@ aio_aqueue(struct thread *td, struct aio B return (error); B } B B + /* XXX: aio_nbytes is later casted to signed types. */ B + if ((int)aiocbe-uaiocb.aio_nbytes 0) { B B This should avoid implementation-defined behaviour by checking if B B (uncast)aiocbe-uaiocb.aio_nbytes INT_MAX. Is the attached patch okay? Yes. It now matches the style used for read^Wsys_read() and friends. This used to have to fit the count in int uio_resid. uio_resid now has type ssize_t, but for some reason the old INT_MAX limits remain. Well, I can revive the patch. I still think it is good to get rid of the limit. The correct limit on the maximum size of a single read/write is SSIZE_MAX, but FreeBSD uses INT_MAX. It's not safe to raise the limit yet, though, because of bugs in several filesystems. For example, FFS copies uio_resid into a local variable of type int. I have some old patches that fix some of these issues for FFS and cd9660, but surely there are more places I didn't notice. Absolutely agree. http://people.freebsd.org/~kib/misc/uio_resid.5.patch Nice. You found a lot more than I've got in my tree, and you even fixed the return values. There are at least a few more places to fix. For instance, cd9660 and the NFS client pass uio_resid or iov_len to min(), which operates on ints. (Incidentally, C11 generics ought to make it possible to write type-generic min() and max() functions.) Thank you, http://people.freebsd.org/~kib/misc/uio_resid.6.patch changed them to MIN(). pgpIqAbNH9ETG.pgp Description: PGP signature
Re: svn commit: r230583 - head/sys/kern
On Fri, Jan 27, 2012 at 07:50:30PM +1100, Bruce Evans wrote: On Thu, 26 Jan 2012, Gleb Smirnoff wrote: On Thu, Jan 26, 2012 at 11:53:57PM +1100, Bruce Evans wrote: B @@ -1552,6 +1552,12 @@ aio_aqueue(struct thread *td, struct aio B return (error); B } B B + /* XXX: aio_nbytes is later casted to signed types. */ B + if ((int)aiocbe-uaiocb.aio_nbytes 0) { B B This should avoid implementation-defined behaviour by checking if B B (uncast)aiocbe-uaiocb.aio_nbytes INT_MAX. Is the attached patch okay? Yes. It now matches the style used for read^Wsys_read() and friends. This used to have to fit the count in int uio_resid. uio_resid now has type ssize_t, but for some reason the old INT_MAX limits remain. Well, I can revive the patch. I still think it is good to get rid of the limit. pgpBLt2BDnrfg.pgp Description: PGP signature
Re: svn commit: r230252 - head/sys/fs/tmpfs
On Mon, Jan 23, 2012 at 11:05:42PM +0200, Mikolaj Golub wrote: On Mon, 23 Jan 2012 17:34:57 +0200 Jaakko Heinonen wrote: JH On 2012-01-22, Mikolaj Golub wrote: Also, may be we should allow remounting ro (and may be some othe options) for tmpfs? JH Yes, the patch below does that. I suspect that flipping the MNT_RDONLY JH flag may be enough for tmpfs but I am not sure. I see two issues with this patch: 1) 'mount -u -o rw /mnt' does not upgrade to rw, although it returns success. 2) if you have a file open for write, after remounting ro you still can write to the file. The expected behaviour: remount fails with EBUSY if force option is not specified; after remounting with force writing to the fail fails with EIO. I think when remounting ro you should call vflush(), something like below: flags = WRITECLOSE; if (mp-mnt_flag MNT_FORCE) flags |= FORCECLOSE; error = vflush(mp, 0, flags, curthread); if (error != 0) return (error); MNT_ILOCK(mp); mp-mnt_flag |= MNT_RDONLY; MNT_IUNLOCK(mp); and when upgrading to rw reset MNT_RDONLY flag: MNT_ILOCK(mp); mp-mnt_flag = ~MNT_RDONLY; MNT_IUNLOCK(mp); I have no idea if something else is needed for tmpfs. Yes, this is needed but not enough. Since other writeable opens may happen during the vflush(), you might still end up with the mount point which is not safe to set MNT_RDONLY flag. FFS suspends the writes on the mp while doing remount. Other filesystems like tmpfs and nfs could also perform suspend during remounts and unmounts, but the complications are the atime and deferred inactivations. See ufs/ffs/ffs_snapshot.c:process_deferred_inactive(), the handling of IN_LAZYACCESS and ufs_inactive(). pgpEFP461j04d.pgp Description: PGP signature
Re: svn commit: r230354 - head/usr.sbin/makefs
On Mon, Jan 23, 2012 at 01:25:04PM +0100, Dimitry Andric wrote: On 2012-01-23 05:28, Hiroki Sato wrote: ... I don't think it is needed. The makefs utility is a special case because it will probably diverge from the upstream to support FreeBSD-specific feature in the future (this is one of the reasons why it is not in contrib/). It didn't happen so far, however. By the way, does gcc46 no longer allow unused code? Generally speaking, I think it is enough to clean up unused code only when we actually change the code. The warnings are not about unused code, but about variables which are assigned, but then never referenced anymore. Sometimes this is just a cosmetic issue, and the compiler will hopefully optimize out the unreferenced variable(s). In other situations there are real bugs. In any case, gcc 4.5 and 4.6 just enable more warnings by default, and when using -Wall; in particular the variable set but not used one. This warning should really be suppressed when WARNS is set to a low level, such as with makefs (it has WARNS?=2). Attached diff makes it so, for gcc45 and gcc46. It uses the same approach as I added earlier for clang. It can also be easily extended to other warnings. There is a typo in the second or condition, should it be gcc46 both times ? Anyway, the reason to answer this message is two ask the for seemingly unreasonable approach of matching compiler type/version based on the driver name. This completely precludes anybody from using gcc installed not from the ports tree. Could the tests performed based on the driver version information instead of name ? pgphXwGNGuNjj.pgp Description: PGP signature
Re: svn commit: r230354 - head/usr.sbin/makefs
On Mon, Jan 23, 2012 at 03:13:07PM +0100, Dimitry Andric wrote: On 2012-01-23 13:31, Kostik Belousov wrote: On Mon, Jan 23, 2012 at 01:25:04PM +0100, Dimitry Andric wrote: ... There is a typo in the second or condition, should it be gcc46 both times ? Ah, sorry, that was a copy/paste error. Anyway, the reason to answer this message is two ask the for seemingly unreasonable approach of matching compiler type/version based on the driver name. This completely precludes anybody from using gcc installed not from the ports tree. Yes, that is indeed the big problem with this approach. Although I think the number of people that hand-build gcc, and not use the ports will be rather low. :) Could the tests performed based on the driver version information instead of name ? Probably, but then you would have to run ${CC} --version plus some sed/awk'ing from bsd.sys.mk. That could add quite some extra forking during buildworld. It may be easier to add a COMPILER_FLAVOR (just an example name) setting in make.conf, which can be set independently of CC, CXX and so on, to tell which specific variant of gcc, clang etc you want to use. The processing of that setting could happen in either sys.mk or bsd.sys.mk, or be separated out to a bsd.compiler.mk, for instance. In each .mk file or Makefile that needs it, the ${CC:T:M:foo} == foo comparisons can then be replaced with ${COMPILER_FLAVOR} == foo. I agree that even such palliative measures are more useful and provides better future-proof of the build system then current name matching. pgpo3NMA6Nbj3.pgp Description: PGP signature
Re: svn commit: r230191 - in head: lib/libc/arm/gen sys/arm/include
On Sun, Jan 15, 2012 at 11:11:43PM -0500, David Schultz wrote: On Mon, Jan 16, 2012, David Schultz wrote: Author: das Date: Mon Jan 16 04:08:29 2012 New Revision: 230191 URL: http://svn.freebsd.org/changeset/base/230191 Log: Implement FLT_ROUNDS for arm. Some (all?) arm FPUs lack support for dynamic rounding modes, but FPUless chips that use softfloat can support it because everything is emulated anyway. (We presently have incomplete support for hardware FPUs.) Submitted by: Ian Lepore Incidentally, all of gcc's hooks into softfloat should probably be in the public symbol namespace instead of FBSDprivate. The compiler generates references to them, so we cannot claim that they are internal, unsupported interfaces. I assume that moving them will not break the ABI because FreeBSDprivate includes FBSD_X, but I haven't tested this. Any objections to moving them? Affects arm and mips. Move will break the ABI. Namespace inheritance is ignored when searching the symbol match. On the other hand. FBSDprivate_1.0 is explicitely created to be changed, so removal of the symbols from this namespace if fine from the POV of the project policy. Another argument is that both MIPS and ARM are the second-tier architectures, and again, project policy allows ABI breakage. pgpZTnEVlhKjv.pgp Description: PGP signature
Re: svn commit: r229986 - head/lib/libutil
On Wed, Jan 11, 2012 at 10:33:41PM +, Guy Helmer wrote: Author: ghelmer Date: Wed Jan 11 22:33:41 2012 New Revision: 229986 URL: http://svn.freebsd.org/changeset/base/229986 Log: Fix namespace issues with prototype parameter names. Add missing prototype parameter names. Requested by bde. Modified: head/lib/libutil/libutil.h Modified: head/lib/libutil/libutil.h == --- head/lib/libutil/libutil.hWed Jan 11 22:12:45 2012 (r229985) +++ head/lib/libutil/libutil.hWed Jan 11 22:33:41 2012 (r229986) @@ -93,7 +93,7 @@ struct termios; struct winsize; __BEGIN_DECLS -char *auth_getval(const char *name); +char *auth_getval(const char *_name); The _[a-z].* names are still in the app namespace. Only _[A-Z].* and __[a-z].* are reserved for the implementation. pgpFfR0d7M7V3.pgp Description: PGP signature
Re: svn commit: r229828 - in head/sys: kern ufs/ufs
On Mon, Jan 09, 2012 at 12:55:06AM -0800, Doug Barton wrote: On 01/08/2012 15:06, Konstantin Belousov wrote: Author: kib Date: Sun Jan 8 23:06:53 2012 New Revision: 229828 URL: http://svn.freebsd.org/changeset/base/229828 Log: Avoid LOR between vfs_busy() lock and covered vnode lock on quotaon(). Does this mean that if we turn witness back on the ever-present VFS LORs will no longer be there, or does this only affect calls to quotaon()? vfs_busy() lock is not accounted for by witness, so there shall be no changes in the witness reports. The scenario affected only fired when somebody did quotaon and unmount proceeded in parallel. pgp5PXwBEyZRX.pgp Description: PGP signature
Re: svn commit: r229887 - in head/sys: conf dev/random modules/random
On Mon, Jan 09, 2012 at 11:20:30PM +, Jung-uk Kim wrote: Author: jkim Date: Mon Jan 9 23:20:30 2012 New Revision: 229887 URL: http://svn.freebsd.org/changeset/base/229887 Log: Enable hardware RNG for VIA Nano processors. PR: kern/163974 Can we, please, have VIA RNG enabled under some option ? It is currently eating 512 bytes on 99.9% of machines for FPU save area. Shortly it will take even more on AVX-capable machines. pgppLHf3hp6Nf.pgp Description: PGP signature
Re: svn: head/sys/netinet
On Fri, Dec 30, 2011 at 04:25:09PM -0800, Maxim Sobolev wrote: On 12/30/2011 4:17 PM, Maxim Sobolev wrote: M Won't this break whole lot of third-party software, which expects M FreeBSD to be slightly different in this regards? Just curious. Yes it does. And until FreeBSD 10.0-RELEASE there is time to fix this software (at least in ports). The MFC to stable/9 of r226105 was back out. Well, I am just curious how critical it is to get it resolved and is there any way to avoid ABI breakage. Software compiled for 9.x won't run on 10.x even when fitted with the proper compat libs, as far as I can tell and not all software can be easily recompiled. P.S. It should be trivial to put some COMPAT_8/COMPAT_9 shims based on the version of the ELF image (i.e. detect if the binary is than FreeBSD 10. What exactly do you mean by 'version of the ELF image' ? ABI note tag ? What do you propose to do if older call comes from dso, or a library statically linked in the main binary ? pgpUH5eBDQ8i7.pgp Description: PGP signature
Re: svn commit: r228843 - head/contrib/telnet/libtelnet head/crypto/heimdal/appl/telnet/libtelnet head/include head/lib/libc/gen head/lib/libc/iconv head/lib/libc/include head/lib/libc/net head/libexe
On Fri, Dec 23, 2011 at 12:06:44PM -0500, Alexander Kabaev wrote: On Fri, 23 Dec 2011 11:22:34 -0500 John Baldwin j...@freebsd.org wrote: On Friday, December 23, 2011 10:58:46 am John Baldwin wrote: On Friday, December 23, 2011 10:00:38 am Colin Percival wrote: Author: cperciva Date: Fri Dec 23 15:00:37 2011 New Revision: 228843 URL: http://svn.freebsd.org/changeset/base/228843 Log: Fix a problem whereby a corrupt DNS record can cause named to crash. [11:06] Add an API for alerting internal libc routines to the presence of unsafe paths post-chroot, and use it in ftpd. [11:07] Eh, the whole libc_dlopen() thing looks like a gross hack (and who came up with that weird symbol name for a public API). Is it really even needed given the other fix to have ftpd drop privilege before execing a helper program? I guess the main reason I don't like it is it doesn't do anything to address the more general problem. I would have expected instead something to restrict dlopen() entirely including from other libraries than just libc in certain circumstances. At the very least if we feel that the libc_dlopen() thing is a temporary band-aid, we should move the new symbols into the private namespace so we can remove them once the better fix is in rather than being required to support them forever. libc_dlopen() is not exposed. The __FreeBSD_libc_enter_restricted_mode() is, and its name is ugly exactly to note the ugly intent. I do not see how the symbol can go int FBSDprivate_1.0 when it was supposed to be used by third-party applications. Putting this hack into rtld itself IMO has to wide consequences. For libc, we can enumerate the points that stop work after the call, but for the generic applications the consequences are undefined. -- John Baldwin Pardon for not catching that when I had a chance to influence the outcome, but I would like to voice my support to tucking the ugliness into private version namespace. -- Alexander Kabaev pgpw9brvjo58g.pgp Description: PGP signature
Re: svn commit: r228843 - head/contrib/telnet/libtelnet head/crypto/heimdal/appl/telnet/libtelnet head/include head/lib/libc/gen head/lib/libc/iconv head/lib/libc/include head/lib/libc/net head/libexe
On Fri, Dec 23, 2011 at 01:20:34PM -0500, Alexander Kabaev wrote: On Fri, 23 Dec 2011 19:51:43 +0200 Kostik Belousov kostik...@gmail.com wrote: On Fri, Dec 23, 2011 at 12:06:44PM -0500, Alexander Kabaev wrote: On Fri, 23 Dec 2011 11:22:34 -0500 John Baldwin j...@freebsd.org wrote: On Friday, December 23, 2011 10:58:46 am John Baldwin wrote: On Friday, December 23, 2011 10:00:38 am Colin Percival wrote: Author: cperciva Date: Fri Dec 23 15:00:37 2011 New Revision: 228843 URL: http://svn.freebsd.org/changeset/base/228843 Log: Fix a problem whereby a corrupt DNS record can cause named to crash. [11:06] Add an API for alerting internal libc routines to the presence of unsafe paths post-chroot, and use it in ftpd. [11:07] Eh, the whole libc_dlopen() thing looks like a gross hack (and who came up with that weird symbol name for a public API). Is it really even needed given the other fix to have ftpd drop privilege before execing a helper program? I guess the main reason I don't like it is it doesn't do anything to address the more general problem. I would have expected instead something to restrict dlopen() entirely including from other libraries than just libc in certain circumstances. At the very least if we feel that the libc_dlopen() thing is a temporary band-aid, we should move the new symbols into the private namespace so we can remove them once the better fix is in rather than being required to support them forever. libc_dlopen() is not exposed. The __FreeBSD_libc_enter_restricted_mode() is, and its name is ugly exactly to note the ugly intent. I do not see how the symbol can go int FBSDprivate_1.0 when it was supposed to be used by third-party applications. Putting this hack into rtld itself IMO has to wide consequences. For libc, we can enumerate the points that stop work after the call, but for the generic applications the consequences are undefined. -- John Baldwin Pardon for not catching that when I had a chance to influence the outcome, but I would like to voice my support to tucking the ugliness into private version namespace. -- Alexander Kabaev Putting symbol into official namespace implies that we are willing to provide and maintain it forever, which I do not think was the intent for the function in question. FBSD_PRIVATE says nothing about who should and should not link to it, only the level of API stability one has to expect in the end. If/when we have better security mechanisms (capsicum?) available to users by default, this should be ripped out with extreme prejudice. The API is offered as a solution to third-parties. Telling them to use the API that is known to be broken in future is wrong step for the project, IMO. I doubt that proftpd will 'go capsicum'. pgpPX5kdHDlGt.pgp Description: PGP signature
Re: svn commit: r228616 - head/usr.bin/tar
On Sat, Dec 17, 2011 at 01:36:51AM +, Dimitry Andric wrote: Author: dim Date: Sat Dec 17 01:36:50 2011 New Revision: 228616 URL: http://svn.freebsd.org/changeset/base/228616 Log: In usr.bin/tar/tree.c, if you really want to poke to NULL, you must use volatile, otherwise the indirection will not be emitted. MFC after: 1 week Modified: head/usr.bin/tar/tree.c Modified: head/usr.bin/tar/tree.c == --- head/usr.bin/tar/tree.c Sat Dec 17 01:29:46 2011(r228615) +++ head/usr.bin/tar/tree.c Sat Dec 17 01:36:50 2011(r228616) @@ -315,7 +315,7 @@ tree_next(struct tree *t) const char *msg = Unable to continue traversing directory hierarchy after a fatal error.; write(2, msg, strlen(msg)); - *(int *)0 = 1; /* Deliberate SEGV; NULL pointer dereference. */ + *(volatile int *)0 = 1; /* Deliberate SEGV; NULL pointer dereference. */ exit(1); /* In case the SEGV didn't work. */ } Why this hack is used instead of abort(3) or abort2(2) ? pgpbz6mmAicxU.pgp Description: PGP signature
Re: svn commit: r228663 - head/usr.sbin/lpr/filters
On Sat, Dec 17, 2011 at 09:37:22PM +, Dimitry Andric wrote: Author: dim Date: Sat Dec 17 21:37:21 2011 New Revision: 228663 URL: http://svn.freebsd.org/changeset/base/228663 Log: In usr.sbin/lpr/filters/lpf.c, use a less obtuse way of clearing the buffer, that also avoids warnings. MFC after: 1 week Modified: head/usr.sbin/lpr/filters/lpf.c Modified: head/usr.sbin/lpr/filters/lpf.c == --- head/usr.sbin/lpr/filters/lpf.c Sat Dec 17 20:53:06 2011 (r228662) +++ head/usr.sbin/lpr/filters/lpf.c Sat Dec 17 21:37:21 2011 (r228663) @@ -55,6 +55,7 @@ __FBSDID($FreeBSD$); #include unistd.h #include stdlib.h #include stdio.h +#include string.h #define MAXWIDTH 132 #define MAXREP10 @@ -115,7 +116,7 @@ main(int argc, char *argv[]) acctfile = cp; } - for (cp = buf[0], limit = buf[MAXREP]; cp limit; *cp++ = ' '); + memset(buf, ' ', sizeof(buf)); done = 0; buf is two-dimensional array. This change looks wrong. while (!done) { pgpR71ytY8TYv.pgp Description: PGP signature
Re: svn commit: r228509 - in head: share/man/man9 sys/kern sys/sys
+ switch (rv) { + case KERN_INVALID_ADDRESS: + case KERN_NO_SPACE: + return (ENOMEM); + case KERN_PROTECTION_FAILURE: + return (EACCES); + default: + return (EINVAL); + } You can replace this fragment with the call to vm_mmap_to_errno(). pgp1uUD1pDFAy.pgp Description: PGP signature
Re: svn commit: r228433 - in head/sys: kern security/mac
On Mon, Dec 12, 2011 at 10:05:13AM +, Andriy Gapon wrote: Author: avg Date: Mon Dec 12 10:05:13 2011 New Revision: 228433 URL: http://svn.freebsd.org/changeset/base/228433 Log: put sys/systm.h at its proper place or add it if missing Reported by:lstewart, tinderbox Pointyhat to: avg, attilio MFC after: 1 week MFC with: r228430 Modified: head/sys/kern/kern_sx.c head/sys/kern/vfs_cache.c head/sys/security/mac/mac_framework.c head/sys/security/mac/mac_priv.c It means that previously sx.h did not required systm.h and now it does ? Might be, you should move SCHEDULER_STOPPED and stop_scheduler declarations into sys/lock.h ? pgpS91RBA95na.pgp Description: PGP signature
Re: svn commit: r228435 - in head/libexec/rtld-elf: . amd64 arm i386 ia64 mips powerpc powerpc64 sparc64
On Mon, Dec 12, 2011 at 11:03:15AM +, Konstantin Belousov wrote: Author: kib Date: Mon Dec 12 11:03:14 2011 New Revision: 228435 URL: http://svn.freebsd.org/changeset/base/228435 Log: Add support for STT_GNU_IFUNC and R_MACHINE_IRELATIVE GNU extensions to rtld on 386 and amd64. This adds runtime bits neccessary for the use of the dispatch functions from the dynamically-linked executables and shared libraries. To allow use of external references from the dispatch function, resolution of the R_MACHINE_IRESOLVE relocations in PLT is postponed until GOT entries for PLT are prepared, and normal resolution of the GOT entries is finished. Similar to how it is done by GNU, IRELATIVE relocations are resolved in advance, instead of normal lazy handling for PLT. Move the init_pltgot() call before the relocations for the object are processed. MFC after: 3 weeks An example use of the facilities is provided at http://vger.kernel.org/~davem/cgi-bin/blog.cgi/2010/02/07 Inner working is described by http://www.airs.com/blog/archives/403 To use this feature on FreeBSD, you need patched gas from recent binutils. Patch is available at http://people.freebsd.org/~kib/misc/gas_ifunc_freebsd.1.patch. The 4.6 gcc should be configured with --enable-gnu-indirect-function option to turn on the 'ifunc' function attributes, no patch for compiler is required. Only dynamically linked executables and shared objects are supported, support for static binaries requires changes for csu. Since in-tree toolchain cannot handle dispatch, and I am only interested in the dynamic linking case, I did not modified csu. If we ever need dispatch for system libraries (cases like CPU-optimized string functions, or math) and have a capable toolchain, I promise to add support for static binaries. I handled just x86oids. If maintainer of missed architecture wants to implement dispatch, I am willing to help (as usual). Does anybody on toolchain@ know how to submit the gas change to maintainers for inclusion into mainline, or better, can submit it himself ? I assume that no copyright assignment is required for this trivial flip of settings. pgpat6xmd7Cpf.pgp Description: PGP signature
Re: svn commit: r228435 - in head/libexec/rtld-elf: . amd64 arm i386 ia64 mips powerpc powerpc64 sparc64
On Mon, Dec 12, 2011 at 06:17:09PM +0100, Joerg Sonnenberger wrote: On Mon, Dec 12, 2011 at 11:03:15AM +, Konstantin Belousov wrote: To allow use of external references from the dispatch function, resolution of the R_MACHINE_IRESOLVE relocations in PLT is postponed until GOT entries for PLT are prepared, and normal resolution of the GOT entries is finished. Similar to how it is done by GNU, IRELATIVE relocations are resolved in advance, instead of normal lazy handling for PLT. Are you sure that you didn't introduce major locking issues with this? What do you mean, exactly ? The dispatcher function is called under the bind lock, yes. pgpwVZZkn5llo.pgp Description: PGP signature
Re: svn commit: r228435 - in head/libexec/rtld-elf: . amd64 arm i386 ia64 mips powerpc powerpc64 sparc64
On Mon, Dec 12, 2011 at 07:21:32PM +0200, Kostik Belousov wrote: On Mon, Dec 12, 2011 at 06:17:09PM +0100, Joerg Sonnenberger wrote: On Mon, Dec 12, 2011 at 11:03:15AM +, Konstantin Belousov wrote: To allow use of external references from the dispatch function, resolution of the R_MACHINE_IRESOLVE relocations in PLT is postponed until GOT entries for PLT are prepared, and normal resolution of the GOT entries is finished. Similar to how it is done by GNU, IRELATIVE relocations are resolved in advance, instead of normal lazy handling for PLT. Are you sure that you didn't introduce major locking issues with this? What do you mean, exactly ? The dispatcher function is called under the bind lock, yes. To describe it in more details, _rtld_bind() read-locks the bind lock, and possible plt resolution from the dispatcher would also acquire bind lock in read mode, which is the supported operation. plt is explicitely designed to allow safe multithreaded updates, so the shared lock do not cause problems. What does not work with rtld locks is read lock acquisition after the write lock. Even more, it works for single-threaded locks, but fails for thr_rtld.c. After thinking more about your question, I see how this exact sequence can happen. If we dlopened the shared object that contains IRELATIVE or jump slot which target is STT_GNU_IFUNC, then possible recursive plt resolve from the dispatcher would cause it. Note that I do not even consider implementing support for dl*(3) calls from a dispatcher. The solution is to postpone the resolution for irelative/ifunc right before initializers are called, and drop bind lock around calls to dispatcher. The following patch is the implementation of the idea. It is somewhat bigger then I hoped because I decided to use initlist to iterate over the objects instead of the -next, due to drop of the bind lock in iteration. diff --git a/libexec/rtld-elf/amd64/reloc.c b/libexec/rtld-elf/amd64/reloc.c index 5ae8493..3b00987 100644 --- a/libexec/rtld-elf/amd64/reloc.c +++ b/libexec/rtld-elf/amd64/reloc.c @@ -413,6 +413,8 @@ reloc_iresolve(Obj_Entry *obj, RtldLockState *lockstate) const Elf_Rela *relalim; const Elf_Rela *rela; +if (!obj-irelative) + return (0); relalim = (const Elf_Rela *)((char *)obj-pltrela + obj-pltrelasize); for (rela = obj-pltrela; rela relalim; rela++) { Elf_Addr *where, target, *ptr; @@ -424,11 +426,14 @@ reloc_iresolve(Obj_Entry *obj, RtldLockState *lockstate) case R_X86_64_IRELATIVE: ptr = (Elf_Addr *)(obj-relocbase + rela-r_addend); where = (Elf_Addr *)(obj-relocbase + rela-r_offset); + lock_release(rtld_bind_lock, lockstate); target = ((Elf_Addr (*)(void))ptr)(); + wlock_acquire(rtld_bind_lock, lockstate); *where = target; break; } } +obj-irelative = false; return (0); } @@ -455,13 +460,15 @@ reloc_gnu_ifunc(Obj_Entry *obj, RtldLockState *lockstate) return (-1); if (ELF_ST_TYPE(def-st_info) != STT_GNU_IFUNC) continue; + lock_release(rtld_bind_lock, lockstate); target = (Elf_Addr)rtld_resolve_ifunc(defobj, def); + wlock_acquire(rtld_bind_lock, lockstate); reloc_jmpslot(where, target, defobj, obj, (const Elf_Rel *)rela); break; } } obj-gnu_ifunc = false; -return 0; +return (0); } void diff --git a/libexec/rtld-elf/i386/reloc.c b/libexec/rtld-elf/i386/reloc.c index 5f11106..30292ca 100644 --- a/libexec/rtld-elf/i386/reloc.c +++ b/libexec/rtld-elf/i386/reloc.c @@ -371,16 +371,21 @@ reloc_iresolve(Obj_Entry *obj, RtldLockState *lockstate) const Elf_Rel *rel; Elf_Addr *where, target; +if (!obj-irelative) + return (0); rellim = (const Elf_Rel *)((char *)obj-pltrel + obj-pltrelsize); for (rel = obj-pltrel; rel rellim; rel++) { switch (ELF_R_TYPE(rel-r_info)) { case R_386_IRELATIVE: where = (Elf_Addr *)(obj-relocbase + rel-r_offset); + lock_release(rtld_bind_lock, lockstate); target = ((Elf_Addr (*)(void))(*where))(); + wlock_acquire(rtld_bind_lock, lockstate); *where = target; break; } } +obj-irelative = false; return (0); } @@ -407,7 +412,9 @@ reloc_gnu_ifunc(Obj_Entry *obj, RtldLockState *lockstate) return (-1); if (ELF_ST_TYPE(def-st_info) != STT_GNU_IFUNC) continue; + lock_release(rtld_bind_lock, lockstate); target = (Elf_Addr)rtld_resolve_ifunc(defobj, def); + wlock_acquire(rtld_bind_lock, lockstate); reloc_jmpslot(where, target, defobj, obj, rel); break; } diff --git a/libexec/rtld-elf/rtld.c b/libexec/rtld-elf/rtld.c index 17cd67c..f91ebf7 100644 --- a/libexec/rtld-elf/rtld.c +++ b/libexec/rtld-elf/rtld.c @@ -116,6 +116,8 @@ static void
Re: svn commit: r228322 - in head: include lib/libc/stdlib sys/sys
On Wed, Dec 07, 2011 at 03:25:48PM +, David Chisnall wrote: Author: theraven Date: Wed Dec 7 15:25:48 2011 New Revision: 228322 URL: http://svn.freebsd.org/changeset/base/228322 Log: Implement quick_exit() / at_quick_exit() from C++11 / C1x. Also add a __noreturn macro and modify the other exiting functions to use it. The __noreturn macro, unlike __dead2, must be used BEFORE the function. This is in line with the C and C++ specifications that place _Noreturn (c1x) and [[noreturn]] (C++11) in front of the functions. As with __dead2, this macro falls back to using the GCC attribute. Unfortunately, clang currently sets the same value for the C version macro in C99 and C1x modes, so these functions are hidden by default. At some point before 10.0, I need to go through the headers and clean up the C1x / C++11 visibility. Reviewed by:brooks (mentor) And, was it approved ? +#include stdlib.h +#include pthread.h + +/** + * Linked list of quick exit handlers. This is simpler than the atexit() + * version, because it is not required to support C++ destructors or + * DSO-specific cleanups. + */ +struct quick_exit_handler { + struct quick_exit_handler *next; + void (*cleanup)(void); +}; + +__attribute((weak)) +void _ZSt9terminatev(void); Why do you need this ? You are not calling terminate() anyway, and added an explicit comment. + +/** + * Lock protecting the handlers list. + */ +static pthread_mutex_t atexit_mutex = PTHREAD_MUTEX_INITIALIZER; +/** + * Stack of cleanup handlers. These will be invoked in reverse order when + */ +static struct quick_exit_handler *handlers; + +int +at_quick_exit(void (*func)(void)) +{ + struct quick_exit_handler *h = malloc(sizeof(struct quick_exit_handler)); You are making initialization at the declaration place, which is not recommended by style. + + if (0 == h) { Why 0 and not NULL ? Later, you use NULL. The {} are not needed. + return 1; This shall be return (1); + } + h-cleanup = func; + pthread_mutex_lock(atexit_mutex); Note that libc code is careful to only call pthread mutex functions if __isthreaded variable is set. Also note that libc uses mangled names to allow the application interposition of the functions. E.g. ANSI C code is allowed to have pthread_mutex_lock() function defined. + h-next = handlers; + handlers = h; + pthread_mutex_unlock(atexit_mutex); + return 0; And this shall be return (0); +} + +void quick_exit(int status) The function name shall start at the column 0. +{ + /* + * XXX: The C++ spec requires us to call std::terminate if there is an + * exception here. + */ + for (struct quick_exit_handler *h = handlers ; NULL != h ; h = h-next) This fragment violates so many style requirements that I probably fail to enumerate them all. The h declaration shall go at the start of function, and not at the for statement. The opening '{' shall be placed on the line of 'for'. More, the {} bracing is not needed there. No space is needed before ';', three times. + { + h-cleanup(); + } + _Exit(status); +} pgp8Ulmq2oDRm.pgp Description: PGP signature
Re: svn commit: r228269 - head/lib/libc/locale
On Mon, Dec 05, 2011 at 12:00:48AM +, Jilles Tjoelker wrote: Author: jilles Date: Mon Dec 5 00:00:47 2011 New Revision: 228269 URL: http://svn.freebsd.org/changeset/base/228269 Log: libc: Eliminate 13 relative relocations in wctype(). This reminds me the following change I had intended to do for quite some time. The hack for openssl is due to buggy assembler, which exactly the case I want to avoid for the base code. commit 3fdba61936a011b768845a8336ad2529e77e8ddb Author: Kostik Belousov kostik@sirion Date: Mon Dec 5 13:01:48 2011 +0200 Fail the build when text relocations are generated for dso. diff --git a/secure/lib/libcrypto/Makefile b/secure/lib/libcrypto/Makefile index 0a1704c..73f5cb7 100644 --- a/secure/lib/libcrypto/Makefile +++ b/secure/lib/libcrypto/Makefile @@ -7,6 +7,7 @@ SUBDIR= engines LIB= crypto SHLIB_MAJOR= 6 +ALLOW_SHARED_TEXTREL= NO_LINT= diff --git a/share/mk/bsd.lib.mk b/share/mk/bsd.lib.mk index 1e43921..40632de 100644 --- a/share/mk/bsd.lib.mk +++ b/share/mk/bsd.lib.mk @@ -167,6 +167,11 @@ SOBJS+=${OBJS:.o=.So} .if defined(SHLIB_NAME) _LIBS+=${SHLIB_NAME} +SOLINKOPTS=-shared -Wl,-x -Wl,--fatal-warnings +.if !defined(ALLOW_SHARED_TEXTREL) +SOLINKOPTS+= -Wl,--warn-shared-textrel +.endif + .if target(beforelinking) ${SHLIB_NAME}: ${SOBJS} beforelinking .else @@ -178,11 +183,11 @@ ${SHLIB_NAME}: ${SOBJS} @ln -fs ${.TARGET} ${SHLIB_LINK} .endif .if !defined(NM) - @${CC} ${LDFLAGS} ${SSP_CFLAGS} -shared -Wl,-x \ + @${CC} ${LDFLAGS} ${SSP_CFLAGS} ${SOLINKOPTS} \ -o ${.TARGET} -Wl,-soname,${SONAME} \ `lorder ${SOBJS} | tsort -q` ${LDADD} .else - @${CC} ${LDFLAGS} ${SSP_CFLAGS} -shared -Wl,-x \ + @${CC} ${LDFLAGS} ${SSP_CFLAGS} ${SOLINKOPTS} \ -o ${.TARGET} -Wl,-soname,${SONAME} \ `NM='${NM}' lorder ${SOBJS} | tsort -q` ${LDADD} .endif pgpwAFCOJxhZ6.pgp Description: PGP signature
Re: svn commit: r226042 - in head/sys: kern sys
On Wed, Nov 30, 2011 at 10:01:12PM +0100, Giovanni Trematerra wrote: On Wed, Oct 5, 2011 at 6:56 PM, Konstantin Belousov k...@freebsd.org wrote: Author: kib Date: Wed Oct 5 16:56:06 2011 New Revision: 226042 URL: http://svn.freebsd.org/changeset/base/226042 Log: Supply unique (st_dev, st_ino) value pair for the fstat(2) done on the pipes. Reviewed by: jhb, Peter Jeremy peterjeremy acm org MFC after: 2 weeks Modified: head/sys/kern/sys_pipe.c head/sys/sys/pipe.h Hi Konstantin, unfortunately your commit introduces a performance penalty of about 3% documented below in a real workload. I guess that fstat(2) on the pipes is seldom used so we could just be lazy and alloc a new unr number inside pipe_stat instead of during pipe creation. In that case if an application doesn't need fstat(2) on the pipes it won't be charged the cost of alloc_unr/free_unr for the fake inode number. The patch I propose, furthermore, fix a panic in the case alloc_unr failed to allocate a new unr number and return -1. This is because ino_t is unsigned and the test pipe_ino 0 into pipeclose would be true, calling then free_unr when it hasn't to. The proposed patch was tested with a regression test code that you can find here http://www.trematerra.net/patches/pipe-fstat.c Feel free to add it under tools/regression/pipe/ Here the proposed patch: http://www.trematerra.net/patches/lazy_inoalloc.diff Here the report of the benchmark: Configuration 10.0-CURRENT updated to r22807. kern.smp.disabled=1 in /boot/loader.conf kernel config GENERIC without debugging options. The first result of any test was discarded and not reported here. here the result of three executions of # make -s buildkernel note that I managed to compile the same identical source files for all the tests. r22807 with r226042 reverted (time in seconds) 527, 527, 527 r22807 (time in seconds) 544, 544, 544 r22807M with the proposed patch (time in seconds) 527, 528, 528 ministat output: x r22807 with r226042 reverted + r22807 * r22807M with the proposed patch +--+ |+* x| |** x| ||__A_M| A| +--+ N Min MaxMedian AvgStddev x 3 544 544 544 544 0 + 3 527 527 527 527 0 Difference at 95.0% confidence -17 +/- 0 -3.125% +/- 0% (Student's t, pooled s = 0) * 3 527 528 528 527.70.57735027 Difference at 95.0% confidence -16. +/- 0.925333 -3.00245% +/- 0.170098% (Student's t, pooled s = 0.408248) -- Gianni Thank you for looking at this. I committed the test, and the fix for the call to free_unr(-1). Regarding the lazy allocation of the inode number, I agree with the idea, but have some reservations against the implementation. If several threads call fstat(2) on the same pipe which inode is not yet initialized, then I see a race in the patch. The easiest workaround is to cover the inode allocation with the pipe lock. Also, I find the return of ENOMEM from fstat(2) somewhat questionable. The error code is not documented as allowed for the syscall. I prefer to not fail the fstat(2) if lazy allocation failed, but return some fake value for inode instead. Updated patch is below. diff --git a/sys/kern/sys_pipe.c b/sys/kern/sys_pipe.c index 2b6eb66..19fc98f 100644 --- a/sys/kern/sys_pipe.c +++ b/sys/kern/sys_pipe.c @@ -569,12 +569,7 @@ pipe_create(pipe, backing) /* If we're not backing this pipe, no need to do anything. */ error = 0; } - if (error == 0) { - pipe-pipe_ino = alloc_unr(pipeino_unr); - if (pipe-pipe_ino == -1) - /* pipeclose will clear allocated kva */ - error = ENOMEM; - } + pipe-pipe_ino = -1; return (error); } @@ -1398,16 +1393,40 @@ pipe_stat(fp, ub, active_cred, td) struct ucred *active_cred; struct thread *td; { - struct pipe *pipe = fp-f_data; + struct pipe *pipe; + int new_unr; #ifdef MAC int error; +#endif + pipe = fp-f_data; PIPE_LOCK(pipe); +#ifdef MAC error = mac_pipe_check_stat(active_cred, pipe-pipe_pair); - PIPE_UNLOCK(pipe); - if (error) + if (error) { + PIPE_UNLOCK(pipe); return (error); + } #endif + /* +* Lazily allocate an inode number for the pipe.
Re: svn commit: r227873 - head/usr.bin/procstat
On Sat, Nov 26, 2011 at 06:43:01PM +0200, Mikolaj Golub wrote: On Thu, 24 Nov 2011 09:12:35 +0200 Mikolaj Golub wrote: MG On Wed, 23 Nov 2011 11:10:47 -0800 m...@freebsd.org wrote: printf( AT_IGNORE=0x%lu, - (unsigned long)aux-a_un.a_val); + (unsigned long)auxv[i].a_un.a_val); m I didn't see this before, but this gives very misleading output. The m 0x prefix implies the output will be hex, but it's printed as decimal, m here and below. MG Oh. Thanks! Will fix. m I don't know if there's a style preference for 0x%lx versus %#lx, m though, or a preference for a different type for the print (uintmax_t, m for example). There is probably a preference for using u_long rather m than unsigned long, since it's shorter. MG It looks like both 0x%lx and %#lx are widely spread in our source, I like %#lx MG a little more. It looks for me that (u_long) will be ok for now, so the chage MG for the printf above would be: MG printf( AT_IGNORE=%#lx, (u_long)auxv[i].a_un.a_val); MG Anyway, printing all values in the same format was considered by me as a MG temporary solution. I am going to have format depending on a_type, so e.g. MG AT_PAGESZ will be 4096, not 0x1000. MG Also, now they are printed as one string for a process: MG PID COMM AUXV MG 2520 firefox-bin AT_PHDR=0x400040 AT_PHENT=0x38 AT_PHNUM=0x7 AT_PAGESZ=0x1000 ... MG I am considering changing this too, to something like below: MG PID COMM AUXVVALUE MG 2520 firefox-bin AT_PHDR 0x400040 MG 2520 firefox-bin AT_PHENT56 MG 2520 firefox-bin AT_PHNUM7 MG ... I am going to commit this patch if nobody has any other suggestions. The typical output: in138:~% procstat -x 2008 PID COMM AUXV VALUE 2008 nginxAT_PHDR 0x400040 2008 nginxAT_PHENT 56 2008 nginxAT_PHNUM 8 2008 nginxAT_PAGESZ4096 2008 nginxAT_FLAGS 0 2008 nginxAT_ENTRY 0x40de00 2008 nginxAT_BASE 0x800689000 2008 nginxAT_EXECPATH 0x7fffefca 2008 nginxAT_OSRELDATE 101 2008 nginxAT_CANARY0x7fffef8a 2008 nginxAT_CANARYLEN 64 2008 nginxAT_NCPUS 2 2008 nginxAT_PAGESIZES 0x7fffef72 2008 nginxAT_PAGESIZESLEN 24 2008 nginxAT_STACKPROT VM_PROT_ALL I like this output much better. The only thing I am unsure of is the pretty-printing of AT_STACKPROT. Might be, change it to EXECUTABLE/NONEXECUTABLE printout. pgpd9Uh5m0Dwm.pgp Description: PGP signature
Re: svn commit: r227873 - head/usr.bin/procstat
On Sat, Nov 26, 2011 at 06:13:14PM +, Robert N. M. Watson wrote: On 26 Nov 2011, at 17:48, Kostik Belousov wrote: in138:~% procstat -x 2008 PID COMM AUXV VALUE 2008 nginxAT_PHDR 0x400040 2008 nginxAT_PHENT 56 2008 nginxAT_PHNUM 8 2008 nginxAT_PAGESZ4096 2008 nginxAT_FLAGS 0 2008 nginxAT_ENTRY 0x40de00 2008 nginxAT_BASE 0x800689000 2008 nginxAT_EXECPATH 0x7fffefca 2008 nginxAT_OSRELDATE 101 2008 nginxAT_CANARY0x7fffef8a 2008 nginxAT_CANARYLEN 64 2008 nginxAT_NCPUS 2 2008 nginxAT_PAGESIZES 0x7fffef72 2008 nginxAT_PAGESIZESLEN 24 2008 nginxAT_STACKPROT VM_PROT_ALL I like this output much better. The only thing I am unsure of is the pretty-printing of AT_STACKPROT. Might be, change it to EXECUTABLE/NONEXECUTABLE printout. On a related note, I wouldn't mind if we stripped AT_ and lower-cased the rest of the field name to make it slightly easier on the eyes. :-) It would then need some explanation what the names mean. For my, AT_SOMETHING has a unique meaning, while something does not. pgpa1GVaXFALN.pgp Description: PGP signature
Re: svn commit: r227798 - in head: . lib/libpam lib/libpam/modules
On Tue, Nov 22, 2011 at 10:38:02PM +0100, Dag-Erling Sm??rgrav wrote: Dag-Erling Sm??rgrav d...@des.no writes: Jilles Tjoelker jil...@stack.nl writes: Although this will work, I think it trades the quality of the binaries for a cleaner build system. It is better to pass all libraries to ld(1) even though a .so may have unresolved references: the NEEDED entry serves as an additional protection against version mismatches and symbol versioning (if you ever add it) requires ld(1) to have access to the .so containing the definition so it knows the version name to store in the output file. These are plugins. The names and prototypes of the functions they export were set in stone 15 years ago. Sorry, that was a bit unclear. What I was trying to say is that symbol versioning is not and will never be an issue. Does plugin depend on the library which loaded it ? In other words, does it reference the symbols from the library ? If yes, then plugin _must_ have a dependency on the library recorded as DT_NEEDED. pgpmOkX1uKdJk.pgp Description: PGP signature
Re: svn commit: r227784 - head/sys/kern
On Mon, Nov 21, 2011 at 10:36:57AM +, Sergey Kandaurov wrote: Author: pluknet Date: Mon Nov 21 10:36:57 2011 New Revision: 227784 URL: http://svn.freebsd.org/changeset/base/227784 Log: Use the acquired reference to the vmspace instead of direct dereferencing of p-p_vmspace like it is done in sysctl_kern_proc_vmmap(). Modified: head/sys/kern/kern_proc.c Modified: head/sys/kern/kern_proc.c == --- head/sys/kern/kern_proc.c Mon Nov 21 08:12:36 2011(r227783) +++ head/sys/kern/kern_proc.c Mon Nov 21 10:36:57 2011(r227784) @@ -1528,7 +1528,7 @@ sysctl_kern_proc_ovmmap(SYSCTL_HANDLER_A } kve = malloc(sizeof(*kve), M_TEMP, M_WAITOK); - map = p-p_vmspace-vm_map;/* XXXRW: More locking required? */ + map = vm-vm_map; /* XXXRW: More locking required? */ It makes sense to remove the XXXRW comment, from both places. vm_map_lock_read(map); for (entry = map-header.next; entry != map-header; entry = entry-next) { pgpbIdxD8JSPK.pgp Description: PGP signature
Re: svn commit: r227491 - head/sbin/md5
On Sun, Nov 13, 2011 at 05:07:43PM +, Eitan Adler wrote: Author: eadler (ports committer) Date: Sun Nov 13 17:07:43 2011 New Revision: 227491 URL: http://svn.freebsd.org/changeset/base/227491 Log: - new sentence should start on new line. PR: bin/146541 Submitted by: bjk Approved by:bjk Modified: head/sbin/md5/md5.1 Modified: head/sbin/md5/md5.1 == --- head/sbin/md5/md5.1 Sun Nov 13 17:07:26 2011(r227490) +++ head/sbin/md5/md5.1 Sun Nov 13 17:07:43 2011(r227491) @@ -78,8 +78,8 @@ The hexadecimal checksum of each file li after the options are processed. .Bl -tag -width indent .It Fl c Ar string -Compare files to this md5 string. (Note that this option is not yet useful -if multiple files are specified.) +Compare files to this md5 string. +(Note that this option is not yet useful if multiple files are specified.) I doubt that you compare files to md5 hash. Rather, you compare the md5 hash of file with provided string. .It Fl s Ar string Print a checksum of the given .Ar string . pgp9EZNwhB0sH.pgp Description: PGP signature
Re: svn commit: r227454 - head/sbin/newfs_msdos
On Fri, Nov 11, 2011 at 08:31:48PM +, Xin LI wrote: Author: delphij Date: Fri Nov 11 20:31:48 2011 New Revision: 227454 URL: http://svn.freebsd.org/changeset/base/227454 Log: Use __packed to prevent alignment from taking place, which otherwise may change the on-disk format in an incompatible way. Without this change, msdosfs created on FreeBSD/arm would not be mountable. PR: bin/162486 Submitted by: Ian Lepore freebsd damnhippie dyndns org Reported by:Mattia Rossi mrossi at swin.edu.au MFC after: 3 days I suspect there is much more places with the similar issues. FAT structure definitions were not centralized, AFAIR. pgpcp1w5iC2oR.pgp Description: PGP signature
Re: svn commit: r227394 - in head/sys: amd64/amd64 i386/i386
On Thu, Nov 10, 2011 at 07:23:23AM +0100, Pawel Jakub Dawidek wrote: On Wed, Nov 09, 2011 at 07:31:28PM +0200, Kostik Belousov wrote: On Wed, Nov 09, 2011 at 05:25:43PM +, Konstantin Belousov wrote: Author: kib Date: Wed Nov 9 17:25:43 2011 New Revision: 227394 URL: http://svn.freebsd.org/changeset/base/227394 Log: Stopped process may legitimately have some threads sleeping and not suspended, if the sleep is uninterruptible. Even more, stopped process might have some threads still running in the kernel mode, or inhibited due to wait on blockable locks. I was unable to design an expression that would assert that such thread will be stopped at the kernel-user boundary. The assertion itself is useful and catched several bugs, but theoretically can cause false positives. If any report of the fired assert for kernel-mode thread is provided, I will remove the assertions. If in five years some change will make it to fire and you won't be around, nobody will remember this e-mail of yours and someone will spend time on looking for a bug that doesn't exist. If we know it can cause false positives, I'd be in favour of removing it or changing it into a warning or at the very least moving this information into commit log. Yes, I do agree that removing the assertion is the most straightforward solution, but on the assertion already catched more real bugs then caused headaches (my headache, BTW). Either I change the third or condition to plain P_SHOULDSTOP(), or create a way to assert exactly what I need. pgpIde0g4IF4e.pgp Description: PGP signature
Re: svn commit: r227394 - in head/sys: amd64/amd64 i386/i386
On Wed, Nov 09, 2011 at 05:25:43PM +, Konstantin Belousov wrote: Author: kib Date: Wed Nov 9 17:25:43 2011 New Revision: 227394 URL: http://svn.freebsd.org/changeset/base/227394 Log: Stopped process may legitimately have some threads sleeping and not suspended, if the sleep is uninterruptible. Even more, stopped process might have some threads still running in the kernel mode, or inhibited due to wait on blockable locks. I was unable to design an expression that would assert that such thread will be stopped at the kernel-user boundary. The assertion itself is useful and catched several bugs, but theoretically can cause false positives. If any report of the fired assert for kernel-mode thread is provided, I will remove the assertions. Reported and tested by: pho MFC after: 1 week Modified: head/sys/amd64/amd64/machdep.c head/sys/i386/i386/machdep.c Modified: head/sys/amd64/amd64/machdep.c == --- head/sys/amd64/amd64/machdep.cWed Nov 9 17:15:51 2011 (r227393) +++ head/sys/amd64/amd64/machdep.cWed Nov 9 17:25:43 2011 (r227394) @@ -2047,7 +2047,8 @@ int fill_fpregs(struct thread *td, struct fpreg *fpregs) { - KASSERT(td == curthread || TD_IS_SUSPENDED(td), + KASSERT(td == curthread || TD_IS_SUSPENDED(td) || + (P_SHOULDSTOP(td-td_proc) TD_IS_SLEEPING(td)), (not suspended thread %p, td)); fpugetregs(td); fill_fpregs_xmm(td-td_pcb-pcb_user_save, fpregs); Modified: head/sys/i386/i386/machdep.c == --- head/sys/i386/i386/machdep.c Wed Nov 9 17:15:51 2011 (r227393) +++ head/sys/i386/i386/machdep.c Wed Nov 9 17:25:43 2011 (r227394) @@ -3299,7 +3299,8 @@ int fill_fpregs(struct thread *td, struct fpreg *fpregs) { - KASSERT(td == curthread || TD_IS_SUSPENDED(td), + KASSERT(td == curthread || TD_IS_SUSPENDED(td) || + (P_SHOULDSTOP(td-td_proc) TD_IS_SLEEPING(td)), (not suspended thread %p, td)); #ifdef DEV_NPX npxgetregs(td); pgpX8ruvrkMKU.pgp Description: PGP signature
Re: svn commit: r225950 - in head: sbin/camcontrol share/examples/scsi_target share/misc sys/cam sys/cam/scsi sys/dev/ciss sys/dev/firewire sys/dev/iir sys/dev/iscsi/initiator sys/dev/isp sys/dev/mly
On Mon, Oct 03, 2011 at 08:32:56PM +, Kenneth D. Merry wrote: Author: ken Date: Mon Oct 3 20:32:55 2011 New Revision: 225950 URL: http://svn.freebsd.org/changeset/base/225950 Before this commit, scsi_extract_sense() was static inline. Since now it is made not-inlined, the consumers of it that previously not depended on cam.ko need an explicit MODULE_DEPEND() for cam. E.g., on HEAD and stable/9, mfi is no longer loadable as module, showing errors link_elf_obj: symbol scsi_extract_sense undefined linker_load_file: Unsupported file type on load attempt. pgpBpJkbjxZM8.pgp Description: PGP signature
Re: svn commit: r226687 - head/sys/fs/nullfs
On Mon, Oct 24, 2011 at 04:22:45PM +0200, Roman Divacky wrote: On Mon, Oct 24, 2011 at 01:53:32PM +, Konstantin Belousov wrote: Author: kib Date: Mon Oct 24 13:53:32 2011 New Revision: 226687 URL: http://svn.freebsd.org/changeset/base/226687 Log: The only possible error return from null_nodeget() is due to insmntque1 failure (the getnewvnode cannot return an error). In this case, the null_insmntque_dtr() already unlocked the reclaimed vnode, so VOP_UNLOCK() in the nullfs_mount() after null_nodeget() failure is wrong. Tested by:pho MFC after:1 week Modified: head/sys/fs/nullfs/null_vfsops.c Modified: head/sys/fs/nullfs/null_vfsops.c == --- head/sys/fs/nullfs/null_vfsops.cMon Oct 24 13:48:13 2011 (r226686) +++ head/sys/fs/nullfs/null_vfsops.cMon Oct 24 13:53:32 2011 (r226687) @@ -157,7 +157,6 @@ nullfs_mount(struct mount *mp) * Make sure the node alias worked */ if (error) { - VOP_UNLOCK(vp, 0); Maybe you want to assert that it's indeed unlocked at this point? No, I don't, since the vnode pointer is already invalid at this point. The vput() call in dtr released the vnode reference owned by current thread. pgpoXsF1E0Zsp.pgp Description: PGP signature
Re: svn commit: r226606 - in head/lib/libc: amd64 amd64/gen arm arm/gen gen i386 i386/gen ia64 ia64/gen mips mips/gen powerpc powerpc/gen powerpc64 powerpc64/gen sparc64 sparc64/gen
On Fri, Oct 21, 2011 at 06:40:36AM +, David Schultz wrote: Author: das Date: Fri Oct 21 06:40:36 2011 New Revision: 226606 URL: http://svn.freebsd.org/changeset/base/226606 Log: Replace a proliferation of buggy MD implementations of modf() with a working MI one. The MI one only needs to be overridden on machines with non-IEEE754 arithmetic. (The last supported one was the VAX.) It can also be overridden if someone comes up with a faster one that actually passes the regression tests -- but this is harder than it sounds. Added: head/lib/libc/gen/modf.c - copied, changed from r226410, head/lib/msun/src/s_modf.c Deleted: head/lib/libc/amd64/gen/modf.S head/lib/libc/arm/gen/modf.c head/lib/libc/i386/gen/modf.S head/lib/libc/ia64/gen/modf.c head/lib/libc/mips/gen/modf.S head/lib/libc/mips/gen/modf.c head/lib/libc/powerpc/gen/modf.c head/lib/libc/powerpc64/gen/modf.c head/lib/libc/sparc64/gen/modf.S Modified: head/lib/libc/amd64/Symbol.map head/lib/libc/amd64/gen/Makefile.inc head/lib/libc/arm/Symbol.map head/lib/libc/arm/gen/Makefile.inc head/lib/libc/gen/Makefile.inc head/lib/libc/gen/Symbol.map head/lib/libc/i386/Symbol.map head/lib/libc/i386/gen/Makefile.inc head/lib/libc/ia64/Symbol.map head/lib/libc/ia64/gen/Makefile.inc head/lib/libc/mips/Symbol.map head/lib/libc/mips/gen/Makefile.inc head/lib/libc/powerpc/Symbol.map head/lib/libc/powerpc/gen/Makefile.inc head/lib/libc/powerpc64/Symbol.map head/lib/libc/powerpc64/gen/Makefile.inc head/lib/libc/sparc64/Symbol.map head/lib/libc/sparc64/gen/Makefile.inc Modified: head/lib/libc/amd64/Symbol.map == --- head/lib/libc/amd64/Symbol.mapFri Oct 21 06:36:40 2011 (r226605) +++ head/lib/libc/amd64/Symbol.mapFri Oct 21 06:40:36 2011 (r226606) @@ -26,7 +26,6 @@ FBSD_1.0 { __infinity; __nan; makecontext; - modf; rfork_thread; setjmp; longjmp; You cannot do this, you just completely broke the ABI. The symbols must not be removed from the versioned library. pgpHVabtNEOTi.pgp Description: PGP signature
Re: svn commit: r226606 - in head/lib/libc: amd64 amd64/gen arm arm/gen gen i386 i386/gen ia64 ia64/gen mips mips/gen powerpc powerpc/gen powerpc64 powerpc64/gen sparc64 sparc64/gen
On Fri, Oct 21, 2011 at 08:22:42AM -0400, John Baldwin wrote: On Friday, October 21, 2011 4:12:01 am Kostik Belousov wrote: On Fri, Oct 21, 2011 at 06:40:36AM +, David Schultz wrote: Author: das Date: Fri Oct 21 06:40:36 2011 New Revision: 226606 URL: http://svn.freebsd.org/changeset/base/226606 Log: Replace a proliferation of buggy MD implementations of modf() with a working MI one. The MI one only needs to be overridden on machines with non-IEEE754 arithmetic. (The last supported one was the VAX.) It can also be overridden if someone comes up with a faster one that actually passes the regression tests -- but this is harder than it sounds. Added: head/lib/libc/gen/modf.c - copied, changed from r226410, head/lib/msun/src/s_modf.c Deleted: head/lib/libc/amd64/gen/modf.S head/lib/libc/arm/gen/modf.c head/lib/libc/i386/gen/modf.S head/lib/libc/ia64/gen/modf.c head/lib/libc/mips/gen/modf.S head/lib/libc/mips/gen/modf.c head/lib/libc/powerpc/gen/modf.c head/lib/libc/powerpc64/gen/modf.c head/lib/libc/sparc64/gen/modf.S Modified: head/lib/libc/amd64/Symbol.map head/lib/libc/amd64/gen/Makefile.inc head/lib/libc/arm/Symbol.map head/lib/libc/arm/gen/Makefile.inc head/lib/libc/gen/Makefile.inc head/lib/libc/gen/Symbol.map head/lib/libc/i386/Symbol.map head/lib/libc/i386/gen/Makefile.inc head/lib/libc/ia64/Symbol.map head/lib/libc/ia64/gen/Makefile.inc head/lib/libc/mips/Symbol.map head/lib/libc/mips/gen/Makefile.inc head/lib/libc/powerpc/Symbol.map head/lib/libc/powerpc/gen/Makefile.inc head/lib/libc/powerpc64/Symbol.map head/lib/libc/powerpc64/gen/Makefile.inc head/lib/libc/sparc64/Symbol.map head/lib/libc/sparc64/gen/Makefile.inc Modified: head/lib/libc/amd64/Symbol.map == --- head/lib/libc/amd64/Symbol.mapFri Oct 21 06:36:40 2011 (r226605) +++ head/lib/libc/amd64/Symbol.mapFri Oct 21 06:40:36 2011 (r226606) @@ -26,7 +26,6 @@ FBSD_1.0 { __infinity; __nan; makecontext; - modf; rfork_thread; setjmp; longjmp; You cannot do this, you just completely broke the ABI. The symbols must not be removed from the versioned library. He just moved it to the MI Symbol.map, he didn't remove it: Ah, sorry. Modified: head/lib/libc/gen/Symbol.map == --- head/lib/libc/gen/Symbol.mapFri Oct 21 06:36:40 2011 (r226605) +++ head/lib/libc/gen/Symbol.mapFri Oct 21 06:40:36 2011 (r226606) @@ -213,6 +213,7 @@ FBSD_1.0 { ldexp; lockf; lrand48; + modf; mrand48; nftw; nice; -- John Baldwin pgpNKqURcbxtu.pgp Description: PGP signature
Re: svn commit: r226580 - stable/7/libexec/tftpd
On Thu, Oct 20, 2011 at 07:23:21PM +, Sean Bruno wrote: Author: sbruno Date: Thu Oct 20 19:23:21 2011 New Revision: 226580 URL: http://svn.freebsd.org/changeset/base/226580 Log: MFC r224536: Confirmed behavior of a Cisco 6509 in production. In the old TFTP server, there was an undocumented behavior where the block counter would rollover to 0 if a file larger than 65535 blocks was transferred. With the default block size of 512 octets per block, this is a file size of approximately 32 megabytes. The new TFTP server code would report an error and stop transferring the file if a file was larger than 65535 blocks. This patch restores the old TFTP server's behavior to the new TFTP server code. If a TFTP client transfers a file larger than 65535 blocks, and does *not* specify the rollover option, then automatically rollover the block counter to 0 every time we reach 65535 blocks. This restores interoperability with the FreeBSD 6 TFTP client. Without this change, if a FreeBSD 6 TFTP client tried to retrieve a file larger than 65535 blocks from a FreeBSD 9 TFTP server , the transfer would fail. The same file could be retrieved successfully if the same FreeBSD 6 TFTP client was used against a FreeBSD 6 TFTP server. Approved by: re (kib) It was not. It looks like people just think that 'Approved by: kib' is some voodo that must be appended for no reason. Today it is second time happened for the commit that I did not approved. It does not matter much for the branch opened for the general commits. Still, I would prefer that people do not falsely accuse me of wrongdoing :). Tested by: Pawan Gupta pawang at juniper dot net, Obtained from: Juniper Networks FreeBSD7 Reviewed by: Yahoo Inc. Submitted by: If someone else sent in the change. Reviewed by: If someone else reviewed your modification. Approved by: If you needed approval for this commit. Obtained from: If the change is from a third party. MFC after: N [day[s]|week[s]|month[s]]. Request a reminder email. Security: Vulnerability reference (one per line) or description. Empty fields above will be automatically removed. Mtftpd/tftp-transfer.c Modified: stable/7/libexec/tftpd/tftp-transfer.c Modified: stable/7/libexec/tftpd/tftp-transfer.c == --- stable/7/libexec/tftpd/tftp-transfer.cThu Oct 20 19:16:52 2011 (r226579) +++ stable/7/libexec/tftpd/tftp-transfer.cThu Oct 20 19:23:21 2011 (r226580) @@ -129,14 +129,16 @@ tftp_send(int peer, uint16_t *block, str (*block)++; if (oldblock *block) { if (options[OPT_ROLLOVER].o_request == NULL) { - tftp_log(LOG_ERR, - Block rollover but not allowed.); - send_error(peer, EBADOP); - gettimeofday((ts-tstop), NULL); - return; + /* + * rollover option not specified in + * tftp client. Default to rolling block + * counter to 0. + */ + *block = 0; + } else { + *block = atoi(options[OPT_ROLLOVER].o_request); } - *block = atoi(options[OPT_ROLLOVER].o_request); ts-rollovers++; } gettimeofday((ts-tstop), NULL); @@ -196,14 +198,16 @@ tftp_receive(int peer, uint16_t *block, (*block)++; if (oldblock *block) { if (options[OPT_ROLLOVER].o_request == NULL) { - tftp_log(LOG_ERR, - Block rollover but not allowed.); - send_error(peer, EBADOP); - gettimeofday((ts-tstop), NULL); - return; + /* + * rollover option not specified in + * tftp client. Default to rolling block + * counter to 0. + */ + *block = 0; + } else { + *block = atoi(options[OPT_ROLLOVER].o_request); } - *block = atoi(options[OPT_ROLLOVER].o_request); ts-rollovers++; } pgpmuKDOCbkHd.pgp Description: PGP signature
Re: svn commit: r226343 - head/sys/vm
On Fri, Oct 14, 2011 at 02:35:15PM -0700, Marcel Moolenaar wrote: On Oct 14, 2011, at 11:24 AM, Kostik Belousov wrote: After more thought about the issue, I do not agree with you. Elf specification says about the PF_R flag that only read permission for the memory image of the segment is required, but read and execute is allowed. The ELF specification does not contain CPU specifics. Those are always covered by processor supplements. Since this is very i386 specific behaviour we're talking about, it's a non sequitur to use the generic ELF specification to argue a point in this respect. This is not quite accurate. The ELF specification does contain the CPU-specific bits for i386. But, the interpretation of the PF_R/PF_X flags in the program header is described under the OS-specific section, and not under the CPU-specific. That is, I am sure that the situation shall be interpreted as the bug in the program. I want to commit the following refinement: The patch is good. Thanks, committed. diff --git a/sys/kern/imgact_elf.c b/sys/kern/imgact_elf.c index 669c652..9970386 100644 --- a/sys/kern/imgact_elf.c +++ b/sys/kern/imgact_elf.c @@ -118,11 +118,24 @@ static int elf_legacy_coredump = 0; SYSCTL_INT(_debug, OID_AUTO, __elfN(legacy_coredump), CTLFLAG_RW, elf_legacy_coredump, 0, ); -static int __elfN(nxstack) = 0; +static int __elfN(nxstack) = +#if defined(__amd64__) || defined(__powerpc__) /* both 64 and 32 bit */ + 1; +#else + 0; +#endif SYSCTL_INT(__CONCAT(_kern_elf, __ELF_WORD_SIZE), OID_AUTO, nxstack, CTLFLAG_RW, __elfN(nxstack), 0, __XSTRING(__CONCAT(ELF, __ELF_WORD_SIZE)) : enable non-executable stack); Please do not commit this change with the rest. It's not to the point. In fact, if you're changing it to the above, please add ia64 to the list as well. Yes, sure. The inclusion of the chunk was a mistake. pgp76vlqCofCj.pgp Description: PGP signature
Re: svn commit: r226343 - head/sys/vm
On Thu, Oct 13, 2011 at 06:50:30PM -0400, David Schultz wrote: On Thu, Oct 13, 2011, Marcel Moolenaar wrote: On Oct 13, 2011, at 2:07 PM, John Baldwin wrote: That's really besides the point. ABI changes are made deliberately and ABIs must be well-documented for anyone to adhere to it. You can't post hoc wave your hand and say that at some unspecified time in the past the ABI changed: at what precise time does supported by hardware mean and how does that tie to a major FreeBSD version? Point in case: the JDK 1.4.x still works on FreeBSD 9.x (i386), so the ABI really hasn't changed at all in that respect. I think if you booted a FreeBSD 9.x i386 PAE kernel you'd find that the jdk did not work. That will be true for any i386 PAE kernel back to when PG_NX support was introduced. That's bad. After more thought about the issue, I do not agree with you. Elf specification says about the PF_R flag that only read permission for the memory image of the segment is required, but read and execute is allowed. In other words, it is a bug in the old jre, which is further confirmed by the fact that later jres run with non-executable head. Recent binutils support a PT_GNU_HEAP flag in the ELF header that controls whether heap allocations are executable by default. In Linux, the flag can be set using an ld option or the execstack(8) command. That seems like a better way to go than breaking old JVMs or disabling the security feature. No, recent binutils do not support PT_GNU_HEAD. git grep -e PT_GNU_HEAD on the up to date checkout of binutils head gives zero matches. There are indeed PT_GNU_HEAP patches floating around from some hardening projects. There is indeed PT_GNU_STACK, which we do support. I want to commit the following refinement: diff --git a/sys/compat/freebsd32/freebsd32_misc.c b/sys/compat/freebsd32/freebsd32_misc.c index 6638ec8..fc2932b 100644 --- a/sys/compat/freebsd32/freebsd32_misc.c +++ b/sys/compat/freebsd32/freebsd32_misc.c @@ -445,7 +445,7 @@ freebsd32_mprotect(struct thread *td, struct freebsd32_mprotect_args *uap) ap.len = uap-len; ap.prot = uap-prot; #if defined(__amd64__) || defined(__ia64__) - if (ap.prot PROT_READ) + if (i386_read_exec (ap.prot PROT_READ) != 0) ap.prot |= PROT_EXEC; #endif return (sys_mprotect(td, ap)); @@ -536,7 +536,7 @@ freebsd32_mmap(struct thread *td, struct freebsd32_mmap_args *uap) #endif #if defined(__amd64__) || defined(__ia64__) - if (prot PROT_READ) + if (i386_read_exec (prot PROT_READ)) prot |= PROT_EXEC; #endif diff --git a/sys/kern/imgact_elf.c b/sys/kern/imgact_elf.c index 669c652..9970386 100644 --- a/sys/kern/imgact_elf.c +++ b/sys/kern/imgact_elf.c @@ -118,11 +118,24 @@ static int elf_legacy_coredump = 0; SYSCTL_INT(_debug, OID_AUTO, __elfN(legacy_coredump), CTLFLAG_RW, elf_legacy_coredump, 0, ); -static int __elfN(nxstack) = 0; +static int __elfN(nxstack) = +#if defined(__amd64__) || defined(__powerpc__) /* both 64 and 32 bit */ + 1; +#else + 0; +#endif SYSCTL_INT(__CONCAT(_kern_elf, __ELF_WORD_SIZE), OID_AUTO, nxstack, CTLFLAG_RW, __elfN(nxstack), 0, __XSTRING(__CONCAT(ELF, __ELF_WORD_SIZE)) : enable non-executable stack); +#if __ELF_WORD_SIZE == 32 +#if defined(__amd64__) || defined(__ia64__) +int i386_read_exec = 0; +SYSCTL_INT(_kern_elf32, OID_AUTO, read_exec, CTLFLAG_RW, i386_read_exec, 0, +enable execution from readable segments); +#endif +#endif + static Elf_Brandinfo *elf_brand_list[MAX_BRANDS]; #definetrunc_page_ps(va, ps) ((va) ~(ps - 1)) @@ -1666,7 +1679,7 @@ __elfN(trans_prot)(Elf_Word flags) prot |= VM_PROT_READ; #if __ELF_WORD_SIZE == 32 #if defined(__amd64__) || defined(__ia64__) - if (flags PF_R) + if (i386_read_exec (flags PF_R)) prot |= VM_PROT_EXECUTE; #endif #endif diff --git a/sys/sys/sysent.h b/sys/sys/sysent.h index 6a4b485..3694ceb 100644 --- a/sys/sys/sysent.h +++ b/sys/sys/sysent.h @@ -151,6 +152,10 @@ extern struct sysentvec null_sysvec; extern struct sysent sysent[]; extern const char *syscallnames[]; +#if defined(__amd64__) || defined(__ia64__) +extern int i386_read_exec; +#endif + #defineNO_SYSCALL (-1) struct module; diff --git a/sys/vm/vm_unix.c b/sys/vm/vm_unix.c index d4ea3b7..253ab77 100644 --- a/sys/vm/vm_unix.c +++ b/sys/vm/vm_unix.c @@ -141,7 +141,7 @@ sys_obreak(td, uap) prot = VM_PROT_RW; #ifdef COMPAT_FREEBSD32 #if defined(__amd64__) || defined(__ia64__) - if (SV_PROC_FLAG(td-td_proc, SV_ILP32)) + if (i386_read_exec SV_PROC_FLAG(td-td_proc, SV_ILP32)) prot |= VM_PROT_EXECUTE; #endif #endif pgpe8kquMiTxV.pgp Description: PGP signature
Re: svn commit: r226343 - head/sys/vm
On Fri, Oct 14, 2011 at 11:30:47AM -0700, Garrett Cooper wrote: On Fri, Oct 14, 2011 at 11:24 AM, Kostik Belousov kostik...@gmail.com wrote: On Thu, Oct 13, 2011 at 06:50:30PM -0400, David Schultz wrote: On Thu, Oct 13, 2011, Marcel Moolenaar wrote: On Oct 13, 2011, at 2:07 PM, John Baldwin wrote: That's really besides the point. ABI changes are made deliberately and ABIs must be well-documented for anyone to adhere to it. You can't post hoc wave your hand and say that at some unspecified time in the past the ABI changed: at what precise time does supported by hardware mean and how does that tie to a major FreeBSD version? Point in case: the JDK 1.4.x still works on FreeBSD 9.x (i386), so the ABI really hasn't changed at all in that respect. I think if you booted a FreeBSD 9.x i386 PAE kernel you'd find that the jdk did not work. That will be true for any i386 PAE kernel back to when PG_NX support was introduced. That's bad. After more thought about the issue, I do not agree with you. Elf specification says about the PF_R flag that only read permission for the memory image of the segment is required, but read and execute is allowed. In other words, it is a bug in the old jre, which is further confirmed by the fact that later jres run with non-executable head. Recent binutils support a PT_GNU_HEAP flag in the ELF header that controls whether heap allocations are executable by default. In Linux, the flag can be set using an ld option or the execstack(8) command. That seems like a better way to go than breaking old JVMs or disabling the security feature. No, recent binutils do not support PT_GNU_HEAD. git grep -e PT_GNU_HEAD on the up to date checkout of binutils head gives zero matches. There are indeed PT_GNU_HEAP patches floating around from some hardening projects. There is indeed PT_GNU_STACK, which we do support. I want to commit the following refinement: diff --git a/sys/compat/freebsd32/freebsd32_misc.c b/sys/compat/freebsd32/freebsd32_misc.c index 6638ec8..fc2932b 100644 --- a/sys/compat/freebsd32/freebsd32_misc.c +++ b/sys/compat/freebsd32/freebsd32_misc.c @@ -445,7 +445,7 @@ freebsd32_mprotect(struct thread *td, struct freebsd32_mprotect_args *uap) ap.len = uap-len; ap.prot = uap-prot; #if defined(__amd64__) || defined(__ia64__) - if (ap.prot PROT_READ) + if (i386_read_exec (ap.prot PROT_READ) != 0) ap.prot |= PROT_EXEC; #endif return (sys_mprotect(td, ap)); @@ -536,7 +536,7 @@ freebsd32_mmap(struct thread *td, struct freebsd32_mmap_args *uap) #endif #if defined(__amd64__) || defined(__ia64__) - if (prot PROT_READ) + if (i386_read_exec (prot PROT_READ)) prot |= PROT_EXEC; #endif diff --git a/sys/kern/imgact_elf.c b/sys/kern/imgact_elf.c index 669c652..9970386 100644 --- a/sys/kern/imgact_elf.c +++ b/sys/kern/imgact_elf.c @@ -118,11 +118,24 @@ static int elf_legacy_coredump = 0; SYSCTL_INT(_debug, OID_AUTO, __elfN(legacy_coredump), CTLFLAG_RW, elf_legacy_coredump, 0, ); -static int __elfN(nxstack) = 0; +static int __elfN(nxstack) = +#if defined(__amd64__) || defined(__powerpc__) /* both 64 and 32 bit */ + 1; +#else + 0; +#endif SYSCTL_INT(__CONCAT(_kern_elf, __ELF_WORD_SIZE), OID_AUTO, nxstack, CTLFLAG_RW, __elfN(nxstack), 0, __XSTRING(__CONCAT(ELF, __ELF_WORD_SIZE)) : enable non-executable stack); +#if __ELF_WORD_SIZE == 32 +#if defined(__amd64__) || defined(__ia64__) +int i386_read_exec = 0; +SYSCTL_INT(_kern_elf32, OID_AUTO, read_exec, CTLFLAG_RW, i386_read_exec, 0, + enable execution from readable segments); +#endif +#endif + static Elf_Brandinfo *elf_brand_list[MAX_BRANDS]; #define trunc_page_ps(va, ps) ((va) ~(ps - 1)) @@ -1666,7 +1679,7 @@ __elfN(trans_prot)(Elf_Word flags) prot |= VM_PROT_READ; #if __ELF_WORD_SIZE == 32 #if defined(__amd64__) || defined(__ia64__) - if (flags PF_R) + if (i386_read_exec (flags PF_R)) prot |= VM_PROT_EXECUTE; #endif #endif diff --git a/sys/sys/sysent.h b/sys/sys/sysent.h index 6a4b485..3694ceb 100644 --- a/sys/sys/sysent.h +++ b/sys/sys/sysent.h @@ -151,6 +152,10 @@ extern struct sysentvec null_sysvec; extern struct sysent sysent[]; extern const char *syscallnames[]; +#if defined(__amd64__) || defined(__ia64__) +extern int i386_read_exec; +#endif + #define NO_SYSCALL (-1) struct module; diff --git a/sys/vm/vm_unix.c b/sys/vm/vm_unix.c index d4ea3b7..253ab77 100644 --- a/sys/vm/vm_unix.c +++ b/sys/vm/vm_unix.c @@ -141,7 +141,7 @@ sys_obreak(td, uap) prot = VM_PROT_RW; #ifdef COMPAT_FREEBSD32 #if defined(__amd64__) || defined(__ia64__
Re: svn commit: r226343 - head/sys/vm
On Thu, Oct 13, 2011 at 04:20:10PM +, Marcel Moolenaar wrote: Author: marcel Date: Thu Oct 13 16:20:10 2011 New Revision: 226343 URL: http://svn.freebsd.org/changeset/base/226343 Log: In sys_obreak() and when compiling for amd64 or ia64, when the process is ILP32 (i.e. i386) grant execute permissions by default. The JDK 1.4.x depends on being able to execute from the heap on i386. Modified: head/sys/vm/vm_unix.c Modified: head/sys/vm/vm_unix.c == --- head/sys/vm/vm_unix.c Thu Oct 13 16:16:46 2011(r226342) +++ head/sys/vm/vm_unix.c Thu Oct 13 16:20:10 2011(r226343) @@ -36,6 +36,8 @@ * @(#)vm_unix.c 8.1 (Berkeley) 6/11/93 */ +#include opt_compat.h + /* * Traditional sbrk/grow interface to VM */ @@ -49,6 +51,7 @@ __FBSDID($FreeBSD$); #include sys/proc.h #include sys/racct.h #include sys/resourcevar.h +#include sys/sysent.h #include sys/sysproto.h #include sys/systm.h @@ -75,7 +78,7 @@ sys_obreak(td, uap) struct vmspace *vm = td-td_proc-p_vmspace; vm_offset_t new, old, base; rlim_t datalim, vmemlim; - int rv; + int prot, rv; int error = 0; boolean_t do_map_wirefuture; @@ -135,8 +138,15 @@ sys_obreak(td, uap) } PROC_UNLOCK(td-td_proc); #endif + prot = VM_PROT_RW; +#ifdef COMPAT_FREEBSD32 +#if defined(__amd64__) || defined(__ia64__) + if (SV_PROC_FLAG(td-td_proc, SV_ILP32)) + prot |= VM_PROT_EXECUTE; +#endif +#endif rv = vm_map_insert(vm-vm_map, NULL, 0, old, new, - VM_PROT_RW, VM_PROT_ALL, 0); + prot, VM_PROT_ALL, 0); if (rv != KERN_SUCCESS) { #ifdef RACCT PROC_LOCK(td-td_proc); The two commits removed NX support for .data/.bss for 32bit binaries on amd64. This is too unfortunate. Can we claim that only old binaries need this hack ? If yes, could you, please, conditionalize the hack on curproc-p_osrel being, say, 4.x ? pgpfx2jzvjatr.pgp Description: PGP signature
Re: svn commit: r226343 - head/sys/vm
On Thu, Oct 13, 2011 at 11:30:12AM -0700, Marcel Moolenaar wrote: On Oct 13, 2011, at 11:20 AM, Kostik Belousov wrote: The two commits removed NX support for .data/.bss for 32bit binaries Thon amd64. is is too unfortunate. Can we claim that only old Thbinaries need this hack ? I don't know. When did our ABI for i386 change to have NX by default? I think it changed de-facto when NX appears to be supported by hardware. In other words, PF_R-PF_X was always considered a coincident, and not a promise. I think we can claim that the moment FreeBSD gained PAE support, it happen. I would suggest to add a sysctl, say kern.elf32.readable_nx, with the values: 0 - strictly follow segment permissions 1 - PF_R implies PF_X if p_osrel 60 2 - PF_R always implied PF_X for 32bit binaries. pgpIE0P8zqOrm.pgp Description: PGP signature
Re: svn commit: r226122 - head/share/mk
On Sat, Oct 08, 2011 at 08:02:43AM -0700, Stanislav Sedov wrote: On Oct 8, 2011, at 12:33 AM, Ed Schouten e...@80386.nl wrote: Hi, * Stanislav Sedov s...@freebsd.org, 20111008 01:43: - ${WRKSRC} might be missing when the autotools fixup is running. Account for this. Maybe we should simplify this a bit? Sounds good! I didn't think about that way of ignoring the error code. Can you, please, also add a check for uname -r being indeed 10.0-CURRENT, before applying the hack ? Also, the hack does not fix perl, I have to use the following patch. diff --git a/hints/freebsd.sh b/hints/freebsd.sh index c661fe8..caf93eb 100644 --- a/hints/freebsd.sh +++ b/hints/freebsd.sh @@ -110,11 +110,11 @@ esac case $osvers in 0.*|1.0*) ;; -1*|2*) cccdlflags='-DPIC -fpic' +1.*|2.*) cccdlflags='-DPIC -fpic' lddlflags=-Bshareable $lddlflags ;; -3*|4*|5*|6*) +3.*|4.*|5.*|6.*) objformat=`/usr/bin/objformat` if [ x$objformat = xaout ]; then if [ -e /usr/lib/aout ]; then @@ -140,7 +140,7 @@ case $osvers in esac case $osvers in -0*|1*|2*|3*) ;; +0*|1.*|2.*|3.*) ;; *) ccflags=${ccflags} -DHAS_FPSETMASK -DHAS_FLOATINGPOINT_H @@ -261,7 +261,7 @@ EOM esac case $osvers in -[1-4]*) +[1-4].*) set `echo X $libswanted | sed -e 's/ c / c_r /'` shift libswanted=$* pgpmo2CXWmlMU.pgp Description: PGP signature
Re: svn commit: r226122 - head/share/mk
On Sat, Oct 08, 2011 at 12:25:16PM -0700, Doug Barton wrote: On 10/08/2011 08:23, Kostik Belousov wrote: On Sat, Oct 08, 2011 at 08:02:43AM -0700, Stanislav Sedov wrote: On Oct 8, 2011, at 12:33 AM, Ed Schouten e...@80386.nl wrote: Hi, * Stanislav Sedov s...@freebsd.org, 20111008 01:43: - ${WRKSRC} might be missing when the autotools fixup is running. Account for this. Maybe we should simplify this a bit? Sounds good! I didn't think about that way of ignoring the error code. Can you, please, also add a check for uname -r being indeed 10.0-CURRENT, before applying the hack ? The change is only in HEAD, why would that be necessary? UNAME_r is better, i.e. the hack broke some port for me (I do not remember which, right now). Anyway, it is reverted. pgpxTcytI4q1H.pgp Description: PGP signature
Re: svn commit: r225894 - in head/sys: kern sys
On Sat, Oct 01, 2011 at 10:18:55AM +, Konstantin Belousov wrote: Author: kib Date: Sat Oct 1 10:18:55 2011 New Revision: 225894 URL: http://svn.freebsd.org/changeset/base/225894 Log: The sigwait(3) function shall not return EINTR, according to the POSIX/SUSvN. The sigwait(2) syscall does return EINTR, and libc.so.7 contains the wrapper sigwait(3) which hides EINTR from callers. The r212405 will be merged to stable/8 shortly. EINTR return is used by libthr to handle required cancellation point in the sigwait(3). To help the binaries linked against pre-libc.so.7, i.e. RELENG_6 and earlier, to have right ABI for sigwait(3), transform EINTR return from sigwait(2) into ERESTART. This leaves the static binaries linked against libc.a from a system with libc.so.N, where N 7 or N == 7 and does not contain r212405, with the broken sigwait(). More intrusive change is to allocate new syscall number for sigwait(2), and change old sigwait(2) to never return EINTR. Then, the static binaries linked on HEAD and stable/9 prior to introduction of the new sigwait syscall numbers will have broken cancellation at sigwait. I had this done, but sort of agreement we reached is to go with less intrusive commit you see. And before you ask, the story started from the real user report of a broken binary-only program that is not prepared to handle EINTR there. pgpVn0lUwyt0V.pgp Description: PGP signature
Re: svn commit: r225898 - stable/8/lib/libc/sys
On Sat, Oct 01, 2011 at 11:05:34AM -0400, Ben Kaduk wrote: On Sat, Oct 1, 2011 at 8:35 AM, Konstantin Belousov k...@freebsd.org wrote: Author: kib Date: Sat Oct 1 12:35:09 2011 New Revision: 225898 URL: http://svn.freebsd.org/changeset/base/225898 Log: MFC r225172: Clarify the behaviour of sigwait() on signal interruption, and note the difference between sigwait() and sigtimedwait()/sigwaitinfo(). Modified: stable/8/lib/libc/sys/sigwait.2 stable/8/lib/libc/sys/sigwaitinfo.2 Directory Properties: stable/8/lib/libc/ (props changed) stable/8/lib/libc/stdtime/ (props changed) Modified: stable/8/lib/libc/sys/sigwait.2 == --- stable/8/lib/libc/sys/sigwait.2 Sat Oct 1 12:19:48 2011 (r225897) +++ stable/8/lib/libc/sys/sigwait.2 Sat Oct 1 12:35:09 2011 (r225898) @@ -27,7 +27,7 @@ .\ .\ $FreeBSD$ .\ -.Dd November 11, 2005 +.Dd August 24, 2011 .Dt SIGWAIT 2 .Os .Sh NAME @@ -82,6 +82,14 @@ selected, it will be the lowest numbered The selection order between realtime and non-realtime signals, or between multiple pending non-realtime signals, is unspecified. +.Sh IMPLEMENTATION NOTES +The +.Fn sigwait +function is implemented as a wrapper around the +.Fn __sys_sigwait I thought the sys_sigwait rename was only on head/stable-9? I merged the wrapper sigwait(3) several commits earlier, see r225896. pgpGsqEep9v2K.pgp Description: PGP signature
Re: svn commit: r225821 - head/sys/dev/ath
On Wed, Sep 28, 2011 at 03:07:52AM +, Adrian Chadd wrote: Author: adrian Date: Wed Sep 28 03:07:51 2011 New Revision: 225821 URL: http://svn.freebsd.org/changeset/base/225821 Log: Fix lock order to be correcter. Nothing else locks these two queues (cabq, avp mcastq), but it should be consistent and correct. Modified: head/sys/dev/ath/if_ath.c Modified: head/sys/dev/ath/if_ath.c == --- head/sys/dev/ath/if_ath.c Wed Sep 28 03:05:04 2011(r225820) +++ head/sys/dev/ath/if_ath.c Wed Sep 28 03:07:51 2011(r225821) @@ -2620,8 +2620,8 @@ ath_beacon_generate(struct ath_softc *sc } /* NB: gated by beacon so safe to start here */ ath_hal_txstart(ah, cabq-axq_qnum); - ATH_TXQ_UNLOCK(cabq); ATH_TXQ_UNLOCK(avp-av_mcastq); + ATH_TXQ_UNLOCK(cabq); } return bf; } Does the unlock order matter ? Unlock cannot block. I do not object against the change, rather, I am nit-picking against the commit message. pgpuuYLtFeh95.pgp Description: PGP signature
Re: svn commit: r225458 - in stable/8/sys: dev/usb dev/usb/quirk dev/usb/storage sys
On Sat, Sep 10, 2011 at 11:21:00AM +, Bjoern A. Zeeb wrote: On Sep 10, 2011, at 10:54 AM, Robert Watson wrote: On Fri, 9 Sep 2011, Hans Petter Selasky wrote: Log: MFC r225350 and r225400: This patch adds automatic detection of USB mass storage devices which does not support the no synchronize cache SCSI command. The __FreeBSD_version version macro has been bumped and external kernel modules needs to be recompiled after this patch. PR:usb/160299 For most other classes of hardware device driver framework KPIs -- especially things like PCI bus attachment, busdma, CAM, ifnet, and GEOM frameworks, our MFC rules would strictly disallow this sort of change, on the grounds that it is our KBI policy that we not break common classes of third-party device drivers (i.e., require them to be recompiled). My suspicion is that we should be applying the same rules to the USB framework -- however, I don't know if we have any third-party USB device drivers? (If we do, then this change should not have been MFC'd.) We do have FreeBSD consumers with private USB drivers, yes. It seems that most of the damage can be mitigated by placing the added fields at the end of the structures, at least for stable/8. pgpvvs3upGz2g.pgp Description: PGP signature
Re: Request for patch approval (Re: svn commit: r225458 - in stable/8/sys: dev/usb dev/usb/quirk dev/usb/storage sys)
On Sat, Sep 10, 2011 at 03:40:16PM +0200, Hans Petter Selasky wrote: Right -- exactly my point. If this change breaks third-party compiled USB device drivers, then our current approach to device driver KBIs does not allow it to be MFC'd in this form. Are there ways you can reformulate the change to avoid breaking those drivers? Sometimes this can be done by adding new symbols, rather than replacing currently symbols, although mileage varies. Hi, Here is my proposal: Implement test for automatic quirks in function which has access to the USB device structure. This decouples the structure change in struct usbd_lookup_info. The only structure which needs change is struct usb_device. In 9-current this structure will be kept as is. In 8-stable the new element will be moved to the end of the structure like suggested, and then there shouldn't be any problems. Please find patches attached. --HPS Commit message: Refactor auto-quirk solution so that we break as few external drivers as possible. PR: usb/160299 Approved by:re (kib) Suggested by: rwatson MFC after: 0 days First, can you, please, regenerate the diff for stable/8 against the code before r225458 ? I want to read diff to see ABI change, assuming r225458 was not done at all. Second, you cannot decrement __FreeBSD_version. In fact, you shall increment it once more in the patch for stable/8. pgpgHivdRGLaj.pgp Description: PGP signature
Re: Request for patch approval (Re: svn commit: r225458 - in stable/8/sys: dev/usb dev/usb/quirk dev/usb/storage sys)
On Sat, Sep 10, 2011 at 04:39:37PM +0200, Hans Petter Selasky wrote: On Saturday 10 September 2011 16:28:12 Kostik Belousov wrote: On Sat, Sep 10, 2011 at 03:40:16PM +0200, Hans Petter Selasky wrote: Right -- exactly my point. If this change breaks third-party compiled USB device drivers, then our current approach to device driver KBIs does not allow it to be MFC'd in this form. Are there ways you can reformulate the change to avoid breaking those drivers? Sometimes this can be done by adding new symbols, rather than replacing currently symbols, although mileage varies. Hi, Here is my proposal: Implement test for automatic quirks in function which has access to the USB device structure. This decouples the structure change in struct usbd_lookup_info. The only structure which needs change is struct usb_device. In 9-current this structure will be kept as is. In 8-stable the new element will be moved to the end of the structure like suggested, and then there shouldn't be any problems. Please find patches attached. --HPS Commit message: Refactor auto-quirk solution so that we break as few external drivers as possible. PR: usb/160299 Approved by:re (kib) Suggested by: rwatson MFC after: 0 days First, can you, please, regenerate the diff for stable/8 against the code before r225458 ? I want to read diff to see ABI change, assuming r225458 was not done at all. Second, you cannot decrement __FreeBSD_version. In fact, you shall increment it once more in the patch for stable/8. Ok. Please find attached output from: svn diff -r 225457 sys/ For me, it looks fine. Thank you. pgp1PevhJ8zAc.pgp Description: PGP signature
Re: svn commit: r224712 - head/sys/kern
On Mon, Aug 08, 2011 at 02:02:08PM +, Martin Matuska wrote: Author: mm Date: Mon Aug 8 14:02:08 2011 New Revision: 224712 URL: http://svn.freebsd.org/changeset/base/224712 Log: Revert r224655 and r224614 because vn_fullpath* does not always work on nullfs mounts. Change shall be reconsidered after 9.0 is released. Requested by: re (kib) Approved by:re (kib) Thank you. pgplsFkUhCjvg.pgp Description: PGP signature
Re: svn commit: r224496 - head/sys/cam
On Fri, Jul 29, 2011 at 08:30:28PM +, Alexander Motin wrote: Author: mav Date: Fri Jul 29 20:30:28 2011 New Revision: 224496 URL: http://svn.freebsd.org/changeset/base/224496 Log: In some cases failed SATA disks may report their presence, but don't respond to any commands. I've found that because of multiple command retries, each of which cause 30s timeout, bus reset and another retry or requeue for many commands, it may take ages to eventually drop the failed device. The odd thing is that those retries continue even after XPT considered device as dead and invalidated it. This patch makes cam_periph_error() to block any command retries after periph was marked as invalid. With that patch all activity completes in 1-2 minutes, just after several timeouts, required to consider device death. This should make ZFS, gmirror, graid, etc. operation more robust. Reviewed by:mjacob@ on scsi@ Approved by:re (kib) Modified: head/sys/cam/cam_periph.c Amusingly, this commit makes my test machine to not boot. This is Ibex Peak PCH, with two SATA disks on the channels 0 and 1. It seems that geom thread 100012 owns GEOM topology lock, while sleeping in adaclose-cam_periph_getccb() : db bt 100012 Tracing pid 12 tid 100012 td 0xfe00028a2000 sched_switch() at 0x8034a0c7 = sched_switch+0x157 mi_switch() at 0x803291fb = mi_switch+0x2eb sleepq_switch() at 0x803631f3 = sleepq_switch+0x123 sleepq_wait() at 0x80363eed = sleepq_wait+0x4d _sleep() at 0x80329b59 = _sleep+0x3b9 cam_periph_getccb() at 0x817ffc50 = cam_periph_getccb+0xa0 adaclose() at 0x8182c484 = adaclose+0xc4 g_disk_access() at 0x802bea74 = g_disk_access+0x1e4 g_access() at 0x802c519a = g_access+0x1ba g_dev_attrchanged() at 0x802bd1f6 = g_dev_attrchanged+0x96 g_dev_taste() at 0x802bd574 = g_dev_taste+0x284 g_new_provider_event() at 0x802c4ecd = g_new_provider_event+0xad g_run_events() at 0x802c0750 = g_run_events+0x250 fork_exit() at 0x802f0d99 = fork_exit+0x189 fork_trampoline() at 0x804ee3be = fork_trampoline+0xe --- trap 0, rip = 0, rsp = 0xff800025fd00, rbp = 0 --- (gdb) list *cam_periph_getccb+0xa0 0x1c50 is in cam_periph_getccb (/usr/home/kostik/work/build/bsd/DEV/src/sys/modules/cam/../../cam/cam_periph.c:883). 882 883 while (SLIST_FIRST(periph-ccb_list) == NULL) { 884 if (periph-immediate_priority priority) Reverting the rev. or not loading ahci.ko allows machine to boot. pgplNNdHYGW37.pgp Description: PGP signature
Re: svn commit: r223884 - head/sys/sys
On Sat, Jul 09, 2011 at 05:36:28PM +0200, Ed Schouten wrote: Hi Kostik, * Konstantin Belousov k...@freebsd.org, 20110709 16:29: +static __inline uint16_t +bitcount16(uint32_t x) Shouldn't we use uint16_t for the argument here? Not sure. uint32_t type of argument avoids repromotion, allowing to do the full-register calculation on both 32 and 64 bit architectures. The function correctly handles non-zero upper half-word on its own. When I saw the code, I thought by myself, this could be done more efficiently: | static __inline uint16_t | bitcount16(uint16_t x) | { | | x = (x 0x) + ((x 1) 0x); | x = (x 0x) + ((x 2) 0x); | x *= 0x; | return (x 12); | } But some testing revealed it works for all inputs, except 65536. d'oh! -- Ed Schouten e...@80386.nl WWW: http://80386.nl/ pgpd5LFH8j3WY.pgp Description: PGP signature
Re: svn commit: r223885 - head/sys/dev/pci
On Sat, Jul 09, 2011 at 12:56:10PM -0600, Warner Losh wrote: Should't this be called 'pci_find_first_class()? Also, can you add a man page for it, or at least document what it is supposed to be used for in the code? Warenr Sure, I will add a man page. Regarding the name, the intended use is to look up some unique devices on the hierarchy, like HOST-PCI bridge, or (used in the driver which requires this function) PCI-ISA bridge. It is known that device belonging to the supplied class is unique in the whole PCI buses hierarchy on the machine where the driver is attaching. 'First' there probably cannot be given any strong sense, since it would be a first device in some unspecified order. On Jul 9, 2011, at 8:30 AM, Konstantin Belousov wrote: Author: kib Date: Sat Jul 9 14:30:13 2011 New Revision: 223885 URL: http://svn.freebsd.org/changeset/base/223885 Log: Implement pci_find_class(9), the function to find a pci device by its class. Sponsored by: The FreeBSD Foundation Reviewed by: jhb MFC after: 1 week Modified: head/sys/dev/pci/pci.c head/sys/dev/pci/pcivar.h Modified: head/sys/dev/pci/pci.c == --- head/sys/dev/pci/pci.c Sat Jul 9 14:29:23 2011(r223884) +++ head/sys/dev/pci/pci.c Sat Jul 9 14:30:13 2011(r223885) @@ -347,6 +347,21 @@ pci_find_device(uint16_t vendor, uint16_ return (NULL); } +device_t +pci_find_class(uint8_t class, uint8_t subclass) +{ + struct pci_devinfo *dinfo; + + STAILQ_FOREACH(dinfo, pci_devq, pci_links) { + if (dinfo-cfg.baseclass == class + dinfo-cfg.subclass == subclass) { + return (dinfo-cfg.dev); + } + } + + return (NULL); +} + static int pci_printf(pcicfgregs *cfg, const char *fmt, ...) { Modified: head/sys/dev/pci/pcivar.h == --- head/sys/dev/pci/pcivar.h Sat Jul 9 14:29:23 2011 (r223884) +++ head/sys/dev/pci/pcivar.h Sat Jul 9 14:30:13 2011 (r223885) @@ -457,6 +457,7 @@ pci_msix_count(device_t dev) device_t pci_find_bsf(uint8_t, uint8_t, uint8_t); device_t pci_find_dbsf(uint32_t, uint8_t, uint8_t, uint8_t); device_t pci_find_device(uint16_t, uint16_t); +device_t pci_find_class(uint8_t class, uint8_t subclass); /* Can be used by drivers to manage the MSI-X table. */ int pci_pending_msix(device_t dev, u_int index); pgpELibwqfSPy.pgp Description: PGP signature
Re: svn commit: r223262 - in head: cddl/contrib/opensolaris/lib/libdtrace/common contrib/binutils/bfd contrib/binutils/gas contrib/binutils/gas/config contrib/binutils/ld contrib/binutils/opcodes cont
On Sun, Jun 19, 2011 at 10:38:44AM +0200, Simon L. B. Nielsen wrote: On 18 Jun 2011, at 22:48, Jilles Tjoelker wrote: On Sat, Jun 18, 2011 at 01:56:33PM +, Ben Laurie wrote: Modified: head/cddl/contrib/opensolaris/lib/libdtrace/common/dt_subr.c == --- head/cddl/contrib/opensolaris/lib/libdtrace/common/dt_subr.c Sat Jun 18 13:54:36 2011(r223261) +++ head/cddl/contrib/opensolaris/lib/libdtrace/common/dt_subr.c Sat Jun 18 13:56:33 2011(r223262) @@ -45,6 +45,7 @@ #include assert.h #include libgen.h #include limits.h +#include stdint.h #include dt_impl.h @@ -811,15 +812,14 @@ dt_basename(char *str) ulong_t dt_popc(ulong_t x) { -#ifdef _ILP32 +#if defined(_ILP32) x = x - ((x 1) 0xUL); x = (x 0xUL) + ((x 2) 0xUL); x = (x + (x 4)) 0x0F0F0F0FUL; x = x + (x 8); x = x + (x 16); return (x 0x3F); -#endif -#ifdef _LP64 +#elif defined(_LP64) x = x - ((x 1) 0xULL); x = (x 0xULL) + ((x 2) 0xULL); x = (x + (x 4)) 0x0F0F0F0F0F0F0F0FULL; @@ -827,6 +827,8 @@ dt_popc(ulong_t x) x = x + (x 16); x = x + (x 32); return (x 0x7F); +#else +# warning need td_popc() implementation #endif } This commit uncovers breakage that had been present for a while. If I compile this on stable/8 i386 for head i386, _ILP32 is not defined and the warning is hit, breaking the build. Apparently, the code had been broken for a while but I do not use dtrace so I would not have noticed. The tinderboxes have now also noticed the problem so it is not something weird about my system. What would be the correct (at least temporary) fix? Replace the _ILP32 / _LP64 checks with specific architecture checks ? I can see that on i386 _ILP32 isn't actually defined - but on amd64 _LP64 is... and it still breaks. [simon@zaphod:~] ssh ref9-i386.freebsd.org 'cpp -dM /dev/null | grep LP' [simon@zaphod:~] ssh ref9-amd64.freebsd.org 'cpp -dM /dev/null | grep LP' #define __LP64__ 1 #define _LP64 1 [simon@zaphod:~] ssh ref8-amd64.freebsd.org 'cpp -dM /dev/null | grep LP' #define __LP64__ 1 #define _LP64 1 I must be missing something... It breaks on amd64 while building compat32 libraries. The build step uses i386-targeted compiler, which lacks the ILP32 definition. pgpyfjWRdzjJK.pgp Description: PGP signature
Re: svn commit: r223262 - in head: cddl/contrib/opensolaris/lib/libdtrace/common contrib/binutils/bfd contrib/binutils/gas contrib/binutils/gas/config contrib/binutils/ld contrib/binutils/opcodes cont
Modified: head/sys/sys/param.h == --- head/sys/sys/param.h Sat Jun 18 13:54:36 2011(r223261) +++ head/sys/sys/param.h Sat Jun 18 13:56:33 2011(r223262) @@ -319,4 +319,10 @@ __END_DECLS #define member2struct(s, m, x) \ ((struct s *)(void *)((char *)(x) - offsetof(struct s, m))) +/* + * Access a variable length array that has been declared as a fixed + * length array. + */ +#define __PAST_END(array, offset) (((typeof(*(array)) *)(array))[offset]) + #endif /* _SYS_PARAM_H_ */ The typeof there should be __typeof, most likely. pgpkW0Q6BZ4zQ.pgp Description: PGP signature
Re: cvs commit: src Makefile.inc1 src/lib/libc Makefile src/lib/libc_r Makefile src/lib/libpthread Makefile pthread.map src/lib/libpthread/thread thr_private.h src/lib/librt Makefile src/lib/libthr Ma
On Thu, Jun 16, 2011 at 11:27:38PM +0200, Jilles Tjoelker wrote: On Sun, Jun 12, 2011 at 09:38:42PM +, Bjoern A. Zeeb wrote: http://svnweb.freebsd.org/base?view=revisionrevision=169524 I figured WITHOUT_SYMVER= hs been useless since 201001. I am no longer able to do build worlds with WITHOUT_SYMVER= set in src.conf on a system with symbol versioning. I'd love someone to fix that and allow us to build libraries without all the historic stuff in them. If we cannot get it back working our libraries will grow bigger and bigger forever. If one is building images for clean-state systems that will never run anything older than the current CURRENT build, there is no need for the extra size. Contrary to what people think, memory and direct attached storage can still be expensive in some environments. Anyone who understands the system can come up with patches to fix this? I think disabling symver completely is too much: it implies a new mutually incompatible set of binaries. What should be done instead is allowing to compile out the compatibility functions. This means all __sym_compat() directives and all functions referred to by them. A simple approach would use a yes/no #ifdef, while a more sophisticated approach would allow choosing the oldest version to retain compatibility with, for example freebsd7_semctl() in lib/libc/gen/semctl.c would be compiled in iff compatibility with FreeBSD 7.x was requested. With just symver, a binary for FreeBSD version for which compatibility was not compiled in will abort if and when it attempts to use a function that was changed in a later version. Preventing which behaviour was explicitely the goal of designing the symver mechanism at the first place. And, disabling symver does not removes the compat cruft from the system, it only makes the compat code not accessible. pgpmdJ7rWr5oZ.pgp Description: PGP signature
Re: svn commit: r223029 - head/usr.sbin/makefs/ffs
On Mon, Jun 13, 2011 at 01:04:00AM +, Dimitry Andric wrote: Author: dim Date: Mon Jun 13 01:04:00 2011 New Revision: 223029 URL: http://svn.freebsd.org/changeset/base/223029 Log: Apparently makefs needs a few more system headers to compile during buildworld. Modified: head/usr.sbin/makefs/ffs/ffs_bswap.c head/usr.sbin/makefs/ffs/ffs_subr.c Modified: head/usr.sbin/makefs/ffs/ffs_bswap.c == --- head/usr.sbin/makefs/ffs/ffs_bswap.c Mon Jun 13 00:55:29 2011 (r223028) +++ head/usr.sbin/makefs/ffs/ffs_bswap.c Mon Jun 13 01:04:00 2011 (r223029) @@ -35,6 +35,8 @@ __FBSDID($FreeBSD$); #include sys/param.h #include sys/queue.h +#include sys/lock.h +#include sys/lockmgr.h #if defined(_KERNEL) #include sys/systm.h #endif Modified: head/usr.sbin/makefs/ffs/ffs_subr.c == --- head/usr.sbin/makefs/ffs/ffs_subr.c Mon Jun 13 00:55:29 2011 (r223028) +++ head/usr.sbin/makefs/ffs/ffs_subr.c Mon Jun 13 01:04:00 2011 (r223029) @@ -35,6 +35,9 @@ __FBSDID($FreeBSD$); #include sys/param.h +#include sys/queue.h +#include sys/lock.h +#include sys/lockmgr.h #include ufs/ufs/dinode.h #include ufs/ffs/fs.h I suspect it is easier and less controversial to put the structure under #ifdef _KERNEL braces instead. pgp0cy3EPbh0T.pgp Description: PGP signature
Re: svn commit: r223061 - head/sys/kern
On Mon, Jun 13, 2011 at 09:21:02PM +, Justin T. Gibbs wrote: Author: gibbs Date: Mon Jun 13 21:21:02 2011 New Revision: 223061 URL: http://svn.freebsd.org/changeset/base/223061 Log: Fix a couple of race conditions in devstat(9) initialization. In devstat_new_entry(), there is no need to initialize the queue and the mutex in this function. There are ways to do static initialization on both, so use STAILQ_HEAD_INITIALIZER and MTX_SYSINIT to initialize the queue and the mutex. In devstat_alloc(), use an atomic test and set routine to guard making our entry in /dev. Using just a plain static variable creates a race condition on multiprocessor machines. If you attempt to create a second entry in devfs, the kernel will panic. Devfs returns an error if MAKEDEV_CHECKNAME flag is supplied and attempt is made to create the existing node. The static guard is still useful, since make_dev() call is costly, but you can remove the atomic, since the race should be of limited scope. pgp7NTBL2jlKg.pgp Description: PGP signature
Re: svn commit: r222274 - stable/8/sys/kern
On Wed, May 25, 2011 at 12:11:29PM +0200, Oliver Pinter wrote: MFC to 7-STABLE? Somebody need to test it for 7 (I do not expect any failures, but I also prefer to not commit untested changes). The testing should include destroying some devfs nodes, e.g. by loading and unloading a driver that creates and destroys them. On 5/25/11, Konstantin Belousov k...@freebsd.org wrote: Author: kib Date: Wed May 25 03:25:14 2011 New Revision: 74 URL: http://svn.freebsd.org/changeset/base/74 Log: MFC r222086: The protection against the race with dev_rel(), introduced in r163328, should be extended to cover destroy_devl() calls for the children of the destroyed dev. Modified: stable/8/sys/kern/kern_conf.c Directory Properties: stable/8/sys/ (props changed) stable/8/sys/amd64/include/xen/ (props changed) stable/8/sys/cddl/contrib/opensolaris/ (props changed) stable/8/sys/contrib/dev/acpica/ (props changed) stable/8/sys/contrib/pf/ (props changed) Modified: stable/8/sys/kern/kern_conf.c == --- stable/8/sys/kern/kern_conf.c Wed May 25 01:04:12 2011 (r73) +++ stable/8/sys/kern/kern_conf.c Wed May 25 03:25:14 2011 (r74) @@ -885,6 +885,8 @@ destroy_devl(struct cdev *dev) /* Remove name marking */ dev-si_flags = ~SI_NAMED; + dev-si_refcount++; /* Avoid race with dev_rel() */ + /* If we are a child, remove us from the parents list */ if (dev-si_flags SI_CHILD) { LIST_REMOVE(dev, si_siblings); @@ -901,7 +903,6 @@ destroy_devl(struct cdev *dev) dev-si_flags = ~SI_CLONELIST; } - dev-si_refcount++; /* Avoid race with dev_rel() */ csw = dev-si_devsw; dev-si_devsw = NULL; /* already NULL for SI_ALIAS */ while (csw != NULL csw-d_purge != NULL dev-si_threadcount) { ___ svn-src-sta...@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/svn-src-stable To unsubscribe, send any mail to svn-src-stable-unsubscr...@freebsd.org pgpDGF0sgjEYW.pgp Description: PGP signature
Re: svn commit: r222274 - stable/8/sys/kern
On Wed, May 25, 2011 at 02:07:10PM +0200, Oliver Pinter wrote: this or likely this script is enough for test? ---8--- #!/bin/csh @ a = 100 while ( $a ) foreach i ( umass cdce foo bar ) kldload $i end foreach i ( umass cdce foo bar ) kldunload $i end @ a-- end ---8--- Only if the unload of any of the listed modules caused destruction of some devfs node. May be, the easiest for 7 is to create some md(4) device and then destroy it. On 5/25/11, Kostik Belousov kostik...@gmail.com wrote: On Wed, May 25, 2011 at 12:11:29PM +0200, Oliver Pinter wrote: MFC to 7-STABLE? Somebody need to test it for 7 (I do not expect any failures, but I also prefer to not commit untested changes). The testing should include destroying some devfs nodes, e.g. by loading and unloading a driver that creates and destroys them. pgpll43Wpc0yQ.pgp Description: PGP signature
Re: svn commit: r221993 - in head/sys: kern sys
On Tue, May 17, 2011 at 01:36:11PM +, Poul-Henning Kamp wrote: In message 20110516211954.gj48...@deviant.kiev.zoral.com.ua, Kostik Belousov writes: struct sbuf is exposed to the libsubf.so consumers. I think that libsbuf.so version shall be bumped (since no symver compat can be provided, due to lack of versioning for libsbuf). The bump was also needed after the r212367. Lets do one for two changes. Ok, I'm done, Question: Do we bumb the libsbuf version, or do we start using symbol-versioning, and if so, how is that done ? We do bump the libsbuf version, regardless of the symbol versioning. If symver is introduced, then it can be combined with the bump. Please see http://people.freebsd.org/~deischen/symver/freebsd_versioning.txt for the document describing the rules. The exemplary use of the versioning for newly introduced library is r221931. pgpENv4dqIfJv.pgp Description: PGP signature
Re: svn commit: r221993 - in head/sys: kern sys
On Mon, May 16, 2011 at 04:18:40PM +, Poul-Henning Kamp wrote: Author: phk Date: Mon May 16 16:18:40 2011 New Revision: 221993 URL: http://svn.freebsd.org/changeset/base/221993 Log: Change the length quantities of sbufs to be ssize_t rather than int. Constify a couple of arguments. Modified: head/sys/sys/sbuf.h == --- head/sys/sys/sbuf.h Mon May 16 15:59:50 2011(r221992) +++ head/sys/sys/sbuf.h Mon May 16 16:18:40 2011(r221993) @@ -44,8 +44,8 @@ struct sbuf { sbuf_drain_func *s_drain_func; /* drain function */ void*s_drain_arg; /* user-supplied drain argument */ int s_error; /* current error code */ - int s_size;/* size of storage buffer */ - int s_len; /* current length of string */ + ssize_t s_size;/* size of storage buffer */ + ssize_t s_len; /* current length of string */ struct sbuf is exposed to the libsubf.so consumers. I think that libsbuf.so version shall be bumped (since no symver compat can be provided, due to lack of versioning for libsbuf). The bump was also needed after the r212367. Lets do one for two changes. pgpkH3lSrCXsw.pgp Description: PGP signature
Re: svn commit: r221971 - head/sys/dev/glxiic
On Sun, May 15, 2011 at 07:04:08PM +, Henrik Brix Andersen wrote: Author: brix (ports committer) Date: Sun May 15 19:04:08 2011 New Revision: 221971 URL: http://svn.freebsd.org/changeset/base/221971 Log: Fix breakage on pc98 by redefining DEBUG(). Why this was built on pc98 at all ? Are there any pc98 machines with Geode ? pgppR3MyvgT2l.pgp Description: PGP signature
Re: svn commit: r221807 - in head: lib lib/libprocstat lib/libprocstat/zfs lib/libutil sys/kern sys/sys usr.bin/fstat usr.bin/fstat/zfs usr.bin/procstat
On Sat, May 14, 2011 at 02:15:23AM -0700, Stanislav Sedov wrote: On Thu, 12 May 2011 14:26:40 +0300 Kostik Belousov kostik...@gmail.com mentioned: The newly introduced fuser(1) utility also uses this library and able to operate via sysctl and kvm backends. The library is by no means complete (e.g. KVM backend is missing vnode name resolution routines, and there're no manpages for the library itself) so I plan to improve it further. I'm commiting it so it will get wider exposure and review. We won't be able to MFC this work as it relies on changes in HEAD, which was introduced some time ago, that break kernel ABI. OTOH we may be able to merge the library with KVM backend if we really need it there. Could you, please, point out the mentioned ABI changes ? Hi, Kostik! It seems I was wrong and that change was actually introduced in 8.0. It thus makes MFC possible. I don't have immediate plans for this though, but I might do it if there's enough demand and it demonstrates to be stable enough in HEAD. I did not worried not about inability to merge the libprocstat. It made me very uncomfortable to see the claim of the ABI skew in the area which I supposed to have stable ABI. Thanks for the clarification. Sorry for being misleading! pgpFvpUpDbAyS.pgp Description: PGP signature
Re: svn commit: r221855 - in head/sys: amd64/include arm/include dev/md dev/null i386/include ia64/include mips/include powerpc/include sparc64/include sun4v/include sys vm
On Fri, May 13, 2011 at 07:35:01PM +, Matthew D Fleming wrote: Author: mdf Date: Fri May 13 19:35:01 2011 New Revision: 221855 URL: http://svn.freebsd.org/changeset/base/221855 Log: Move the ZERO_REGION_SIZE to a machine-dependent file, as on many architectures (i386, for example) the virtual memory space may be constrained enough that 2MB is a large chunk. Use 64K for arches other than amd64 and ia64, with special handling for sparc64 due to differing hardware. Also commit the comment changes to kmem_init_zero_region() that I missed due to not saving the file. (Darn the unfamiliar development environment). Arch maintainers, please feel free to adjust ZERO_REGION_SIZE as you see fit. Requested by: alc MFC after: 1 week MFC with: r221853 Wasting 2MB even on amd64 is not quite good, IMHO. pgpbx9vrgN6I2.pgp Description: PGP signature
Re: svn commit: r221855 - in head/sys: amd64/include arm/include dev/md dev/null i386/include ia64/include mips/include powerpc/include sparc64/include sun4v/include sys vm
On Fri, May 13, 2011 at 02:45:47PM -0500, Alan Cox wrote: On 5/13/2011 2:40 PM, Kostik Belousov wrote: On Fri, May 13, 2011 at 07:35:01PM +, Matthew D Fleming wrote: Author: mdf Date: Fri May 13 19:35:01 2011 New Revision: 221855 URL: http://svn.freebsd.org/changeset/base/221855 Log: Move the ZERO_REGION_SIZE to a machine-dependent file, as on many architectures (i386, for example) the virtual memory space may be constrained enough that 2MB is a large chunk. Use 64K for arches other than amd64 and ia64, with special handling for sparc64 due to differing hardware. Also commit the comment changes to kmem_init_zero_region() that I missed due to not saving the file. (Darn the unfamiliar development environment). Arch maintainers, please feel free to adjust ZERO_REGION_SIZE as you see fit. Requested by:alc MFC after: 1 week MFC with:r221853 Wasting 2MB even on amd64 is not quite good, IMHO. It's virtual address space, of which we have 512GB on amd64, and not physical memory. Only one page of physical memory is used. I should have read the patch more careful. Sorry. pgp77AjMlMoB8.pgp Description: PGP signature
Re: svn commit: r221807 - in head: lib lib/libprocstat lib/libprocstat/zfs lib/libutil sys/kern sys/sys usr.bin/fstat usr.bin/fstat/zfs usr.bin/procstat
On Thu, May 12, 2011 at 10:11:39AM +, Stanislav Sedov wrote: Author: stas Date: Thu May 12 10:11:39 2011 New Revision: 221807 URL: http://svn.freebsd.org/changeset/base/221807 Log: - Commit work from libprocstat project. These patches add support for runtime file and processes information retrieval from the running kernel via sysctl in the form of new library, libprocstat. The library also supports KVM backend for analyzing memory crash dumps. Both procstat(1) and fstat(1) utilities have been modified to take advantage of the library (as the bonus point the fstat(1) utility no longer need superuser privileges to operate), and the procstat(1) utility is now able to display information from memory dumps as well. The newly introduced fuser(1) utility also uses this library and able to operate via sysctl and kvm backends. The library is by no means complete (e.g. KVM backend is missing vnode name resolution routines, and there're no manpages for the library itself) so I plan to improve it further. I'm commiting it so it will get wider exposure and review. We won't be able to MFC this work as it relies on changes in HEAD, which was introduced some time ago, that break kernel ABI. OTOH we may be able to merge the library with KVM backend if we really need it there. Could you, please, point out the mentioned ABI changes ? pgpRibUlsxOy2.pgp Description: PGP signature
Re: svn commit: r221645 - head/sys/geom/part
On Sun, May 08, 2011 at 11:20:27AM +, Andrey V. Elsukov wrote: Author: ae Date: Sun May 8 11:20:27 2011 New Revision: 221645 URL: http://svn.freebsd.org/changeset/base/221645 Log: Limit number of sectors that can be addressed. MFC after: 1 week Modified: head/sys/geom/part/g_part_pc98.c Modified: head/sys/geom/part/g_part_pc98.c == --- head/sys/geom/part/g_part_pc98.c Sun May 8 11:16:17 2011 (r221644) +++ head/sys/geom/part/g_part_pc98.c Sun May 8 11:20:27 2011 (r221645) @@ -261,7 +261,7 @@ g_part_pc98_create(struct g_part_table * cyl = basetable-gpt_heads * basetable-gpt_sectors; - msize = MIN(pp-mediasize / SECSIZE, 0x); + msize = MIN(pp-mediasize / SECSIZE, UINT_MAX); Shouldn't this be UINT32_MAX (consistently) ? pgpfJpbFL9n4K.pgp Description: PGP signature
Re: svn commit: r221597 - head/sys/kern
On Sat, May 07, 2011 at 11:10:58AM +, Jaakko Heinonen wrote: Author: jh Date: Sat May 7 11:10:58 2011 New Revision: 221597 URL: http://svn.freebsd.org/changeset/base/221597 Log: Add WITNESS_WARN() to getenv() to explicitly note that the function may sleep. This helps to expose bugs when the requested environment variable doesn't exist. Modified: head/sys/kern/kern_environment.c Modified: head/sys/kern/kern_environment.c == --- head/sys/kern/kern_environment.c Sat May 7 11:05:16 2011 (r221596) +++ head/sys/kern/kern_environment.c Sat May 7 11:10:58 2011 (r221597) @@ -310,6 +310,7 @@ getenv(const char *name) int len; if (dynamic_kenv) { + WITNESS_WARN(WARN_GIANTOK | WARN_SLEEPOK, NULL, getenv); mtx_lock(kenv_lock); cp = _getenv_dynamic(name, NULL); if (cp != NULL) { This might be somewhat excessive. Since malloc() warns or panics anyway, what about moving the WITNESS_WARN into not found branch ? pgpFX1B841gkn.pgp Description: PGP signature
Re: svn commit: r221597 - head/sys/kern
On Sat, May 07, 2011 at 03:20:48PM +0300, Jaakko Heinonen wrote: On 2011-05-07, Kostik Belousov wrote: @@ -310,6 +310,7 @@ getenv(const char *name) int len; if (dynamic_kenv) { + WITNESS_WARN(WARN_GIANTOK | WARN_SLEEPOK, NULL, getenv); mtx_lock(kenv_lock); cp = _getenv_dynamic(name, NULL); if (cp != NULL) { This might be somewhat excessive. Since malloc() warns or panics anyway, what about moving the WITNESS_WARN into not found branch ? Is this better? Most likely, at least this is exactly what I meant. %%% Index: sys/kern/kern_environment.c === --- sys/kern/kern_environment.c (revision 221597) +++ sys/kern/kern_environment.c (working copy) @@ -310,7 +310,6 @@ getenv(const char *name) int len; if (dynamic_kenv) { - WITNESS_WARN(WARN_GIANTOK | WARN_SLEEPOK, NULL, getenv); mtx_lock(kenv_lock); cp = _getenv_dynamic(name, NULL); if (cp != NULL) { @@ -322,6 +321,8 @@ getenv(const char *name) } else { mtx_unlock(kenv_lock); ret = NULL; + WITNESS_WARN(WARN_GIANTOK | WARN_SLEEPOK, NULL, + getenv); } } else ret = _getenv_static(name); %%% -- Jaakko pgpWR7GzI7XGA.pgp Description: PGP signature
Re: svn commit: r221402 - in vendor/tre: . dist dist/doc dist/lib dist/lib/.deps dist/m4 dist/po dist/python dist/src dist/src/.deps dist/tests dist/tests/.deps dist/tests/agrep dist/utils dist/win32
On Tue, May 03, 2011 at 07:33:10PM +, Gabor Kovesdan wrote: Author: gabor Date: Tue May 3 19:33:09 2011 New Revision: 221402 URL: http://svn.freebsd.org/changeset/base/221402 Log: - Vendor import of TRE 0.8.0 What is this ? Why we do need it in base ? vendor/tre/dist/lib/.deps/ vendor/tre/dist/lib/.deps/regcomp.Plo vendor/tre/dist/lib/.deps/regerror.Plo vendor/tre/dist/lib/.deps/regexec.Plo vendor/tre/dist/lib/.deps/tre-ast.Plo vendor/tre/dist/lib/.deps/tre-compile.Plo vendor/tre/dist/lib/.deps/tre-match-approx.Plo vendor/tre/dist/lib/.deps/tre-match-backtrack.Plo vendor/tre/dist/lib/.deps/tre-match-parallel.Plo vendor/tre/dist/lib/.deps/tre-mem.Plo vendor/tre/dist/lib/.deps/tre-parse.Plo vendor/tre/dist/lib/.deps/tre-stack.Plo I am sure that all .deps dirs are garbage. pgpEg0AtyKOMr.pgp Description: PGP signature
Re: svn commit: r221166 - in head/sys: fs/ext2fs modules/ext2fs
On Thu, Apr 28, 2011 at 02:27:17PM +, John Baldwin wrote: Author: jhb Date: Thu Apr 28 14:27:17 2011 New Revision: 221166 URL: http://svn.freebsd.org/changeset/base/221166 Log: Sync with several changes in UFS/FFS: - 77115: Implement support for O_DIRECT. - 98425: Fix a performance issue introduced in 70131 that was causing reads before writes even when writing full blocks. - 98658: Rename the BALLOC flags from B_* to BA_* to avoid confusion with the struct buf B_ flags. - 100344: Merge the BA_ and IO_ flags so so that they may both be used in the same flags word. This merger is possible by assigning the IO_ flags to the low sixteen bits and the BA_ flags the high sixteen bits. - 105422: Fix a file-rewrite performance case. - 129545: Implement IO_INVAL in VOP_WRITE() by marking the buffer as no cache. - Readd the DOINGASYNC() macro and use it to control asynchronous writes. Change i-node updates to honor DOINGASYNC() instead of always being synchronous. - Use a PRIV_VFS_RETAINSUGID check instead of checking cr_uid against 0 directly when deciding whether or not to clear suid and sgid bits. Submitted by: Pedro F. Giffuni giffunip at yahoo @@ -141,10 +162,42 @@ READ(ap) if (error) break; - bqrelse(bp); + if ((ioflag (IO_VMIO|IO_DIRECT)) +(LIST_FIRST(bp-b_dep) == NULL)) { + /* + * If there are no dependencies, and it's VMIO, There is no dependencies for ext2fs, the FFS comments talks about SU. Also, I am unsure what the resulting semantic of O_DIRECTIO for ext2 is ? UFS tries to eliminate any use of buffer cache for O_DIRECTIO case, up to the mapping of user pages into pbuf to perform the actual i/o. In ext2 case, it seems we will just destroy the buffers after using them for i/o. Is it useful ? pgpSEzWFoHUWS.pgp Description: PGP signature
Re: svn commit: r221101 - in head/sys/geom: . concat journal mirror raid3 shsec stripe virstor
On Wed, Apr 27, 2011 at 12:10:26AM +, Alexander Motin wrote: Author: mav Date: Wed Apr 27 00:10:26 2011 New Revision: 221101 URL: http://svn.freebsd.org/changeset/base/221101 Log: Implement relaxed comparision for hardcoded provider names to make it ignore adX/adaY difference in both directions to simplify migration to the CAM-based ATA or back. Modified: head/sys/geom/concat/g_concat.c head/sys/geom/geom.h head/sys/geom/geom_subr.c head/sys/geom/journal/g_journal.c head/sys/geom/mirror/g_mirror.c head/sys/geom/raid3/g_raid3.c head/sys/geom/shsec/g_shsec.c head/sys/geom/stripe/g_stripe.c head/sys/geom/virstor/g_virstor.c Modified: head/sys/geom/concat/g_concat.c == --- head/sys/geom/concat/g_concat.c Tue Apr 26 23:00:32 2011 (r221100) +++ head/sys/geom/concat/g_concat.c Wed Apr 27 00:10:26 2011 (r221101) @@ -678,7 +678,8 @@ g_concat_taste(struct g_class *mp, struc if (md.md_version 4) md.md_provsize = pp-mediasize; - if (md.md_provider[0] != '\0' strcmp(md.md_provider, pp-name) != 0) + if (md.md_provider[0] != '\0' + !g_compare_names(md.md_provider, pp-name)) return (NULL); if (md.md_provsize != pp-mediasize) return (NULL); Modified: head/sys/geom/geom.h == --- head/sys/geom/geom.h Tue Apr 26 23:00:32 2011(r221100) +++ head/sys/geom/geom.h Wed Apr 27 00:10:26 2011(r221101) @@ -238,6 +238,7 @@ void g_waitidlelock(void); /* geom_subr.c */ int g_access(struct g_consumer *cp, int nread, int nwrite, int nexcl); int g_attach(struct g_consumer *cp, struct g_provider *pp); +int g_compare_names(const char *namea, const char *nameb); void g_destroy_consumer(struct g_consumer *cp); void g_destroy_geom(struct g_geom *pp); void g_destroy_provider(struct g_provider *pp); Modified: head/sys/geom/geom_subr.c == --- head/sys/geom/geom_subr.c Tue Apr 26 23:00:32 2011(r221100) +++ head/sys/geom/geom_subr.c Wed Apr 27 00:10:26 2011(r221101) @@ -1017,6 +1017,43 @@ g_getattr__(const char *attr, struct g_c return (0); } +static int +g_get_device_prefix_len(const char *name) +{ + int len; + + if (strncmp(name, ada, 3) == 0) + len = 3; + else if (strncmp(name, ad, 2) == 0) + len = 2; + else + return (0); + if (name[len] '0' || name[len] '9') + return (0); + do { + len++; + } while (name[len] = '0' name[len] = '9'); + return (len); +} + +int +g_compare_names(const char *namea, const char *nameb) +{ + int deva, devb; + + if (strcmp(namea, nameb) == 0) + return (1); + deva = g_get_device_prefix_len(namea); + if (deva == 0) + return (0); + devb = g_get_device_prefix_len(nameb); + if (devb == 0) + return (0); + if (strcmp(namea + deva, nameb + devb) == 0) + return (1); + return (0); +} + This is most likely my misunderstanding of things. Can we have a legitimate situation where both ada* and ad* devices coexist on the same system ? E.g. on-board AHCI and legacy PATA controller ? Wouldn't this hack then wreak the havoc ? Can the hack be put under the control of some tunable ? pgpLM6VNsOmPP.pgp Description: PGP signature
Re: svn commit: r221059 - in head/sys: kern sys
On Tue, Apr 26, 2011 at 01:44:00PM +0200, Hans Petter Selasky wrote: On Tuesday 26 April 2011 13:39:56 Konstantin Belousov wrote: + pending = !!callout_stop(timeout_task-c); pending = (callout_stop(timeout_task-c) != 0); ? This line is about conversion from a boolean value to {0, 1} value set. If !! construct does not look stylish, then wouldn't we need to go with pending = (callout_stop(timeout_task-c) != 0) ? 1 : 0; instead ? Feel free to adjust whatever variant you prefer and commit it. pgpNKu48OtZcN.pgp Description: PGP signature
Re: svn commit: r220982 - in head: . sys/amd64/conf sys/arm/conf sys/conf sys/i386/conf sys/ia64/conf sys/mips/conf sys/mips/malta sys/pc98/conf sys/powerpc/conf sys/sparc64/conf sys/sun4v/conf
On Sun, Apr 24, 2011 at 08:58:58AM +, Alexander Motin wrote: Author: mav Date: Sun Apr 24 08:58:58 2011 New Revision: 220982 URL: http://svn.freebsd.org/changeset/base/220982 Log: Switch the GENERIC kernels for all architectures to the new CAM-based ATA stack. It means that all legacy ATA drivers are disabled and replaced by respective CAM drivers. If you are using ATA device names in /etc/fstab or other places, make sure to update them respectively (adX - adaY, acdX - cdY, afdX - daY, astX - saY, where 'Y's are the sequential numbers for each type in order of detection, unless configured otherwise with tunables, see cam(4)). ataraid(4) functionality is now supported by the RAID GEOM class. To use it you can load geom_raid kernel module and use graid(8) tool for management. Instead of /dev/arX device names, use /dev/raid/rX. How many raid implementations, in particular, stripe and mirror ? Is gstripe/gmirror merged, or planned to be merged with geom_raid ? Does it make sense to merge if not yet ? pgpYTb3HXy10m.pgp Description: PGP signature
Re: svn commit: r220906 - head/sys/fs/nfsclient
On Wed, Apr 20, 2011 at 11:25:18PM +, Rick Macklem wrote: Author: rmacklem Date: Wed Apr 20 23:25:18 2011 New Revision: 220906 URL: http://svn.freebsd.org/changeset/base/220906 Log: Add a check for VI_DOOMED at the beginning of nfscl_request() so that it won't try and use vp-v_mount to do an RPC during a forced dismount. There needs to be at least one more kernel commit, plus a change to the umount(8) command before forced dismounts will work for the experimental NFS client. MFC after: 2 weeks Modified: head/sys/fs/nfsclient/nfs_clport.c Modified: head/sys/fs/nfsclient/nfs_clport.c == --- head/sys/fs/nfsclient/nfs_clport.cWed Apr 20 23:20:00 2011 (r220905) +++ head/sys/fs/nfsclient/nfs_clport.cWed Apr 20 23:25:18 2011 (r220906) @@ -819,6 +819,8 @@ nfscl_request(struct nfsrv_descript *nd, int ret, vers; struct nfsmount *nmp; + if ((vp-v_iflag VI_DOOMED) != 0) + return (EPERM); nmp = VFSTONFS(vp-v_mount); if (nd-nd_flag ND_NFSV4) vers = NFS_VER4; Is vnode lock held at this point ? If yes, I suggest to add ASSERT_VOP_{E,}LOCKED() assertion both to enforce the invariant, and to document the state to readers of the code. pgpeYv3PWyMf9.pgp Description: PGP signature
Re: svn commit: r220761 - head/sys/fs/nfsclient
On Sun, Apr 17, 2011 at 11:04:04PM +, Rick Macklem wrote: Author: rmacklem Date: Sun Apr 17 23:04:03 2011 New Revision: 220761 URL: http://svn.freebsd.org/changeset/base/220761 Log: Add checks for MNTK_UNMOUNTF at the beginning of three functions, so that threads don't get stuck in them during a forced dismount. nfs_sync/VFS_SYNC() needs this, since it is called by dounmount() before VFS_UNMOUNT(). The nfscl_nget() case makes sure that a thread doing an VOP_OPEN() or VOP_ADVLOCK() call doesn't get blocked before attempting the RPC. Attempting RPCs don't block, since they all fail once a forced dismount is in progress. The third one at the beginning of nfsrpc_close() is done so threads don't get blocked while doing VOP_INACTIVE() as the vnodes are cleared out. With these three changes plus a change to the umount(1) command so that it doesn't do sync() for the forced case seem to make forced dismounts work for the experimental NFS client. MFC after: 2 weeks Modified: head/sys/fs/nfsclient/nfs_clrpcops.c head/sys/fs/nfsclient/nfs_clstate.c head/sys/fs/nfsclient/nfs_clvfsops.c Modified: head/sys/fs/nfsclient/nfs_clrpcops.c == --- head/sys/fs/nfsclient/nfs_clrpcops.c Sun Apr 17 22:31:36 2011 (r220760) +++ head/sys/fs/nfsclient/nfs_clrpcops.c Sun Apr 17 23:04:03 2011 (r220761) @@ -567,6 +567,11 @@ nfsrpc_close(vnode_t vp, int doclose, NF if (vnode_vtype(vp) != VREG) return (0); + + /* For forced unmounts, just return. */ + if ((vp-v_mount-mnt_kern_flag MNTK_UNMOUNTF) != 0) + return (0); + Is there anything that would prevent the MNTK_UNMOUNTF flag from being set immediately after the test returned success ? Usually, the tests for MNTK_UNMOUNTF are bugs. I coould see how terminating the RPCs would be useful for forced unmounts, but do not think that preventing entry into close would help. if (doclose) error = nfscl_doclose(vp, clp, p); else pgp9Tkr81BCJh.pgp Description: PGP signature
Re: svn commit: r220791 - in head: lib/libc/sys sys/compat/freebsd32 sys/kern sys/sys
On Mon, Apr 18, 2011 at 04:32:22PM +, Matthew D Fleming wrote: Author: mdf Date: Mon Apr 18 16:32:22 2011 New Revision: 220791 URL: http://svn.freebsd.org/changeset/base/220791 Log: Add the posix_fallocate(2) syscall. The default implementation in vop_stdallocate() is filesystem agnostic and will run as slow as a read/write loop in userspace; however, it serves to correctly implement the functionality for filesystems that do not implement a VOP_ALLOCATE. Note that __FreeBSD_version was already bumped today to 900036 for any ports which would like to use this function. Also reserve space in the syscall table for posix_fadvise(2). Reviewed by:-arch (previous version) Thank you for taking the remarks into consideration. +int +vop_stdallocate(struct vop_allocate_args *ap) +{ +#ifdef __notyet__ + struct statfs sfs; +#endif + struct iovec aiov; + struct vattr vattr, *vap; + struct uio auio; + off_t len, cur, offset; + uint8_t *buf; + struct thread *td; + struct vnode *vp; + size_t iosize; + int error, locked; + + buf = NULL; + error = 0; + locked = 1; + td = curthread; + vap = vattr; + vp = ap-a_vp; + len = ap-a_len; + offset = ap-a_offset; + + error = VOP_GETATTR(vp, vap, td-td_ucred); + if (error != 0) + goto out; + iosize = vap-va_blocksize; + if (iosize == 0) + iosize = BLKDEV_IOSIZE; + if (iosize MAXPHYS) + iosize = MAXPHYS; + buf = malloc(iosize, M_TEMP, M_WAITOK); + +#ifdef __notyet__ + /* + * Check if the filesystem sets f_maxfilesize; if not use + * VOP_SETATTR to perform the check. + */ + error = VFS_STATFS(vp-v_mount, sfs, td); + if (error != 0) + goto out; + if (sfs.f_maxfilesize) { + if (offset sfs.f_maxfilesize || len sfs.f_maxfilesize || + offset + len sfs.f_maxfilesize) { + error = EFBIG; + goto out; + } + } else +#endif + if (offset + len vap-va_size) { + VATTR_NULL(vap); + vap-va_size = offset + len; + error = VOP_SETATTR(vp, vap, td-td_ucred); + if (error != 0) + goto out; + } I still do not see a reason to do VOP_SETATTR() there. VOP_WRITE() will do auto-extend as needed. Also, see below. + + while (len 0) { + if (should_yield()) { + VOP_UNLOCK(vp, 0); + locked = 0; + kern_yield(-1); Please note that, despite dropping the vnode lock, the snapshot creation is still blocked at this point, due to previous vn_start_write(). If doing vn_finished_write() there, then bwillwrite() before next iteration is desired. + error = vn_lock(vp, LK_EXCLUSIVE); + if (error != 0) + break; + locked = 1; + error = VOP_GETATTR(vp, vap, td-td_ucred); + if (error != 0) + break; + } + + /* + * Read and write back anything below the nominal file + * size. There's currently no way outside the filesystem + * to know whether this area is sparse or not. + */ + cur = iosize; + if ((offset % iosize) != 0) + cur -= (offset % iosize); + if (cur len) + cur = len; + if (offset vap-va_size) { + aiov.iov_base = buf; + aiov.iov_len = cur; + auio.uio_iov = aiov; + auio.uio_iovcnt = 1; + auio.uio_offset = offset; + auio.uio_resid = cur; + auio.uio_segflg = UIO_SYSSPACE; + auio.uio_rw = UIO_READ; + auio.uio_td = td; + error = VOP_READ(vp, auio, 0, td-td_ucred); + if (error != 0) + break; + if (auio.uio_resid 0) { + bzero(buf + cur - auio.uio_resid, + auio.uio_resid); + } + } else { + bzero(buf, cur); + } Wouldn't VOP_SETATTR() at the start of the function mostly prevent this bzero from executing ? + + aiov.iov_base = buf; + aiov.iov_len = cur; + auio.uio_iov = aiov; + auio.uio_iovcnt = 1; + auio.uio_offset = offset; + auio.uio_resid = cur; + auio.uio_segflg = UIO_SYSSPACE; + auio.uio_rw = UIO_WRITE; + auio.uio_td = td; + +
Re: svn commit: r220791 - in head: lib/libc/sys sys/compat/freebsd32 sys/kern sys/sys
On Mon, Apr 18, 2011 at 12:49:38PM -0700, m...@freebsd.org wrote: On Mon, Apr 18, 2011 at 12:28 PM, Kostik Belousov kostik...@gmail.com wrote: On Mon, Apr 18, 2011 at 04:32:22PM +, Matthew D Fleming wrote: Author: mdf Date: Mon Apr 18 16:32:22 2011 New Revision: 220791 URL: http://svn.freebsd.org/changeset/base/220791 Log: Add the posix_fallocate(2) syscall. The default implementation in vop_stdallocate() is filesystem agnostic and will run as slow as a read/write loop in userspace; however, it serves to correctly implement the functionality for filesystems that do not implement a VOP_ALLOCATE. Note that __FreeBSD_version was already bumped today to 900036 for any ports which would like to use this function. Also reserve space in the syscall table for posix_fadvise(2). +#ifdef __notyet__ + /* + * Check if the filesystem sets f_maxfilesize; if not use + * VOP_SETATTR to perform the check. + */ + error = VFS_STATFS(vp-v_mount, sfs, td); + if (error != 0) + goto out; + if (sfs.f_maxfilesize) { + if (offset sfs.f_maxfilesize || len sfs.f_maxfilesize || + offset + len sfs.f_maxfilesize) { + error = EFBIG; + goto out; + } + } else +#endif + if (offset + len vap-va_size) { + VATTR_NULL(vap); + vap-va_size = offset + len; + error = VOP_SETATTR(vp, vap, td-td_ucred); + if (error != 0) + goto out; + } I still do not see a reason to do VOP_SETATTR() there. VOP_WRITE() will do auto-extend as needed. Also, see below. The need is, as commented, to return EFBIG when the new file size will be larger than the FS supports. Without this code, passing in something like posix_fallocate(fd, 0, OFF_MAX) will run the filesystem out of space. Handling max file size and not overflowing the fs are different things. VOP_WRITE() will handle file size on its own too. I see no problem with exhausting free space if this is what user asked for. + + while (len 0) { + if (should_yield()) { + VOP_UNLOCK(vp, 0); + locked = 0; + kern_yield(-1); Please note that, despite dropping the vnode lock, the snapshot creation is still blocked at this point, due to previous vn_start_write(). If doing vn_finished_write() there, then bwillwrite() before next iteration is desired. + error = vn_lock(vp, LK_EXCLUSIVE); + if (error != 0) + break; + locked = 1; + error = VOP_GETATTR(vp, vap, td-td_ucred); + if (error != 0) + break; + } + + /* + * Read and write back anything below the nominal file + * size. There's currently no way outside the filesystem + * to know whether this area is sparse or not. + */ + cur = iosize; + if ((offset % iosize) != 0) + cur -= (offset % iosize); + if (cur len) + cur = len; + if (offset vap-va_size) { + aiov.iov_base = buf; + aiov.iov_len = cur; + auio.uio_iov = aiov; + auio.uio_iovcnt = 1; + auio.uio_offset = offset; + auio.uio_resid = cur; + auio.uio_segflg = UIO_SYSSPACE; + auio.uio_rw = UIO_READ; + auio.uio_td = td; + error = VOP_READ(vp, auio, 0, td-td_ucred); + if (error != 0) + break; + if (auio.uio_resid 0) { + bzero(buf + cur - auio.uio_resid, + auio.uio_resid); + } + } else { + bzero(buf, cur); + } Wouldn't VOP_SETATTR() at the start of the function mostly prevent this bzero from executing ? Yes. If struct statfs had a member indicating the file system's max file size, then the extend wouldn't be necessary. We have that feature locally, but it's only implemented for ufs and our custom file system, and it requires an ABI change so it's a bit of work to upstream. And as with most of those things, it's hard to find the time to upstream it outside of work hours. I estimated what it would take to do the optimized implementation for UFS, and I think that the following change would allow to lessen the code duplication much. What if the vnode lock drop and looping be handled by the syscall, instead
Re: svn commit: r218836 - in stable/8/sys: amd64/amd64 amd64/ia32 i386/i386 kern
On Sat, Apr 16, 2011 at 11:04:38PM +0200, Oliver Pinter wrote: MFC this for 7-STABLE? I do not reject the proposal, but I am quite curious why ? I do not have any machine running 7, so please test the patch below. It compiled for me on amd64. Property changes on: . ___ Modified: svn:mergeinfo Merged /head/sys:r218327 Index: kern/kern_context.c === --- kern/kern_context.c (revision 220730) +++ kern/kern_context.c (working copy) @@ -71,6 +71,7 @@ PROC_LOCK(td-td_proc); uc.uc_sigmask = td-td_sigmask; PROC_UNLOCK(td-td_proc); + bzero(uc.__spare__, sizeof(uc.__spare__)); ret = copyout(uc, uap-ucp, UC_COPY_SIZE); } return (ret); @@ -109,6 +110,7 @@ ret = EINVAL; else { get_mcontext(td, uc.uc_mcontext, GET_MC_CLEAR_RET); + bzero(uc.__spare__, sizeof(uc.__spare__)); PROC_LOCK(td-td_proc); uc.uc_sigmask = td-td_sigmask; PROC_UNLOCK(td-td_proc); Index: i386/i386/machdep.c === --- i386/i386/machdep.c (revision 220730) +++ i386/i386/machdep.c (working copy) @@ -342,12 +342,14 @@ /* Build the argument list for the signal handler. */ sf.sf_signum = sig; sf.sf_scp = (register_t)fp-sf_siginfo.si_sc; + bzero(sf.sf_siginfo, sizeof(sf.sf_siginfo)); if (SIGISMEMBER(psp-ps_siginfo, sig)) { /* Signal handler installed with SA_SIGINFO. */ sf.sf_arg2 = (register_t)fp-sf_siginfo; sf.sf_siginfo.si_signo = sig; sf.sf_siginfo.si_code = ksi-ksi_code; sf.sf_ahu.sf_action = (__osiginfohandler_t *)catcher; + sf.sf_addr = 0; } else { /* Old FreeBSD-style arguments. */ sf.sf_arg2 = ksi-ksi_code; @@ -461,6 +463,11 @@ sf.sf_uc.uc_mcontext.mc_onstack = (oonstack) ? 1 : 0; sf.sf_uc.uc_mcontext.mc_gs = rgs(); bcopy(regs, sf.sf_uc.uc_mcontext.mc_fs, sizeof(*regs)); + bzero(sf.sf_uc.uc_mcontext.mc_fpregs, + sizeof(sf.sf_uc.uc_mcontext.mc_fpregs)); + bzero(sf.sf_uc.uc_mcontext.__spare__, + sizeof(sf.sf_uc.uc_mcontext.__spare__)); + bzero(sf.sf_uc.__spare__, sizeof(sf.sf_uc.__spare__)); /* Allocate space for the signal handler context. */ if ((td-td_pflags TDP_ALTSTACK) != 0 !oonstack @@ -480,6 +487,7 @@ /* Build the argument list for the signal handler. */ sf.sf_signum = sig; sf.sf_ucontext = (register_t)sfp-sf_uc; + bzero(sf.sf_si, sizeof(sf.sf_si)); if (SIGISMEMBER(psp-ps_siginfo, sig)) { /* Signal handler installed with SA_SIGINFO. */ sf.sf_siginfo = (register_t)sfp-sf_si; @@ -596,6 +604,11 @@ sf.sf_uc.uc_mcontext.mc_len = sizeof(sf.sf_uc.uc_mcontext); /* magic */ get_fpcontext(td, sf.sf_uc.uc_mcontext); fpstate_drop(td); + bzero(sf.sf_uc.uc_mcontext.mc_spare1, + sizeof(sf.sf_uc.uc_mcontext.mc_spare1)); + bzero(sf.sf_uc.uc_mcontext.mc_spare2, + sizeof(sf.sf_uc.uc_mcontext.mc_spare2)); + bzero(sf.sf_uc.__spare__, sizeof(sf.sf_uc.__spare__)); /* Allocate space for the signal handler context. */ if ((td-td_pflags TDP_ALTSTACK) != 0 !oonstack @@ -617,6 +630,7 @@ /* Build the argument list for the signal handler. */ sf.sf_signum = sig; sf.sf_ucontext = (register_t)sfp-sf_uc; + bzero(sf.sf_si, sizeof(sf.sf_si)); if (SIGISMEMBER(psp-ps_siginfo, sig)) { /* Signal handler installed with SA_SIGINFO. */ sf.sf_siginfo = (register_t)sfp-sf_si; @@ -2716,6 +2730,8 @@ mcp-mc_ss = tp-tf_ss; mcp-mc_len = sizeof(*mcp); get_fpcontext(td, mcp); + bzero(mcp-mc_spare1, sizeof(mcp-mc_spare1)); + bzero(mcp-mc_spare2, sizeof(mcp-mc_spare2)); return (0); } @@ -2763,6 +2779,7 @@ #ifndef DEV_NPX mcp-mc_fpformat = _MC_FPFMT_NODEV; mcp-mc_ownedfp = _MC_FPOWNED_NONE; + bzero(mcp-mc_fpstate, sizeof(mcp-mc_fpstate)); #else union savefpu *addr; Property changes on: contrib/pf ___ Modified: svn:mergeinfo Merged /head/sys/contrib/pf:r218327 Property changes on: contrib/dev/acpica ___ Modified: svn:mergeinfo Merged /head/sys/contrib/dev/acpica:r218327 Property changes on: cddl/contrib/opensolaris ___ Modified: svn:mergeinfo Merged /head/sys/cddl/contrib/opensolaris:r218327 Index: amd64/amd64/machdep.c
Re: svn commit: r220526 - head/sys/kern
On Thu, Apr 14, 2011 at 05:13:28PM -0400, John Baldwin wrote: On Sunday, April 10, 2011 1:07:03 pm Konstantin Belousov wrote: Author: kib Date: Sun Apr 10 17:07:02 2011 New Revision: 220526 URL: http://svn.freebsd.org/changeset/base/220526 Log: Some callers of proc_reparent() already have the parent process locked. Detect the situation and avoid process lock recursion. Reported by: Fabian Keil freebsd-listen fabiankeil de Modified: head/sys/kern/kern_exit.c Can we instead assert it is always held and fix callers that don't? Using locked variables is messy and I'd rather avoid it when possible. We already require the caller to hold other locks for this operation. I agree that this is ugly, and proper fix probably would be something else. E.g. struct proc could grow another field that holds a pointer to the ucred it is accounted for, and locked with some global lock. pgp2mHLtz2SNv.pgp Description: PGP signature
Re: svn commit: r220526 - head/sys/kern
On Fri, Apr 15, 2011 at 12:46:18PM -0400, Attilio Rao wrote: 2011/4/15 Kostik Belousov kostik...@gmail.com: On Thu, Apr 14, 2011 at 05:13:28PM -0400, John Baldwin wrote: On Sunday, April 10, 2011 1:07:03 pm Konstantin Belousov wrote: Author: kib Date: Sun Apr 10 17:07:02 2011 New Revision: 220526 URL: http://svn.freebsd.org/changeset/base/220526 Log: Some callers of proc_reparent() already have the parent process locked. Detect the situation and avoid process lock recursion. Reported by: Fabian Keil freebsd-listen fabiankeil de Modified: head/sys/kern/kern_exit.c Can we instead assert it is always held and fix callers that don't? Using locked variables is messy and I'd rather avoid it when possible. We already require the caller to hold other locks for this operation. I agree that this is ugly, and proper fix probably would be something else. E.g. struct proc could grow another field that holds a pointer to the ucred it is accounted for, and locked with some global lock. As you already hold allproc_lock the process can't be distructed, then as I already pointed out to Tomasz, it should alright to just bump the refcount for cred and pass down, I guess. I do not see how allproc_lock is useful there, unless setuid(2) and other syscalls, which change the process credentials, are protected by the same lock. The issue there is in accounting for wrong container. You want to avoid a race between dereferencing stale p_ucred and the process moving to another container. pgpEYstB29LCc.pgp Description: PGP signature
Re: svn commit: r220387 - head/sys/vm
On Wed, Apr 06, 2011 at 07:04:47PM +0200, Edward Tomasz Napiera?a wrote: Wiadomo?? napisana przez Garrett Cooper w dniu 2011-04-06, o godz. 18:57: On Wed, Apr 6, 2011 at 9:27 AM, Edward Tomasz Napierala tr...@freebsd.org wrote: Author: trasz Date: Wed Apr 6 16:27:04 2011 New Revision: 220387 URL: http://svn.freebsd.org/changeset/base/220387 Log: In vm_daemon(), do not skip processes stopped with SIGSTOP. Did you run this by anyone else before you committed the change? The whole racct patchset was reviewed by kib@, and I seem to remember that he said this might cause problems. However, I didn't encounter any problems with this, neither did any person testing the patchset. So, what's wrong with this? I remember that I disliked the whole approach of handling RSS limits, and still hold the same opinion. I said something about honoring the limit at the time of page allocation or page-in, and not `offline' as it is committed, by periodic scans by daemon. I do not think that r220387 is much problematic. It performs forced pageout of usermode pages, that it. On the other hand, a commit message is not very accurate, thread inhibitor TDI_SUSPENDED might be set e.g. due to single-threading event, and then we could end up swapping out the vmspace that is going to be destroyed due to exec(2). pgp3tqMNi1PEJ.pgp Description: PGP signature
Re: svn commit: r220158 - in head/sys: compat/freebsd32 kern sys
On Thu, Mar 31, 2011 at 11:07:36AM +0400, Pan Tsu wrote: Konstantin Belousov k...@freebsd.org writes: Author: kib Date: Wed Mar 30 14:46:12 2011 New Revision: 220158 URL: http://svn.freebsd.org/changeset/base/220158 Log: Provide compat32 shims for kldstat(2). Requested and tested by: jpaetzel MFC after:1 week It breaks loading modules from loader(8) for me. The box panics early #13 0x804f2f2a in sysctl_find_oidname (name=0x80eec156 kstat, list=0x657a69735f796c) at /usr/src/sys/kern/kern_sysctl.c:106 #14 0x804f3e69 in sysctl_register_oid (oidp=0x80ef0a80) at /usr/src/sys/kern/kern_sysctl.c:149 #15 0x804cb83c in linker_file_register_sysctls (lf=Variable lf is not available. ) at /usr/src/sys/kern/kern_linker.c:301 #16 0x804cd112 in linker_preload (arg=Variable arg is not available. ) at /usr/src/sys/kern/kern_linker.c:1624 #17 0x80499ad7 in mi_startup () at /usr/src/sys/kern/init_main.c:258 #18 0x80299a3c in btext () at /usr/src/sys/amd64/amd64/locore.S:81 [...] What supports your claim that the referenced commit breaks early module loading ? You even did not provided the panic message, btw. Commit only changed the wrapper around kldstat(2) syscall, that cannot be used during bootstrap, and the changed code does not reference kldstat(9). The later cannot prove that the change did not break something, but due to its nature, it is unlikely. pgpkyPe8Pe5Ud.pgp Description: PGP signature
Re: svn commit: r220062 - head/sys/geom/gate
On Sun, Mar 27, 2011 at 07:56:55PM +, Mikolaj Golub wrote: Author: trociny Date: Sun Mar 27 19:56:55 2011 New Revision: 220062 URL: http://svn.freebsd.org/changeset/base/220062 Log: In g_gate_create() there is a window between when g_gate_softc is registered in g_gate_units array and when its sc_provider field is filled. If during this period g_gate_units is accessed by another thread that is checking for provider name collision the crash is possible. Fix this by adding sc_name field to struct g_gate_softc. In g_gate_create() when g_gate_softc is created but sc_provider is still not sc_name points to provider name stored in the local array. Approved by:pjd (mentor) Reported by:Freddie Cash fjwc...@gmail.com MFC after: 1 week Modified: head/sys/geom/gate/g_gate.c head/sys/geom/gate/g_gate.h Modified: head/sys/geom/gate/g_gate.c == --- head/sys/geom/gate/g_gate.c Sun Mar 27 19:29:18 2011 (r220061) +++ head/sys/geom/gate/g_gate.c Sun Mar 27 19:56:55 2011 (r220062) @@ -409,13 +409,14 @@ g_gate_create(struct g_gate_ctl_create * for (unit = 0; unit g_gate_maxunits; unit++) { if (g_gate_units[unit] == NULL) continue; - if (strcmp(name, g_gate_units[unit]-sc_provider-name) != 0) + if (strcmp(name, g_gate_units[unit]-sc_name) != 0) continue; mtx_unlock(g_gate_units_lock); mtx_destroy(sc-sc_queue_mtx); free(sc, M_GATE); return (EEXIST); } + sc-sc_name = name; g_gate_units[sc-sc_unit] = sc; g_gate_nunits++; mtx_unlock(g_gate_units_lock); @@ -434,6 +435,9 @@ g_gate_create(struct g_gate_ctl_create * sc-sc_provider = pp; g_error_provider(pp, 0); g_topology_unlock(); + mtx_lock(g_gate_units_lock); + sc-sc_name = sc-sc_provider-name; + mtx_unlock(g_gate_units_lock); I think you do not need a mutex locked around the single assignment. As I understand, sc_provider-name is constant ? pgpWiGCrhAZuT.pgp Description: PGP signature
Re: svn commit: r220062 - head/sys/geom/gate
On Sun, Mar 27, 2011 at 11:49:15PM +0300, Mikolaj Golub wrote: On Sun, 27 Mar 2011 23:08:04 +0300 Kostik Belousov wrote: KB On Sun, Mar 27, 2011 at 07:56:55PM +, Mikolaj Golub wrote: Author: trociny Date: Sun Mar 27 19:56:55 2011 New Revision: 220062 URL: http://svn.freebsd.org/changeset/base/220062 Log: In g_gate_create() there is a window between when g_gate_softc is registered in g_gate_units array and when its sc_provider field is filled. If during this period g_gate_units is accessed by another thread that is checking for provider name collision the crash is possible. Fix this by adding sc_name field to struct g_gate_softc. In g_gate_create() when g_gate_softc is created but sc_provider is still not sc_name points to provider name stored in the local array. Approved by:pjd (mentor) Reported by:Freddie Cash fjwc...@gmail.com MFC after:1 week Modified: head/sys/geom/gate/g_gate.c head/sys/geom/gate/g_gate.h Modified: head/sys/geom/gate/g_gate.c == --- head/sys/geom/gate/g_gate.cSun Mar 27 19:29:18 2011 (r220061) +++ head/sys/geom/gate/g_gate.cSun Mar 27 19:56:55 2011 (r220062) @@ -409,13 +409,14 @@ g_gate_create(struct g_gate_ctl_create * for (unit = 0; unit g_gate_maxunits; unit++) { if (g_gate_units[unit] == NULL) continue; -if (strcmp(name, g_gate_units[unit]-sc_provider-name) != 0) +if (strcmp(name, g_gate_units[unit]-sc_name) != 0) continue; mtx_unlock(g_gate_units_lock); mtx_destroy(sc-sc_queue_mtx); free(sc, M_GATE); return (EEXIST); } +sc-sc_name = name; g_gate_units[sc-sc_unit] = sc; g_gate_nunits++; mtx_unlock(g_gate_units_lock); @@ -434,6 +435,9 @@ g_gate_create(struct g_gate_ctl_create * sc-sc_provider = pp; g_error_provider(pp, 0); g_topology_unlock(); +mtx_lock(g_gate_units_lock); +sc-sc_name = sc-sc_provider-name; +mtx_unlock(g_gate_units_lock); KB I think you do not need a mutex locked around the single assignment. KB As I understand, sc_provider-name is constant ? Is the following scenario impossible? Thread A is looking for name collision and is accessing g_gate_units[unit]-sc_name of the unit that is being created by a thread B, so sc_name is pointing to thread B local buffer. At this time the thread B creates provider, does sc-sc_name = sc-sc_provider-name and returns from g_gate_create(). Thread A, if it is still working with g_gate_units[unit]-sc_name, is accessing invalid memory. Ok, name is local variable. Apparently, what you need is a barrier. It would be enough to do sc-sc_name = sc-sc_provider-name; mtx_lock(g_gate_units_lock); mtx_unlock(g_gate_units_lock); The change is fine as is. pgp0Ao5ItIhA4.pgp Description: PGP signature
Re: svn commit: r219727 - head/sys/vm
On Wed, Mar 23, 2011 at 09:45:37AM -0400, John Baldwin wrote: On Friday, March 18, 2011 4:40:34 am Julian Elischer wrote: On 3/17/11 11:47 PM, Edward Tomasz Napierala wrote: Author: trasz Date: Fri Mar 18 06:47:23 2011 New Revision: 219727 URL: http://svn.freebsd.org/changeset/base/219727 Log: In vm_daemon(), when iterating over all processes in the system, skip those which are not yet fully initialized (i.e. ones with p_state == PRS_NEW). Without it, we could panic in _thread_lock_flags(). Note that there may be other instances of FOREACH_PROC_IN_SYSTEM() that require similar fix. In the past each process was only put on the process list after it was fully set up. Did someone change that recently? that would be A Bad Thing (TM). Err, no, that has never been true. The reason it has to go on the list immediately is to reserve the PID against concurrent fork()s. Hmm, the locking of prs_state is a bit busted it seems. Both the PROC_LOCK() and PROC_SLOCK() are supposed to be held when it is written to, but PROC_LOCK() is missing in fork1() when moving the state to PRS_NORMAL. Also, this commit should check against PRS_NORMAL after acquiring the proc lock, not before. In the case of this commit, it does not matter much, I think. The reason is that all the check want is to make sure that there is at least one fully initialized thread linked into the process. pgpF9SRCR5EhI.pgp Description: PGP signature
Re: svn commit: r219712 - head/sys/ufs/ufs
On Fri, Mar 18, 2011 at 12:37:41AM +1100, Bruce Evans wrote: On Thu, 17 Mar 2011, Konstantin Belousov wrote: Log: Remove the #if defined(FFS) || defined(IFS) braces around the calls to ffs_snapgone(). ufs.ko module is not build with FFS define, causing snapshot inode number slots in superblock never be freed, as well as a reference on the snapshot vnode. IFS was removed several years ago, and UFS/FFS separation was not maintained for real. Reported, analyzed and tested by: Yamagi Burmeister lists yamagi org MFC after: 3 days This seems to leave FFS correctly unused. Most options for file systems are put in opt_dontuse.h to inhibit bugs like this (but I never figured out a way to generate a compile time error if an option in there is used). The FFS option is special. It should not be special. It was moved from opt_dontuse.h to opt_ffs_broken_fixme.h in 2001 to avoid a collateral bug: the FFS/UFS split doesn't work, but was used by ext2fs, and it is a layering violation for UFS to call ffs_snapgone() in FFS, and this broke the configuration with EXT2FS (and UFS) but not FFS; this was hacked around by the ifdefs. ext2fs was decoupled from UFS in 2002, so the ifdefs became unnecessary, but they remained to break the module. opt_ffs_broken_fixme.h also became unnecessary in 2002, but remains to generate history lessons :-). The ifdef on IFS was even more bogus, since IFS doesn't exist and has no vestiges in conf/*. I went ahead and implemented UFS_SNAPONE() and removal of opt_ffs_broken_fixme.h. diff --git a/sys/conf/options b/sys/conf/options index 7f92258..8207800 100644 --- a/sys/conf/options +++ b/sys/conf/options @@ -200,6 +200,7 @@ CD9660 opt_dontuse.h CODA opt_dontuse.h EXT2FS opt_dontuse.h FDESCFSopt_dontuse.h +FFSopt_dontuse.h HPFS opt_dontuse.h MSDOSFSopt_dontuse.h NTFS opt_dontuse.h @@ -217,9 +218,6 @@ UNIONFS opt_dontuse.h # Pseudofs debugging PSEUDOFS_TRACE opt_pseudofs.h -# Broken - ffs_snapshot() dependency from ufs_lookup() :-( -FFSopt_ffs_broken_fixme.h - # In-kernel GSS-API KGSSAPIopt_kgssapi.h KGSSAPI_DEBUG opt_kgssapi.h diff --git a/sys/modules/ufs/Makefile b/sys/modules/ufs/Makefile index 0fe7b4c..652856c 100644 --- a/sys/modules/ufs/Makefile +++ b/sys/modules/ufs/Makefile @@ -3,8 +3,7 @@ .PATH: ${.CURDIR}/../../ufs/ufs ${.CURDIR}/../../ufs/ffs KMOD= ufs -SRCS= opt_ddb.h opt_directio.h opt_ffs.h opt_ffs_broken_fixme.h \ - opt_quota.h opt_suiddir.h opt_ufs.h \ +SRCS= opt_ddb.h opt_directio.h opt_ffs.h opt_quota.h opt_suiddir.h opt_ufs.h \ vnode_if.h ufs_acl.c ufs_bmap.c ufs_dirhash.c ufs_extattr.c \ ufs_gjournal.c ufs_inode.c ufs_lookup.c ufs_quota.c ufs_vfsops.c \ ufs_vnops.c ffs_alloc.c ffs_balloc.c ffs_inode.c ffs_snapshot.c \ diff --git a/sys/ufs/ffs/ffs_vfsops.c b/sys/ufs/ffs/ffs_vfsops.c index f578382..573c364 100644 --- a/sys/ufs/ffs/ffs_vfsops.c +++ b/sys/ufs/ffs/ffs_vfsops.c @@ -797,6 +797,7 @@ ffs_mountfs(devvp, mp, td) ump-um_vfree = ffs_vfree; ump-um_ifree = ffs_ifree; ump-um_rdonly = ffs_rdonly; + ump-um_snapgone = ffs_snapgone; mtx_init(UFS_MTX(ump), FFS, FFS Lock, MTX_DEF); bcopy(bp-b_data, ump-um_fs, (u_int)fs-fs_sbsize); if (fs-fs_sbsize SBLOCKSIZE) diff --git a/sys/ufs/ufs/ufs_lookup.c b/sys/ufs/ufs/ufs_lookup.c index d819f69..45ebea1 100644 --- a/sys/ufs/ufs/ufs_lookup.c +++ b/sys/ufs/ufs/ufs_lookup.c @@ -37,7 +37,6 @@ #include sys/cdefs.h __FBSDID($FreeBSD$); -#include opt_ffs_broken_fixme.h #include opt_ufs.h #include opt_quota.h @@ -1253,7 +1252,7 @@ out: * when last open reference goes away. */ if (ip != 0 (ip-i_flags SF_SNAPSHOT) != 0 ip-i_effnlink == 0) - ffs_snapgone(ip); + UFS_SNAPGONE(ip); return (error); } @@ -1316,7 +1315,7 @@ ufs_dirrewrite(dp, oip, newinum, newtype, isrmdir) * when last open reference goes away. */ if ((oip-i_flags SF_SNAPSHOT) != 0 oip-i_effnlink == 0) - ffs_snapgone(oip); + UFS_SNAPGONE(oip); return (error); } diff --git a/sys/ufs/ufs/ufsmount.h b/sys/ufs/ufs/ufsmount.h index b13db40..c2cfcfb 100644 --- a/sys/ufs/ufs/ufsmount.h +++ b/sys/ufs/ufs/ufsmount.h @@ -104,6 +104,7 @@ struct ufsmount { int (*um_vfree)(struct vnode *, ino_t, int); void(*um_ifree)(struct ufsmount *, struct inode *); int (*um_rdonly)(struct inode *); + void(*um_snapgone)(struct inode *); }; #define UFS_BALLOC(aa, bb, cc, dd, ee, ff) VFSTOUFS((aa)-v_mount)-um_balloc(aa, bb, cc, dd, ee, ff) @@ -114,6 +115,7 @@ struct ufsmount { #define UFS_VFREE(aa, bb, cc) VFSTOUFS((aa)-v_mount)-um_vfree(aa, bb, cc) #define UFS_IFREE(aa, bb) ((aa)-um_ifree(aa, bb)) #defineUFS_RDONLY(aa)
Re: svn commit: r219672 - in head: share/man/man9 sys/i386/include
On Wed, Mar 16, 2011 at 05:00:42PM +0300, Maxim Dounin wrote: Hello! On Wed, Mar 16, 2011 at 04:44:35PM +1100, Bruce Evans wrote: On Tue, 15 Mar 2011, Jung-uk Kim wrote: On Tuesday 15 March 2011 03:55 pm, Jung-uk Kim wrote: On Tuesday 15 March 2011 03:33 pm, Maxim Dounin wrote: Hello! On Tue, Mar 15, 2011 at 05:14:26PM +, Jung-uk Kim wrote: Author: jkim Date: Tue Mar 15 17:14:26 2011 New Revision: 219672 URL: http://svn.freebsd.org/changeset/base/219672 Log: Unconditionally use binuptime(9) for get_cyclecount(9) on i386. Since this function is almost exclusively used for random harvesting, there is no need for micro-optimization. Adjust the manual page accordingly. Note that on early boot only dummy timecounter available, and binuptime() has no entropy. As a result of this change random(9) won't have entropy on early boot on i386, and arc4random(9) as well. While there are no known major security problems associated with it - it at least makes stack protector easily bypasseable as it now (again after r198295) uses well-known stack guard instead of random one. And there may be other issues as well. Is dummy timecounter used for long enough to matter? I think completion of clock initialization is still bogusly late for histrotical reasons, but there is still a second or two between completion of timecounter initialization and user mode. The earlier stages of booting might take 20 seconds but should be faster, so they might not provided much more entropy from clocks. The problem is initial random number generator initialization (random(9) and acr4random(9)) and various early boot things which use random. Most notably it's stack protector (which has to be initialized as early as possible, and requires special handling of code which is run before it's initialized), but there are also other things to care about - AFAIR booting from nfs uses the same ports without entropy and so on. Right now the only entropy used at early boot are from get_cyclecount() call, which has at least some entropy on most platforms (notable exceptions are arm and i386 with i486 cpus). With dummy timecounter there are no entropy at early boot. After boot everything is eventually reseeded (random(9) at proc0_post(), arc4random(9) and /dev/random at userland startup scripts) and becomes safe. proc0_post() is not called after the boot, it is executed during the boot. Why is the randomness for the stack protector critical at that stage ? Most kernel threads are started after INTRINSIC_POST, at least this is what I see from looking at kernel.h. Might be, the ssp initialization should be moved after SI_SUB_INTRINSIC_POST ? This is definitely irrelevant for usermode ssp, since the first thread enters usermode long after the INTRINSIC_POST is done. I have entropy stuff mostly turned off and often don't even have enough entropy to run ed on /etc/fstab early. Don't know if I have clock entropy turned off. Hope you thought well before moving i386 to a set of platforms which have no early boot randomness at all. And you have good reason for doing it. I asked for moving all platforms to binuptime() so that the bugs could be seen by everyone :-). Didn't know about this bug. If you want to change get_cyclecount() to be alias to binuptime() - we may consider adding another machdep call to extract early entropy. I still not quite understand the reasons though. I consider binuptime() to be some (bad one) fallback for get_cyclecount() on platforms which has no hardware counter available. Moving all platforms to bad fallback looks strange. Maxim Dounin pgpPB6TXVCpfJ.pgp Description: PGP signature
Re: svn commit: r219646 - head/sys/x86/isa
On Tue, Mar 15, 2011 at 05:21:34PM -0400, Jung-uk Kim wrote: However, why do we need cheaper DELAY() when we trying to delay something with it? Busy-loop performing repeated access to the south bridge eats the bus capacity, that could be useful for other bus agent and other cores. I think this is the case with HPET, while TSC is core-local. pgp874pQWyLYv.pgp Description: PGP signature
Re: svn commit: r219578 - head/etc
On Sat, Mar 12, 2011 at 09:13:08PM +, Doug Barton wrote: Author: dougb Date: Sat Mar 12 21:13:08 2011 New Revision: 219578 URL: http://svn.freebsd.org/changeset/base/219578 Log: Use the allexport option in load_rc_config() in order to avoid having to repeatedly read the conf files. Depending on what is enabled the files are being read anywhere from 15, 30, or more times currently. By loading the values in the environment this is reduced to 1, with perhaps a couple more, again depending on what is enabled. The speed-up for boot and shutdown is negligible when rc.conf is on local disk, noticable when accessing files over NFS, and dramatic when pulling rc.conf values from a database. This change also includes a minor optimization to the conditional for $_rc_conf_loaded. Modified: head/etc/rc.subr Modified: head/etc/rc.subr == --- head/etc/rc.subr Sat Mar 12 20:36:52 2011(r219577) +++ head/etc/rc.subr Sat Mar 12 21:13:08 2011(r219578) @@ -998,9 +998,8 @@ load_rc_config() err 3 'USAGE: load_rc_config name' fi - if ${_rc_conf_loaded:-false}; then - : - else + if [ -z $_rc_conf_loaded ]; then + set -o allexport if [ -r /etc/defaults/rc.conf ]; then debug Sourcing /etc/defaults/rc.conf . /etc/defaults/rc.conf @@ -1010,6 +1009,7 @@ load_rc_config() . /etc/rc.conf fi _rc_conf_loaded=true + set +o allexport fi if [ -f /etc/rc.conf.d/$_name ]; then debug Sourcing /etc/rc.conf.d/${_name} As I read it, the change means that each process started by rc.d got approximately 32kB of non-shared garbage data in its environment ? pgp2CMEjazo56.pgp Description: PGP signature
Re: svn commit: r219526 - stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs
On Fri, Mar 11, 2011 at 07:27:31PM +, Andriy Gapon wrote: Author: avg Date: Fri Mar 11 19:27:31 2011 New Revision: 219526 URL: http://svn.freebsd.org/changeset/base/219526 Log: use even larger stack size for ZFS txg_sync_thread While the stack size was larger than the default stack size on i386, it was smaller than the default stack size on amd64 and apparently that wasn't enough. So, bump the size to 4 pages. Upcoming ZFSv28 code uses 8 pages for this stack size. This is a direct commit to stable/8. PR: kern/154681 Discussed with: pjd Modified: stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/txg.c Modified: stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/txg.c == --- stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/txg.c Fri Mar 11 19:21:42 2011(r219525) +++ stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/txg.c Fri Mar 11 19:27:31 2011(r219526) @@ -146,7 +146,7 @@ txg_sync_start(dsl_pool_t *dp) * 32-bit x86. This is due in part to nested pools and * scrub_visitbp() recursion. */ - tx-tx_sync_thread = thread_create(NULL, 1210, txg_sync_thread, + tx-tx_sync_thread = thread_create(NULL, 1610, txg_sync_thread, dp, 0, p0, TS_RUN, minclsyspri); mutex_exit(tx-tx_sync_lock); What about architectures where PAGE_SIZE != 4096 ? We have ptoa() macro. pgp6w5vokKhgQ.pgp Description: PGP signature
Re: svn commit: r219173 - in stable/8/sys: compat/linux kern
On Wed, Mar 02, 2011 at 09:53:13AM +, Alexander Leidinger wrote: Author: netchild Date: Wed Mar 2 09:53:13 2011 New Revision: 219173 URL: http://svn.freebsd.org/changeset/base/219173 Log: MFC r215664: By using the 32-bit Linux version of Sun's Java Development Kit 1.6 on FreeBSD (amd64), invocations of javac (or java) eventually end with the output of Killed and exit code 137. This is caused by: 1. After calling exec() in multithreaded linux program threads are not destroyed and continue running. They get killed after program being executed finishes. 2. linux_exit_group doesn't return correct exit code when called not from group leader. Which happens regularly using sun jvm. The submitters fix this in a similar way to how NetBSD handles this. I took the PRs away from dchagin, who seems to be out of touch of this since a while (no response from him). The patches committed here are from [2], with some little modifications from me to the style. PR:141439 [1], 144194 [2] Submitted by:Stefan Schmidt stefan.schm...@stadtbuch.de, gk Reviewed by:rdivacky (in april 2010) MFC r215675: Do not take the process lock. The assignment to u_short inside the properly aligned structure is atomic on all supported architectures, and the thread that should see side-effect of assignment is the same thread that does assignment. Use a more appropriate conditional to detect the linux ABI. Suggested by:kib See r215706, that was a followup to the commit. Without it, stable/8 is broken: /usr/bsd/src/sys/modules/linux/../../compat/linux/linux_emul.c: In function 'linux_proc_exec': /usr/bsd/src/sys/modules/linux/../../compat/linux/linux_emul.c:265: error: dereferencing pointer to incomplete type /usr/bsd/src/sys/modules/linux/../../compat/linux/linux_emul.c:265: error: 'SV_ABI_MASK' undeclared (first use in this function) pgpXUolqsyTCh.pgp Description: PGP signature
Re: svn commit: r218966 - head/sys/vm
On Sun, Feb 27, 2011 at 11:12:03PM +1100, Bruce Evans wrote: On Sun, 27 Feb 2011, Bruce Cran wrote: On Thu, 2011-02-24 at 10:36 +1100, Bruce Evans wrote: I would cast operand(s) in the expression as necessary to prevent overflow of subexpressions. vm_pindex_t would work, but I prefer to use a type related to the subexpressions. Not sure what that is. Maybe just uintmax_t for safety (even that is not safe if the subexpressions have large values). So: (uintmax_t)swap_bcount * SWAP_META_PAGES * n / mumble. Following Alan's suggestion, I've attached an updated patch which uses a cast to u_long and returns long. I thought he only meant to return long. The units being returned are PAGE_SIZE smaller than the units in the expression. But I don't know exacty how large swp_bcount can be. If everything fits in a single object, then vm_size_t = u_int on 32-bit machines is enough. I think that return type should be left as vm_offset_t. The calculation basically returns some quantity of type vm_offset_t, divided by PAGE_SIZE. This validates the new return type. The calculation could use vm_offset_t, or uintmax_t cast, probably vm_offset_t is preferred for the same reason. pgpu5aew5KUQ3.pgp Description: PGP signature
Re: svn commit: r218967 - head/sys/kern
On Wed, Feb 23, 2011 at 01:57:34PM +, Alexander Best wrote: On Wed Feb 23 11, Kostik Belousov wrote: On Wed, Feb 23, 2011 at 12:56:25PM +, John Baldwin wrote: Author: jhb Date: Wed Feb 23 12:56:25 2011 New Revision: 218967 URL: http://svn.freebsd.org/changeset/base/218967 Log: Fix off-by-one error in check against max_threads_per_proc. Submitted by: arundel MFC after: 1 week Modified: head/sys/kern/kern_thr.c Modified: head/sys/kern/kern_thr.c == --- head/sys/kern/kern_thr.c Wed Feb 23 10:28:37 2011 (r218966) +++ head/sys/kern/kern_thr.c Wed Feb 23 12:56:25 2011 (r218967) @@ -153,7 +153,7 @@ create_thread(struct thread *td, mcontex p = td-td_proc; /* Have race condition but it is cheap. */ - if (p-p_numthreads = max_threads_per_proc) { + if (p-p_numthreads max_threads_per_proc) { ++max_threads_hits; return (EPROCLIM); } I do not think there was off by one error. The create_thread() function is called to create new thread, and before the process thread counter is incremented in thread_link(). The old test tried to not allow more then max_threads_per_proc threads in a process, now it allows to create max_threads_per_proc. doesn't the semantics of the term maximum imply that it's own value is also valid? if a sign says maximum weight 2000kg, does that mean that a weight of 2000kg is invalid and the highest valid weight is 1999,999..kg? cheers. alex My guess is that the reference to mentioned pthread_vfork_test failed because reporter set kern.threads.max_threads_per_proc to 100. The test actually tries to create 101 threads, 1 main + 100 new. so the main process counts as 1 thread and for each pthread_create invokation the thread number gets bumped up? so with a process doing a single pthread_create() that would imply this process is having a thread count of 2? Exactly. The main thread is the same as all others (almost). pgpkGGk4py7Uj.pgp Description: PGP signature
Re: svn commit: r218959 - head/usr.sbin/pc-sysinstall/backend-query
On Tue, Feb 22, 2011 at 07:18:56PM +, Josh Paetzel wrote: Author: jpaetzel Date: Tue Feb 22 19:18:56 2011 New Revision: 218959 URL: http://svn.freebsd.org/changeset/base/218959 Log: Better method for grabbing disk name, dmesg may produce mangled output. Approved by:kib (mentor, implicit) Modified: head/usr.sbin/pc-sysinstall/backend-query/disk-list.sh Modified: head/usr.sbin/pc-sysinstall/backend-query/disk-list.sh == --- head/usr.sbin/pc-sysinstall/backend-query/disk-list.shTue Feb 22 19:05:42 2011(r218958) +++ head/usr.sbin/pc-sysinstall/backend-query/disk-list.shTue Feb 22 19:18:56 2011(r218959) @@ -74,7 +74,7 @@ do fi # Check the dmesg output for some more info about this device Shouldn't the comment be updated ? - NEWLINE=$(dmesg | sed -n s/^$DEV: .*\(.*\).*$/ \1/p | head -n 1) + NEWLINE=$(camcontrol identify $DEV | grep device model | tr -s ' ' | sed 's |device model ||g') if [ -z $NEWLINE ]; then NEWLINE= Unknown Device fi pgpoEMmecISSG.pgp Description: PGP signature
Re: svn commit: r218960 - head/usr.sbin/pc-sysinstall/backend
On Tue, Feb 22, 2011 at 07:37:12PM +, Josh Paetzel wrote: Author: jpaetzel Date: Tue Feb 22 19:37:12 2011 New Revision: 218960 URL: http://svn.freebsd.org/changeset/base/218960 Log: Added patch-functions-upgrade which should fix some kernel panics doing upgrades and uninstalling linux compat ports. This is probably wrong way to fix kernel panics. Did you show the panics with backtrace to anybody on emulation@ ? Submitted by: Joerg-Christian Boehme jo...@chaosdorf.de Approved by:kib (mentor, implicit) Modified: head/usr.sbin/pc-sysinstall/backend/functions-upgrade.sh Modified: head/usr.sbin/pc-sysinstall/backend/functions-upgrade.sh == --- head/usr.sbin/pc-sysinstall/backend/functions-upgrade.sh Tue Feb 22 19:18:56 2011(r218959) +++ head/usr.sbin/pc-sysinstall/backend/functions-upgrade.sh Tue Feb 22 19:37:12 2011(r218960) @@ -58,7 +58,7 @@ mount_target_slice() zfs mount -a # Mount all the fstab goodies on disk - chroot ${FSMNT} /sbin/mount -a ${LOGOUT} 2${LOGOUT} + chroot ${FSMNT} /sbin/mount -a -t nolinprocfs ${LOGOUT} 2${LOGOUT There is a typo/syntax error at the very end of the line. chroot ${FSMNT} umount /proc /dev/null 2/dev/null chroot ${FSMNT} umount /compat/linux/proc /dev/null 2/dev/null @@ -79,7 +79,7 @@ mount_target_slice() if [ $INSTALLTYPE != FreeBSD ] then echo_log Removing old packages, this may take a while... Please wait... -echo '#/bin/sh +echo '#!/bin/sh for i in `pkg_info -E \*` do echo Uninstalling package: ${i} pgpzLisjlBKV6.pgp Description: PGP signature
Re: svn commit: r218822 - in head: . contrib/binutils contrib/binutils/bfd contrib/binutils/bfd/doc contrib/binutils/bfd/po contrib/binutils/binutils contrib/binutils/binutils/doc contrib/binutils/bin
On Sun, Feb 20, 2011 at 02:01:44PM -0800, Steve Kargl wrote: On Fri, Feb 18, 2011 at 08:54:13PM +, Dimitry Andric wrote: Author: dim Date: Fri Feb 18 20:54:12 2011 New Revision: 218822 URL: http://svn.freebsd.org/changeset/base/218822 Log: Merge binutils 2.17.50 to head. This brings a number of improvements to x86 CPU support, better support for powerpc64, some new directives, and many other things. Bump __FreeBSD_version, and add a note to UPDATING. Thanks to the many people that have helped to test this. This commit appears to be causing laptop:kargl[205] ~/work/bin/gfortran -o z a.f /usr/bin/ld: error in /usr/home/kargl/work/lib/gcc/i386-unknown-freebsd9.0/4.6.0/crtend.o(.eh_frame); no .eh_frame_hdr table will be created. and laptop:kargl[208] ~/work/bin/gcc -o z h.c /usr/bin/ld: error in /usr/home/kargl/work/lib/gcc/i386-unknown-freebsd9.0/4.6.0/crtend.o(.eh_frame); no .eh_frame_hdr table will be created. Note an executable is created and it appears to work. laptop:kargl[213] ./z Hello world Unfortunately, one can no longer run the GCC testsuite because the testsuite is not expecting the /usr/bin/ld message. So, is there an patch that needs to been sent upstream to GCC to fix crtend.o? Or. can you fix /usr/bin/ld to not emit this seemingly harmless message? I suspect this is the same issue as was fixed by r209294 in FreeBSD svn repo. Most likely, similar approach would help gcc. pgpPhdQcKsf4j.pgp Description: PGP signature