Re: general Darwin imports (was Re: Darwin cmd import?)
On Wed, 2 Jun 2004, Michael W. Lucas wrote: On Sat, May 29, 2004 at 07:55:21PM -0400, Robert Watson wrote: The FreeBSD Core Team took a look at the APSL a while back, and decided that similar to LGPL/GPL, it was an acceptable license for use in userspace for stand-alone tools, but that similar protections to LGPL/GPL would be required for kernel code (not built by default, carefully marked, etc). That said, Apple tends to release only code they've heavily rewritten or created from scratch under APSL; code they modify tends to remain under the existing license (CMU, BSD, etc). Generally they're careful to label the license on the download page. I'm writing an article about Apple's licensing and returning code to the community, but if you want to become a committer read this: Apple has made a lot of improvements to various FreeBSD utilities, and re-released them under the original licensing. This provides an excellent source of patches. People may gripe about Apple not returning stuff to the open source community. The truth is, they have. They aren't responsible for converting what they return into a format we can use, but they haven't deliberately obfuscated their code. Sorting out the diffs would be a pain, but not horribly difficult. According to Jordan Hubbard, the best source of low-hanging fruit is their modified libc. They've had people work out all sorts of bugs, clean up functions, performance improvements, etc. Libc changes require extensive testing. They also have wide-reaching benefits. It's still BSDL'd, so we can take back whatever we want. If you want a commit bit, go and pick some of this fruit and send-pr it. I would also add that Apple has worked hard to improve their interaction on the open source licensing front. APSLv2 is a dramatic improvement over APSLv1. They've also been working internally to improve their ability to return changes under non-APSL licenses, and recently released several new components in the new Darwin drop under the Berkeley at my request. There are some areas where I don't think we'll see any license movement (HFS+, for one thing), but there are other areas where (at least from the outside) it appears Apple recognizes the benefit of widespread use of the code, community participation, etc. And I'm happy for us to prove Apple right by adopting their pieces in sensible ways, improving them, and pointing them at the improvements. :-) Robert N M Watson FreeBSD Core Team, TrustedBSD Projects [EMAIL PROTECTED] Senior Research Scientist, McAfee Research ___ [EMAIL PROTECTED] mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: Darwin cmd import?
On Fri, 28 May 2004, Cyrille Lefevre wrote: regarding the APSL (http://www.opensource.apple.com/apsl/), do you think it is possible to import some darwin commands w/ mods. for instance, I thing to decomment and relpath from bootstrap_cmds, sadc and sar from system_cmds, and maybe some others in the future. also, how about to import NetBSD shlock ? and CMU md (a sort of mkdep in C) ? PS : decomment and relpath only need some mods while sadc would need a large amount of mods, don't know about sar. The FreeBSD Core Team took a look at the APSL a while back, and decided that similar to LGPL/GPL, it was an acceptable license for use in userspace for stand-alone tools, but that similar protections to LGPL/GPL would be required for kernel code (not built by default, carefully marked, etc). That said, Apple tends to release only code they've heavily rewritten or created from scratch under APSL; code they modify tends to remain under the existing license (CMU, BSD, etc). Generally they're careful to label the license on the download page. Robert N M Watson FreeBSD Core Team, TrustedBSD Projects [EMAIL PROTECTED] Senior Research Scientist, McAfee Research ___ [EMAIL PROTECTED] mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: implications of SMP kernel on UP
On Thu, 1 Apr 2004, Bjoern A. Zeeb wrote: what are the implications on running an SMP enabled kernel on a UP machine ? I first thought of things like: - performence (most likely not worth the discussion ?) - additional locking problematic ? - ... ? Or asked the other way round: why would I want to disable SMP on a kernel that is going to run on a UP machine ? I've observed substantial performance overhead from enabling SMP on UP boxes. However, increasing numbers of UP boxes ship with HTT, blurring the picture a little. There are at least two issues associated with enabling SMP on UP boxes: - First, we use the IO APIC, which has caused compatibility problems with some systems (likely actually ACPI problems?), as well as a slightly higher cost to interrupt handling. - Second, we use locked operations for locks, increasing their cost. I've spent a bit of time trimming some gratuitous locking from the system call path, so it should actually be a bit better than it was previously, but the upshot is that if you want optimal performance on UP, you should compile out both apic and SMP. Peter and I have had conversations about creating HAL modules that plug and play locking operations, optimized copies, and so on for the kernel, and improving the run-time pluggability of SMP (et al), but haven't made any progress. It's worth noting, FYI, that we always compile modules with locked atomic operations so that one module will work on UP and SMP... Robert N M Watson FreeBSD Core Team, TrustedBSD Projects [EMAIL PROTECTED] Senior Research Scientist, McAfee Research ___ [EMAIL PROTECTED] mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: UT2004?
On Wed, 31 Mar 2004, Kris Kennaway wrote: On Thu, Apr 01, 2004 at 03:26:27PM +0930, Daniel O'Connor wrote: Has anyone got it installed under FreeBSD? I got the demo to run and install pretty well (for some reason I can't play it in KDE, I have to drop back to twm otherwise my system hangs), but the full game doesn't install :( I have tried both the DVD edition and the 6 CD version.. It doesn't appear to detect that I have mounted a new disk and so I can't get past installing the first disks worth of stuff. I run the installer like so sudo /compat/linux/bin/sh /cdrom/linux-installer.sh and pick /usr/local/ut2004 as the place to install it. I have ktrace'd it and when I click 'Yes' on the CDROM prompt it only seems to try and open fstab and mtab. It ends up with a FreeBSD fstab and /compat/linux/etc/mtab which is a zero length file. Is it expecting /compat/linux/etc/mtab to be updated somehow when you mount the new disk? linprocfs exports an mtab file from the kernel that's appropriate for use as a substitute, I believe. You can try symlinking etc/mtab to procfs/mtab in the linux namespace. Robert N M Watson FreeBSD Core Team, TrustedBSD Projects [EMAIL PROTECTED] Senior Research Scientist, McAfee Research ___ [EMAIL PROTECTED] mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: implications of SMP kernel on UP
On Thu, 1 Apr 2004, Thierry Herbelot wrote: Le Thursday 01 April 2004 09:10, Bjoern A. Zeeb a écrit : Hi, what are the implications on running an SMP enabled kernel on a UP machine ? I first thought of things like: - performence (most likely not worth the discussion ?) I got an improvement with a factor of ten between an SMP and a UP kernel on a HTT-enabled P4/2,6GHz/800MHz FSB on network transfers (with gigabit Ethernet boards : SMP gives about 6MB/s for FTP transfer rate, and UP gives up to 75MB/s) So : as long as the network stack is not fully locked (this is coming - perhaps for 5.3), a server should definitely run a UP kernel. I would instead phrase this as A kernel-bound network server may benefit from running a UP server. For compute-bound tasks, running SMP has pretty dramatic effects :-). It's also worth pointing out that in many existing configurations, even with Giant over the network stack, we already see performance benefits running 5.x with SMP over 4.x with SMP. BTW, look for network locking patches coming to the arch@ mailing list in the next couple of days to try out (subject to limitations). Robert N M Watson FreeBSD Core Team, TrustedBSD Projects [EMAIL PROTECTED] Senior Research Scientist, McAfee Research ___ [EMAIL PROTECTED] mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: Question regarding shell user creation at login time
On Mon, 29 Mar 2004, Ganbold wrote: Hi, I traced sshd using ktrace and it says: .. 10198 new CALL setuid(0) 10198 new RET setuid -1 errno 1 Operation not permitted 10198 new CALL execve(0x80485d0,0xbfbfed8c,0xbfbfed94) 10198 new NAMI /home/new/new.pl 10198 new RET execve -1 errno 13 Permission denied 10198 new CALL exit(0x) . Don't you mean to be running /home/new/new instead? new.pl isn't world readable/executable. Robert N M Watson FreeBSD Core Team, TrustedBSD Projects [EMAIL PROTECTED] Senior Research Scientist, McAfee Research My C program is: #include unistd.h main(ac, av) char **av; { setuid(0); execv(/home/new/new.pl,av); } Directory: public# ls -la ~new total 46 drwxr-xr-x 2 root wheel512 Mar 29 09:10 . drwxr-xr-x 8 root wheel512 Mar 25 15:28 .. -r--r- 1 root new 767 Mar 24 17:43 .cshrc -r--r- 1 root new 248 Mar 26 12:32 .login -r--r- 1 root new 158 Mar 24 17:43 .login_conf -r--r- 1 root new 373 Mar 24 17:43 .mail_aliases -r--r- 1 root new 331 Mar 24 17:43 .mailrc -r--r- 1 root new 797 Mar 24 17:43 .profile -r--r- 1 root new 276 Mar 24 17:43 .rhosts -r--r- 1 root new 975 Mar 24 17:43 .shrc -rwsr-x--- 1 root new 4651 Mar 26 08:47 new -- 1 root wheel 94 Mar 26 08:47 new.c -r-x-- 1 root wheel 15430 Mar 25 15:16 new.pl -rw-r--r-- 1 root wheel 52 Mar 25 16:52 new.sh Can somebody tell me the reason why it is failed? Thanks in advance, Ganbold ___ [EMAIL PROTECTED] mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to [EMAIL PROTECTED] ___ [EMAIL PROTECTED] mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: usermode linux on BSD?
On Wed, 10 Mar 2004, David Gilbert wrote: Has anyone made an attempt to run usermode linux on FreeBSD? Is the issue-list long? There was a neat paper at BSDCon 2003 discussing running usermode FreeBSD on Linux, and it talked about what would be necessary to make usermode FreeBSD run on FreeBSD. You can find the paper off the USENIX web site, or perhaps via Google. I think it was a relatively small set of changes. Robert N M Watson FreeBSD Core Team, TrustedBSD Projects [EMAIL PROTECTED] Senior Research Scientist, McAfee Research Dave. -- |David Gilbert, Independent Contractor. | Two things can only be | |Mail: [EMAIL PROTECTED]| equal if and only if they | |http://daveg.ca | are precisely opposite. | =GLO ___ [EMAIL PROTECTED] mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to [EMAIL PROTECTED] ___ [EMAIL PROTECTED] mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: a serious error in sched_ule.c?
On 9 Mar 2004, Bin Ren wrote: Hi, all: I've been reading sched_ule.c and seem to find a serious error: in 'sched_slice()': * Rationale: * KSEs in interactive ksegs get the minimum slice so that we * quickly notice if it abuses its advantage. Then, there is: if (!SCHED_INTERACTIVE(kg)) { . . } else ke-ke_slice = SCHED_SLICE_INTERACTIVE; Then, at the beginning of the file, there is: #define SCHED_SLICE_INTERACTIVE (slice_max) (slice_max) for interactive KSEs Either this is a serious mistake or I'm seriously missing sth here. I believe this is a synchronization error in the comment and the code. The code was changed to provide a maximum slice to interactive applications because non-CPU intensive X11 applications will be marked as interactive, but redraws get interrupted in a short slice. When the change went in to increase the time slice I saw an observable improvement in the redraws of X11 apps under load. Robert N M Watson FreeBSD Core Team, TrustedBSD Projects [EMAIL PROTECTED] Senior Research Scientist, McAfee Research ___ [EMAIL PROTECTED] mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: Strange problem with vnodes and sockets
On Sun, 7 Mar 2004, Kiss Tibor wrote: I want to create a small kernel module which logs the socket operations. So in my module I have a socket structure, and i want to know which process (thread) owns that. I try to solve this problem by this way: Sockets, as with files, can be referenced by more than one process at a time. While there is only one process that has created any given socket, references to the socket can be inherited by processes forked from it, as well as passed using UNIX domain sockets. As such, there really isn't a notion of owner. so_cred is a cached referenced to the process credential of the process that created the socket... So how can the v_type 2048? v_type is an enum (vnode.h) with 10 options: enum vtype { VNON, VREG, VDIR, VBLK, VCHR, VLNK, VSOCK, VFIFO, VBAD }; And the real problem is: why don't find that code any VSOCK type vnode in the active process list? And how can i find the proc struct for a socket? :) VSOCK vnodes are rendezvous points for UNIX domain socket communication, not the actual communication vehicles themselves. Very few UNIX domain sockets are used in normal operation, but you might take a look at /var/run/log, and the file descriptors that referenced various sockets to the log subsystem. Robert N M Watson FreeBSD Core Team, TrustedBSD Projects [EMAIL PROTECTED] Senior Research Scientist, McAfee Research ___ [EMAIL PROTECTED] mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: Standard sbc and pcm support in GENERIC kernel?
On Wed, 3 Mar 2004, Randy Pratt wrote: On Wed, 3 Mar 2004 17:03:40 +0100 (CET) you wrote: I've been on the question list for some time, and I have noticed that many people do not know how to get sound support up and running in FreeBSD 5.X. I know that re-compiling the kernel is easy enough, but there are still people not willing to do so, as I have noticed on the list. Therefor I thought it might be an idea to put sound support in the GENERIC kernel configuration, so that newbies will no longer find themselves stuck with that. I think I've read more than one time about problems fitting the installation on the 1.44M floppies. Definitely a bikeshed discussion but adding to the documentation regarding kldload or a knob in sysinstall to turn on all sound modules is preferable to adding to the kernel. Actually, pcm was in GENERIC for a while, but was removed because it caused hangs on boot with a common line of Dell Latitude notebooks at the time. The problem is likely now fixed, and I'd certainly not object to it being in GENERIC as long as there are no similar widespread issues now. Robert N M Watson FreeBSD Core Team, TrustedBSD Projects [EMAIL PROTECTED] Senior Research Scientist, McAfee Research ___ [EMAIL PROTECTED] mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: Looking for static analysis tool to generate call graphs
On Wed, 3 Mar 2004, Zajcev Evgeny wrote: Robert Watson [EMAIL PROTECTED] writes: Well, using a scary combination of grep, awk, a long list of omit this regexp's, and prcc from cflow, I got the following: http://www.watson.org/~robert/freebsd/20040302-sockets.ps Actually it looks kind a mess. Maybe use dot's clustering or ranking to organize callgraph a little? Part of that is because things are somewhat convoluted :-). I've applied some of your suggestions, as well as a bit more noise-trimming and clarification, to: http://www.watson.org/~robert/freebsd/20040303-sockets.ps Hopefully this is somewhat of an improvement. I also added some more of the socketvar.h macros to my hinted edges -- apparently prcc doesn't really do much with our large macros, and so I was missing some edges. Robert N M Watson FreeBSD Core Team, TrustedBSD Projects [EMAIL PROTECTED] Senior Research Scientist, McAfee Research ___ [EMAIL PROTECTED] mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: Looking for static analysis tool to generate call graphs
On Wed, 3 Mar 2004, Dag-Erling Smørgrav wrote: Robert Watson [EMAIL PROTECTED] writes: Well, using a scary combination of grep, awk, a long list of omit this regexp's, and prcc from cflow, I got the following: http://www.watson.org/~robert/freebsd/20040302-sockets.ps Duck and cover. Hmm, is there any way you can try to group functions with similar names together? For instance, functions whose names match /^fd.*/ call mostly eachother, and the graph would be a lot cleaner if they were placed close together. In the most recent revision, I've tried to assign the same rank and color to certain classes of functions: System Calls (accept, bind, close, connect, dup, ...) Protocol Switch (pru_accept, pru_attach, pru_bind, pr_ctloutput, ...) File Descriptor Switch (fo_read, fo_write, fo_poll, ...) Socket File Descriptor Functions (soo_read, soo_write, ...) In addition, I assigned the same color to certain classes of functions: Almost System Calls (kern_bind, kern_connect, accept1, ...) Protocol Upcalls to Socket Layer (soisdisconnected, soisdisconnected, ...) I'm going to experiment with grouping later today. Robert N M Watson FreeBSD Core Team, TrustedBSD Projects [EMAIL PROTECTED] Senior Research Scientist, McAfee Research ___ [EMAIL PROTECTED] mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: Looking for static analysis tool to generate call graphs
On Mon, 1 Mar 2004, Robert Watson wrote: I'd like to generate static call graphs from sections of src/sys/kern, src/sys/net, and src/sys/netinet, and ideally, get an output that looks pretty when printed to a (perhaps large) piece of paper. It doesn't need to be able to handle function pointer magic in structures (vnode operations, socket operations, file descriptor operations, sysinits, etc); I just want a fairly high-level graph to get a feel for particular chunks of code spanning a couple of C files. Anyone have any recommendations? Preferably something that can actually parse the variant of C we use in our kernel :-). Well, using a scary combination of grep, awk, a long list of omit this regexp's, and prcc from cflow, I got the following: http://www.watson.org/~robert/freebsd/20040302-sockets.ps Duck and cover. Robert N M Watson FreeBSD Core Team, TrustedBSD Projects [EMAIL PROTECTED] Senior Research Scientist, McAfee Research ___ [EMAIL PROTECTED] mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to [EMAIL PROTECTED]
Looking for static analysis tool to generate call graphs
I'd like to generate static call graphs from sections of src/sys/kern, src/sys/net, and src/sys/netinet, and ideally, get an output that looks pretty when printed to a (perhaps large) piece of paper. It doesn't need to be able to handle function pointer magic in structures (vnode operations, socket operations, file descriptor operations, sysinits, etc); I just want a fairly high-level graph to get a feel for particular chunks of code spanning a couple of C files. Anyone have any recommendations? Preferably something that can actually parse the variant of C we use in our kernel :-). Thanks, Robert N M Watson FreeBSD Core Team, TrustedBSD Projects [EMAIL PROTECTED] Senior Research Scientist, McAfee Research ___ [EMAIL PROTECTED] mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to [EMAIL PROTECTED]
RE: em0, polling performance, P4 2.8ghz FSB 800mhz
On Sun, 29 Feb 2004, Mike Silbersack wrote: On Sat, 28 Feb 2004, Don Bowman wrote: this would only allow 2 concurrent TCP sessions per unique source address. Depends on the syn flood you are expecting to experience. You could also use dummynet to shape syn traffic to a fixed level i suppose. Does that really help? If so, we need to optimize the syncache. :( Given that we have syncookie support, the other thing we could consider doing under high syn load is simply to drop the syncache from the loop entirely. The syncache provides us with the ability to gracefully degrade as the syn rate goes up, but the FIFO cache bucket overflow handling means we pay the cost of syncache entry allocation even in the high load situation. It might be interesting to measure when syncache overflow is taking place, and simply drop it from the loop under a rate known to exceed the syncache capacity, then re-enable it again once the rate drops. This would remove a memory allocation, queue walking, and in the case of an SMP system, locking, from the syn handling path. Robert N M Watson FreeBSD Core Team, TrustedBSD Projects [EMAIL PROTECTED] Senior Research Scientist, McAfee Research ___ [EMAIL PROTECTED] mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: Accessing sysctls from kernel
On Thu, 26 Feb 2004, Bruce M Simpson wrote: On Thu, Feb 26, 2004 at 02:10:40PM +0100, Ivan Voras wrote: In sys/sys/sysctl.h I see function kernel_sysctlbyname() that looks (to me) to be intended for accessing sysctl values from kernel, but for it's first parameter it requires a struct thread *td. What should I pass to it? (I'm calling it from inside a screensaver module) You could try lying about which thread you are, when you aren't in a userland thread: Cscope tag: kernel_sysctlbyname # line filename / context / line 1728 /sys/dev/vinum/vinumio.c vinum_scandisk error = kernel_sysctlbyname(thread0, kern.disks, NULL, 2741 /sys/dev/vinum/vinumio.c vinum_scandisk kernel_sysctlbyname(thread0, kern.disks, devicename, 3305 /sys/i386/i386/elan-mmcr.c init_AMD_Elan_sc520 i = kernel_sysctlbyname(thread0, machdep.i8254_freq, FWIW, the thread exists in the context of a sysctl for several reasons -- one is to provide access to the requesting process's address space, another is the credential authorizing the change. While there are calls kernel_sysctl() and kernel_sysctlbyname(), those are generally intended for consumption on behalf of a user process. My general preference would be to offer an in-kernel API to manage whatever service is being accessed if it's being done in the kernel on behalf of the kernel, rather than trying to force the access through the current sysctl MIB. That way you don't find unnecessary references to thread0, etc, which have some dubious locking properties, as well as abuse of credentials, etc, that may have unexpected side effects with less traditional security models. Robert N M Watson FreeBSD Core Team, TrustedBSD Projects [EMAIL PROTECTED] Senior Research Scientist, McAfee Research ___ [EMAIL PROTECTED] mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: 5.2.1 make installword trashed my system
On Thu, 26 Feb 2004, db wrote: I've been running 5.2 on my laptop (i386 Acer) for some time now. Few hours ago I download the 5.2.1-release source, buildworld, buildkernel, installkernel, but after a few minuts of installworld my system froze. Now when I try to boot I get: Ouch. Mounting root from ufs: /dev/ad0s1a WARNING: / was not properly dismounted exec /sbin/init: error 8 /sbin/sysctl: 1: Syntax error: ; unexpected /sbin/sysctl: 1: Syntax error: ; unexpected Thu Feb 26 16:24:26 CET 2004 Feb 26 16:24:26 init: can't exec getty '/usr/libexec/getty' for port /dev/ttyv0: No such file or directory and so on So the question is: Now what? I can boot in single user mode and get a shell, but very few programs work and I can't mount anything? It seems there might be three fairly straight-forward choices, not sure which are options for you: (1) Download the 5.2.1 ISO, and do a binary update, which will simply slap down the 5.2.1 binaries over whatever is on the disk (a pretty blend of 5.2 and 5.2.1, no doubt). (2) In general, the 5.2 and 5.2.1 binaries are about the same in userspace -- most changes were to the kernel, and no ABIs were changed. So it sounds like maybe init got toasted, a few shared libraries, etc. Boot to single-user mode using /rescue/init and/or /rescue/sh, and manually update binaries from your object tree until you can successfully kick off an installworld. You can use /rescue/cp to do most of this. I'd start by copying /usr/obj/usr/src/sbin/init to /sbin/init, and hitting a couple of the key libraries (libc, libutil, for example). As soon as you can boot normally to single user mode and use the existing tools, restart installworld. By looking at the dates in /bin, /sbin, etc, you can probably figure out where it gave up, and what only got partially completed. (3) If you can NFS boot the system, perhaps using PXE, you can do an installworld over the network. This is generally easy if you already have a PXE setup, but otherwise hard as you have to figure out PXE. None of this addresses the hang you saw -- once you've gotten the system up and running properly, if you are still experiencing the hang, we should see if we can figure out what it is that's hanging. However, that will be very hard to do in the partially updated configuration, so I think the best bet is to try and get the update finished. Good luck! Robert N M Watson FreeBSD Core Team, TrustedBSD Projects [EMAIL PROTECTED] Senior Research Scientist, McAfee Research ___ [EMAIL PROTECTED] mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: use after free bugs
On Fri, 20 Feb 2004, John Baldwin wrote: On Thursday 19 February 2004 08:43 pm, Ted Unangst wrote: Hi. These are some bugs found by Coverity in a static analysis run on the FreeBSD kernel. All these are use after free bugs. Thanks for the excellent bug reports! I wonder if the same approach relating to memory allocation and free checking via static analysis could be applied to locking and unlocking of locks? I.e.: - We don't release locks more than once. - We don't forget to unlock. - We hold a lock before accessing certain fields (defined by annotation) of a structure. Robert N M Watson FreeBSD Core Team, TrustedBSD Projects [EMAIL PROTECTED] Senior Research Scientist, McAfee Research ___ [EMAIL PROTECTED] mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: malloc backed md/mfs filesystem swapped?
On Fri, 13 Feb 2004, Andrew J Caines wrote: After Ring the various FMs including, but not limited to, mdmfs(8), mdconfig(8) malloc(9), I am unclear whether of not the memory used by md of type MD_MALLOC is kernel memory which will not be swapped, or not. On the same subject, does the the MD_SWAP backed device simply use swapable userland VM or does it specifically use a piece of the (presumably) disk backed swap partition? FYI, the relevant fstab entries for a malloc backed disk having a UFS2 with softupdates and async would look like: Malloc-backed md devices will be backed by unpageable kernel address space, and doing this with anything but a very small virtual disk will result in a kernel panic once the pages are allocated and the rest of the kernel runs out of address space and memory. Swap-backed md devices will be backed by pageable memory, but I'm not sure what the practical limits (if any) are for address space concerns. In general, I use malloc-backed disks only for diskless systems, and then, only in a sparing way. If you have swap available, you pretty much always want to use swap-backing for memory disks -- if there's room in memory they will run as fast as malloc-backed, but you don't have to be as worried about the Oh shoot, I'm out of room case. I use a pretty large swap-backed file system for /tmp on almost all of my production systems, since swap is cheap, and most of the time so is memory. Robert N M Watson FreeBSD Core Team, TrustedBSD Projects [EMAIL PROTECTED] Senior Research Scientist, McAfee Research md/tmpmfs rw,-M,-s128m,async 2 0 md/var/runmfs rw,-M,-s1m,async2 0 -Andrew- -- ___ | -Andrew J. Caines- Unix Systems Engineer [EMAIL PROTECTED] | | They that can give up essential liberty to obtain a little temporary | | safety deserve neither liberty nor safety - Benjamin Franklin, 1759 | ___ [EMAIL PROTECTED] mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to [EMAIL PROTECTED] ___ [EMAIL PROTECTED] mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: malloc backed md/mfs filesystem swapped?
On Sat, 14 Feb 2004, Colin Percival wrote: At 00:56 14/02/2004, Robert Watson wrote: If you have swap available, you pretty much always want to use swap-backing for memory disks -- if there's room in memory they will run as fast as malloc-backed, but you don't have to be as worried about the Oh shoot, I'm out of room case. Actually, there is one consideration: swap-backed memory disks have a sector size equal to the machine page size. This will result in some inflation in memory usage, and can confuse program which expect a sector size of 512 bytes (for example, dd, which I plan on fixing but I haven't gotten around to yet). One such application is Vinum, actually, which does not like using swap-backed storage nodes, although maybe I fixed that. Robert N M Watson FreeBSD Core Team, TrustedBSD Projects [EMAIL PROTECTED] Senior Research Scientist, McAfee Research ___ [EMAIL PROTECTED] mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: kernel threads
On Wed, 28 Jan 2004, Julian Elischer wrote: the KSE stuff requires too much assistance from teh Userland Thread scheduler. HOWEVER it is possible that kthreads may one day be implemented as multiple threads of a single kernel process.. (but not yet) John has been talking about doing this for a while -- clustering the kernel threads into a smaller number of kernel processes or a single kernel process. This is the approach Darwin takes as well, FWIW -- they have a kernel_task in which all the various kernel threads hang out, which avoids the overhead of full processes, as well as the emotional baggage. I think I saw John put it on his TODO list in Perforce, so maybe it's coming soon :-). Robert N M Watson FreeBSD Core Team, TrustedBSD Projects [EMAIL PROTECTED] Senior Research Scientist, McAfee Research ___ [EMAIL PROTECTED] mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: FreeBSD Status Report for Oct-Dec 2003
On Thu, 29 Jan 2004, Danny Braniss wrote: thanks! with so much garbage/software/noise around it's difficult to see the gems. and hearing from first hand is very important. true also that google hit it first, but you provided the missing link. If you want to peruse the FreeBSD perforce server, you can visit: http://perforce.freebsd.org/ Robert N M Watson FreeBSD Core Team, TrustedBSD Projects [EMAIL PROTECTED] Senior Research Scientist, McAfee Research danny www.perforce.com Simply put, Perforce is a source control management tool that makes that is very oriented towards easily managing multiple development streams and easily integrating changes between them. Whereas branching in CVS is expensive and hard to manage, Perforce makes it very, very easy. So it's an ideal tool for managing lots of parallel projects that may or may not be related. Scott ___ [EMAIL PROTECTED] mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to [EMAIL PROTECTED] ___ [EMAIL PROTECTED] mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: kernel threads
On Tue, 27 Jan 2004, Renaud Molla Wanadoo wrote: I'm trying to use the kthread library under 5.2-RELEASE but can't compile my program (which actually only tries to create a thread). I've read that there is now KSE to create kernel threads, but i am wondering if it could be used within the kernel code. I'm left a little unclear by your message what it is you're trying to do. In traditional parlance, a kernel thread is a thread executing kernel code in the kernel. These are created using the kthread(9) API, which is available both to kernel modules and code compiled directly into the kernel. You can see examples of kthread use (both compiled in and in modules) by grepping in the src/sys/kern and src/sys/dev/* trees. The only real caveat here that I know if is that you need to grab the Giant lock if your thread will use it, since kthreads don't start holding Giant, and that if you call kthread_exit(), you will need to grab Giant before that. A use of kernel thread popularized by linux is the idea of userspace threads that are backed by a kernel schedulable thread, as opposed to multiple userspace threads being mapped into a single thread making up a single process. In FreeBSD 5.x, the libc_r library provides multiple user threads multiplexed onto a single kernel-visible thread/process. libkse and libthr provide M:N and 1:1 models. By linking your application against libkse or libthr and using the pthreads API, you will automatically get parallelism and latency improvements over libc_r. Hope this helps. Robert N M Watson FreeBSD Core Team, TrustedBSD Projects [EMAIL PROTECTED] Senior Research Scientist, McAfee Research ___ [EMAIL PROTECTED] mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: XL driver checksum producing corrupted but checksum-correct packets
On Sun, 25 Jan 2004, Mike Silbersack wrote: On Sat, 24 Jan 2004, Robert Watson wrote: To pick up the corrupted packet on the machine where the corruption is occurring, you might want to try hooking up the UDP checksum drop case to BPF_MTAP() for a special BPF device or rule, or have it spit them into a raw socket (probably easier). He said that the packet's checksum passes, but it is corrupt, so this won't work. I may have misread: my reading was that the if_xl card marks the packet as having passed the checksum test, but if you let the OS do the checksum, the checksum fails. I.e., either the hardware checksumming is broken, or the data is corrupted between when the hardware does the checksum, and it reaches the OS buffer. As such, Sam's patch works because it tells the OS to ignore the checksum results from the hardware (although it doesn't disable the checking of checksums), causing the OS to recalculate the checksums and drop the packets rather than accepting them. The goal of the change I suggested would be to also do the checksums in the OS as well, which allows you to detect the bad packets, but instead of dropping them, funnel them aside for later analysis. However, if I've misread, sorry for the confusion! Robert N M Watson FreeBSD Core Team, TrustedBSD Projects [EMAIL PROTECTED] Senior Research Scientist, McAfee Research ___ [EMAIL PROTECTED] mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: XL driver checksum producing corrupted but checksum-correct packets
On Fri, 23 Jan 2004, Matthew Dillon wrote: I tracked down an occassional buildworld failure on DragonFly to my XL driver, which is synchronized to 4.x's XL driver. It would be very helpful if you could do the following: (1) See if you can reproduce this using something other than NFS -- perhaps netperf using UDP_STREAM or the like, between that machine and another machine. This would give us a more reproduceable workload than builds, and hopefully one that is less sensitive to things like context switching, etc. (2) See if you can reproduce this with a stock 4.9-RELEASE kernel (or 4-STABLE). While the drivers are similar between 4.x and DFBSD, there are actually quite a few structural changes in the DFBSD version. Maybe it would make sense to try backing out the local DFBSD changes to the base FreeBSD version, even if not trying a completely FreeBSD system, to see if they are the cause. It's difficult to diff the two because of reorganization and style changes. [EMAIL PROTECTED]:6:0: class=0x02 card=0x764610b7 chip=0x764610b7 rev=0x30 hdr=0x00 Does this card have a product name, or is it one of those chips embedded in a motherboard without a separate name? I took a look through the xl cards/chips on my various machines, and was unable to find anything that had remotely the same card or chip ID. I did some high-volume packet flows between them with hardware checksumming disabled and didn't see any corrupted UDP packets, but the workloads I'm using sound pretty different. Knowing it could be reproduced using a more simple workload (and the specifics) would be good. FYI, I checked the Linux driver for these cards, and didn't see mention of any quirks for the particular chips/card you're using. The only thing of note in the Linux driver was the following: /* Check the PCI latency value. On the 3c590 series the latency timer must be set to the maximum value to avoid data corruption that occurs when the timer expires during a transfer. This bug exists the Vortex chip only. */ if (pdev) { u8 pci_latency; u8 new_latency = (drv_flags IS_VORTEX) ? 248 : 32; pci_read_config_byte(pdev, PCI_LATENCY_TIMER, pci_latency); if (pci_latency new_latency) { printk(KERN_INFO %s: Overriding PCI latency timer (CFLT) setting of %d, new value is %d.\n, dev-name, pci_latency, new_latency); pci_write_config_byte(pdev, PCI_LATENCY_TIMER, new_latency); } } The rate at which you have failures sounds like it could be a similar issue, however -- an occasional collision between a timer and DMA. NFS is often a mix of small RPCs handling lookups and attributes, and larger RPCs carrying data. Using netperf or a related tool might help you identify if one of those is more likely to cause the failure. Robert N M Watson FreeBSD Core Team, TrustedBSD Projects [EMAIL PROTECTED] Senior Research Scientist, McAfee Research ___ [EMAIL PROTECTED] mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: XL driver checksum producing corrupted but checksum-correct packets
On Sat, 24 Jan 2004, Max Laier wrote: On Saturday 24 January 2004 17:06, Robert Watson wrote: On Fri, 23 Jan 2004, Matthew Dillon wrote: I tracked down an occassional buildworld failure on DragonFly to my XL driver, which is synchronized to 4.x's XL driver. FYI: This was reproduced on OpenBSD as well (w/ ftp and scp): http://marc.theaimsgroup.com/?l=openbsd-techm=107494884327698w=2 Two thoughts on other things to try, with that in mind: (1) Linux on the same hardware, see if whatever set of XL workarounds they have addresses this specific problem. (2) Try the NDIS driver with the NDIS-u-lator on FreeBSD 5.x and see if that also has the problem. Robert N M Watson FreeBSD Core Team, TrustedBSD Projects [EMAIL PROTECTED] Senior Research Scientist, McAfee Research ___ [EMAIL PROTECTED] mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: XL driver checksum producing corrupted but checksum-correct packets
On Sat, 24 Jan 2004, Luigi Rizzo wrote: On Sat, Jan 24, 2004 at 01:38:37PM -0500, Robert Watson wrote: ... (2) Try the NDIS driver with the NDIS-u-lator on FreeBSD 5.x and see if that also has the problem. but going this way you have no idea on what the driver does, including enabling hw checksums. This looks like a useless test at least for the purpose of finding out what is going wrong Actually, I'm more curious about whether it's a known errata/misbehavior for the card that 3Com's drivers work around, or not. The problem could well be compleely unrelated to hardware checksuming per se -- the corruption might well be taking place as the buffer is moved from the card's buffer to the operating system managed buffer. If the NDIS driver doesn't illustrate the same problem, it tells us that by frobbing appropriately, this problem can be worked around. It also tells us that by looking a bit harder at what the driver is doing (i.e., how it frobs the hardware), we can learn something about the appropriate workaround. If it's a delay/timing issue, it's less likely we can learn something, but if the NDIS driver is simply disabling hardware checksumming for specific chipsets, that's something we should be able to figure out. On the other hand, if the NDIS driver shows the exact same problem, this might not be an issue known to the vendor. Robert N M Watson FreeBSD Core Team, TrustedBSD Projects [EMAIL PROTECTED] Senior Research Scientist, McAfee Research ___ [EMAIL PROTECTED] mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: XL driver checksum producing corrupted but checksum-correct packets
On Sat, 24 Jan 2004, Luigi Rizzo wrote: On Sat, Jan 24, 2004 at 02:12:12PM -0500, Robert Watson wrote: ... but going this way you have no idea on what the driver does, including enabling hw checksums. This looks like a useless test at least for the purpose of finding out what is going wrong Actually, I'm more curious about whether it's a known errata/misbehavior for the card that 3Com's drivers work around, or not. The problem could well be compleely unrelated to hardware checksuming per se -- the corruption might well be taking place as the buffer is moved from the card's buffer to the operating system managed buffer. If the NDIS driver doesn't illustrate the same problem, it tells us that by frobbing appropriately, this problem can be worked around. It also tells us that by looking a bit harder at what the driver is doing (i.e., how it frobs the hardware), we can learn something about the appropriate workaround. yes, but how would you know that, short of reverse engineering the driver, or tracing I/O accesses to the hardware ? It really looks like an overkill effort... I'd rather just try to debug the issue working on an open source driver, or dump the hardware altogether and replace it with something known to work... My understanding is that NDIS drivers rely on the HAL provided by NT to perform hardware access, so you can generate I/O traces with relative ease. Decoding and following the HAL traces during card setup is probably relatively straight forward, since presumably most of the I/O transactions will match the documented services of the card. It might be useful to add some KTR support to Bill's NDIS pieces for this very purpose, if there's interest. Robert N M Watson FreeBSD Core Team, TrustedBSD Projects [EMAIL PROTECTED] Senior Research Scientist, McAfee Research ___ [EMAIL PROTECTED] mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: XL driver checksum producing corrupted but checksum-correct packets
On Sat, 24 Jan 2004, Matthew Dillon wrote: Well, I tried to tcpdump a session. I managed to hit the error three times but in all three cases the tcpdump on the server dropped the particular packet I was looking for. I'm only able to get a 70% retention rate in the tcpdump output on the server... its just trying to record too much for the machine to handle at the rate the NFS requests are coming in. To pick up the corrupted packet on the machine where the corruption is occurring, you might want to try hooking up the UDP checksum drop case to BPF_MTAP() for a special BPF device or rule, or have it spit them into a raw socket (probably easier). Problem is, the context switching does in BPF, so if you can get another machine onto the segment without it being excessively switched (perhaps on a monitor port), using a third machine to grab the on-the-wire packets might work best. That way you can compare pre-corruption and post-corruption. I'm going to give up trying to characterize the corruption for now. It could very well be the PCI latency timer as previously discussed but I can't test that right now. If it is the problem, it may be easier to do this and see if it works than to track down the packet :-). good luck... Robert N M Watson FreeBSD Core Team, TrustedBSD Projects [EMAIL PROTECTED] Senior Research Scientist, McAfee Research ___ [EMAIL PROTECTED] mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: GEOM + Vinum
On Wed, 21 Jan 2004, Lukas Ertl wrote: I step in. I complained bitterly about the rip-it-off-plans. s/plans/proposal/ I'm currently not able to help out coding, but I would gladly supply remote console access to a box suitable for vinum testing. (Including access to a local cvsup-/cvs-server, backup space etc.) Thanks for these offers! FWIW, there's now a new mailing list, freebsd-geom@, and we should move this thread over there. I'm really glad someone has picked up on this. No one wants to see Vinum users left behind, it's simply been a question of finding someone to bring Vinum forward :-). (And once I remembered I had to use MailMan to subscribe to freebsd-geom and not Majordomo, I was much happier :-). Robert N M Watson FreeBSD Core Team, TrustedBSD Projects [EMAIL PROTECTED] Senior Research Scientist, McAfee Research ___ [EMAIL PROTECTED] mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: shutdown -p now
On Wed, 21 Jan 2004, Liam Foy wrote: shutdown -p now is dependant upon hardware, and am 100% sure my hardware supports this; yet it still does not work. Must I have anything added to my kernel configuration or anything? What version of FreeBSD are you using? Do you have ACPI enabled, if on 5.x? Robert N M Watson FreeBSD Core Team, TrustedBSD Projects [EMAIL PROTECTED] Senior Research Scientist, McAfee Research ___ [EMAIL PROTECTED] mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: ip_input - chksum - why is it done so early in ip_input?
On Sat, 17 Jan 2004, Andre Oppermann wrote: Besides that i'd like to add that FreeBSD has the fastest forwarding engine i've seen on any free OS. It's in my opinion a very suitable OS for routing/forwarding. We are working on it to make it even faster. If you are using 5.2 or -current you get the first step of it by enabling net.inet.ip.fastfowarding. This is a newly written fast path for packet forwarding. (Do not do this on 4.9 because that is the old ip_flow code). You can also enable debug.mpsafenet, which disables holding the Giant lock over the forwarding path for supported ethernet drivers. Unfortunately, this option can't be used with KAME IPSEC or IPv6 yet, but can be used with FAST_IPSEC. Robert N M Watson FreeBSD Core Team, TrustedBSD Projects [EMAIL PROTECTED] Senior Research Scientist, McAfee Research ___ [EMAIL PROTECTED] mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: 5.1-5.2
On Thu, 15 Jan 2004, Matt Freitag wrote: Building 5.2-RELEASE from 5.1-RELEASE-p10 w/ipf+ipfw+ipfw6+dummynet, 5.1 Compiled fine with this setup. I need ipfilter as it's doing my source routing for ipv6 (multiple transits) since ip6fw doesn't support fwd. (I just use ip6fw for filtering, and ipf for forwarding to the correct interface according to source) Am I just being stupid here somehow? IPFILTER now relies on the PFIL_HOOKS kernel option; this is something that is somewhat poorly documented, and we should add it to the errate I suspect. If you add options PFIL_HOOKS to your kernel config, it should work. Moving to PFIL_HOOKS for all the funky IP input/ouput feature is a goal for 5.3 (in fact, I believe Sam has it almost entirely done in one of his development branches), and should both simplify the input/output paths, and also simplify locking for the IP stack. So the change is for a good cause :-). Robert N M Watson FreeBSD Core Team, TrustedBSD Projects [EMAIL PROTECTED] Senior Research Scientist, McAfee Research snip cc -c -O -pipe -mcpu=pentiumpro -Wall -Wredundant-decls -Wnested-externs -Wstrict-prototypes -Wmissing-prototypes -Wpointer-arith -Winline -Wcast-qual -fformat-extensions -std=c99 -g -nostdinc -I- -I. -I../../.. -I../../../contrib/dev/acpica -I../../../contrib/ipfilter -I../../../contrib/dev/ath -I../../../contrib/dev/ath/freebsd -I../../../contrib/ngatm -D_KERNEL -include opt_global.h -fno-common -finline-limit=15000 -fno-strict-aliasing -mno-align-long-strings -mpreferred-stack-boundary=2 -ffreestanding -Werror ../../../contrib/ipfilter/netinet/ip_fil.c ../../../contrib/ipfilter/netinet/ip_fil.c: In function `fr_check_wrapper': ../../../contrib/ipfilter/netinet/ip_fil.c:319: `PFIL_OUT' undeclared (first use in this function) ../../../contrib/ipfilter/netinet/ip_fil.c:319: (Each undeclared identifier is reported only once ../../../contrib/ipfilter/netinet/ip_fil.c:319: for each function it appears in.) ../../../contrib/ipfilter/netinet/ip_fil.c: In function `fr_check_wrapper6': ../../../contrib/ipfilter/netinet/ip_fil.c:329: `PFIL_OUT' undeclared (first use in this function) cc1: warnings being treated as errors ../../../contrib/ipfilter/netinet/ip_fil.c: In function `iplattach': ../../../contrib/ipfilter/netinet/ip_fil.c:376: warning: unused variable `ph_inet' ../../../contrib/ipfilter/netinet/ip_fil.c:378: warning: unused variable `ph_inet6' machine/in_cksum.h: At top level: ../../../contrib/ipfilter/netinet/ip_fil.c:317: warning: `fr_check_wrapper' defined but not used ../../../contrib/ipfilter/netinet/ip_fil.c:327: warning: `fr_check_wrapper6' defined but not used *** Error code 1 Stop in /usr/src/sys/i386/compile/funk. snip -mpf ___ [EMAIL PROTECTED] mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to [EMAIL PROTECTED] ___ [EMAIL PROTECTED] mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: 5.1-5.2
On Thu, 15 Jan 2004, Eric Masson wrote: Robert == Robert Watson [EMAIL PROTECTED] writes: Robert Moving to PFIL_HOOKS for all the funky IP input/ouput Will all available packet filters, including ipfw rely on PFIL_HOOKS or not ? Yes; we to make it so that ipfw will also rely on PFIL_HOOKS to integrate with the IP stack, greatly reducing the quantity of #ifdef FOO in ip_input() and ip_output(). Robert N M Watson FreeBSD Core Team, TrustedBSD Projects [EMAIL PROTECTED] Senior Research Scientist, McAfee Research ___ [EMAIL PROTECTED] mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: p_[usi]ticks from userland without kvm and procfs?
On Wed, 14 Jan 2004, Ryan Beasley wrote: I'm poring over some code that uses the p_[usi]ticks counters inside of struct proc. This is fine under 4.x where kinfo_proc includes a copy of proc, but is broken under 5.x since a commit 3 years ago that reorganized kinfo_proc. So, outside of kvm and procfs, is there any user-kernel interface for getting to struct proc or just those counters? (getrusage is kinda close except one can't lookup info about another process. :|. ) libkvm uses two back-ends to retrieve information from the kernel: it can either retrieve it using sysctl() on a live kernel, or using kvm access on /dev/kmem or a core file. Generally, using sysctl() is preferred for a live kernel, as it requires no special privilege, and also lets the kernel decide what data is revealed to the user application (i.e., hide processes owned by other users). The kernel function that generally exports process information userspace access is sysctl_out_proc() in src/sys/kern/kern_proc.c, which calls kill_info_proc() of fill_kinfo_thread(), depending on a flag passed to sysctl. Those fields are now part of the thread definition as opposed to the proc definition, and don't appear in the externalized structure in -current (that I can tell). A lot of process accounting and measurement changed with the introduction of M:N threads (KSE), and some of the details haven't yet been sorted out as part of the dust settling. It could well be that the fields are not currently maintained properly, and that the functionality in the kernel needs to be fixed to measure them again properly. Robert N M Watson FreeBSD Core Team, TrustedBSD Projects [EMAIL PROTECTED] Senior Research Scientist, McAfee Research ___ [EMAIL PROTECTED] mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: Filesystem marker.
On Wed, 14 Jan 2004, David Gilbert wrote: Is there a set of bytes at some offset in a block that is common to any instance of a BSD ufs filesystem? I ask because recently my home machine erased it's fdisk block _and_ the bsdlabel with it. It certainly didn't have time to erase the whole disk, but I'm having trouble guessing where the partitions are. /usr/ports/sysutils/gpart will look for partitions on a disk ... but it only knows to look for bsd disklabels ... not bsd filesystems. Ideally, I'd like to make a bsd filesystem module for gpart with some pointers from the group. I ported the OpenBSD version of their scan_ffs to FreeBSD. However, it only speaks UFS1: http://www.watson.org/~robert/freebsd/scan_ffs_freebsd4/ It might also require tweaking to even build on -CURRENT, as I haven't lost any file systems recently enough to have needed to test. One of the nice things about this tool is that it can generate output that can then be fed into disklabel to write the disklabel you need back to disk. Robert N M Watson FreeBSD Core Team, TrustedBSD Projects [EMAIL PROTECTED] Senior Research Scientist, McAfee Research ___ [EMAIL PROTECTED] mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: Gratituous ARP and the em driver
On -1 xxx -1, Nielsen wrote: When I change IP addresses on my 'em' gigabit NIC, ARP isn't sent properly. This appears to be the problem in the following bug report, however i'm using the 'fixed' version of the em driver (in FreeBSD 4.9). http://www.freebsd.org/cgi/query-pr.cgi?pr=54488 Does anyone have any tips on how to get around this? I'm building new systems with gigabit ethernet support and this problem keeps cropping up. I have a failover system, and when moving an IP alias between machines, the em NIC driver doesn't properly send out gratituous ARP, resulting in the IP being inaccessible. - The problem does not occur when plugged into a 100BaseTX switch - FreeBSD 4.9p1 / em version 1.7.16 - Tried various gigabit switches. - One other odd thing is that when configuring the NIC (ifconfig) the machine locks up for several seconds. If you run tcpdump on the machine to sniff the interface in question looking for arp packets, does tcpdump see the gratuitous arp? I'm guessing that it does, and the lack of sending the arp is a result of delays in negotiating on the wire. Does this problem turn up only the first time you raise the interface, or every time you change the IP address on the interface? Robert N M Watson FreeBSD Core Team, TrustedBSD Projects [EMAIL PROTECTED] Senior Research Scientist, McAfee Research ___ [EMAIL PROTECTED] mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: Future of RAIDFrame and Vinum (was: Future of RAIDFrame)
On Mon, 12 Jan 2004, Mark Linimon wrote: If nothing happens, vinum is going to break even more very soon. No ... if you do a commit that changes the code assumptions upon which vinum was built, vinum will break. vinum is not going to magically break by itself. This gets back to a problem with the FreeBSD development model: people who commit changes that break things in other parts of the system do not automatically get assigned the responsibility to fix them. Now, there's no way to impose something like that requirement on a cooperative anarchy, so I am not playing the let's reorganize card -- I think most of us would agree that that dog won't hunt as we say down around these parts. But, in the real world of software engineering, He Who Breaketh It, Must Fixeth It. Well, actually, it's not quite that way. The reality is that every major component in a system needs someone with expertise in it, who is willing and available to maintain the system across infrastructural changes. Vinum has made it over smaller API bumps without a maintainer: the move to devfs, etc. However, to make it speak GEOM requires someone highly familiar with Vinum, and with the time available to do it. If we want to enhance the architecture of FreeBSD for improvements in performance, stability, and long-term maintainability, there will necessarily be structural changes that require a distributed update of the system. FreeBSD is of sufficient size that no one person will be able to make this sort of change alone, which is one of the important reasons to have a software maintenance model that reflects that. Our notion of software maintainance could certainly use some further evolution, but I think there are some existing intuitions. Vinum is a highly complex software module, and *must* have an active maintainer in order to survive structural operating system changes. Greg has recently posted to arch@ saying What's the future of Vinum, indicating an intent to continue to enhance Vinum, and received a number of e-mails regarding how to adapt Vinum for GEOM. I sent him an e-mail in which I laid out a spectrum of possibilities, ranging from the minimalist to a complete transformation of Vinum into a GEOM module. Greg has indicated he plans to work on Vinum further, so I think the best we can do is provide support and encourgement. The minimalist approach appears to be viable (although there are some risks), and someone highly familiar with Vinum (such as Greg) can probably make the changes in short order. He's currently at a conference, but my hope is that when he gets back, he'll evaluate some of the approaches we've described, and pick one. I think the right strategy is to follow the minimalist approach now (adopt the disk(9) API, rather than having Vinum generate character devices) so that swap works on Vinum again, and so that when UFS moves to speaking GEOM there's no loss of functionality. If we want to completely reimplement Vinum, we should do that separately so as to avoid loss of functionality during structural changes. Robert N M Watson FreeBSD Core Team, TrustedBSD Projects [EMAIL PROTECTED] Senior Research Scientist, McAfee Research ___ [EMAIL PROTECTED] mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: diskless problems
On Sun, 11 Jan 2004, Danny Braniss wrote: while the subject is being revived, there are some changes/additions I made to libstand/bootp.c, it exports all the dhcp tags so that they are available to rc.diskless? or rc.d/initdiskless via kenv check out ftp://ftp.cs.huji.ac.il/users/danny/freebsd/diskless-boot/ these are a bit date, but the uptodated stuff is actively being used here, so if there is some interest i could update it. Sounds very interesting indeed. Could you: (1) Update it to bootp.c:1.5; this just removed 'register', and it looks like you've already done that. (2) Restore the original file style -- right now, it's a very hard to read diff because you use different tabbing, function prototypes, etc, so it's hard to isolate and read the changes. There's probably some room for style convergence (new function prototypes), but we have a long-standing tradition of committing style and functional chaanges separately so cvs diff is maximally useful between revisions. It looks like the main difference is that you use four space tabs, and the original file uses real tab characters. Then if you could file a PR and drop me the PR number by e-mail, that would be great. I can do the style stuff if necessary, but I figured since you're much more familiar with the changes, it might not be a bad idea if you did it :-). Robert N M Watson FreeBSD Core Team, TrustedBSD Projects [EMAIL PROTECTED] Senior Research Scientist, McAfee Research ___ [EMAIL PROTECTED] mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: MD(4) cleanups and unload lesson.
On Sun, 11 Jan 2004, Pawel Jakub Dawidek wrote: With attached patch unloading md(4) module is possible. It also cleans up big part of code according to style(9). Could you separate this into a functional diff and a style diff? There's a general preference to not combine them, as it means cvs diff between revisions isn't useful for identifying functional changes (i.e., reviewing for bugs when back-tracking, etc). Thanks, Robert N M Watson FreeBSD Core Team, TrustedBSD Projects [EMAIL PROTECTED] Senior Research Scientist, McAfee Research ___ [EMAIL PROTECTED] mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: diskless problems
On Sat, 10 Jan 2004, Dag-Erling Smørgrav wrote: I'm trying to set up a VIA C3-based mini-ITX box for diskless boot using isc-dhcpd 3.0 from ports. The kernel and modules load fine, but isc-dhcpd doesn't seem to answer the kernel's DHCP discover message. The following is a tcpdump of the traffic the DHCP server sees. I've removed the timestamps for legibility. The DHCP server is on 10.0.0.6, the TFTP and NFS server is on 10.0.0.4, and the client is on 10.0.0.9. 0.0.0.0.68 255.255.255.255.67: xid:0x64c4603d secs:4 flags:0x8000 [|bootp] 0.0.0.0.68 255.255.255.255.67: xid:0x64c4603d secs:4 flags:0x8000 [|bootp] arp who-has 10.0.0.4 tell 10.0.0.9 10.0.0.9.68 255.255.255.255.67: xid:0x3d60c464 file [|bootp] 10.0.0.6.67 10.0.0.9.68: xid:0x3d60c464 Y:10.0.0.9 S:10.0.0.4 file [|bootp] [tos 0x10] 10.0.0.9.68 255.255.255.255.67: xid:0x3d60c464 file [|bootp] 10.0.0.6.67 10.0.0.9.68: xid:0x3d60c464 Y:10.0.0.9 S:10.0.0.4 [|bootp] [tos 0x10] Can you send tcpdump -e output? at this point the kernel boots and prints Sending DHCP Discover packet from interface vr0 (00:40:63:c4:60:3d) What kernel configuration are you using? Are there multiple ethernet devices in the system? Normally if you're using pxeboot for diskless booting, there's no need for the kernel or userspace to use DHCP: they inherit the DHCP settings provided by the pxeboot loader using the kernel environment. When using PXE, there's no need for any special kernel options, etc, you should just be able to use GENERIC. Robert N M Watson FreeBSD Core Team, TrustedBSD Projects [EMAIL PROTECTED] Senior Research Scientist, McAfee Research ___ [EMAIL PROTECTED] mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: diskless problems
On Sat, 10 Jan 2004, Dag-Erling Smørgrav wrote: Robert Watson [EMAIL PROTECTED] writes: Can you send tcpdump -e output? 22:18:14.884745 0:40:63:c4:60:3d ff:ff:ff:ff:ff:ff 0800 590: 0.0.0.0.68 255.255.255.255.67: xid:0x64c4603d secs:4 flags:0x8000 [|bootp] 22:18:16.911162 0:40:63:c4:60:3d ff:ff:ff:ff:ff:ff 0800 590: 0.0.0.0.68 255.255.255.255.67: xid:0x64c4603d secs:4 flags:0x8000 [|bootp] 22:18:16.919251 0:40:63:c4:60:3d ff:ff:ff:ff:ff:ff 0806 60: arp who-has 10.0.0.4 tell 10.0.0.9 22:18:17.134219 0:40:63:c4:60:3d ff:ff:ff:ff:ff:ff 0800 590: 10.0.0.9.68 255.255.255.255.67: xid:0x3d60c464 file [|bootp] 22:18:17.135119 8:0:2b:86:88:55 0:40:63:c4:60:3d 0800 348: 10.0.0.6.67 10.0.0.9.68: xid:0x3d60c464 Y:10.0.0.9 S:10.0.0.4 file [|bootp] [tos 0x10] 22:18:17.135621 0:40:63:c4:60:3d ff:ff:ff:ff:ff:ff 0800 590: 10.0.0.9.68 255.255.255.255.67: xid:0x3d60c464 file [|bootp] 22:18:17.136477 8:0:2b:86:88:55 0:40:63:c4:60:3d 0800 348: 10.0.0.6.67 10.0.0.9.68: xid:0x3d60c464 Y:10.0.0.9 S:10.0.0.4 [|bootp] [tos 0x10] 22:18:38.239936 0:40:63:c4:60:3d ff:ff:ff:ff:ff:ff 0800 1502: 0.0.0.0.68 255.255.255.255.67: xid:0x0001 flags:0x8000 [|bootp] [ttl 1] What kernel configuration are you using? Are there multiple ethernet devices in the system? I followed the advice from the diskless(8) man page. There's only one interface, and tcpdump clearly shows that the DHCP server recieves a request but does not answer. I was a bit surprised to see 'vr0', since PXE is almost always used with fxp drivers. Normally if you're using pxeboot for diskless booting, there's no need for the kernel or userspace to use DHCP: they inherit the DHCP settings provided by the pxeboot loader using the kernel environment. When using PXE, there's no need for any special kernel options, etc, you should just be able to use GENERIC. I'll try again without the BOOTP options... Yeah. Our PXE booting support isn't really the same as the traditional diskless booting environment. If we don't have a PXE manpage, we probably should have one, since it's actually pretty easy to use. I use PXE booting extensively in my test environment, and it makes life much, much easier. I'm sure we have some worked examples posted around, but if not, I can post the details of my configuration. Robert N M Watson FreeBSD Core Team, TrustedBSD Projects [EMAIL PROTECTED] Senior Research Scientist, McAfee Research ___ [EMAIL PROTECTED] mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: diskless problems
On Sat, 10 Jan 2004, Dag-Erling Smørgrav wrote: Robert Watson [EMAIL PROTECTED] writes: On Sat, 10 Jan 2004, Dag-Erling Smørgrav wrote: I'll try again without the BOOTP options... Yeah. Our PXE booting support isn't really the same as the traditional diskless booting environment. It works fine without the BOOTP options... Yeah, makes sense, although I sort of feel as though it should have worked either way. I've just committed some changes to the diskless(8) man page to indicate those options aren't needed with PXE. Robert N M Watson FreeBSD Core Team, TrustedBSD Projects [EMAIL PROTECTED] Senior Research Scientist, McAfee Research ___ [EMAIL PROTECTED] mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: Future of RAIDFrame
On Sat, 10 Jan 2004, Scott Long wrote: I started RAIDframe three years ago with the hope of bringing a proven and extensible RAID stack to FreeBSD. Unfortunately, while it was made to work pretty well on 4.x, it has never been viable on 5.x; it never survived the introduction of GEOM and removal of the old disk layer. I'm coming to the conclusion that I really don't have the time to work on it in my spare time. Also, I've seen next to zero interest in it from others, except for the occasional reminder that it doesn't work. I still believe in it, and I still believe that it can be integrated into GEOM and become the all-singing-all-dancing raid engine for the OS. It will probably never be an LVM stack, but I've also always believed that LVM and RAID are related but separate layers. It can certainly build upon whatever LVM layer appears in GEOM. All it needs is one or two other people to share some of the work and testing with me. I have a Work-In-Progress for converting and integrating it into GEOM on my home Perforce server. It hasn't been touched in several months and I really don't see myself being able to finish alone it in the near future. Since it's been hanging over my head for so long, I'm very, very close to just removing it and moving on. If anyone has the interest AND time available to help out with keeping it, please let me know ASAP. While I recognize the reality of time constraints and developers, I think it might not be a bad idea (regardless of the outcome here) to import the RAIDFrame bits into the FreeBSD Perforce server, so that it's available for reference should anyone pick this up now (or in the future). Robert N M Watson FreeBSD Core Team, TrustedBSD Projects [EMAIL PROTECTED] Senior Research Scientist, McAfee Research ___ [EMAIL PROTECTED] mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: Where is FreeBSD going?
On Wed, 7 Jan 2004, Roman Neuhauser wrote: [1] has core@ considered subversion (devel/subversion)? Everyone has their eyes wide open looking for a revision control alternative, but last time it was discussed in detail (a few months ago?) it seemed there still wasn't a viable alternative. On the src tree side, FreeBSD committers are making extensive use of a Perforce repository (which supports lightweight branching, etc, etc), but there's a strong desire to maintain the base system on a purely open source revision control system, and migrating your data is no lightweight proposition. Likewise, you really want to trust your data only to tried and true solutions, I think -- we want to build an OS, not a version control system, if at all possible :-). Subversion seems to be the current favorite to keep an eye on, but the public release seemed not to have realized the promise of the design (i.e., no three-way merges, etc). You can peruse the FreeBSD Perforce repository via the web using http://perforce.FreeBSD.org/ -- it contains a lot of personal and small project sandboxes that might be of interest. For example, we do all the primary TrustedBSD development in Perforce before merging it to the main CVS repository. Robert N M Watson FreeBSD Core Team, TrustedBSD Projects [EMAIL PROTECTED] Senior Research Scientist, McAfee Research ___ [EMAIL PROTECTED] mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to [EMAIL PROTECTED]
RE: switching between groups
On Wed, 7 Jan 2004, Adil Katchi wrote: Unfortunately, newgrp(1) would not work, because it calls setgroups, which for some weird reason, needs the caller to be a superuser. Isn't there a function that sets the groups (like setgroups) of the current process where you don't have to be a superuser? To maintain security, that function could just check that the groups being set by setgroups are a subset of the caller's set. Does a function like that already exist? If not, how come? Groups are sometimes used for negative access control rights: i.e., permissions are set on a file so that users who should not be able to read the file are in a group, and the group rights are less than the 'other' rights. If users can drop arbitrary groups, they can leave the group excluding the rights. This probleis more or less pronounced with ACLs, depending on who you speak to: using negative rights is often a workaround for not having ACLs, but with ACLs, you can add more than one group to a file, and don't have to be a member of the group to add it... It does strike me that newgrp(1) seems less than useful without the setuid bit... Robert N M Watson FreeBSD Core Team, TrustedBSD Projects [EMAIL PROTECTED] Senior Research Scientist, McAfee Research Thanks, Adil -Original Message- From: Bruce M Simpson [mailto:[EMAIL PROTECTED] Sent: Tuesday, January 06, 2004 1:12 PM To: Adil Katchi Cc: '[EMAIL PROTECTED]' Subject: Re: switching between groups On Tue, Jan 06, 2004 at 11:14:06AM -0500, Adil Katchi wrote: I was just wondering if anyone has any ideas how it's possible for a user that belongs to multiple groups to somehow limit his or her own capabilities by using only one of the n groups that they belong to and be able to switch between these groups? For example, if userA belongs to groupA, groupB and groupC, can userA enter a mode that would force it to only belong to groupA (or groupB, or groupC)? UserA whould be able to switch between these groups and back to normal (ie. belong to all groups). newgrp(1) could be hacked to do this fairly easily. Currently it preserves supplemental group memberships. An option to discard supplementals could be added. Or just call setgroups() with a no-op group-list vector and then setgid()/ setegid() from within your application. BMS ___ [EMAIL PROTECTED] mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to [EMAIL PROTECTED] ___ [EMAIL PROTECTED] mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: Where is FreeBSD going?
On Tue, 6 Jan 2004, Paul Robinson wrote: And therein lies a problem. The only thing any of the committers cares about is what they think. Got a problem? Submit a patch. Don't like the way things are done? Submit a patch. Don't like how such-and-such a util works? Submit a patch. While it's clearly the case that many people have met with the submit a patch response, that's probably more a property of time constraints from developers than a lack of desire to work with users to produce a system users want. Many FreeBSD developers find FreeBSD of particular appeal because it gives them a chance to produce a system they've always wanted to use: one that addresses the frustrations of many other systems out there. For example, a fair number of FreeBSD developers have their time funded by Internet Service Providers who appreciate the scalability, performance, and mangeability of FreeBSD when deployed on tens of thousands of machines. They bring changes to FreeBSD regularly reflecting those needs. Many FreeBSD developers do hang out in the public IRC channels and try to answer questions, hang out on questions@, stable@, etc. Sometimes, you post a question and get the answer That doesn't work yet, but we're looking for a few good developers..., but frequently, you also get a patch and If you could try this and see if it helps with your problem... Obviously, the harder question you ask, the more likely you'll get We're looking for a few good developers... :-). The marketting department of Microsoft may be able to keep their less user-friendly developers from talking to their users, but many people would argue that one of the greatest benefits of open source is increasing that communication, even if it means the unwashed developers talk to real people once in a while. A great many developers pick FreeBSD to work on because they're quite aware of what users of other systems have to deal with, and want to produce a system people can use. But no one is paying the bills for hand-holding, so unless people step up to do the hand holding (thanks greatly to those who do!) it's not going to happen. We'd appreciate your help in making it happen, if that's something that strikes you as done wrong or poorly. As with any commercial software development enterprise, we also have limited resources, but unlike a commercial software development enterprise, we can help involve a much larger community in building and supporting a product. Personally, unless the madness around SMP, the 5- branch and various other bits are ironed out, I can see my next server deployment making use of DragonFly. At least they listen to people who don't submit patches due to the limitations of time/skill/whatever. No, I'm not a Matt fan - I like and respect most on -core and others. I just think 5- has got... well, it's all a bit out of hand really, isn't it? The reality is that operating system development takes a lot of time, energy, and expertise. We can't pull a next generation operating system out of hats overnight -- it takes literally hundreds of man years of work to do. It's not something one, three, or even ten people can do alone. FreeBSD 5.x remains a work in progress, but has made a lot of progress in the right direction. I think what you think of as madness is a necessary step on the path of a major engineering project. I can't think of any major project I've seen where at some point, people haven't taken a pause for a breather saying Oh my god -- what have we gotten ourselves into. On the other hand, I think referring to it as madness dismisses years of hard work by a great many competent and dedicated developers. A year ago, M:N threading was extremely far from productionability -- today, it's on the cusp of being there, with higher performance and increasingly high reliability. It's almost ready for 5-STABLE. There's substantial on-going work on SMP, with a huge investment of time and energy into the network stack, VM system, VFS, process support, scheduling, etc. These are areas where the primary feedback today is going to be stability and performance, and believe me, we're listening. All the FreeBSD developers I correspond with regularly run FreeBSD 5 on their desktops, on their servers, in their appliances, etc, to make sure we keep shaking out problems. Many companies have production products based on 5.x, and their feedback (and contributions) have been valuable. We've also invested substantial efforts in areas like compiler toolchains, standards compliance, not to mention new features. 5.x is, at long last, starting to land; it will take about one more minor version number to get there, we believe, but it is in dramatically better shape than it was a year or two ago. As I said above: writing operating systems isn't a small task. Companies invest tens (hundreds) of millions of dollars writing and maintaining operating systems, and (net across developers, if you actually bill for the volunteer
Re: pciconf -lv - /dev/pci error
On Wed, 31 Dec 2003, William Michael Grim wrote: I have 5.1-RELEASE installed on my system, and I've never needed to do a pciconf -lv to probe the system before. However, I tried doing it earlier today after logging in through SSH and doing su - to become superuser. I received this error: [EMAIL PROTECTED] 09:12:42 root]# pciconf -lv pciconf: /dev/pci: Operation not permitted [EMAIL PROTECTED] 09:15:41 root]# ls -l /dev/pci crw-r--r-- 1 root wheel 251, 0 Nov 2 05:09 /dev/pci So, as you can see, the permissions are correct. Perhaps I don't have something compiled into my kernel? I can attach a dmesg and kernel config if it's necessary. pciconf -lv appears to cause pciconf to open /dev/pci writable: 731 pciconf CALL open(0x8049a55,0x2,0) 731 pciconf NAMI /dev/pci 731 pciconf RET open -1 errno 13 Permission denied And, of course, it's not writable by non-root. The attached patch causes pciconf to open /dev/pci read-only when listing devices (apply to usr.sbin/pciconf/pciconf.c): Index: pciconf.c === RCS file: /home/ncvs/src/usr.sbin/pciconf/pciconf.c,v retrieving revision 1.19 diff -u -r1.19 pciconf.c --- pciconf.c 20 Jun 2003 23:59:25 - 1.19 +++ pciconf.c 31 Dec 2003 17:58:45 - @@ -165,7 +165,7 @@ if (verbose) load_vendors(); - fd = open(_PATH_DEVPCI, O_RDWR, 0); + fd = open(_PATH_DEVPCI, O_RDONLY, 0); if (fd 0) err(1, %s, _PATH_DEVPCI); The pci_user.c code in the kernel requires that the caller hold a writable file descriptor for most of the ioctls; the exception is PCIOCGETCONF, which is the only ioctl pciconf's list_devs() uses. We can probably just go ahead and commit this patch, I think. The reason a check was added to the kernel pci ioctl code is that unaligned writes to /dev/pci can cause faults, I believe... Robert N M Watson FreeBSD Core Team, TrustedBSD Projects [EMAIL PROTECTED] Senior Research Scientist, McAfee Research ___ [EMAIL PROTECTED] mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: pciconf -lv - /dev/pci error
On Wed, 31 Dec 2003, John Baldwin wrote: History is in PR 32677. I do think your patch might be ok if it only applies to the -l case. If so, then it should probably be committed and MFC'd (along with the kernel pci_user.c change) so the PR can be closed. Well, this patch changes only the user code for pciconf, which doesn't run with privilege, not the kernel code implementing the protections. pciconf appears only to require the PCIOCGETCONF ioctl to implement -l[v], and all this patch does is make it so pciconf ask for a read-only file descriptor for -l[v]. This patch doesn't fix pciconf with securelevels, since we still prevent acquiring an open file descriptor when the securelevel is 0. I think a better answer would be to expose the PCI stuff using a sysctl mib rather than an ioctl, since file descriptors to /dev/pci are multi-purpose, and imply the ability to read/write the register space, etc. Robert N M Watson FreeBSD Core Team, TrustedBSD Projects [EMAIL PROTECTED] Senior Research Scientist, McAfee Research ___ [EMAIL PROTECTED] mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: patch: portable dirhash
On Wed, 17 Dec 2003, Alexander Kabaev wrote: On Tue, 16 Dec 2003 22:12:08 -0500 (EST) Ted Unangst [EMAIL PROTECTED] wrote: can somebody please review/commit this to freebsd? it is most of the differences to permit openbsd to use the code. it should not change the code in any functional way. I do not think there is any point in this code ever hitting FreeBSD CVS repository. Rather, OpenBSD should just take cleaned-out copy of this code and be done with it. Well, it's true the #ifdef OpenBSD's probably don't help the readability of our code, abstracting a step by using macros to wrap specific locking primitives is a widely used approach in the FreeBSD tree, especially where it's not clear a final locking strategy has been developed due to a lack of profiling. For example, in both the network code and process management code, we wrap mutexes/sxlocks in macros to avoid committing to either, and to make changing the strategy easier. I wouldn't object to our adopting the macro wrapping, which would have the side effect of helping the OpenBSD patch size a lot also, even leaving out the #ifdef's. Robert N M Watson FreeBSD Core Team, TrustedBSD Projects [EMAIL PROTECTED] Senior Research Scientist, McAfee Research ___ [EMAIL PROTECTED] mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: patch: portable dirhash
On Wed, 17 Dec 2003, Robert Watson wrote: On Wed, 17 Dec 2003, Alexander Kabaev wrote: On Tue, 16 Dec 2003 22:12:08 -0500 (EST) Ted Unangst [EMAIL PROTECTED] wrote: can somebody please review/commit this to freebsd? it is most of the differences to permit openbsd to use the code. it should not change the code in any functional way. I do not think there is any point in this code ever hitting FreeBSD CVS repository. Rather, OpenBSD should just take cleaned-out copy of this code and be done with it. Well, it's true the #ifdef OpenBSD's probably don't help the readability of our code, abstracting a step by using macros to wrap specific locking primitives is a widely used approach in the FreeBSD tree, especially where it's not clear a final locking strategy has been developed due to a lack of profiling. For example, in both the network code and process management code, we wrap mutexes/sxlocks in macros to avoid committing to either, and to make changing the strategy easier. I wouldn't object to our adopting the macro wrapping, which would have the side effect of helping the OpenBSD patch size a lot also, even leaving out the #ifdef's. That said, LOCK() is a terrible name for a macro. :-) If anything, it should be DIRHASH_LOCK() or the like. Robert N M Watson FreeBSD Core Team, TrustedBSD Projects [EMAIL PROTECTED] Senior Research Scientist, McAfee Research ___ [EMAIL PROTECTED] mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: general load balancing issues
On Mon, 15 Dec 2003, Matthew Seaman wrote: On Mon, Dec 15, 2003 at 12:46:52PM +0100, Bogdan TARU wrote: Right now I am considering a setup with one common NFS repository for the configuration files, Apache binaries, Web content and temp directory for PHP, NFS resource which will be mounted on all the 'front' webservers. I am wondering, though, if I will be able (by having one common temp directory for PHP) to load-balance the domains involving sessions: will the sessions be lost when connsecutive hits go to different webservers, or not? The canonical answer to this is to store the session data in the back-end database, so that it's accessible to all of your servers. See the PHP docs for session_set_save_handler(). There's an example of how to do this in the O'Reilly Platypus book Web Database Applications with PHP and MySQL, or contact me off list and I can send you some sample code. Probably a good idea to take this off-list anyhow, as it's not really [EMAIL PROTECTED] material. Another approach I've seen is to avoid the use of state as much as possible, but when the user starts accessing a stateful service, to redirect them from the load balancer to one of the back end servers directly. This assumes that the majority of content generating load is static, of course (which may well not be the case because dynamic content generates much more load than static content in many installations). Another approach is, if there is little state being used, to store the state in the client via URL lines or cookies. This can be especially effective if you use a keyed hash with expiry as part of the cookie or URL data so that you can trust the state. When setting up load balancing with state, one of the hardest things is making sure the solution isn't slower than the original, and the details of the local installation are often relevant. If there are frequent state queries, going to a backend database can make things slower. If they're infrequent, and enough of the work can happen on the web server, it can make things a lot faster (and it's much easier to manage than many other solutions, since it just works). Robert N M Watson FreeBSD Core Team, TrustedBSD Projects [EMAIL PROTECTED] Senior Research Scientist, McAfee Research ___ [EMAIL PROTECTED] mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: howto upgrade 4.8 to 4.9 without cdrom or floppy?
On Fri, 12 Dec 2003, paul van den bergen wrote: on freebsd-hackers, Alfred Perlstein posted a method that allows boot-disk-less installation... but it requires mdconfig, a 5.1 utility... is there a method to do this under 4.8? it seems to me that the job performed by md0 could be done with vn0 e.g. If you're willing to build from the source tree, the buildworld/buildkernel/installkernel/reboot/installworld/mergemaster route is actually quite reliable. I just upgraded a 4.6 box to 4.9 a couple of days ago, remotely with no serial console, cdrom, or floppy, without a hitch. Robert N M Watson FreeBSD Core Team, TrustedBSD Projects [EMAIL PROTECTED] Senior Research Scientist, McAfee Research do # ls /dev/vn* if empty do # cd /dev # ./MAKEDEV vn0 # ./MAKEDEV vn1 # vnconfig vn0 /path/to/freebsd/4.9.iso # mount_cd9660 /dev/vn0c /path/to/freebsd4.9 or however you access the freebsd install iso disk the point being to get access to the /floppies/boot.flp image on the cdrom # vnconfig vn1 /path/to/freebsd4.9/floppies/boot.flp # mkdir /bootfloppy # mount_mfs /dev/vn1c /bootfloppy/ # cp /bootfloppy/kernel.gz /ikernel.gz # cp /bootfloppy/mfsroot.gz /mfsroot.gz then reboot as described... I am about to try this out... wish me luck! On Mon, 1 Dec 2003 07:18 pm, Alfred Perlstein wrote: I have a mini-HOWTO here that possibly be automated. Basically we're going to install FreeBSD over FreeBSD without a floppy, cdrom or pxe. This depends on a loader that's compatible with your kernel so if really weird lockups happen, you might not be compatible. Anyhow, here we go: Download the boot.flp from the release you want to install. Mount it like so: mdconfig -a -t vnode -f boot.flp # should output something like 'md0' mkdir -p /mnt mount /dev/md0 /mnt Copy the yummy bits from the install image to your root: cp /mnt/kernel.gz /ikernel.gz cp /mnt/mfsroot.gz /mfsroot.gz Now reboot and interrupt the loader when it counts down the boot. Then type these commands into the loader: unload kernel load /ikernel load -t mfs_root /mfsroot set vfs.root.mountfrom boot Now cross your fingers once you wipe the partitions out to reinstall... It would be cool if this could be automated[1], perhaps by setting the boot partition to the swap partition and setting it up temporarily as a ufs filesystem and then... oh... well... [1] http://www.jerkcity.com/jerkcity1426.html -- Dr Paul van den Bergen Centre for Advanced Internet Architectures caia.swin.edu.au [EMAIL PROTECTED] IM:bulwynkl2002 And some run up hill and down dale, knapping the chucky stones to pieces wi' hammers, like so many road makers run daft. They say it is to see how the world was made. Sir Walter Scott, St. Ronan's Well 1824 ___ [EMAIL PROTECTED] mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to [EMAIL PROTECTED] ___ [EMAIL PROTECTED] mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: adding more ram
On Wed, 10 Dec 2003, Dan Nelson wrote: In the last episode (Dec 10), [EMAIL PROTECTED] said: I have a server with 1GB of RAM and a swap partition of 2GB i will upgrade the memory server to 2GB so my questions are: should i fix the swap partition to have now 4GB of space ? Depends. Have you ever used up that 2gb of swap? If not, you'll probably never consume 4gb either :) If this is a database server, or something similar where a few processes allocate large amounts of memory, you don't need much swap anyway, since if any of those processes actually has to swap, you end up thrashing the system as it tries to swap 500mb processes in and out of memory. I really can't think of a system that would still perform well with 2 or 3GB of process space in swap. At the 2gb RAM point, you usually have a system where any swapping == bad news. Actually, the thing I use swap for most now is to make sure I can allocate large temporary file systems without consuming excessive kernel address space. I.e., I'll often create a 512mb swap-backed md device for /tmp, and make sure I have enough swap to fully back it and everything else, even though the chances are I won't touch it in normal operation. I just don't want to run out in the event something does need it... Robert N M Watson FreeBSD Core Team, TrustedBSD Projects [EMAIL PROTECTED] Senior Research Scientist, McAfee Research ___ [EMAIL PROTECTED] mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: Sharing data between user space and kernel
On Sun, 7 Dec 2003, Anand Subramanian wrote: A look at the copyin() code in the kernel reveals that all the kernel needs to do to access the data(address space) of a user process is Note that the copyin/copyout implementat is machine-dependent (MD) and so while this is true on i386, it may not be true on other systems. The fuword/suword/copyin/copyout/uio code is intentionally designed to avoid the assumption that userspace pointers are directly dereferenceable by kernel code. One important example of a situation where this difference has to be maintained is implementing 32-bit emulation on 64-bit platforms. On amd64, you can't just dereference a 32-bit pointer when the kernel is running in 64-bit mode. 1. Get the current thread, which I saw is done using the PCPU_GET macro. So I suppose this is always preserved upon a system call. 2. Set the segment register for the user process correctly. And magically, all the user process's data can now be accessed by the kernel directly. Is that correct? In the event of which, it would become really easy for a user process to allocate a chunk of memory and all a kernel module needs to do to implement shared memory is do the steps 1 2 and access the data. Of course there is the question that the user process is swapped out after the system call and some other thread starts running in between in which case curthread should point to some other thread and not the one that issued the system call. But then, isn't this what happens upon every system call normally, when the kernel does the steps 1 2 to obtain the data arguments which are passed to the system call. So this is hardly a problem. So, can shared memory be implemented this way instead of the more traditional pseudo-device way? Appreciate any comments on this(please do a CC to my email address, in case you choose to respond). An additional issue is that user pages are pageable to disk, so may not be in memory. If you're holding any mutexes/etc in kernel when you touch one of those pages, the page fault has to be processed, and you risk (a) holding the locks for a long time, and (b) lock order problems. This is one reason why copyin()/copyout() have to be used very carefully, and this would apply also to any code replicating that functionality. If you take a look at the sysctl() code, you'll see that it wires userspace pages into memory to avoid the risk of sleeping(). What you probably want to do is actually allocate wired kernel pages and export them to userspace. Take a look at the GEOM gstat(8) implementation, which does exactly that. However, you have to make sure that if you ever decide to reuse that kernel memory for something else (i.e., free it back to the allocator), you've GC'd all userspace references to it. Robert N M Watson FreeBSD Core Team, TrustedBSD Projects [EMAIL PROTECTED] Senior Research Scientist, McAfee Research ___ [EMAIL PROTECTED] mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: Reward for fixing keyboard support in FreeBSD, apply within
On Sun, 7 Dec 2003, Mathew Kanner wrote: On Dec 07, Mathew Kanner wrote: The way I see it, FreeBSD needs serious hacking to have multiple concurrent keyboards support without serious hacking. ugh, you know what I mean. With mouse support, we have a layer of indirection with moused that combines input from various mouse devices into a single event stream via /dev/sysmouse. While I don't think we want a keyboard daemon at this point, we might well need to add a similar abstraction in the kernel so that different keyboard sources can be combined. Just plugging in a USB mouse and having it just work is extremely beneficial, and I agree we need to be able to support the same with keyboards. Robert N M Watson FreeBSD Core Team, TrustedBSD Projects [EMAIL PROTECTED] Senior Research Scientist, McAfee Research ___ [EMAIL PROTECTED] mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: IPFW and the IP stack
On Thu, 4 Dec 2003, Devon H.O'Dell wrote: This is obviously the most logical explanation. There's a good bit of questioning for PFIL_HOOKS to be enabled in generic to allow ipf to be loaded as a module as well. If this is the case, we'll have two firewalls that have their hooks compiled in by default allowing for them both to be loaded as modules. (Is this still scheduled for 5.2?) But at this point, there's no way to allow one to turn the IPFW hooks *off*. Is there a reason for this? Would it be beneficial (or possible) to hook ipfw into pfil(9)? This way, we could allow the modules to be loaded by default for both and also allow for the total absence of both in the kernel. Sorry if I've missed discussions on this and am being redundant. Sam Leffler has done a substantial amount of work to push all of the various hacks (features?) behind PFIL_HOOKS, and I anticipate we'll ship PFIL_HOOKS enabled in GENERIC in 5.3 and use it to plug in most of these services. This also means packages like IPFilter and PF will work out of the box without a kernel recompile, not to mention offering substantial architectural cleanup. Robert N M Watson FreeBSD Core Team, TrustedBSD Projects [EMAIL PROTECTED] Senior Research Scientist, McAfee Research ___ [EMAIL PROTECTED] mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: ifconfig(8) refactoring -- YACC grammar now online
On Sun, 30 Nov 2003, Bruce M Simpson wrote: On Sun, Nov 30, 2003 at 01:12:42PM +0100, Andre Oppermann wrote: What I've thinking about a lot is to make the networking system and ifconfig sort of class-based like newbus and geom. Look at: http://people.freebsd.org/~bms/dump/nifconfig/nifconfig-design.txt There is a pending change to if_gre to enable it to be easily classified in this way; ifconfig would simply query the interface for its if_type. This is one way to do it without having to change struct ifnet. We could add a new field, but avoiding changing the ABI is a Good Thing. if_type seems like it will work for high level classes of interfaces, but something more fine-grained will be required for interfaces that implement multiple classes or subclasses (i.e., 802 generally, and also 802.11b). Or likewise, tap interfaces might implement 802 generally, but also if_tap-specific primitives. Do we need to probe by-name for capabilities using interface ioctls, or return a list of implemented interfaces/classes to allow things to be a bit more multidimensional? Robert N M Watson FreeBSD Core Team, TrustedBSD Projects [EMAIL PROTECTED] Senior Research Scientist, McAfee Research ___ [EMAIL PROTECTED] mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: ifconfig(8) refactoring -- YACC grammar now online
On Sun, 30 Nov 2003, Bruce M Simpson wrote: On Sun, Nov 30, 2003 at 02:20:50PM -0500, Robert Watson wrote: if_type seems like it will work for high level classes of interfaces, but something more fine-grained will be required for interfaces that implement multiple classes or subclasses (i.e., 802 generally, and also 802.11b). The idea just now is we look at if_media if we need to get specific with physical interfaces. tap would seem to be missing from my list, actually; I note it's used to provide VMware support in the absence of Netgraph, amongst other things. if_tap is actually quite useful, and in the same general class of synthetic interfaces as if_tun. I've used both in building tunneling and topology-manipulation tools, as well as for debugging routing, etc. if_tap simulates an 802 device, and if_tun simulates a point-to-point device. VMware is the only application I know of using if_tap, although I have a fair amount of my own code that uses it. Userland ppp uses if_tun, as to some of the third party crypto tunneling tools. Or likewise, tap interfaces might implement 802 generally, but also if_tap-specific primitives. Do we need to probe by-name for capabilities using interface ioctls, or return a list of implemented interfaces/classes to allow things to be a bit more multidimensional? That might work well, actually -- I already added a MIB to rtsock to deal with our lack of reporting multicast group memberships, I see no reason not to add one to enumerate loaded interface classes. OTOH, for the 'could load kld' case, this falls down, until the instance is created, either through cloning or completing ifattach() for a physical interface -- but if CREATE is a separate operation this isn't a problem, it is a problem if we want to say something like this in one go:- ifconfig gif0 create tunnel 1.2.3.4 5.6.7.8 10.0.0.1 10.0.0.2 Then you do need a means for the ifconfig instance to ask gif0 if it speaks 'tunnel-ese' once it's loaded. I have to find an abstraction to comfortably deal with this stacking of properties/methods, simple polymorphism (a la Java 'implements interface') springs to mind. I think that would be a reasonable approach, although it seems to me that both the inheritance and implements models might apply in looking at sets of protocol relationships. a tap interface is a synthetic interface, it implements synthetic interface controls, as well as implementing 802. However, it might be neat to hook up 802.11 to a tap-like interface sometime as well. Question: does 802.11 imply 802? If so, a notion of inheritence might be quite useful for driver implementors. Robert N M Watson FreeBSD Core Team, TrustedBSD Projects [EMAIL PROTECTED] Senior Research Scientist, McAfee Research ___ [EMAIL PROTECTED] mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: freebsd smp - linux up
On Wed, 26 Nov 2003, Anthony Schneider wrote: sadly, all ktrace shows is ktrace launching vmware (from 'ktrace vmware', shows sh reading and executing, and then ends with the vmware fork). is there a special way to ktrace linux binaries that i'm not aware of? ktrace should work fine, but you need to make sure you use the linux_kdump port so that the system call trace is interpreted correctly when converted to text. As DES points out, make sure you have the right flags to the ktrace command so tracing is inheritted across fork and exec. Robert N M Watson FreeBSD Core Team, TrustedBSD Projects [EMAIL PROTECTED] Network Associates Laboratories -Anthony. On Tue, Nov 25, 2003 at 07:32:35PM +0100, Dag-Erling Smørgrav wrote: Anthony Schneider [EMAIL PROTECTED] writes: is there a way to have linux emulation report that its kernel is running on a UP system even though the freebsd box it's running on is SMP? i would like to get vmware running on my smp -current box, but vmmon_smp.ko is broken, and with vmmon_up.ko loaded i get a message about needing to be running on an smp linux kernel version 2.0 (2.2) or higher, even though linux emulation reports a 2.4 kernel. It would be interesting to know exactly what it needs that we don't provide. I suspect it's something really trivial... do you see any messages in syslog about unimplemented syscalls? Could you get a ktrace or something? DES -- Dag-Erling Smørgrav - [EMAIL PROTECTED] ___ [EMAIL PROTECTED] mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: Help request: problems with a 5.1 server and large numbers of ssh users.
On Wed, 19 Nov 2003, Len Sassaman wrote: It is my intuition from this behavior that the sshd master process listening for connections is unable to spawn a new process to complete the authentication step, and thus the connection is being dropped. There is no information of use in dmesg, nor in the system logs. (I've cranked up LogLevel to DEBUG3 in sshd_config). I have a RedHat Linux server running the 2.4.18-3smp kernel on a dual Athlon MP 1800+ and 2048MB RAM that is known to handle 1000 users without issue -- so I have to believe the FreeBSD box, though not as beefy hardware-wise, should be able to do better than a few hundred users. I believe this to be some sort of resource limit issue, but I have addressed everything I could think of. Hmm. Well, it certainly sounds like a resource limit to me, especially if it's a nice round number like 150 or 300. However, I'm also having a bit of trouble seeing, off the top of my head, which limit it might be. It sounds like you've got the ones I would think of. A quick skim of sshd.c suggests that it is pretty careful to document various failure modes in debugging output. There are one or two failures where it does not log, and they include the call to pipe() in the server loop -- if that fails, it bails without an error, which is a little surprising. Could you post server debug output for the first connection to the server that fails? This would let us see how far it got... In particular, whether it did spawn a child process, etc. Robert N M Watson FreeBSD Core Team, TrustedBSD Projects [EMAIL PROTECTED] Network Associates Laboratories ___ [EMAIL PROTECTED] mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: Help request: problems with a 5.1 server and large numbers of ssh users.
On Thu, 20 Nov 2003, Ken Smith wrote: On Thu, Nov 20, 2003 at 10:56:08AM -0500, Robert Watson wrote: Hmm. Well, it certainly sounds like a resource limit to me, especially if it's a nice round number like 150 or 300. One possibility might be running out of pseudo-terminals to support the login sessions. pty's are created as needed I think, and the code that handles it is in sys/kern/tty_pty.c. The limits on it appear to be 256 ptys: I thought about that, but the submitter indicated that pty's were not being allocated. However, that would be a really good thing to verify, since the numbers come out right... I should really clean up and commit my pty cleanup at some point, as well as support for forkpty()/openpty()/etc that avoid the sort of code found below. Presumably that would be a 5.3 thing. Robert N M Watson FreeBSD Core Team, TrustedBSD Projects [EMAIL PROTECTED] Network Associates Laboratories /* * This function creates and initializes a pts/ptc pair * * pts == /dev/tty[pqrsPQRS][0123456789abcdefghijklmnopqrstuv] * ptc == /dev/pty[pqrsPQRS][0123456789abcdefghijklmnopqrstuv] * * XXX: define and add mapping of upper minor bits to allow more * than 256 ptys. */ I don't know if simply changing the : static char *names = pqrsPQRS; to something longer is all that would be required or if there are other factors involved. -- Ken Smith - From there to here, from here to | [EMAIL PROTECTED] there, funny things are everywhere. | - Theodore Geisel | ___ [EMAIL PROTECTED] mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: 4.9 KLDload error
On Fri, 7 Nov 2003, Jin Guojun [NCS] wrote: A KLD module ncs_time_ctl.ko compiled on both 4.8 and 4.9 hosts can be loaded by kldload on any 4.8 machine. But neither .ko files can be loaded on a 4.9 machine. The error is: 4.9 # kldload -v ./ncs_time_ctl.ko kldload: can't load ./ncs_time_ctl.ko: Exec format error kldload should give more error information on what function it failed to load. Is this possible a 4.9 bug in kldload? or does some KLD mechanism has been changed in 4.9-RELEASE? Is there any way to analyze what is wrong in the 4.9 LKD system? Unfortunately, the UNIX errno mechanism isn't very expressive. However, the kernel linker will send debugging output to the system console. Check dmesg and see if there's more information there. Typically, this error will be the result of a failure to link symbols in the module: either due to a symbol already present, or a missing dependency. To debug this further, look at the console output, and also compare the output of nm on the .ko built on 4.8 and 4.9 to see if its dependencies or exposed symbols have changed. Robert N M Watson FreeBSD Core Team, TrustedBSD Projects [EMAIL PROTECTED] Network Associates Laboratories ___ [EMAIL PROTECTED] mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: sending messages, user process -- kernel module
On Fri, 7 Nov 2003, Jerry Toung wrote: I am trying to do asynchronous send/receive between a user process that I am writing and a kernel module that I am also writing. I thought about implementing something similar to unix routing socket, but I will have to define a new domain and protosw. Beside that idea, what else would you suggest? This is actually somewhat of a FAQ, since it comes up with relative frequency. I should dig up my most recent answer and forward that to you, but the quicky answers off the top of my head are: (1) One frequent answer is a pseudo-device -- for example, /dev/log buffers kernel log output for syslogd to pick up asynchronously. Arla and Coda both use pseudo-devices as a channel for local procedure calls to/from userspace to support their respective file systems using userspace cache managers. (2) Have the kernel open a file system FIFO and have the process on that FIFO. The client-side NFS locking code uses /var/run/lock to ship locking events to a userspace rpc.lockd. However, responses from rpc.lockd are then delivered to the kernel using a system call synchronously from the user process, as opposed to via a FIFO. (3) The routing socket approach can work quite well, especially if you need multicast semantics for messages, not to mention well-defined APIs for managing buffer size, etc. Another instance of this approach is PF_KEY, used for IPsec key management. As you point out, it requires digging into other code and a fair amount of implementation overhead. (4) You can have kernel code create and listen on sockets in existing domains, including UNIX domain sockets and TCP/IP sockets. The NFS client and server code both make use of sockets directly in the kernel for RPCs. Some of the particularly nice benefits of (2) and (4) is that it's easy to implement userspace test code, since the fifo/socket is just used as a rendezvous and doesn't care if the other end is in kernel or not. Likewise, the blocking/buffering/... semantics are quite well defined, which means you won't be tracking down wakeups, select semantics, thread behavior and synchronization, etc, as you might do in (1). Robert N M Watson FreeBSD Core Team, TrustedBSD Projects [EMAIL PROTECTED] Network Associates Laboratories ___ [EMAIL PROTECTED] mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: Update: Debox sendfile modifications
On Wed, 5 Nov 2003, Igor Sysoev wrote: As to worker kthreads I think it's better to queue aio operation as it was made in src/sys/kern/vfs_aio.c:aio_qphysio(). One of the things that worries me about the proposal to use kernel worker threads to perform the I/O is that this can place a fairly low upper bound on effective parallelism, unless the kernel threads themselves can issue the I/O's asynchronously. In the network stack itself, we are event and queue driven without blocking--if we can maintain the apparent semantics to the application, it would be very nice to be able to handle that at the socket layer itself. I.e., not waste a thread + stack per in-progress operation, and instead have a worker or two that simply propel operations up and down the stack (similar to geom_up and geom_down). Robert N M Watson FreeBSD Core Team, TrustedBSD Projects [EMAIL PROTECTED] Network Associates Laboratories ___ [EMAIL PROTECTED] mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: Update: Debox sendfile modifications
On Wed, 5 Nov 2003, Igor Sysoev wrote: On Wed, 5 Nov 2003, Robert Watson wrote: On Wed, 5 Nov 2003, Igor Sysoev wrote: As to worker kthreads I think it's better to queue aio operation as it was made in src/sys/kern/vfs_aio.c:aio_qphysio(). One of the things that worries me about the proposal to use kernel worker threads to perform the I/O is that this can place a fairly low upper bound on effective parallelism, unless the kernel threads themselves can issue the I/O's asynchronously. In the network stack itself, we are event and queue driven without blocking--if we can maintain the apparent semantics to the application, it would be very nice to be able to handle that at the socket layer itself. I.e., not waste a thread + stack per in-progress operation, and instead have a worker or two that simply propel operations up and down the stack (similar to geom_up and geom_down). As far as I understand src/sys/kern/vfs_aio.c:aio_qphysio() (that handles AIO on raw disks) does not use kthreads and simply queues operations. I think it sounds like we're actually agreeing with each other. Currently, AIO does use threads for non-character devices, so in the socket case it will be using a worker thread. Robert N M Watson FreeBSD Core Team, TrustedBSD Projects [EMAIL PROTECTED] Network Associates Laboratories ___ [EMAIL PROTECTED] mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: Experimental FreeBSD and Linux kernel source cross reference web site
On Thu, 30 Oct 2003, Hiten Pandya wrote: Thank you very very much! ;-) Atlast, someone got to it. I have been wanting to setup LXR for DragonFly for quite some time now, but did not have enough time on my hands to mess with it. Does it require any sort of patching for it to work on FreeBSD ? I recall it requires MySQL and some other stuff.. I'm actually using an older version of the lxr software, 0.3.1, which doesn't make use of a back-end SQL database, rather, some simple db-based data stores and glimpse for searches. It was a lot easier to set up, once I fixed some rather critical bugs :-). I've gone ahead and dropped a snapshot of the DFBSD sys tree on fxr as well, and am currently cvsuping opendarwin source to drop a recent snapshot of xnu. I'm not sure if there are any DFBSD tags worth using other than HEAD, so I just used a timestamp for the checkout. The rearrangement of the DFBSD tree makes diffing between FreeBSD and DFBSD bits a little more difficult, but in many cases it's fairly feasible. I've been trying to decide how to improve diffability between the FreeBSD and Darwin trees: most FreeBSD bits compare directly with xnu/bsd/..., not xnu/..., and lxr isn't very flexible about how it sets up diff comparisons. I've also noticed that lxr is currently unwilling to index macros as identifiers when they're generated at compile-time, which means (for example) that you have to use freetext searches to find vnode operation macro use. I'm not sure how much more time I'm willing to invest in further refining lxr itself, but I'll keep the source code snapshots up-to-date and bring in new sources of kernel source code as appropriate. Robert N M Watson FreeBSD Core Team, TrustedBSD Projects [EMAIL PROTECTED] Network Associates Laboratories ___ [EMAIL PROTECTED] mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to [EMAIL PROTECTED]
Experimental FreeBSD and Linux kernel source cross reference web site
In the past when browsing the Linux source code, I've made extensive use of the Linux Cross-Reference (LXR) hosted at lxr.linux.no. This web site provides a cross-referenced and searchable HTML interface to the Linux source code; you can perform freetext and identifier searches, check differences between revisions, etc. For FreeBSD, we provide a cvsweb interface that is extremely useful for tracking changes, but a little less useful for raw browsing when you're looking for use of an identifier. In the past, CMU's PDL (and possibly others) have provided FreeBSD cross-reference web pages, but I was unable to find one once that site went down. As such, I've experimentally set up the LXR software with access to several branches of the FreeBSD source code, as well as 2.4 and 2.6 Linux kernels at: http://fxr.watson.org/ This is experimental, but I've found it to be quite useful for my own work. I'm intermittently synchronizing the checked out snapshots to CVS. LXR is a useful piece of software, but not designed to handle multiple source code collections so well (i.e., currently isn't a good candidate for all of src). On the other hand, making the source code more easy to search and browse is a very useful thing, so feel free to give it a spin :-). I'll probably keep tweaking and playing with the configuration, as well as put more revisions of the Linux source online, probably drop in an OpenBSD, NetBSD, or DFBSD snapshot or two, etc, soon also. I don't promise it will be there tomorrow, but if it proves useful and interesting, it probably will be :-). Robert N M Watson FreeBSD Core Team, TrustedBSD Projects [EMAIL PROTECTED] Network Associates Laboratories ___ [EMAIL PROTECTED] mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: Experimental FreeBSD and Linux kernel source cross reference web site
FYI, lxr's C parsing code appears to dislike some of our C constructs. I haven't had a chance to dig in much yet, but this is a warning that there are some glitches (for example, kern_prot.c seems to be improperly parsed in RELENG_4). Also, the identifier database seems somewhat prone to corruption if aborted part way through processing; the identifier database for HEAD appears currently to be corrupted so I'm rebuilding it. So if an identifier search fails unexpectedly, or you notice that a C file is not highlighted with cross-reference links for important identifiers, that's probably why: try again in ten minutes. Robert N M Watson FreeBSD Core Team, TrustedBSD Projects [EMAIL PROTECTED] Network Associates Laboratories ___ [EMAIL PROTECTED] mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: FreeBSD mail list etiquette
On Sat, 25 Oct 2003, Matthew Dillon wrote: It's a lot easier lockup path then the direction 5.x is going, and a whole lot more maintainable IMHO because most of the coding doesn't have to worry about mutexes or LORs or anything like that. You still have to be pretty careful, though, with relying on implicit synchronization, because while it works well deep in a subsystem, it can break down on subsystem boundaries. One of the challenges I've been bumping into recently when working with Darwin has been the split between their Giant kernel lock, and their network lock. To give a high level summary of the architecture, basically they have two Funnels, which behave similarly to the Giant lock in -STABLE/-CURRENT: when you block, the lock is released, allowing other threads to enter the kernel, and regained when the thread starts to execute again. They then have fine-grained locking for the Mach-derived components, such as memory allocation, VM, et al. Deep in a particular subsystem -- say, the network stack, all works fine. The problem is at the boundaries, where structures are shared between multiple compartments. I.e., process credentials are referenced by both halves of the Darwin BSD kernel code, and are insufficiently protected in the current implementation (they have a write lock, but no read lock, so it looks like it should be possible to get stale references with pointers accessed in a read form under two different locks). Similarly, there's the potential for serious problems at the surprisingly frequently occuring boundaries between the network subsystem and remainder of the kernel: file descriptor related code, fifos, BPF, et al. By making use of two large subsystem locks, they do simplify locking inside the subsystem, but it's based on a web of implicit assumptions and boundary synchronization that carries most of the risks of explicit locking. It's also worth noting that there have been some serious bugs associated with a lack of explicit synchronization in the non-concurrent kernel model used in RELENG_4 (and a host of other early UNIX systems relying on a single kernel lock). These have to do with unexpected blocking deep in a function call stack, where it's not anticipated by a developer writing source code higher in the stack, resulting in race conditions. In the past, there have been a number of exploitable security vulnerabilities due to races opened up in low memory conditions, during paging, etc. One solution I was exploring was using the compiler to help track the potential for functions to block, similar to the const qualifier, combined with blocking/non-blocking assertions evaluated at compile-time. However, some of our current APIs (M_NOWAIT, M_WAITOK, et al) make that approach somewhat difficult to apply, and would have to be revised to use a compiler solution. These potential weaknesses very much exist in an explicit model, but with explicit locking, we have a clearer notion of how to express assertions. In -CURRENT, we make use of thread-based serialization in a number of places to avoid explicit synchronization costs (such as in GEOM for processing work queues), and we should make more use of this practice. I'm particularly interested in the use of interface interrupt threads performing direct dispatch as a means to maintain interface ordering of packets coming in network interfaces while allowing parallelism in network processing (you'll find this in use in Sam's netperf branch currently). Robert N M Watson FreeBSD Core Team, TrustedBSD Projects [EMAIL PROTECTED] Network Associates Laboratories ___ [EMAIL PROTECTED] mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to [EMAIL PROTECTED]
Synchronization philosophy (was: Re: FreeBSD mail list etiquette)
(Subject changed to reflect the fact that it contains useful technical content and banter, resulting in a hijacking of the thread; hope no one minds) On Sat, 25 Oct 2003, Matthew Dillon wrote: Yes. I'm not worried about BPF, and ucred is easy since it is already 95% of the way there, though messing with ucred's ref count will require a mutex or an atomic bus-locked instruction even in DragonFly! The route table is our big issue. TCP caches routes so we can still BGL the route table and achieve 85% of the scaleable performance so I am not going to worry about the route table initially. An example with ucred would be to passively queue it to a particular cpu for action. Lets say instead of using an atomic bus-locked instruction to manipulate ucred's ref count, we instead send a passive IPI to the cpu 'owning' the ucred, and that ucred is otherwise read-only. A passive IPI, which I haven't implemented yet, is simply queueing an IPI message but not actually generating an interrupt on the target cpu unless the CPU-CPU software IPI message FIFO is full, so it doesn't actually waste any cpu cycles and multiple operations can be executed in-batch by the target. Passive IPIs can be used for things that do not require instantanious action and both bumping and releasing ref counts can take advantage of it. I'm not saying that is how we will deal with ucred, but it is a definite option. Actually, the problem isn't so much the data referenced by ucred, but the references themselves. Part of the issue in Darwin is that ucred references are always gained using the p_ucred pointer in the proc structure. The proc structure is read and dereferenced fairly deep in the network code (network funnel), and also in the remainder of the kernel (kernel funnel). In addition, there's a lock used to serialize writes to p-p_ucred, but not to protect against reads of stale data. Shared structures, such as these, occur in pretty large quantity in BSD code, and will be a problem no matter what approach to synchronization is taken. Moving towards message passing helps to structure the code to avoid sharing of this sort, although it's not the only way to motivate that sort of change. I'm a big fan of the change in -CURRENT to use td-td_cred as a read-only thread-local credential reference and avoid synchronization on the credential reference--it nicely addresses the requirements for consistency in the referenced data for the read-only cases (which are the vast majority of uses of a credential). There are a number of cases where moving towards a message passing philosophy would really clean up the synchronization and parallelism issues in FreeBSD: for example, even the relatively simple accounting file rotation would benefit from queue-like operation to serialize the accounting data/event stream and rotation events. Using locks and condition variables to perform serialization as is currently done in the accounting code is unwieldy and bug-prone. However, when moving to event/message queuing, you also have to be very careful with data ownership and referencing, as well as proper error-handling. With accounting, most scheduled vnode operations are asynchronous and have no need for substantial error handling (when a process performs execve(), regardless of whether accounting of that operation succeeds or fails, execve() continues). The start/stop operation, however, is intended to be synchronous. Happily, in the accounting case, all necessary error checking can be performed in advance of the handoff to the accounting thread from the user thread, but that won't always be the case... One of the other risks that has worried me about this approach is that explicit locking has some nice benefits from the perspective of deadlocking and lock order management: monitoring for deadlocks and lock orders is a well-understood topic, and the tools for tracking deadlocks and wait conditions, as well as for performing locking and waiting safely, are mature. As with with the older BSD sleeping interfaces, such as tsleep(), synchronous waits on messages are harder to mechanically track, and resulting deadlocks resemble resource deadlocks more than lock deadlocks... On the other hand, some forms of tracing may be made easier. I've had some pretty nasty experiences trying to track deadlocks between cooperating threads due to message waits, and found tools such as WITNESS much easier to work with. In some work we're doing for one of our customers, we make extensive use of handoff between various submitting threads and a serializing kernel thread making use of thread-local storage to avoid explicit synchronization. Having dealt both with lower level lock/cv primitives for event passing, and message passing, I have to say I'm leaning far more towards the message passing. However, it benefits particularly from the approach due to its
Re: Is socket buffer locking as questionable as it seems?
On Sat, 4 Oct 2003, Brian Fundakowski Feldman wrote: I keep getting these panics on my SMP box (no backtrace or DDB or crash dump of course, because panic() == hang to FreeBSD these days): panic: receive: m == 0 so-so_rcv.sb_cc == 52 From what I can tell, all sorts of socket-related calls are MP-safe and yet never even come close to locking the socket buffer. From what I can tell, the easiest way for this occur would be sbrelease() being called from somewhere that it's supposed to, but doesn't, have sblock(). Has anyone seen these, or a place to start looking? Maybe a way to get panics to stop hanging the machine? TIA if anyone has some enlightenment. The system calls are marked MPSAFE in the case of the socket calls because the grabbing of Giant has been pushed down into the system call, as opposed to Giant being grabbed by the system call code itself. Giant should be held across all the relevant socket-related events -- if you find a place where it's not, send some details :-). As you observe, there is currently no socket locking in the source tree, although I'm hopeful that will be remedied in the next couple of months. The lower levels of the IP stack can be run Giant-free at this point, although my local patches to run multiple input paths in parallel runs into a panic due to insufficient locking in ip_forward() (bug report already filed with Sam). One of the conclusions from the recent developer summit was that a big focus needs to be placed on interrupt processing latency and device driver improvements so that we get the benefits of finger-grained locking. Peter's has picked up the task of doing a driver API sweep to provide better facilities for doing this. Robert N M Watson FreeBSD Core Team, TrustedBSD Projects [EMAIL PROTECTED] Network Associates Laboratories ___ [EMAIL PROTECTED] mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: 4.8-stable kernel panic
If one of you has had a chance to test this properly, please go ahead and commit. I don't have remote -STABLE development boxes, so haven't been able to do any -STABLE merging since I went to BSDCon. I did get RE permission to MFC this change. FYI, I have a bunch more related changes in a patch that I can dig up once I'm caught up on work re-mail. There are a number of M_TRYWAIT scenarios where we don't test the return value -- some easier to fix than others. Robert N M Watson FreeBSD Core Team, TrustedBSD Projects [EMAIL PROTECTED] Network Associates Laboratories On Mon, 15 Sep 2003, Maxim Konovalov wrote: On Sun, 14 Sep 2003, 23:05-0500, Mike Silbersack wrote: On Sun, 14 Sep 2003 [EMAIL PROTECTED] wrote: Hello, It's been almost a month now since I posted the original message on the list and I'm wondering about the progress on resolving this problem. I still can reproduce the panics after cvs-supping to RELENG_4 ~ 23:00 EDT today. Thanks much. Ooops, I forgot to follow up on this. Ok, a few questions: 1. Can you compile INVARIANTS and INVARIANT_SUPPORT into your kernel? That might help us track down the problem. 2. What does your network setup look like? Are you using divert sockets, is there ppp in action, etc. I believe that I tried out your script at the time, and I couldn't find it to cause any problems on my system. rwatson has fixed this panic in rev. 1.115 in -current: revision 1.115 date: 2003/08/26 14:11:48; author: rwatson; state: Exp; lines: +2 -0 M_PREPEND() with an argument of M_TRYWAIT can fail, meaning the returned mbuf can be NULL. Check for NULL in rip_output() when prepending an IP header. This prevents mbuf exhaustion from causing a local kernel panic when sending raw IP packets. PR: kern/55886 Reported by:Pawel Malachowski [EMAIL PROTECTED] MFC after: 3 days and haven't MFCed it yet. Here is a patch for -stable: Index: sys/netinet/raw_ip.c === RCS file: /home/ncvs/src/sys/netinet/raw_ip.c,v retrieving revision 1.64.2.17 diff -u -r1.64.2.17 raw_ip.c --- sys/netinet/raw_ip.c 9 Sep 2003 19:09:22 - 1.64.2.17 +++ sys/netinet/raw_ip.c 15 Sep 2003 04:21:59 - @@ -257,6 +257,8 @@ return(EMSGSIZE); } M_PREPEND(m, sizeof(struct ip), M_WAIT); + if (m == NULL) + return(ENOBUFS); ip = mtod(m, struct ip *); ip-ip_tos = inp-inp_ip_tos; ip-ip_off = 0; %%% -- Maxim Konovalov, [EMAIL PROTECTED], [EMAIL PROTECTED] ___ [EMAIL PROTECTED] mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to [EMAIL PROTECTED]
Reminder: BSDCon next week in San Mateo!
This is just a friendly reminder e-mail that the BSD Conference is taking place in San Mateo next week, and that if you're planning to attend and haven't yet registered, you might want to. Or, just turn up and register at the door. There's a really strong lineup of FreeBSD-related papers, especially relating to new features in the 5-CURRENT development line. I've attached a list of just some of the interesting things that will be going on there: they include a number of tutorials relating to development and administration, technical session presentations relating to the development of FreeBSD, development of products using FreeBSD, and the deployment of FreeBSD-based systems. And, as always, there will be a variety of invited talks, BoFs and work-in-progress sessions. USENIX has extended their early registration pricing, and also (I believe) has an online registration discount. Multi-employee discounts are also available for companies sending more than one employee. You can find out more about the location, schedule of events, etc, at: http://www.usenix.org/events/bsdcon03/ I look forward to seeing you there! Robert N M Watson FreeBSD Core Team, TrustedBSD Projects [EMAIL PROTECTED] Network Associates Laboratories Several excellent tutorials including one on developing storage extensions using GEOM Keynote: Computing Fallacies (or, What Is the World Coming To?) Reasoning about SMP in FreeBSD devd-A Device Configuration Daemon ULE: A Modern Scheduler for FreeBSD An Automated Binary Security Update System for FreeBSD Building a High-performance Computing Cluster Using FreeBSD build.sh: Cross-building NetBSD Invited Talk: Long Range 802.11 WANs BSD Status Reports GBDE-GEOM Based Disk Encryption Cryptographic Device Support for FreeBSD Enhancements to the Fast Filesystem to Support Multi-Terabyte Storage Systems Invited Talk: Social and Technical Implications of Nonproprietary Software Running BSD Kernels as User Processes by Partial Emulation and Rewriting of Machine Instructions A Digital Preservation Network Appliance Based on OpenBSD Using FreeBSD to Render Realtime Localized Audio and Video Work in Progess Reports (WiPs) Tagging Data in the Network Stack: mbuf_tags Fast IPSec: A High-Performance IPsec Implementation The WHBA Project: Experiences deeply embedding NetBSD Invited Talk: Post-Digital Possibilities ___ [EMAIL PROTECTED] mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: Ugly Huge BSD Monster
On Mon, 1 Sep 2003, Denis Troshin wrote: Almost every package I install requires a few other packages. This 'idea of using dependent packages' turns FreeBSD (and other unix-systems) to an ugly monster. For example, I don't need Perl or Python but a few packages I install require them. Does exist a programming under unix without these dependencies? P.S. Under Windows it is possible to write not bad applications which depend just on libraries (KERNEL32, USER32, GDI32). And these libs exist on every base system!!! Is it possible in unix? Before I thought that unix programs very compact, but they are huge! You've already got a boatload of responses, but I figured I'd throw in mine: it depends on the application. If applications require a scripting language, by virtue of what they do or how they are written, well, you get a scripting language in the dependencies. To get a Windows-like environment on FreeBSD, you need to layer the X server and then a toolkit/windowing environment on top -- my personal leaning right now is to stick QT/KDE on top. Once you have those pieces in place, you have a lot of what you need to write general-purpose applications interacting with users, the network, multimedia, etc. If you look at some of the key UNIX software packages, however, you'll see that they tend not to have a lot of dependencies -- Apache, Postgres, MySQL, etc. These applications avoid dependencies through less reliance on scripting, GUI elements, etc. One of the upsides, and downsides, of the open source world is a strong dependence on scripting, and the resulting diversification of scripting languages and rapid prototyping tools. This occurs in the Windows world also, though -- if you rely on Java, you need the JVM. If you have TCL applications, you need the TCL environment as well. Many web sites running on Windows use Perl for CGI just as they do in UNIX, in which case you need Perl... One of the nice things about this package-oriented approach is that the dependencies are generally very explicit: you want to write a gui app, so you need the gui pieces. Your application requires a back-end database, so a database dependency is introduced. In Windows, you have a larger base but less ability to decompose as a result. I'm also a bit alarmed when I install a new application and pick up two new scripting languages along the way -- I tend to avoid installing applications that pull in scripting as a dependency. However, sometimes that's unavoidable. In Windows, I think you'll find applications depend on more in the way of libraries than you think, though... Upgrades to system dlls when you build and install applications are not infrequent -- application vendors tend to quietly bundle all the dependent runtime components and quietly install them Robert N M Watson FreeBSD Core Team, TrustedBSD Projects [EMAIL PROTECTED] Network Associates Laboratories ___ [EMAIL PROTECTED] mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: Minimalist FreeBSD 4.8
As has been mentioned, the FreeBSD source tree as shipped isn't configured for minimization without a fair amount of effort. However, there are a number of larger components, typically maintained by third parties, that are build-time removable, and are typically arguments to the build specified in make.conf. Here are the components people like to disable with relative frequency, found by grepping for '^#NO' in /usr/share/examples/etc/make.conf, plus a little trimming for entries that have to do with compile flags on -CURRENT: #NO_CVS=true# do not build CVS #NO_CXX=true# do not build C++ and friends #NO_BIND= true# do not build BIND #NO_FORTRAN=true# do not build g77 and related libraries #NO_GDB=true# do not build GDB #NO_I4B=true# do not build isdn4bsd package #NO_IPFILTER= true# do not build IP Filter package #NO_KERBEROS= true# do not build and install Kerberos 5 (KTH Heimdal) #NO_LPR=true# do not build lpr and related programs #NO_MAILWRAPPER=true# do not build the mailwrapper(8) MTA selector #NO_MODULES=true# do not build modules with the kernel #NO_OBJC= true# do not build Objective C support #NO_OPENSSH=true# do not build OpenSSH #NO_OPENSSL=true# do not build OpenSSL (implies NO_KERBEROS and #NO_SENDMAIL= true# do not build sendmail and related programs #NO_SHAREDOCS= true# do not build the 4.4BSD legacy docs #NO_TCSH= true# do not build and install /bin/csh (which is tcsh) #NO_X= true# do not compile in XWindows support (e.g. doscmd) #NOCRYPT= true# do not build any crypto code #NOGAMES= true# do not build games (games/ subdir) #NOINFO=true# do not make or install info files #NOLIBC_R= true# do not build libc_r (re-entrant version of libc) #NOMAN= true# do not build manual pages #NOPROFILE= true# Avoid compiling profiled libraries #NOSHARE= true# do not go into the share subdir On 4.x-STABLE, the set is slightly different as Kerberos5 isn't built by default, UUCP is included in the source tree, etc. I don't think we currently have a NO_GCC flag or NO_BINUTILS to avoid installing the compiler and related tools, but I imagine those would be fairly straight-forward to build. Robert N M Watson FreeBSD Core Team, TrustedBSD Projects [EMAIL PROTECTED] Network Associates Laboratories ___ [EMAIL PROTECTED] mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: Minimalist FreeBSD 4.8
On Wed, 27 Aug 2003, Brian Reichert wrote: On Wed, Aug 27, 2003 at 07:26:10AM +1000, John Birrell wrote: One way to do this initially is to install a full FreeBSD system on one disk partition and use a second partition for a trial install. FreeBSD's boot manager will let you boot into each. As I'm pursuing these matters as well, I've found that mucking with jails is faster, for a lot of bulk work. Starting/stopping a jail is _much_ quicker than reboots. (And it's a lot easier to reset a jail to a prior state.) This won't exercise the rc* scripts, but will let you quickly test for dependancies elsewhere. Actually, I tend to boot my jails using the existing rc pieces -- I skip some of the hardware-esque things (network interface configuration, file system mounting), but do use the rc stuff to start daemons. And whatever you find for dependancies, please document them somewhere; I still have a fantasy of 'deconstructing' FreeBSD into finer-grained packages... One of the big problems with that process has between that people who've attempted it (perhaps rationally) get caught up in combining compartmentalization of the build and compartmentalization of the delivery. I.e., they sit there and try to figure out how to break out libraries, utilities, etc, and get caught up in building the end-all to package building infrastructure. Something a little lower-hanging would go a long way... Robert N M Watson FreeBSD Core Team, TrustedBSD Projects [EMAIL PROTECTED] Network Associates Laboratories ___ [EMAIL PROTECTED] mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: Kernel Panic on FreeBSD-5.1-p2 with SMP support
On Sun, 24 Aug 2003, Markus Paluschek wrote: I have Compaq Proliant Server with 2 Pentium Xeon IV 2,4GHz processors. After installing FreeBSD-5.1, upgrading to FreeBSD-5.1-p2 by cvsup and recompiling kernel with SMP support I;ve download ircd-hybrid-7 sources and installed on user account after running it and writing /restart my.ircd.server Im getting kernel panic: than system locks, need to reset. What to do for fixing that? There's a pretty useful chapter in the FreeBSD Developer's Handbook on kernel debugging: the starting point for debugging a panic is to get a stack trace and posting that. I would actually suggest updating to FreeBSD 5-CURRENT, since some pretty large bugfixes have gone into the tree since the release, and they may well have fixed the problem you're bumping into. Robert N M Watson FreeBSD Core Team, TrustedBSD Projects [EMAIL PROTECTED] Network Associates Laboratories ___ [EMAIL PROTECTED] mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: GEOM Gate.
On Fri, 15 Aug 2003, Pawel Jakub Dawidek wrote: On Thu, Aug 14, 2003 at 09:48:57PM +0200, Attila Nagy wrote: + Bruce M Simpson wrote: + Whatever next? PCI-over-IP? + Collecting cheap on board serial lines to make a big terminal server + makes sense to me :) + + BTW, Pawel's stuff would be even more interesting if it would be + possible to mount the same filesystem on more than one machines. It'll be, but probably in read-write mode on one machine and read-only mode on rest machines, because you don't export file systems here, but disk devices. In order to do this, you need a file system capable of multi-node consistency, and a medium capable of supporting the consistency mechanisms. Since we can't handle mounting the same file system read-write and read-only in multiple places from the same block device without a likely panic, I expect much the same results with a distributed block device. Multiple read-only mounts should work OK, but you don't want to violate the assumptions of the read-only mounts by introducing a read-write mount. File systems can be written that do synchronization on using a protocol of some sort when talking to a common block device, but that will keep you busy for a while, I expect :-). That said, I think the geom gate stuff looks very cool :-). You might be able to run some interesting performance numbers comparing NFS and UFS over a remote block device. Robert N M Watson FreeBSD Core Team, TrustedBSD Projects [EMAIL PROTECTED] Network Associates Laboratories ___ [EMAIL PROTECTED] mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: Missing system call in linux emulation ( patch )
On Thu, 14 Aug 2003, Steven Hartland wrote: I've created a patch for the linux emulation which adds a dummy for the exit_group syscall along with defining all functions up to 252 this fixes a crash in the BattleField 1942 server. How do I go about getting this into the various FreeBSD streams so others can benifit. File a PR, please. BTW, I saw this on the OpenBSD source-changes list recently and remembered the missing system call thread here: CVSROOT:/cvs Module name:src Changes by: [EMAIL PROTECTED] 2003/08/14 12:34:15 Modified files: sys/compat/linux: syscalls.master Log message: add more syscalls. implement exit_group (which is actually an alias for sys_exit), needed for newer glibc's binaries. from marius aamodt eriksen marius at monkey dot org Assuming that this is the right approach to solving the problem, we could probably pull that change into our linux emulator. Robert N M Watson FreeBSD Core Team, TrustedBSD Projects [EMAIL PROTECTED] Network Associates Laboratories ___ [EMAIL PROTECTED] mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: possible deadlocks?
On Wed, 6 Aug 2003, Ted Unangst wrote: My advisor Dawson Engler has written a deadlock detector, and we'd like some verification. They look like bugs, unless there is some other reason why two call chains cannot happen at the same time. Neat -- sounds like two good catches given the responses so far. Can we expect more such reports forthcoming? This kind of help will be invaluable in finishing up the fine-grained locking work. Alternatively, do you plan to post the software? Is this static or dynamic analysis? etc, etc? :-) Robert N M Watson FreeBSD Core Team, TrustedBSD Projects [EMAIL PROTECTED] Network Associates Laboratories ___ [EMAIL PROTECTED] mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: Communications kernel - userland
On Tue, 22 Jul 2003, Adam Migus wrote: Perhaps I'm not understanding you right but I think Pawel's idea is cool. It seems to fulfill your requirements (except being network specific). I suppose if it were network specific we could optimize it for packet streams and if we made it complicated enough it would require quite an elaborate sychronization and notification mechanism. Is that closer to what have in mind? Well, the case I had particularly in mind was the rapid flow of packets form the kernel to the user process; Pawel's suggestion handles the flow of new data from the user process to the kernel well, and has substantial similarity to some of the IO Lite mechanisms I pointed at (and hopefully with many of the same performance benefits). In the kernel-to-userspace case, we want to avoid the copy of what is originally kernel-owned memory (from the mbuf allocator) to the user process memory. If you didn't care about stuff like confidentiality of kernel memory, etc, the simplest approach would be to actually map the mbuf memory (and possibly cluster) into userspace, and then notify the user process in some form of the new mapping. However, because mbufs and their meta-data aren't page aligned (etc, etc, etc), you really don't want to do it explicitly that way, I suspect. By synchronization, I had in mind a mechanism by which the process and kernel would communicate about memory ownership in the shared memory space: I'm done with this packet, I'm done with these packets, I want to continue delivery of that packet, I modified this packet, I'm inserting a new packet here, I'm dropping this packet, all without extensive memory copying, and with a moderate amount of asynchrony (and possibly parallelism). In terms of functionality, it might be similar to some of the current services that forward between IPDIVERT in and out (such as natd), or between BPF pseudo-devices. This sounds like something that likely exists in a few commercial products already, so my question to Terry was to whether he knew of any in the literature. IOLite is the closest I know of, as it supports the zero-copy page and memory ownership bits, although I don't know if they allowed it to handle packets, perhaps just datagrams and streams. Robert N M Watson FreeBSD Core Team, TrustedBSD Projects [EMAIL PROTECTED] Network Associates Laboratories ___ [EMAIL PROTECTED] mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: Communications kernel - userland
On Mon, 21 Jul 2003, Terry Lambert wrote: Robert Watson wrote: Of these approaches, my favorite are writing directly to a file, and using a psuedo-device, depending on the requirements. They have fairly well-defined security semantics (especially if you properly cache the open-time credentials in the file case). I don't really like the Fifo case as it has to re-look-up the fifo each time, and has some odd blocking semantics. Sockets, as I said, involve a lot of special casing, so unless you're already dealing with network code, you probably don't want to drag it into the mix. If you're creating big new infrastructure for a feature, I suppose you could also hook it up as a first class object at the file descriptor level, in the style of kqueue. If it's relatively minor event data, you could hook up a new kqueue event type. You could also just use a special-purpose system call or sysctl if you don't mind a lot of context switching and lack of buffering. I like setting the PG_G bit on the page involved, which maps it into the address space of all processes. 8-). For one of our research projects, here at NAI, we did a fair amount of userland network code prototyping. We started out with IPDIVERT, then pushed down to BPF using a partial network stack in userspace. We've found it's a lot easier on competent network developers who are unfamiliar with the FreeBSD kernel code, not to mention easier on debugging. We never got so far on that project as to do shared memory between the kernel and userspace, but I know that that's been done by at least a couple of companies at various points to reduce copying and context switch costs for userspace test frameworks. One of the things I'd really like to see if some decent throw packets between kernel and userspace primitive bits, such that the kernel has a useful and logical way to expose buffer data into directly mapped user pages, and an appropriate notification and management system to reuse memory, etc. Something that looks a bit like the relationship between kernel device drivers and devices when it comes to DMA management. Do you know if any such framework exists? (Specifically targetted at exposing network packets...) (Ideally not requiring privilege in the user process, nor involving nasty integrity or confidentiality problems :-) Robert N M Watson FreeBSD Core Team, TrustedBSD Projects [EMAIL PROTECTED] Network Associates Laboratories ___ [EMAIL PROTECTED] mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: Communications kernel - userland
On Mon, 21 Jul 2003, Pawel Jakub Dawidek wrote: For example syscall is marking some range with mark() function. For now on this range isn't accessable from userland. If process will try to write to this page, page is copied (copy-on-write). If this page will be modified by kernel it will be marked as MODIFIED. Now when syscall will call unmark() on this range we could get two scenarious: 1. Page is marked as MODIFIED (by kernel) so userland copy of this page (if it exists of course) is destroyed and this page will be putted in its place. This is replacement for copyin() and then copyout() or just copyout().. 2. Page isn't marked as MODIFIED, so kernel version of page is destroyed (is there is userland version). This is replacement for just copyin(). There could be other ways. Thread/process could be locked if it is trying to access memory marked with mark() function. And this, I think, don't hit performance, because this happends really rarely. So maybe it is better to lock thread for a moment instead of doplicating page, but I don't think so. This sounds a bit like some of the IO Lite stuff -- moving to a page-centric model for IO interfaces to avoid copy operations, in many cases able to share pages between applications, buffer cache, network buffers, etc. Take a look at: http://www.cs.princeton.edu/~vivek/ For some details. Some of the benefits of this approach are captured in the common case through sendfile(), in practice, but it's definitely worth a read. I guess what I had in mind was something more network-specific, with interfaces optimized for memory mapped network packet streams. In the simplest case, something like memory-mapping the BPF buffer from kernel space to userspace, with some sort of simple stream synchronization so that the user application could notify the kernel as to when it could reuse bits of the buffer, but avoiding copy operations and lots of context switching. Robert N M Watson FreeBSD Core Team, TrustedBSD Projects [EMAIL PROTECTED] Network Associates Laboratories ___ [EMAIL PROTECTED] mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: Communications kernel - userland
On Sat, 19 Jul 2003, Pawel Jakub Dawidek wrote: Your choices are: - device, - sysctl, - syscall. There are actually a few other more obscure ways to push information from the kernel to userspace, depending on what you want to accomplish. Write directly to a file from the kernel. ktrace, system accounting, and ktr with alq all stream data directly to a file provided by an authorized user process. quotas and UFS1 extended attribute data are also written directly to a file. On other operating systems, audit implementations frequently take the same approach -- when the goal is long term storage of data in a user-accessible form, but you don't want to stream it through a user process live, this is usually the preference. Typically, when taking this approach, a special system call is used to notify the kernel of the target file to write to -- the file is created by the user process with appropriate protections. Often, but not always, the system call is non-blocking and simply returns once the file is hooked up as a target, and continues until another system call cancels delivery, or switches it to a new target. Stream it through a device node. If you need only one or a small number of processes to listen for events from the kernel, a common approach is a pseudo-device that acts like a file. For example, syslogd listens on /dev/klog for log events from the kernel; some audit implementations also take this approach. Our devd, usbd, and others similarly listen for system events that are exposed to user processes as data on a blocking pseudo-device. One nice thing about this approach is that you can combine it with select(), kqueue(), et al, to do centralized event management in the application. BPF also does this. Both Arla and Coda take this approach for LPC'ing to userspace to request events as a result of VFS operations by processes. Expose it using a special socket type. We expose routing data and network stack administrative controls as special reads, writes, and ioctls on various socket types. I'm not a big fan of this approach, as it special cases a lot of bits, and requires you to get caught up in socket semantics. However, one advantage of this approach is it makes the notion of multicast of events to multiple listeners easier to deal with, since each socket endpoint has automatic message buffering. There are some other odd cases in use as well. The NFS locking code opens a specially named fifo (/var/run/lock) and writes messages to it, which are picked up by rpc.lockd. The lock daemon pushes events back into the kernel using a special system call. I don't really like this approach, as it has some odd semantics -- especially since it reopens the fifo for each operation, and there are credential/ file system namespace inconsistencies. Of these approaches, my favorite are writing directly to a file, and using a psuedo-device, depending on the requirements. They have fairly well-defined security semantics (especially if you properly cache the open-time credentials in the file case). I don't really like the Fifo case as it has to re-look-up the fifo each time, and has some odd blocking semantics. Sockets, as I said, involve a lot of special casing, so unless you're already dealing with network code, you probably don't want to drag it into the mix. If you're creating big new infrastructure for a feature, I suppose you could also hook it up as a first class object at the file descriptor level, in the style of kqueue. If it's relatively minor event data, you could hook up a new kqueue event type. You could also just use a special-purpose system call or sysctl if you don't mind a lot of context switching and lack of buffering. Robert N M Watson FreeBSD Core Team, TrustedBSD Projects [EMAIL PROTECTED] Network Associates Laboratories ___ [EMAIL PROTECTED] mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: running 5.1-RELEASE with no procfs mounted (lockups?)
On Tue, 15 Jul 2003, Josh Brooks wrote: I have loaded two 5.1-RELEASE systems, both of them have PROCFS and PSEUDOFS in the kernel, and yet neither of them have a procfs mounted. There is no procfs line in /etc/fstab by default, and no procfs is mounted on the system in any way. Question 1: Is this intentional ? Is it no longer needed/recommended to run a procfs ? Most system functionality that relied on procfs has been rewritten to rely on other mechanisms. In general, I advise against running procfs--it's interesting, but conceptually it's very risky. If you look at the history of security advisories on systems that supported procfs (FreeBSD, Linux, Solaris), you'll get a sense of why: procfs represents processes as files, and the semantics of processes and of files are very different. For example, with processes, there are notions of revoked access; processes are reused to hold several programs often running with different credentials. The behavior I'm aware of that currently relies on procfs and has not yet been adapted to use ptrace() or sysctl() are: ps -e Relies on groping around in the address space of each process to display environmental variables. truss Relies on the event model of procfs; there have been some initial patches and discussion of migrating truss to ptrace() but I don't think we have anything very usable yet. I'd be happy to be corrected on this. :-) Also, linprocfs, which offers many of the functions of procfs, relies on pseudofs, and is required to run many Linux emulated programs. Often for rather bizarre reasons (retrieving command line arguments from the per-process cmdline file...). Question 2: Is this because I am running without procfs ? Or have these type of problems been seen in 5.1-RELEASE by other causes ? This is most likely unrelated. Robert N M Watson FreeBSD Core Team, TrustedBSD Projects [EMAIL PROTECTED] Network Associates Laboratories ___ [EMAIL PROTECTED] mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: current state of the art / best practice for devfs in a jail ?
On Thu, 3 Jul 2003, Joshua Oreman wrote: On Thu, Jul 03, 2003 at 04:00:46AM -0700 or thereabouts, Josh Brooks wrote: I have been researching the various of ways people add devfs to a jail to give the jail certian /dev devices necessary to function ... Well, all I did was test your research :-) Gordon Tetlow (victim CC'd) was, I believe, working on changes to rc.d to allow automatic construction of jails at boot, and part of that was some best practice devfs rules for jail. Perhaps he could chime in now? :-) Robert N M Watson FreeBSD Core Team, TrustedBSD Projects [EMAIL PROTECTED] Network Associates Laboratories ___ [EMAIL PROTECTED] mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: per-directory quotas possible on 5.x ?
On Sun, 29 Jun 2003, Josh Brooks wrote: Normally, quotas work on a per-user, per-filesystem basis - so if a user has a home directory and other processes _not owned by that user_ are placing files and using up space into that directory, it will not count toward the quota (unless they get chowned/chgrpd to that user/group). Is there any way to enforce a quota on a directory, regardless of what ownership or group ownership the files and dirs inside the directory - that is to say, take directory X, located at an arbitrary spot on the system, I want it to grow no larger than size Y. I know this can be done by creating a lot of little partitions - maybe even vn-backed parttion-on-file, but that seems like a hack, as they would be hard to resize. I am looking for a way to force a changeable quota on a directory, regardless of what gets put in it, or who owns what gets put in it. Any hacks/asuggestions/comments of any kind are very appreciated. Unfortunately, the UFS file system model makes it difficult to implement this sort of feature. One major part of this is that files can exist in more than one directory at a time, by virtue of hard links; this in turn is relied on for file system checking, where a file may end up linked to more than one directory when certain failure modes occur and are recovered from. Another part of the problem is that the internals of UFS really disassociate the namespace from the storage mechanism, and since such a directory based quota system would determine the relationship between files based on the namespace and not a per-inode attribute, this also makes implementing such a system on a UFS file system difficult. FWIW, you can sometimes get similar semantics using group quotas and the fact that, on BSD, entries created in directories have the group of the parent directory in which they are created... Most of the systems I've seen that do quotas on a large scale do basically follow the many volumes model -- for example, large AFS cells may have tens or hundreds of thousands of volumes, and use volume size to impose quotas, which sounds like what you're looking for. When I've seen things like this done on UFS, it's usually been as a weak consistency accounting mechanism -- measure the size of various trees at intervals and bill based on the sampled size, rather than block allocation. As you may have noticed in trying the vn-backed mechanism, there are some inefficiencies that turn up in FreeBSD when have large numbers of pseudo-devices, etc. The resizing problem is real, also, since we don't have online file system resizing. FWIW, a file system like HFS+ (which has a much more strict directory hierarchy) would lend itself to directory quotas much more. A port of HFS+ to FreeBSD was recently posted to freebsd-fs. Robert N M Watson FreeBSD Core Team, TrustedBSD Projects [EMAIL PROTECTED] Network Associates Laboratories ___ [EMAIL PROTECTED] mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: Suid and gid files
On Mon, 23 Jun 2003, Socketd wrote: I just installed FreeBSD 5.1 release and ran a find / -perm +4000 and find / -perm +2000. My question is: are any of these files used by the system, in a way that prevents me from making them non-executable to the world? I have no shell users and don't use sendmail. Setuid can be turned off on pretty much all of the binaries; however, as you turn off setuid bits, more and more things will not work for unprivileged users. During normal system operation, privileges are usually dropped as opposed to acquired, so the exceptions are usually for access to raw sockets, system devices, etc. I recently removed the setuid bit from the quota command in -CURRENT, and am in the throes of reviewing the remaining setuid/setgid pieces as part of developing our Security Architecture document. The one potentially problematic case that comes to mind is mail submission by sendmail; mechanisms such as cron, at, etc, expect to be able to generate mail from unprivileged users and that may break if you use sendmail as the MTA but without setuid. There are mail systems that don't require setuid, instead relying on LTMP, which might be preferable in your environment. I also find su very helpful, FWIW :-). Btw why is /usr/sbin/ppp world readable? (not that is matters) sproing:/usr/sbin ls -l ppp -r-sr-xr-- 1 root network 367304 May 8 15:16 ppp* Yeah, that is a little inconsistent, although not harmful as far as I can tell. I'll remove the read bit in -CURRENT and we'll see if anyone complains :-). Robert N M Watson FreeBSD Core Team, TrustedBSD Projects [EMAIL PROTECTED] Network Associates Laboratories ___ [EMAIL PROTECTED] mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: struct ipc_perm
On Wed, 18 Jun 2003, Dmitry Sivachenko wrote: Is there any reason why struct ipc_perm is not protected by #ifdef _KERNEL in ipc.h? Is it supposed to be used from userland? It's needed by ipcs. Ah, I see. It is visible via struct msqid_ds. I developed a patch which requires addition of custom field to ipc_perm. I am trying to imagine which problems can it cause to userland programs. We have local changes in the TrustedBSD development trees to extend all the structures in the kernel without modifying the ABI. We needed this to put labels in the various System V IPC object structures. We're not ready to merge them yet, but it will probably happen in the next month or so. If you'd like early access to the patch, we can drop you a copy. We'll merge it into the MAC tree in about a week. Robert N M Watson FreeBSD Core Team, TrustedBSD Projects [EMAIL PROTECTED] Network Associates Laboratories ___ [EMAIL PROTECTED] mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: kqueue alternative?
On Sun, 15 Jun 2003, Matthew Hagerty wrote: I'm writing a little application that needs to watch a file that another process is writing to, think 'tail -F'. kqueue and kevent are going to do it for me on *BSD, but I'm also trying to support *cough* linux and other UN*X types OSes. From what I can find on google, the linux community seems very opposed to kqueue and has not yet implemented it (they say: blah blah blah, aio_*, blah blah balh.) What alternatives do I have with OSes that don't support kqueue? I'd really hate to poll with stat(), but do I have any other choices? I was recently told about a library named libevent from Niels Provos, which abstracts a variety of underlying event mechanisms behind a common API. You can learn a bit more about it here: http://www.monkey.org/~provos/libevent/ It doesn't appear to support /dev/poll yet, but the web page suggests such support is planned. If it's not already a port, we should create one. Robert N M Watson FreeBSD Core Team, TrustedBSD Projects [EMAIL PROTECTED] Network Associates Laboratories ___ [EMAIL PROTECTED] mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: Jdk13/14 still hangs in 4.8 Prerelease. Outstanding Fix need (fwd)
Per Martin's request, I'm forwarding this response to the broader group involved in this thread. Basically, I think broadening the scope of processes permitted to make the scheduler call is fine, but you don't want to use the CANSIGNAL() code that's currently present for several reasons. The simplist solution might be to only allow the scheduler change if the requesting process is targetting itself. Robert N M Watson FreeBSD Core Team, TrustedBSD Projects [EMAIL PROTECTED] Network Associates Laboratories -- Forwarded message -- Date: Tue, 25 Feb 2003 12:53:53 -0500 (EST) From: Robert Watson [EMAIL PROTECTED] To: Martin Blapp [EMAIL PROTECTED] Subject: Re: Jdk13/14 still hangs in 4.8 Prerelease. Outstanding Fix need (fwd) On Tue, 25 Feb 2003, Martin Blapp wrote: Basically, it changes p31b_proc() to not always return an error for non-root. If rwaston@ signs off on the security implications (should be minimal, basically means that you can change your own scheduling params and can change the params of other processes you own) then I would prefer this patch. Hmm. I think the check there is a bit on the unsafe side, that could be why it was disabled. Basically, it permits the scheduler change in the following four circumstances: (0) Superuser always wins (1) Subject real uid is object real uid E.g., any process I should randomly start or own (2) Subject effective uid is object real uid If a tool is temporarily switched to my uid to exercise my privileges, sounds OK. (3) Subject real uid is object effective uid (uh oh) (4) Subject effective uid is object effective uid (uh oh) The reason (3) and (4) are problems is that they affect daemons temporarily switching to a user's privileges to carry out a task -- such as mail delivery, or a userland NFS server or the like. It could be that these are poor handling of the loopback process case, wherein a process can always modify its own scheduling. Take a look at p_cansched() in 5.x for a bit more what I think the check should be. In summary, the rules are: (0) You can always reschedule the current process. (1) If you're in a different jail, deny. (2) Optionally call out to MAC. (3) If the seeotheruids support says you can't see the other process, you can't reschedule it either, regardless of uids. (4) If the real uids are the same, it's OK -- i.e., any arbitrary shell process (setuid or otherwise). (5) If the subject effective uid is the same as the object real uid -- if temporarily adopting a user's privileges, we can reschedule the processes they own. (6) Superuser always wins (subject to 0, 1, 2, 3). (7) Deny I don't know why the check was turned off. The entire #if 0 / #else / #endif seems to have been around since revision 1.1. It's probably because whoever wrote it realized that it was moderately suspect. I would oppose simply enabling the current CANSIGNAL check -- it has serious problems. On the other hand, putting in a refined check sounds reasonable and I'd be happy to review such a patch. Although the code from 5.x won't instantly work with 4.x without substantial modification, it might make a good starting point. Robert N M Watson FreeBSD Core Team, TrustedBSD Projects [EMAIL PROTECTED] Network Associates Laboratories To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-hackers in the body of the message
Re: Monitoring changes in extended attributes?
On Wed, 12 Feb 2003, Kevin Fogleman wrote: Is there an existing way to monitor the entire filesystem for changes to any file, particularly changes in extended attributes? I'm looking to write a program that builds an index of all user-accessable extended attributes for every file in the filesystem and then updates that index in real time according to modifications to existing files' attributes, creation of new files and deletion of files. I've read over the documentation for kqueue, but some things were left unclear. For example, it appears that kqueue needs a file descriptor for each file that one would want to monitor, making any large-scale file monitoring impractical. Is there any other way in FreeBSD to be notified of file modifications in a way that would allow one to monitor the whole file system or large portions of it? Also, I'm not very knowledgable about file system conventions, so I'm wondering how one would detect the creation of new files? I don't really need to know whether a particular attribute changed, but rather just whether any of them changed. BTW, I have posted this question earlier to freebsd-questions, but nobody answered and, judging by the content of the other questions on that list, I thought that my question would be more appropriate here. Currently, you can monitor particular files for meta-data changes, which include extended attribute modifications, and you can monitor directories for changes, which might include the addition of a new name (and hence possibly a file). However, currently there's no way to monitor at the granularity of a file system for events such as Some EA changed or A new file was allocated. I guess such primitives haven't generally been needed in the past, although I can certainly imagine scenarios where they might be used. Kqueue is the vehicle the two events I identified above can be monitored with, and it's certainly possible to imagine adding new event categories to monitor a file system for global events, assuming it's a local file system. However, then the question becomes Once I know that a file has been added, how do I find it, which I would guess generally results in a recursive search, at which point I suspect you might as well just re-search the entire fs once in a while anyway. The functionality you're looking for sounds a bit more database-esque than in line with a traditional file store. FWIW, Apple has a searchfs() system call and vnode operation to permit more efficient meta-data searches on HFS+; this makes some sense for HFS+ because it has a notion of a centralized meta-data store, whereas ours is laid out pretty sparsely over the tree and works a bit differently. They don't support generalized meta-data extended attributes right now, though, although they do have a few specific attributes beyond the standard set. Well, we actually have local patches to add EA's to their UFS file system that would probably work on HFS+, but they aren't in the central Darwin tree. Robert N M Watson FreeBSD Core Team, TrustedBSD Projects [EMAIL PROTECTED] Network Associates Laboratories To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-hackers in the body of the message
Re: Multi-level jailing.
On Mon, 17 Feb 2003, Pawel Jakub Dawidek wrote: I have prepared patch for jail functionality against FreeBSD 5.0-CURRENT. It provides multi-level jailing and multiple ips for jails. Sounds cool, although I haven't had a chance to read the patch yet. Question: how did you handle the problem (if at all) that INADDR_ANY doesn't perform a wildcard binding with multiple IPs in the same jail? It's not strictly required that it be handled, but it was always one of the semantic problems I bumped into when I experimented with more IPs. A single-IP jail works because it maps INADDR_ANY into the only IP available. I'll try to get a box up and running with these changes in the next few days and give them a spin. Robert N M Watson FreeBSD Core Team, TrustedBSD Projects [EMAIL PROTECTED] Network Associates Laboratories To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-hackers in the body of the message
Re: max simultaneous TCP connections (32,763)?
On Sun, 26 Jan 2003, Sam Tannous wrote: I have two freebsd boxes (back to back) and I've been playing with a simple server on one machine and client on the other machine (this was simply an exercise with playing with kqueue). Both the server and the client are single processes and the client seems to stop at 32,763 connections. I've modified the port range, tcp keepalive, kern.ipc.somaxconn, maxfiles, maxsockets, nmbclusters. I even tried net.inet.tcp.tcbhashsize (up to 1024). Is there some other parameter I'm missing? Or is this a known limitation/bug? Some of this has to do with limits on the available ancillary ports for out-going connections. Try adding additional IP addresses to the client machine, and forcing your client software to use specific IP addresses. TCP uniquely identifies connections by the pair of port numbers and IP addresses, so assuming unconstrained use of the outgoing port space on a particular IP, that TCP/IP can in theory support up to (approx) 64k outgoing connections. In practice, we only allocate out of specific ranges. By adding additional IP addresses for outgoing connections, you increase the number of potential connections to a particular remote IP/port tuple. However, if you're not specifying a local IP address, the stack will pick the most appropriate local address for the route, which is probably the first IP address on the interface associated with the route to the other endpoint. Hard-coding local addreses in your application overrides that. I've never tried this (i.e., using multiple IPs to get around the TCP/IP limit), so if it doesn't work, let me know. In theory, it should. Robert N M Watson FreeBSD Core Team, TrustedBSD Projects [EMAIL PROTECTED] Network Associates Laboratories To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-hackers in the body of the message
Re: umount of procfs fails
On Sun, 26 Jan 2003, Tim Kientzle wrote: Experimenting with 'mount' and stumbled across the following oddity: mount -t procfs proc /mnt umount -t /mnt You're missing the proc after -t here, right? results in procfs still mounted on /mnt but no longer mounted on /proc. It appears that a umount of procfs is unmounting the most recently mounted instance rather than the instance mounted at the specified location. I haven't checked to see if other filesystem types have this problem. I experimented a bit, and found the following: If I unmount /mnt using simply umount /mnt on -CURRENT or -STABLE, /mnt is unmounted. If I specify -t procfs, the same thing happens. If I mis-specify the mount type as -t proc, then it silently fails, and neither is unmounted. Which isn't to say that it didn't happen for you, but you should probably provide a bit more information. First, could you identify the version of FreeBSD you're running? Second, can you include script output of the shell session in which you mount /proc, /mnt, run mount to confirm they are both mounted, then umount one, run mount to show the wrong one unmounted? Robert N M Watson FreeBSD Core Team, TrustedBSD Projects [EMAIL PROTECTED] Network Associates Laboratories To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-hackers in the body of the message
Re: NFS ACLS's ?
On Fri, 27 Dec 2002, joe mcguckin wrote: Are there any strange interactions between NFS and filesystems that are not UFS? E.g. UFS2? Does NFS support new features that these fs's may implement? NFS can represent many but not all of the services found in UFS1 and UFS2. Among things it doesn't support are the retrieval and manipulation of BSD file user flags, system flags, extended attributes, and access control lists (ACLs). However, NFSv3 does correctly handle enforcement with these features because clients rely on the server to evaluate protections on file system objects using an ACCESS RPC. NFS2 evaluates protections on the client (if I recall correctly) so may not behave properly. There are RPC extensions to NFSv3 to retrieve and manipulate ACLs on Solaris, IRIX, et al, but we don't currently implement those extensions. Likewise, NFSv4 supports ACL management, but we don't yet implement NFSv4. It shouldn't be too hard to dig up information on the NFSv3 ACL RPC extensions and implement them on FreeBSD 5, since the semantics of our ACLs are highly compatible with Solaris and IRIX. Robert N M Watson FreeBSD Core Team, TrustedBSD Projects [EMAIL PROTECTED] Network Associates Laboratories To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-hackers in the body of the message
Re: panic: icmp_error: bad length
BTW, if this bug exists in 5.0 for the same reasons (or even different ones), we should try to generate a fix ASAP and get it committed. Robert N M Watson FreeBSD Core Team, TrustedBSD Projects [EMAIL PROTECTED] Network Associates Laboratories On Thu, 12 Dec 2002, Ian Dowse wrote: In message [EMAIL PROTECTED], Luigi Rizzo writes: the diagnosis looks reasonable, though i do not remember changing anything related to this between 4.6 and 4.7 so i wonder why the error did not appear in earlier versions of the code. Yes strange - actually, it looks like the THERE IS NO FUNCTIONAL OR EXTERNAL API CHANGE IN THIS COMMIT commit may be to blame :-) Some fragments below. Ian bridge.c 1.16.2.2: +#ifdef PFIL_HOOKS ... -* before calling the firewall, swap fields the same as IP does. -* here we assume the pkt is an IP one and the header is contiguous ... - ip = mtod(m0, struct ip *); - NTOHS(ip-ip_len); - NTOHS(ip-ip_off); ip_fw.c 1.131.2.34: - if (0 BRIDGED) { /* not yet... */ - offset = (ntohs(ip-ip_off) IP_OFFMASK); + if (BRIDGED) { /* bridged packets are as on the wire */ + ip_off = ntohs(ip-ip_off); ip_len = ntohs(ip-ip_len); } else { To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-hackers in the body of the message To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-hackers in the body of the message
Re: kernel/userland ssh filesystem for FreeBSD?
On Wed, 11 Dec 2002, Marco Molteni wrote: as you might know, both kde (via kio-fish) and gnome (via gnome virtual file system) provide a userland filesystem-like API that allows to mount a remote filesystem using ssh. What I don't like about those solutions is that they require the application to use a particular API (kio slave or gnome vfs). Another approach, that provides a real filesystem interface, is the Linux Userspace File System. Quoting from http://lufs.sourceforge.net/lufs/intro.html: LUFS is a hybrid userspace filesystem framework supporting an indefinite number of filesystems transparently for any application. It consists of a kernel module and an userspace daemon. Basically it delegates most of the VFS calls to a specialized daemon which handles them. Now the question: if I wanted to do something similar for FreeBSD, how would I do it? Any high-level hints? FreeBSD actually includes a module for this very purpose to support the Coda file system, which uses a userspace cache manager to interact with directory services, manage the on-disk local cache, etc. I actually slightly prefer the Arla XFS kernel module, which behaves in an almost identical manner. Both create /dev nodes and communicate their needs via what are effectively RPC upcalls. They both follow the model that a daemon exists in userspace to support a file system mount, and will update the kernel with namespace information, as well as providing referenced to cache files locally. Usually the userland daemon is threaded, and matches worker threads with kernel threads/processes currently blocked in file system activity. I know there was discussion of getting the XFS module to support more than one mountpoint at a time, but I'm not sure if that happened or not. The Arla code is separately distributed from FreeBSD, but there's a port I believe. Robert N M Watson FreeBSD Core Team, TrustedBSD Projects [EMAIL PROTECTED] Network Associates Laboratories To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-hackers in the body of the message
Re: Some problems about KSE
A commit was made to correct the KSE crash shortly after 5.0-RC1. You can cvsup forward to a newer revision, or wait for RC2. Robert N M Watson FreeBSD Core Team, TrustedBSD Projects [EMAIL PROTECTED] Network Associates Laboratories On Wed, 11 Dec 2002, ouyang kai wrote: Hi, everybody, I want to make sure whether we can program the multi-thread code based on KSE in FreeBSD5.0 RC-1. I have make in '/usr/src/lib/libpthread', I found some new things in '/usr/lib' as follow: lrwxr-xr-x 1 root wheel 11 Dec 11 16:04 libkse.so - libkse.so.1 -r--r--r-- 1 root wheel68780 Dec 11 16:04 libkse.so.1 -r--r--r-- 1 root wheel 164448 Dec 11 16:04 libkse_p.a -r--r--r-- 1 root wheel 153854 Dec 11 16:04 libkse.a So if I program. How can I use the kse? I can use pthread(3) as traditional manner, only using '-lpthread' instead of '-pthread' in my makefile, right? when I use /usr/src/tools/KSE/ksetest/ksetest program , it always cause my box crash. I have report this issue to Julian. I am seeing KSE(2), I have some puzzles about that. 1. upcall is really means what? Does it represent through 'km_func'? if it were true, the 'km_func' is indicated by whom? UTS, Kernel, or user program, I do not know. 2. When one process has more than one KSEG, the signal should be delivered to which KSEG? The manual said it is indeterminate. I do not know how the signal could be delivered to the special KSEG exactly? Thank you! Best Regards Ouyang Kai _ Add photos to your e-mail with MSN 8. Get 2 months FREE*. http://join.msn.com/?page=features/featuredemail To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-hackers in the body of the message To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-hackers in the body of the message