RE: bad RAM? prove it with a crash dump?
On Thu, 6 May 2010, Andrew Duane wrote: It is also useful to make sure that the garbage itself is different. As mentioned before, a single bit error in an otherwise valid value, or maybe a missing/scrambled byte, these are good indications of memory problems. If random places are often overwritten with something else, that could just be another piece of misbehaving code that is writing someplace it shouldn't. I've often found code that writes some buffer into e.g. a piece of memory it no longer owns that looks like memory corruption until you realize the garbage is always something specific like a vnode structure. There are trickier things too. I once had a machine with bad cache memory where once in a while you would get a cache line that had come from somewhere else in memory. This was particularly vexing when it happened to an I/O buffer, and I wound up with a large zip file that had 32 bytes of libc.so somewhere in the middle... :-( And of course, swapping out the RAM wouldn't have fixed it. -- Nate Eldredge n...@thatsmathematics.com ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org
Re: Modifying ELF files
On Thu, 8 Apr 2010, Patrick Mahan wrote: In my job, we are producing applications and KLM's for our product that require them to be signed so that our installer will recognize and validate our images. The signature is stored in each app as unsigned char signature[40] __attribute__((section(.compsign))); What I need to do is open the file for writing, locate the .compsign section and stuff in the signature, write it out and close the file. (simple ELF manipulation) An 'ls -l' shows the following: % ls compklm.ko -rw-r--r-- 1 pmahan pmahan 125296 Apr 6 22:50 /home/pmahan/temp/compklm.ko When I try to run my program ./signfile --signature=A203239897C8EB360D1EB2C84E8E77B16E5B7C9A compklm.ko open: Text file busy Googling and looking at the kernel sources, it seems that it detects this file contains 'shared text', that is, it is an executable file and does not allow me to open it for writing. My understanding was that ETXTBSY occurs when you attempt to open for writing a file which is actually being executed, i.e. is mapped into some process. I'm not aware that open(2) actually looks at the file itself to see if it is an executable; that would be very surprising to me. What does fstat -m compklm.ko say? What happens if you cp compklm.ko foo.ko and try to sign foo.ko? You should then be able to do mv foo.ko compklm.ko; if compklm.ko is in fact mapped into some process, it will continue to use the original version, which will be kept around (invisibly) until all mappings go away. This is what compilers, install(8), etc, normally do. Does your signfile program do anything with the target file before open(..., O_RDWR)? -- Nate Eldredge n...@thatsmathematics.com ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org
Re: namei() returns EISDIR for / (Re: svn commit: r203990 - head/lib/libc/sys)
On Sun, 28 Feb 2010, Garrett Cooper wrote: On Sun, Feb 28, 2010 at 5:11 PM, Alexander Best alexbes...@wwu.de wrote: i have a small test app to check {rm|mk}dir()'s errnos with certain args like /, ., /proc and non-empty dirs. i'll submit it to this thread as soon as i also add testcases for syscalls like rename(), unlink(), etc. most of the errno codes returned after applying your patch look correct. i wonder however why rmdir(/proc) returns EACCESS as unprivileged user. wouldn't it make more sense to also return EBUSY? why complain about permission related matters when even root won't be able to perform the operation. Hmm.. good question. POSIX doesn't fully expound on this case (http://www.opengroup.org/onlinepubs/009695399/functions/rmdir.html), and either seem possible... At: http://www.opengroup.org/onlinepubs/009695399/functions/xsh_chap02_03.html#tag_02_03 we have If more than one error occurs in processing a function call, any one of the possible errors may be returned, as the order of detection is undefined. So we're okay standard-wise. In general, though, I'd think it makes sense to do permissions checks before anything else, because in some cases the error code can leak information. For instance, if you try to open() a nonexistent file in a directory for which you don't have search permission ('x' bit), it's very important that open() fail with EACCES instead of ENOENT, since you aren't suppposed to be able to find out whether or not the file exists. Obviously that doesn't apply in this case, because anyone is entitled to know that /proc is the root of a mounted filesystem, but it seems to me that it's a good habit to check permission first. -- Nate Eldredge n...@thatsmathematics.com ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org
RE: ntpd hangs under FBSD 8
On Mon, 22 Feb 2010, Peter Steele wrote: Just out of curiosity, can you attach to the process via gdb and get a backtrace? This smells like a locked pthread_join I hit in my own code a few weeks ago I'm not using the debug version of ntpd so the backtrace isn't too useful, but here's what I get: (gdb) bt #0 0x000800d52bfc in select () from /lib/libc.so.7 #1 0x00425273 in ?? () #2 0x0040540e in ?? () #3 0x00080058 in ?? () #4 0x in ?? () I bet ntpd doesn't call select() in all that many places. Instead of going to all this trouble to build a debugging libc, you could just grep for select() and place breakpoints on all occurrences. (It might also be obvious from looking at them which one is the offender.) Also, since a system call is causing the trouble, you might learn something from truss or ktrace. -- Nate Eldredge n...@thatsmathematics.com ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org
Re: NFS write corruption on 8.0-RELEASE
On Fri, 12 Feb 2010, Dmitry Marakasov wrote: * Oliver Fromme (o...@lurza.secnetix.de) wrote: This is an excerpt from Solaris' mount_nfs(1M) manpage: File systems that are mounted read-write or that con- tain executable files should always be mounted with the hard option. Applications using soft mounted file systems may incur unexpected I/O errors, file corrup- tion, and unexpected program core dumps. The soft option is not recommended. FreeBSD's manual page doesn't contain such a warning, but maybe it should. (It contains a warning not to use soft with NFSv4, though, for different reasons.) Interesting, I'll try disabling it. However now I really wonder why is such dangerous option available (given it's the cause) at all, especially without a notice. Silent data corruption is possibly the worst thing to happen ever. Tell me about it. :) But in this case I'm not sure I understand. As I understand it, the difference between soft and hard is that in the case of soft, a timeout will result in the operation failing and returning EIO or the like (hence unexpected I/O errors). And if the operation is being done to fault in a mapped page, you'd have to notify the process asynchronously by sending a signal like SIGBUS which it may not be expecting (hence unexpected core dumps). But in what scenario would you see file corruption? Unless you have a buggy program that doesn't check return values from system calls or handles signals in a stupid way, I don't see how this can happen, and I'm not sure what the Sun man page is referring to. -- Nate Eldredge n...@thatsmathematics.com ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org
Re: ptrace bug or feature?
On Sun, 17 Jan 2010, Kostik Belousov wrote: It may be a missed feature, not a bug. There is obvious hack value in ability to modify syscall arguments from the debugger. Do you know whether other operating systems allow this ? Linux does, I've used it. -- Nate Eldredge n...@thatsmathematics.com ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org
Re: Suggestion: rename killall to fkill, but wait five years to phase the new name in
On Tue, 22 Dec 2009, Craig Small wrote: I also agree with Daniel; why would anyone want to literally kill every process? AFAIK, it's a helper program for shutdown(8) (or shutdown(1M) as they call it) and isn't really intended to be useful otherwise. -- Nate Eldredge n...@thatsmathematics.com ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org
Re: Email sent from at command going to the wrong account
On Mon, 14 Dec 2009, Holger Kunst wrote: Hi, The at command sends an email with the output of the scheduled job. I've experienced inconsistent results when running jobs, receiving emails in accounts not associated with the user currently logged in. To reproduce in FreeBSD 7.2-RELEASE-p2 Case #1 login as user a (new shell through ssh) echo echo 1 | at now -- user a will receive an email containing 1 - this is as expected Case #2 login as user a (new shell through ssh) login as user b How are you accomplishing this? exit echo echo 1 | at now -- user b will receive an email containing 1 - this is not as expected, since I am user a again A look at the source for at reveals that at is getting the mailname from getlogin(). Running a small test program that outputs getlogin(), confirms the above behavior: A log-in and out of another account makes getlogin() return that account's name, even though the shell has been closed and we are back to the original shell and the original user a. Is this the intended behavior? Any hints would be appreciated. -- Nate Eldredge n...@thatsmathematics.com ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org
Re: Superpages on amd64 FreeBSD 7.2-STABLE
On Thu, 10 Dec 2009, Linda Messerschmidt wrote: Also... On Thu, Dec 10, 2009 at 9:50 AM, Bernd Walter ti...@cicely7.cicely.de wrote: I use fork myself, because it is easier sometimes, but people writing big programms such as squid should know better. If squid doesn't use vfork they likely have a reason. Actually they are probably going to switch to vfork(). They were previously not using it because they thought there was some ambiguity about whether it was going to be around long term. Well, the worst that would likely happen to vfork() is it would become an alias of fork(), and you'd be back to where you are now (or better if fork() were fixed in the meantime). I'd be more worried about the mysterious bugs which it's so easy to introduce with vfork() if you do anything at all nontrivial before exec() and accidentally touch the parent's memory. What about using posix_spawn(3)? This is implemented in terms of vfork(), so you'll gain the same performance advantages, but it avoids many of vfork's pitfalls. Also, since it's a POSIX standard function, you needn't worry that it will go away or change its semantics someday. I actually am not a huge fan of vfork() since it stalls the parent process until the child exec()'s. If you're doing so much work between vfork() and exec() that this delay is significant, then I would think you're really abusing vfork(). To me, this case actually highlights why that's an issue. If the explanation is that stuff is happening in the parent process between fork() and the child's exec() causes the fragmentation, that's stuff that would be deferred in a vfork() regime, with unknown potential consequences. (At a minimum, decreased performance.) Not necessarily. In the fork() case, presumably copy-on-write is to blame for the fragmentation. In the vfork() case, there's no copy at all. -- Nate Eldredge n...@thatsmathematics.com ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org
Re: ucred when euid/egid
On Sun, 29 Nov 2009, Clifton Royston wrote: On Sun, Nov 29, 2009 at 01:19:02PM +0300, Anthony Pankov wrote: Thank you for reply. So, seteuid/gid isn't enough to gain group access as for real uid. But how i can achieve this? What functions should i call from 'theprog' to gain access for the groups euid user belongs to? May be i solve the problem in wrong way? The full problem is: There is a file owned by group filegroup: rw-rw someone:filegroupthefile There is a programs data owned by group proggroup: rw-rw someone2:proggroupprogdata I need a program (theprog) that can access 'thefile' and 'progdata' simultaneously. Program can be executed by anyone. This is a clearer statement of the problem, in terms of what you're trying to accomplish. If you can make the program data owned by a special program user, and require the users of the program to make their files group-accessible by this special filegroup, then you can do it fairly simply, like this: Make each users' thefile be owned by group filegroup, for example: rw-rw someone:filegroup~someone/thefile rw-rw someone2:filegroup ~someone2/thefile rw-rw someone3:filegroup ~someone3/thefile ... Make the program's data file owned by *user* proguser: rw-rw proguser:proggroupprogdata Now you can make the program setuid proguser/setgid filegroup: r-sr-sr-x proguser:filegrouptheprog This lets it be executed by any user and access its own data (via the suid) and the files the users have put into filegroup (via the sgid). If you can't make progdata owned by proguser, or if more groups are needed, you might be able to abuse newgrp(1), which will let you run a program with your real and effective gids set to any specified group of which your real uid is a member. This would require, though, that you break the code that requires access to those files into separate programs. (Though maybe they are as simple as cat'ing a file into a pipe or something.) Example: setuid(proguser); FILE *data = popen(echo \cat progdata\ | newgrp proggroup, r); /* read data */ etc. If your program needs to do something really elaborate with the files that can't be factored out into a separate program, you could use newgrp to run a program that opens the file and passes its fd over a unix socket. But then it's really becoming a hack. :) Caution: I haven't tested any of this. -- Nate Eldredge n...@thatsmathematics.com ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org
Re: [patch] burncd: honour for envar SPEED
On Tue, 10 Nov 2009, Alexander Best wrote: ps: would be nice if strcasecmp could protect itself from segfault with one or both of the args being NULL. I disagree. What do you think it should do instead? Return 0? If it did, would you have found your bug? The same argument could be made for any of the string.h functions, but I don't think it actually holds water. Such checks add overhead, and only provide an illusion of safety. Sure, strcasecmp could avoid causing the segfault itself, but at the cost of letting a broken program continue and possibly cause more damage. It could call abort(), but then you'd just have the same result (program terminates) with a different signal, and doing your check in software rather than letting the MMU hardware do it. It could print a message, but that pollutes the program's output, and 15 seconds debugging the core dump will reveal the problem anyway. Having a library function protect itself in this manner is not actually helpful, IMHO. -- Nate Eldredge n...@thatsmathematics.com ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org
Re: help needed to fix contrib/ee crash/exit when receiving SIGWINCH
On Fri, 23 Oct 2009, Antony Mawer wrote: On Fri, Oct 23, 2009 at 1:35 PM, Alexander Best alexbes...@math.uni-muenster.de wrote: hi everyone, together with hugh mahon (the author of ee) i've been trying to fix a nasty bug in ee. for some reason ee exits (not crashes) and leaves the console corrupted when receiving SIGWINCH (`killall -SIGWINCH ee` should exit all running ee instances). I noticed this the other day when working on a new 8.0-RC1 system... in my case I was using putty (Windows ssh client) to access the system and maximised the window I had ee running in, and noticed ee just dumped me straight to the prompt. Seems a good start might be to compile ncurses with -g, link ee against it, put a breakpoint on the SIGWINCH handler, and start single stepping... -- Nate Eldredge n...@thatsmathematics.com ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org
Re: mmap(2) segaults with certain len values and MAP_ANON|MAP_FIXED
On Wed, 21 Oct 2009, Alexander Best wrote: hi there, This is on a 32-bit platform I take it? just a little mmap(2) related question. running the following code causes a segfault: mmap( (void*)0x1000, 0x80047000, PROT_NONE, MAP_ANON|MAP_FIXED, -1, 0 ); I don't doubt it. You mapped over a big chunk of your address space with memory that's inaccessible (PROT_NONE). This probably includes your program's code. So when the mmap call returns from the kernel and tries to execute the next instruction of your program, it finds that the instruction pointer is pointing to inaccessible memory. Result: segfault. This is quite normal. What are you actually trying to accomplish with this? while the following doesn't: mmap( (void*)0x1000, 0x, PROT_NONE, MAP_ANON|MAP_FIXED, -1, 0 ); Did you check whether the mmap actually succeeded? I bet it didn't. You have a length that isn't a multiple of the page size and wraps around 32 bits. I bet you got an EINVAL, and the mmap call didn't actually do anything. is this a known problem? seems reproducible on all branches. Not a problem at all, I suspect. -- Nate Eldredge n...@thatsmathematics.com ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org
Re: Running a program through gdb without interfering
On Fri, 9 Oct 2009, Mel Flynn wrote: On Friday 09 October 2009 11:38:29 Dag-Erling Smørgrav wrote: Mel Flynn mel.flynn+fbsd.hack...@mailing.thruhere.net writes: is there a way to have a program run through gdb and gdb only record a segfault, but otherwise let the program run? Yes, just run gdb /path/to/program and type run. Not what I was looking for. The segfaults are random and the only way to somewhat reliably reproduce it is to have portmaster invoke it as it's PM_SU_CMD. And no, running that same command again doesn't trigger the segfault, so it's something environmental. Hence I'm looking for something like: gdb -batch -x script_with_run_cmd.gdb -exec /usr/local/bin/sudo $argv where somehow I need $argv to be passed as arguments to sudo. I'm thinking i should just wrap it and mktemp(1) a new command script for gdb to use with set args $*, but if anyone has a more clever idea, I'd love to hear it. This won't work. You can't debug setuid programs (for reasons which should be obvious). You could do it if you ran everything as root, but it sounds like the bug doesn't occur in that case. -- Nate Eldredge n...@thatsmathematics.com___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org
Re: genuine cpu I386_CPU kernel support
On Wed, 23 Sep 2009, John Baldwin wrote: On Wednesday 23 September 2009 1:21:59 pm Julian Elischer wrote: John Baldwin wrote: Other things added since then assume at least a 486. Not having cmpxchg is a bit of a killer. I think a 386 can assume non-SMP in which case that can be simulated just fine :-) it also simplifies a lot of the other breakages.. #if (CPU == 80386) defined(SMP) #error can't have smp on a 386 #endif No, it actually does not. The in-kernel version of cmpset for 386 was to disable interrupts while doing a cmp and jmp around a mov (even 386's have preemption, so you do have to disable interrupts). You can't do that in userland (cli is a privileged instruction), which probably mandates doing a cmpxchg emulator in the kernel for userland code. That and disabling interrupts is actually far less efficient than spl() for a UP 80386 machine. I suspect newer kernels will run slower on an 80386 than 4.x. Another issue that I know affected Linux is that the 386 would allow kernel code (CPL 0) to write to a page that was marked read-only. The 486 and later would generate a page fault. Linux takes advantage of the 486 behavior to avoid having to do explicit access checks when copying to user space, though AFAIK it checks the CPU at boot time to decide if this can be done. I haven't checked whether FreeBSD uses this feature, but it would be another thing to watch out for. -- Nate Eldredge n...@thatsmathematics.com ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org
Re: genuine cpu I386_CPU kernel support
On Tue, 22 Sep 2009, John Baldwin wrote: My comment is to just use 4.x (seriously). A true 386 is going to be quite slow and the overhead of many things added that work well on newer processors is going to be very painful on a 386 (probably on a 486 as well). 4.x runs fine on a 386 and should support all the hardware you can stick into a machine with an 80386 CPU. Unless, of course, you plan to put it on a network. I doubt that 4.x is up to date with respect to security patches. -- Nate Eldredge n...@thatsmathematics.com ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org
Re: ZFS group ownership
On Sat, 12 Sep 2009, Giulio Ferro wrote: I don't know if this is the correct list to discuss this matter, if not I apologize in advance. freebsd-questions might have been better, but I don't think you're too far off. It wasn't necessary to post three times though :) [On UFS, files are created with the same group as the directory that contains them. On ZFS, they are created with the primary group of the user who creates them.] What I ask now is: is this a bug or a feature? Both, I think :) The behavior you describe on UFS (group comes from the directory) is standard for BSD-based systems like FreeBSD. On SysV-based systems, however, the default is that the group comes from the user, as you describe on ZFS. ZFS was originally developed for Solaris, a descendent of SysV, so it's not surprising that it also has this behavior. However, this is at least a documentation bug, since the open(2) man page describes the BSD behavior without mentioning exceptions. How can I achieve my goal in ZFS, that is allowing members of the same group to operate with the files / dirs they create? On SysV, you can get BSD-type behavior by setting the sgid bit on the directory in question, e.g. chmod g+s dir. Then new files will inherit their group from the directory. I suspect this will work on FreeBSD/ZFS too even though chmod g+s on a directory is undocumented. -- Nate Eldredge n...@thatsmathematics.com ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org
Re: Intermittent system hangs on 7.2-RELEASE-p1
On Wed, 26 Aug 2009, Linda Messerschmidt wrote: I'm trying to troubleshoot an intermittent Apache performance problem, and I've narrowed it down using to what appears to be a brief whole-system hang that last from 0.5 - 3 seconds. They occur every few minutes. One thought would be to use ps to try to determine which process, if any, is charged with CPU time during the hang. If you could afford a little downtime, it would be worth seeing if the hang occurs in single-user mode (perhaps with a simple program that loops calling gettimeofday() and warns when the time between successive iterations is large). I once had a problem like this that I eventually traced to a power management problem. (Specifically, the machine had a modem, and would hang for a few seconds whenever the line would ring. It was apparently related to the Wake-On-Ring feature.) If I remember correctly, disabling ACPI made it go away. So that might be something to try, if rebooting is an option. What are the similarities and differences in hardware and software among the affected machines (you mentioned there were several)? -- Nate Eldredge neldre...@math.ucsd.edu ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org
Re: mmap/munmap with zero length
On Sun, 5 Jul 2009, Alexander Best wrote: i'm wondering why mmap and munmap behave differently when it comes to a length argument of zero. allocating memory with mmap for a zero length file returns a valid pointer to the mapped region. munmap however isn't able to remove a mapping with no length. wouldn't it be better to either forbid this in mmap or to allow it in munmap? POSIX has an opinion: http://www.opengroup.org/onlinepubs/9699919799/functions/mmap.html If len is zero, mmap() shall fail and no mapping shall be established. http://www.opengroup.org/onlinepubs/9699919799/functions/munmap.html The munmap() function shall fail if: ... [EINVAL] The len argument is 0. -- Nate Eldredge neldre...@math.ucsd.edu ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org
Re: Why kernel kills processes that run out of memory instead of just failing memory allocation system calls?
On Wed, 20 May 2009, Yuri wrote: Seems like failing system calls (mmap and sbrk) that allocate memory is more graceful and would allow the program to at least issue the reasonable error message. And more intelligent programs would be able to reduce used memory instead of just dying. It's a feature, called memory overcommit. It has a variety of pros and cons, and is somewhat controversial. One advantage is that programs often allocate memory (in various ways) that they will never use, which under a conservative policy would result in that memory being wasted, or programs failing unnecessarily. With overcommit, you sometimes allocate more memory than you have, on the assumption that some of it will not actually be needed. Although memory allocated by mmap and sbrk usually does get used in fairly short order, there are other ways of allocating memory that are easy to overlook, and which may allocate memory that you don't actually intend to use. Probably the best example is fork(). For instance, consider the following program. #define SIZE 10 /* 1 GB */ int main(void) { char *buf = malloc(SIZE); /* 1 GB */ memset(buf, 'x', SIZE); /* touch the buffer */ pid_t pid = fork(); if (pid == 0) { execlp(true, true, (char *)NULL); perror(true); _exit(1); } else if (pid 0) { for (;;); /* do work */ } else { perror(fork); exit(1); } return 0; } Suppose we run this program on a machine with just over 1 GB of memory. The fork() should give the child a private copy of the 1 GB buffer, by setting it to copy-on-write. In principle, after the fork(), the child might want to rewrite the buffer, which would require an additional 1GB to be available for the child's copy. So under a conservative allocation policy, the kernel would have to reserve that extra 1 GB at the time of the fork(). Since it can't do that on our hypothetical 1+ GB machine, the fork() must fail, and the program won't work. However, in fact that memory is not going to be used, because the child is going to exec() right away, which will free the child's copy. Indeed, this happens most of the time with fork() (but of course the kernel can't know when it will or won't.) With overcommit, we pretend to give the child a writable private copy of the buffer, in hopes that it won't actually use more of it than we can fulfill with physical memory. If it doesn't use it, all is well; if it does use it, then disaster occurs and we have to start killing things. So the advantage is you can run programs like the one above on machines that technically don't have enough memory to do so. The disadvantage, of course, is that if someone calls the bluff, then we kill random processes. However, this is not all that much worse than failing allocations: although programs can in theory handle failed allocations and respond accordingly, in practice they don't do so and just quit anyway. So in real life, both cases result in disaster when memory runs out; with overcommit, the disaster is a little less predictable but happens much less often. If you google for memory overcommit you will see lots of opinions and debate about this feature on various operating systems. There may be a way to enable the conservative behavior; I know Linux has an option to do this, but am not sure about FreeBSD. This might be useful if you are paranoid, or run programs that you know will gracefully handle running out of memory. IMHO for general use it is better to have overcommit, but I know there are those who disagree. -- Nate Eldredge neldre...@math.ucsd.edu ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org
Re: Why kernel kills processes that run out of memory instead of just failing memory allocation system calls?
On Thu, 21 May 2009, per...@pluto.rain.com wrote: Nate Eldredge neldre...@math.ucsd.edu wrote: With overcommit, we pretend to give the child a writable private copy of the buffer, in hopes that it won't actually use more of it than we can fulfill with physical memory. I am about 99% sure that the issue involves virtual memory, not physical, at least in the fork/exec case. The incidence of such events under any particular system load scenario can be reduced or eliminated simply by adding swap space. True. When I said a system with 1GB of memory, I should have said a system with 1 GB of physical memory + swap. -- Nate Eldredge neldre...@math.ucsd.edu ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org
Re: Why kernel kills processes that run out of memory instead of just failing memory allocation system calls?
On Thu, 21 May 2009, Yuri wrote: Nate Eldredge wrote: Suppose we run this program on a machine with just over 1 GB of memory. The fork() should give the child a private copy of the 1 GB buffer, by setting it to copy-on-write. In principle, after the fork(), the child might want to rewrite the buffer, which would require an additional 1GB to be available for the child's copy. So under a conservative allocation policy, the kernel would have to reserve that extra 1 GB at the time of the fork(). Since it can't do that on our hypothetical 1+ GB machine, the fork() must fail, and the program won't work. I don't have strong opinion for or against memory overcommit. But I can imagine one could argue that fork with intent of exec is a faulty scenario that is a relict from the past. It can be replaced by some atomic method that would spawn the child without ovecommitting. I would say rather it's a centerpiece of Unix design, with an unfortunate consequence. Actually, historically this would have been much more of a problem than at present, since early Unix systems had much less memory, no copy-on-write, and no virtual memory (this came in with BSD, it appears; it's before my time.) The modern atomic method we have these days is posix_spawn, which has a pretty complicated interface if you want to use pipes or anything. It exists mostly for the benefit of systems whose hardware is too primitive to be able to fork() in a reasonable manner. The old way to avoid the problem of needing this extra memory temporarily was to use vfork(), but this has always been a hack with a number of problems. IMHO neither of these is preferable in principle to fork/exec. Note another good example is a large process that forks, but the child rather than exec'ing performs some simple task that writes to very little of its copied address space. Apache does this, as Bernd mentioned. This also is greatly helped by having overcommit, but can't be circumvented by replacing fork() with something else. If it really doesn't need to modify any of its shared address space, a thread can sometimes be used instead of a forked subprocess, but this has issues of its own. Of course all these problems are solved, under any policy, by having more memory or swap. But overcommit allows you to do more with less. Are there any other than fork (and mmap/sbrk) situations that would overcommit? Perhaps, but I can't think of good examples offhand. -- Nate Eldredge neldre...@math.ucsd.edu ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org
Re: C99: Suggestions for style(9)
On Tue, 28 Apr 2009, Peter Jeremy wrote: On 2009-Apr-26 09:02:36 +0200, Christoph Mallon christoph.mal...@gmx.de wrote: as some of you may have noticed, several years ago a new millenium started and a decade ago there was a new C standard. Your implication that FreeBSD is therefore a decade behind the times is unfair. Whilst the C99 standard was published a decade ago, compilers implementing that standard are still not ubiquitous. HEAD recently switched to C99 as default (actually gnu99, but that's rather close). Note that gcc 4.2 (the FreeBSD base compiler) states that it is not C99 compliant. However, if you take a look at http://gcc.gnu.org/gcc-4.2/c99status.html , you will see that it is very close. The vast majority of C99 features are implemented and working correctly. Even those which are marked as broken generally work in most cases, and fail only in rather obscure corner cases that real programs are unlikely to encounter. In particular, the features Christoph proposes to use work fine. -- Nate Eldredge neldre...@math.ucsd.edu ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org
Re: Bug in tcp wrappers?
On Sun, 15 Mar 2009, Mikko Työläjärvi wrote: The real fix involves rewriting chunks of the libwrap code, or finding a version where someone has already done so. It doesn't seem like it should be too bad. xgets is only called in three places. It would be easy enough to replace it with something like glibc's getline(3), that uses realloc to size a buffer appropriately. If nobody else feels like doing this, maybe I will. -- Nate Eldredge neldre...@math.ucsd.edu___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org
Re: Debugging init process.
On Tue, 10 Mar 2009, vasanth raonaik wrote: Hello Team, I need to debug init process. I am not able to attach init to gdb and it throws As others mentioned, this is explicitly disabled. You could re-enable it by hacking the kernel, but it could cause other unexpected problems. Alternatively, there's always printf debugging. What is wrong with init, that you need to debug it? It's a fairly simple program that's been around for a long time and should be pretty stable. -- Nate Eldredge neldre...@math.ucsd.edu ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org
Re: Spin down HDD after disk sync or before power off
On Thu, 5 Mar 2009, Tobias Blersch wrote: Oliver Fromme wrote: Joerg Sonnenberger wrote: This is not true. Many hard disks don't like having to do an emergency shutdown as it affects the disk life time negatively. That's what happens if you poweroff the machine when the disks are still spinning. Can you point to any authoritative information (URL) about that claim, such as vendor specs, white paper or similar? http://www.hitachigst.com/tech/techlib.nsf/techdocs/28DCCB17E0EEC5A086256F4E006E2F5B Thats the specification for my notebooks hard drive. Section 6.6 Reliability gives data about how to power-off the disk. It also contains numbers of supported load/unloads and emergency unloads. Emergency unloads are invoked when the heads are still loaded and power fails. Ok, I didn't know that. There are some drives that can unload the heads normally on power loss and don't need any special handling, and I was under the mistaken impression that this was universal. But the documentation suggests that this should be a BIOS function. When the kernel tries to poweroff the system, isn't that normally done via the BIOS (perhaps with ACPI/APM)? So maybe the BIOS is supposed to unload the heads (by sending a standby/sleep command) before cutting the power. This makes sense in some ways. Suppose the drive is attached to a weird ATA controller that FreeBSD doesn't know anything about. (Maybe it's used by the other system in a dual-boot setup.) There's no way that FreeBSD could send it a power-down sequence, but the BIOS could. Perhaps the OP's BIOS for some reason doesn't do this correctly. -- Nate Eldredge neldre...@math.ucsd.edu ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org
Re: ln: posixly confused
On Tue, 3 Mar 2009, Andriy Gapon wrote: Test case. Preparation: $ mkdir linktest $ cd linktest $ mkdir some_dir $ mkdir other_dir The test: $ ln -s some_dir the_link $ ln -s -f other_dir the_link Expected: the_link points to other_dir. Actual result: some_dir contains symlink other_dir - other_dir. From ln(1): SYNOPSIS ln [-s [-F]] [-f | -iw] [-hnv] source_file [target_file] ln [-s [-F]] [-f | -iw] [-hnv] source_file ... target_dir I thought that only true directory would trigger the second form. I thought that the second argument being a symlink (to a file or to a directory) should trigger the first form. I also read this: http://www.opengroup.org/onlinepubs/009695399/utilities/ln.html I think that the text there (and in ln(1)) implies what I expected, but this is not spelled out clearly. FWIW, Linux and Solaris have the same behavior as FreeBSD. The standard says the second form is triggered if the second argument names an existing directory. An informative note in the symlink() specification at http://www.opengroup.org/onlinepubs/009695399/functions/symlink.html says a symbolic link allows a file to have multiple logical names. Therefore, I think it's a fair interpretation to say that a symbolic link to an existing directory names it. -- Nate Eldredge neldre...@math.ucsd.edu ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org
portupgrade spurious skips
Hi folks, In the past few months I've noticed a bug in portupgrade. When I update my ports tree and do `portupgrade -a', often a few ports will be skipped, supposedly because another port on which they depend failed to install. However, the apparently failed port actually did not fail, and if I rerun `portupgrade -a', some of the skipped ports will install successfully without complaint. After enough iterations I can eventually get all of them. I'd like to file a PR about this, but it's a little bit tricky coming up with a test case, since the behavior depends on having outdated packages installed, and on the dependencies between them. Moreover, after I run `portupgrade -a' and notice the problem, the state of the installed packages has changed and the same packages aren't skipped the next time. So my question is whether anyone has ideas about how to construct a reasonable test case that could help me make this reproducible and easier to investigate. Any thoughts? Thanks in advance. -- Nate Eldredge neldre...@math.ucsd.edu ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org
Re: shmmax tops out at 2G?
On Tue, 24 Feb 2009, Garrett Cooper wrote: On Mon, Feb 23, 2009 at 12:16 PM, Bill Moran wmo...@potentialtech.com wrote: In response to Christian Peron c...@freebsd.org: On Mon, Feb 23, 2009 at 11:58:09AM -0800, Garrett Cooper wrote: [..] Why isn't the field an unsigned int / size_t? I don't see much value in having the size be signed... No idea :) This code long predates me. It's that way because the original Sun spec for the API said so. It makes little sense to change it just to unsigned. The additional 2G it would give doesn't really solve the tuning problem on a 64G system. This is simply a spec that has become outdated by modern hardware. Ah, but an unsigned integer on a 64-bit system supports that kind of precision ;). Or are you saying you're crazy enough to run PAE mode with that much RAM 0-o? int and unsigned on amd64 are 32-bit types. To get a 64-bit integer, you need (unsigned) long. -- Nate Eldredge neldre...@math.ucsd.edu___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org
Re: pahole - Finding holes in kernel structs
On Thu, 12 Feb 2009, Marcel Moolenaar wrote: On Feb 12, 2009, at 8:34 AM, Jille Timmermans wrote: Julian Stacey schreef: 1) Is it worth my time trying to rearrange structs? I wondered whether as a sensitivity test, some version of gcc (or its competitor ?) might have capability to automatically re-order variables ? but found nothing in man gcc Optimization Options. There is a __packed attribute, I don't know what it exactly does and whether it is an improvement. __packed is always a gross pessimization. The side-effect of packing a structure is that the alignment of the structure drops to 1. That means that any field will be read 1 byte at a time and reconstructed by logical operations. The other alternative is to read/write that member by unaligned operations, on platforms that support it. This also typically comes with a performance penalty, of course. Usually it means the hardware reads the two words that overlap the member and pieces it back together. But on such a platform the software does not need to handle it specially; it executes the same instruction, but it takes more time. The only reason to use this would be (1) if you needed to have your structure occupy as little memory as possible; for instance, if your structure had two elements, one 'int' and one 'char', and you had 1 billion of them, using __packed__ would save you 3 gigabytes. Or (2) if you need to conform to an externally defined data structure that already does this. Most places in the kernel, I don't think either of these would be true. -- Nate Eldredge neldre...@math.ucsd.edu ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org
Re: gcc 4.3.2 libgcc_s.so exception handling broken?
On Sat, 17 Jan 2009, xorquew...@googlemail.com wrote: Hello. I have some C code that's compiled with -fexceptions using the lang/gnat-gcc43 port. I'm on 6.4-RELEASE-p2. A function c_function in the C code takes a callback as an argument. I'm passing this function the address of a function ext_function defined in another language (Ada, to be precise, but it seems to happen with C++ too). The main body of my program is written in this language so C is effectively the foreign code (whatever). If ext_function raises an exception, the exception is NOT propagated through the C code, the process simply exits. I tried a simple example of this in C++ and it works as expected. (I am on 7.0-RELEASE/amd64.) So it isn't completely busted, at least. Can you post an example that exhibits the problem? Ideally, something complete that can be compiled and is as simple as possible. If you can do it with C++ rather than Ada it might be easier, so people don't have to install the Ada compiler. Also please mention the commands you use to compile, and what they output when you compile using -v, and what architecture you are on. -- Nate Eldredge neldre...@math.ucsd.edu ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org
Re: Confused by segfault with legitimate call to strerror(3) on amd64 / sysctl (3) setting `odd' errno's
On Fri, 16 Jan 2009, Garrett Cooper wrote: On Fri, Jan 16, 2009 at 2:52 AM, Thierry Herbelot thierry.herbe...@free.fr wrote: Le Friday 16 January 2009, Garrett Cooper a écrit : On Fri, Jan 16, 2009 at 2:21 AM, Christoph Mallon #include errno.h #include stdio.h #include sys/stat.h int main() { struct stat sb; int o_errno; if (stat(/some/file/that/doesn't/exist, sb) != 0) { o_errno = errno; printf(Errno: %d\n, errno); printf(%s\n, strerror(o_errno)); } return 0; } with this, it's better on an amd64/ RELENG_7 machine : % diff -ub badfile.c.ori badfile.c --- badfile.c.ori 2009-01-16 11:49:44.778991057 +0100 +++ badfile.c 2009-01-16 11:49:03.470465677 +0100 @@ -1,6 +1,7 @@ #include errno.h #include stdio.h #include sys/stat.h +#include string.h int main() Cheers TfH That's hilarious -- why does it pass though without issue on x86 though? -Garrett As pointed out, when you don't have a declaration for strerror, it's implicitly assumed to return `int'. This feature was widely used in the early days of C and so continues to be accepted by compilers, and gcc by default doesn't warn about it. On x86, int and char * are the same size. So even though the compiler thinks strerror is returning an int which is being passed to printf, the code it generates is the same as for a char *. On amd64, int is 32 bits but char * is 64. When the compiler thinks it's using int, it only keeps track of the lower 32 bits, and the upper 32 bits get zeroed. So the pointer that printf receives has had its upper 32 bits zeroed, and no longer points where it should. Hence segfault. Since running on amd64 I've seen a lot of bugs where people carelessly assume (perhaps without noticing) that ints and pointers are practically interchangeable, which works on x86 and the like but breaks on amd64. Variadic functions are special offenders because the compiler can't do much type checking. Pop quiz: which of the following statements is correct? #include stdlib.h #include unistd.h execl(/bin/sh, /bin/sh, 0); execl(/bin/sh, /bin/sh, NULL); -- Nate Eldredge neldre...@math.ucsd.edu___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org
Re: lzo2 shows insane speed gap
On Mon, 29 Dec 2008, Christian Weisgerber wrote: The archivers/lzo2 port runs a series of regression tests after the actual build. These tests show extremely divergent behavior on different machines. There are two types of machines: Type #1: Running the tests takes roughly the same time as configure and compile did, whether it's 30 seconds on a fast machine or 10 minutes on an old slow one. Type #2: Running the tests takes much, much, MUCH longer. I've tried this across alpha, amd64, i386, and sparc64, partially on FreeBSD, partially on OpenBSD. The operating system doesn't matter and there is no pattern related to endianness or 32/64 bits. You can find machines that are the same architecture (e.g. amd64) and are of similar overall speed (e.g. an Intel Xeon Xeon E5405 and an AMD Phenom 9350e) and one of these machines will be type #1 and the other will be #2 and take _a hundred_ times longer to run the tests. A hundred times. I have never seen anything like this before. It might be good first to rule out compiler / library differences. First, can you isolate a single lzo command / input combination whose time differs dramatically? This would simplify tests compared to running the whole test suite. (It should be easy because it looks like the test suite prints the time for each test.) It might also simplify things to work on one fast and one slow machine. Then try copying the lzo binary from the fast machine to the slow machine (and vice versa) and see if the same test speeds up with the copied binary. If not, try again with the binary statically linked. If still not, it would be good to have a copy of the binary made available, along with more information about the fast and slow machines (CPU, amount of memory, load on the machine, kernel version, disk, etc). If the copied binary isn't faster than the natively produced one, then it would be good to have information about the compiler options, versions, etc. -- Nate Eldredge neldre...@math.ucsd.edu ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org
Re: AMD64 qemu completely broken?
On Sun, 7 Dec 2008, Juergen Lock wrote: On Thu, Dec 04, 2008 at 02:43:47PM -0800, Nate Eldredge wrote: On Thu, 4 Dec 2008, Juergen Lock wrote: I forgot to say the qemu-devel port (as well as the later snapshots I posted about on -emulation) also support -curses, which shows the emulated vga text(!)console on qemu's tty. This works quite well with FreeBSD guests (even the isos) if you extend your xterm/whatever by one line (the default vga textconsole is 80x25 instead of 80x24.) As long as we're sharing tips about qemu: I've recently been working with qemu on amd64 and have set up a Debian etch i386 guest which is working well. I am using the qemu-devel and kqemu-kmod-devel ports. I am not using -kernel-kqemu at the moment; I thought I would get things working before trying to speed up. Using qemu I've finally achieved my goal of being able to use flash on FreeBSD/amd64 (in some sense :-O). Actually at least on RELENG_7 and later the original www/linux-flashplugin9 + www/nspluginwrapper don't work too bad at least for video sites these days (on 6 and 7.0 you need a patch and there it probably doesn't quite work on SMP because another patch concerning SMP can't be merged.) See e.g. this thread on -emulation for more: http://lists.freebsd.org/pipermail/freebsd-emulation/2008-October/005433.html Thanks for the pointer. I will probably wait until 7.1 is out and ports are defrosted, so I can go straight to flash10 and not to have to do everything twice, but this information should be very helpful. '-net tap' works fine, but requires root privileges and is more work to set up. Actually it doesn't require root privs to run, only to setup. (Ok you _might_ need sudo to ifconfig the tap device and/or bridge in the qemu-ifup script... But qemu itself can certainly run as user.) Okay. I was being lazy and letting qemu do some of that work for me. [*] Out of curiosity, I looked at some Unix Archive stuff and found the identical code in BSD's Net2, circa 1991. It is identified in a comment as a quick hack and adorned with several /* XXX */. Naturally the code and the comments survive intact, 17 years later. :-( This might be somewhat more understandable if you know that the original slirp code was written many moons ago and only later resurrected for emulation purposes. (It was originally invented for dialup users that logged into shellservers' gettys via serial modem lines so they could also use the box' inet connection locally before things like ppp were available...) Yep, I think I remember trying to use some slip implementation over a serial modem once. It's just unfortunate that qemu chose that code for their TCP/IP implementation rather than something else more modern. Not that I'm volunteering to update it :) -- Nate Eldredge [EMAIL PROTECTED] ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: RFC: small syscons and kbd patch
On Fri, 5 Dec 2008, Garrett Cooper wrote: On Fri, Dec 5, 2008 at 1:11 AM, Christoph Mallon [EMAIL PROTECTED] wrote: Garrett Cooper schrieb: (I feel like I'm getting off on a bikeshed topic, but...) 1. What dialect of C was it defined in? Is it still used in the standard dialect (honestly, this is the first time I've ever seen it before, but then again I am a younger generation user)? Dialect? The ! operator is plain vanilla standard C. It takes a scalar operand and returns 1, if it compares equal to 0, otherwise it returns 0. !!, i.e. two consecutive ! operators, is one of the oldest tricks in the book, right next to (a b) - (a b) for comparison functions and countless other idioms. 3. What's the real loss of going to `? :', beyond maybe 3 extra keystrokes if it's easier for folks who may not be as experienced to read? I'd like my bikeshed grass green, please. Christoph If you really want to split hairs, ! only negates the logic value, whereas ~ actually negates the bits. So technically, you're not flipping 0 to make 1 and vice versa, but instead flipping 0 to make non-zero, etc. There is a clear distinction in hardware. The point was that !! isn't obvious at first glancing the C code. It's important for code to be readable as well as functional (that's why we have style(9)). Getting down to it I'd like to see what the compiler optimizes each as, because I can see dumb compilers saying `!!' translates to `not, bne = set, else set, continue', whereas `? :' could be translated to `bne, set, else set, continue'; I'm sure gcc has moved passed these really minute details. Out of curiosity, I tried some various compilers, including gcc on i386, amd64, and sparc; Intel's C compiler on i386; tcc (tiny, non-optimizing C compiler) on i386; and Sun's compiler (old version) on sparc. I compiled the following file: int bangbang(int x) { return !!x; } int ternary(int x) { return x ? 1 : 0; } Intel's compiler generated different code for these two functions when optimization was turned off. bangbang used a conditional set instruction, while ternary used conditional jumps. With optimization on the two were identical. All other compilers generated identical code for the two functions whether optimization was on or off. (Of course, the generated code varied between compilers; tcc's in particular was decidedly non-optimized.) I really don't think something as simple as this is worth worrying about in terms of code efficiency. Even if they weren't identical, the difference is at most a couple of instructions and a pipeline flush, and if that's a serious problem you need to be using assembly anyway. Besides, it's not a piece of code that comes up all that often. The only basis for arguing about it is style, and I think we've established that it's purely a matter of taste. In particular, there isn't a clear favorite for which is easier to read. IMHO, style(9) should remain agnostic and let the programmer decide. However, if people really feel that consistency is necessary here, I propose the following: if the cents digit of the closing price of the Dow Jones Industrial Average on this coming Monday, December 8, 2008, is even, then style(9) shall be edited to indicate that `!!x' is preferred. If odd, then style(9) shall prefer `x ? 1 : 0'. :-) -- Nate Eldredge [EMAIL PROTECTED] ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: RFC: small syscons and kbd patch
On Fri, 5 Dec 2008, Stephen Montgomery-Smith wrote: Nate Eldredge wrote: int bangbang(int x) { return !!x; } int ternary(int x) { return x ? 1 : 0; } Stylewise, I prefer int notzero(int x) { return x!=0; } icc -O0 compiles notzero the same as bangbang (better than ternary). tcc produces better code for notzero than the other two. Sun cc without optimization produces slightly better code for notzero than the other two (one jump instead of two). For everything else all three produce equivalent code. `x 1' and `x || 0' are some other possibilities. Anyway, maybe there is something more useful we could all be doing. :) -- Nate Eldredge [EMAIL PROTECTED] ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: AMD64 qemu completely broken?
On Thu, 4 Dec 2008, Juergen Lock wrote: I forgot to say the qemu-devel port (as well as the later snapshots I posted about on -emulation) also support -curses, which shows the emulated vga text(!)console on qemu's tty. This works quite well with FreeBSD guests (even the isos) if you extend your xterm/whatever by one line (the default vga textconsole is 80x25 instead of 80x24.) As long as we're sharing tips about qemu: I've recently been working with qemu on amd64 and have set up a Debian etch i386 guest which is working well. I am using the qemu-devel and kqemu-kmod-devel ports. I am not using -kernel-kqemu at the moment; I thought I would get things working before trying to speed up. Using qemu I've finally achieved my goal of being able to use flash on FreeBSD/amd64 (in some sense :-O). savevm and loadvm don't work due to a security patch. Since my guest system is trusted I reverted the patch. I filed a PR as ports/129417 . I found that '-net user' is horribly broken on amd64 (qemu segfaults). It uses some ancient [*] BSD TCP/IP code (via slirp) which assumes that pointers are 32 bits and doesn't hesitate to shove them into random 32-bit corners of externally defined structures if it's convenient. Looks like a pain to clean up. '-net tap' works fine, but requires root privileges and is more work to set up. [*] Out of curiosity, I looked at some Unix Archive stuff and found the identical code in BSD's Net2, circa 1991. It is identified in a comment as a quick hack and adorned with several /* XXX */. Naturally the code and the comments survive intact, 17 years later. :-( -- Nate Eldredge [EMAIL PROTECTED] ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: tcsh loses the foreground process group?
On Tue, 2 Dec 2008, Steve Watt wrote: In article [EMAIL PROTECTED] you write: [ ... ] I'm running 6-STABLE (6.4-PRE as of 24 Nov right now), tcsh 6.15.00, which shows tcsh 6.15.00 (Astron) 2007-03-03 (i386-intel-FreeBSD) options wide,nls,dl,al,kan,sm,rh,color,filec as $version. The symptom is that when I do a long-ish running task inside a `` expansion that I then ^C, nobody gets the foreground process group... I never get a prompt back after the ^C, and ^T gets me load: 0.27 no foreground process group [ ... ] One portable reproduction: # cd /usr/src # less `egrep -lir '^Foo.*baz' *` ^Cload: 0.02 no foreground process group (I typed ^C ^T) SIGKILL to the shell seems to be the only way to get things back to normal. I've gotten one me too, which indicated that SIGHUP to the shell will also make it go away, but does not solve the problem. I've got another FreeBSD machine available that was running tcsh 6.14.00, and it does _NOT_ display the problem. When I build 6.15.00 on that same box (/usr/src is more up to date than the install right now), that does fail. Thus I'm pretty comfortable saying that it's a tcsh bug of some sort, and probably a regression. Hopefully this can be fixed (PR being filed now) before 6.4 releases... Thanks for the report. It looks like this is yet another manifestation of a problem in tcsh, where it does inappropriate things in a vfork'ed subshell. In my tests, running tcsh with -F (which causes it to use fork instead of vfork) causes the problem to go away. It is also present in 7.0-RELEASE and probably all later versions. There are several open bugs related to this problem, but so far they do not seem to have attracted the interest of any committers. Among them are: bin/41297 bin/52746 bin/125185 amd64/128259 bin/129378 (which you just opened) The fix is simple: make -F the default. There is a minor performance penalty, but that's a small price to pay for correct behavior. A more involved fix would be to make tcsh not do inappropriate things after vfork (modifying global variables), or at least clean up before exiting, but IMHO that is less clean; vfork really shouldn't be used here at all. -- Nate Eldredge [EMAIL PROTECTED] ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: tcsh loses the foreground process group?
On Wed, 3 Dec 2008, Nate Eldredge wrote: Thanks for the report. It looks like this is yet another manifestation of a problem in tcsh, where it does inappropriate things in a vfork'ed subshell. In my tests, running tcsh with -F (which causes it to use fork instead of vfork) causes the problem to go away. It is also present in 7.0-RELEASE and probably all later versions. There are several open bugs related to this problem, but so far they do not seem to have attracted the interest of any committers. Among them are: bin/41297 bin/52746 bin/125185 amd64/128259 bin/129378 (which you just opened) I have opened bin/129405 as an omnibus PR for these problems. -- Nate Eldredge [EMAIL PROTECTED] ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: [Testers wanted] /dev/console cleanups
On Thu, 20 Nov 2008, Jeremy Chadwick wrote: On Wed, Nov 19, 2008 at 11:48:36PM -0800, Nate Eldredge wrote: On Wed, 19 Nov 2008, Jeremy Chadwick wrote: On Thu, Nov 20, 2008 at 05:39:36PM +1100, Peter Jeremy wrote: I hope that never gets committed - it will make debugging kernel problems much harder. There is already a kern.msgbuf_clear sysctl and maybe people who are concerned about msgbuf leakage need to learn to use it. And this sysctl is only usable *after* the kernel loads, which means you lose all of the messages shown from the time the kernel loads to the time the sysctl is set (e.g. hardware detected/configured). This is even less acceptable, IMHO. But surely you can arrange that the contents are written out to /var/log/messages first? E.g. a sequence like - mount /var - write buffer contents via syslogd - clear buffer via sysctl - allow user logins This has two problems, but I'm probably missing something: 1) See my original post, re: users of our systems use dmesg to find out what the status of the system is. By status I don't mean from the point the kernel finished to now, I literally mean they *expect* to see the kernel device messages and all that jazz. No, I'm not making this up, nor am I arguing just to hear myself talk (despite popular belief). I can bring these users into the discussion if people feel it would be useful. I forgot about that point. I can sympathize with those users; I feel the same way. It's a good way to learn about a system as a mere user (since usually sysadmins don't remember or bother to disable it). However, in my experience dmesg isn't really the best thing for that purpose; the kernel message buffer tends to get wiped out once the system has been up for a while. (It fills with ipfw logs, ethernet link state changes, etc.) Maybe a better approach would be to point them to /var/log/messages or whichever log file stores them permanently. Or, better yet, do some syslogd magic to make a logfile that can be appropriately readable but doesn't have any overly sensitive messages directed there (e.g. kernel yes, sshd no). 2) I don't understand how this would work (meaning, technically and literally: I do not understand). How do messages like CPU: Intel(R) Core(TM)2 Duo CPU E8400 @ 3.00GHz (2992.52-MHz K8-class CPU) get written to syslog when syslogd isn't even running (or any filesystems) mounted at that time? There must be some magic involved there (since syslog == libc, not syscall) when syslogd starts, but I don't know how it works. I think you're conflating a couple of things, and I also explained my idea poorly. As I understand it (from memory, which is a little vague), syslogd gets messages from two places: from the kernel via /dev/klog, and from other processes via a Unix domain socket in /var/run. These messages then get sent to the appropriate log files. The syslog(3) function of libc just connects and writes the message to the Unix domain socket. If syslogd isn't running to listen on that socket, syslog(3) won't work very well. Now /dev/klog should be a direct line to the kernel's message buffer. So when syslogd starts and reads /dev/klog for the first time, it will get everything that's accumulated so far, including the earliest boot messages. It should then write them to the appropriate log files. This already works, which is why /var/log/messages contains the kernel copyright message, etc. My idea is, after syslogd does this, but before the system goes multi-user, you should clear the kernel buffer. Early boot messages are already in the log files, so they won't be lost. Maybe the best thing would be to build this functionality into syslogd itself, to minimize the possibility of losing messages due to a race. This way the buffer is cleared before any unprivileged users get to do anything. No kernel changes needed, just a little tweaking of the init scripts at most. If you should have a crash and suspect there is useful data in the buffer, you can boot to single-user mode (avoiding the clear) and retrieve it manually. Seems like this should make everyone happy. What I'm not understanding is the resistance towards Rink's patch, assuming the tunable defaults to disabled/off. It seems reasonable to me. The only catch I can see is that if you have a crash and you want to look at the message buffer after rebooting, you'll have to remember to stop at the loader prompt and turn off that tunable. Which might be easy to forget in the heat of the moment. -- Nate Eldredge [EMAIL PROTECTED] ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: [Testers wanted] /dev/console cleanups
On Wed, 19 Nov 2008, Jeremy Chadwick wrote: On Thu, Nov 20, 2008 at 05:39:36PM +1100, Peter Jeremy wrote: I hope that never gets committed - it will make debugging kernel problems much harder. There is already a kern.msgbuf_clear sysctl and maybe people who are concerned about msgbuf leakage need to learn to use it. And this sysctl is only usable *after* the kernel loads, which means you lose all of the messages shown from the time the kernel loads to the time the sysctl is set (e.g. hardware detected/configured). This is even less acceptable, IMHO. But surely you can arrange that the contents are written out to /var/log/messages first? E.g. a sequence like - mount /var - write buffer contents via syslogd - clear buffer via sysctl - allow user logins This way the buffer is cleared before any unprivileged users get to do anything. No kernel changes needed, just a little tweaking of the init scripts at most. If you should have a crash and suspect there is useful data in the buffer, you can boot to single-user mode (avoiding the clear) and retrieve it manually. Seems like this should make everyone happy. -- Nate Eldredge [EMAIL PROTECTED] ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: How can I add new binaries to the mfsroot image?
On Sun, 16 Nov 2008, Peter Steele wrote: I want to make a custom FreeBSD install CD-ROM with additional commands available in the mfsroot image. Adding the new commands to the image is easy enough, and I've made an install.cfg file on the CD-ROM as well so that when the CD runs the commands in install.cfg are automatically executed. This all works, except none of the new binaries I add to the mfsroot image run during the automated sysinstall session. If I reference one of the default commands (the ones stored in /stand) they run fine, but if I add a new FreeBSD binary to the /stand directory (e.g. gmirror), the command fails. How does it fail? Is the binary you added statically linked? What's weird is that I can open a fixit shell after the install.cfg script fails and then run the same commands interactively and they work fine. Why would work these commands work in an interactive fixit shell but not during the automated sysinstall session? Wild guess: the shared libraries are present somewhere else on the CD, which perhaps is either not mounted or not pointed to by LD_LIBRARY_PATH or similar until the fixit shell is run. -- Nate Eldredge [EMAIL PROTECTED] ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: Unprivileged user can't set sticky bit on a file; why?
On Fri, 14 Nov 2008, Volodymyr Kostyrko wrote: Nate Eldredge wrote: I came across this when trying to rsync some files which had the sticky bit set on the remote side. (It's the historical Unix archive from tuhs.org; the files in question are part of an unpacked V7 UNIX installation, for which the sticky bit of course had meaning. :-) ) It's annoying that this makes rsync fail; it messes up my mirroring script. You can ask rsync to change file attributes on the fly with the --chmod option. Just my 2c. Thanks for this hint. --chmod=F-t solves my problem. But I am still curious about this behavior. -- Nate Eldredge [EMAIL PROTECTED] ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to [EMAIL PROTECTED]
Unprivileged user can't set sticky bit on a file; why?
Hi folks, FreeBSD doesn't allow an unprivileged user to set the sticky bit (mode S_ISTXT, octal 01000) on a file, though it does allow root to do so. [EMAIL PROTECTED]:/tmp$ chmod +t foo chmod: foo: Inappropriate file type or format [EMAIL PROTECTED]:/tmp$ su Password: vulcan# chmod +t foo vulcan# ls -l foo -rw-r--r-T 1 nate wheel 0 Nov 13 22:46 foo Why is this? I don't expect the sticky bit to actually do anything on a regular file in this day and age (I know what its historical behavior was, and what it does for directories), but I'd think it would be harmless to set it. Linux lets a user set the sticky bit, and Solaris silently masks it off. I came across this when trying to rsync some files which had the sticky bit set on the remote side. (It's the historical Unix archive from tuhs.org; the files in question are part of an unpacked V7 UNIX installation, for which the sticky bit of course had meaning. :-) ) It's annoying that this makes rsync fail; it messes up my mirroring script. sticky(8) says the bit is ignored for regular files, which evidently isn't accurate. chmod(2) says on UFS-based file systems (FFS, LFS) the sticky bit may only be set upon directories, which isn't right either since root is able to do it. src/sys/ufs/ufs/ufs_vnops.c has the following comment: /* * Privileged processes may set the sticky bit on non-directories, * as well as set the setgid bit on a file with a group that the * process is not a member of. Both of these are allowed in * jail(8). */ but does not explain why unprivileged process should be forbidden to set the sticky bit. -- Nate Eldredge [EMAIL PROTECTED] ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: ukbd attachment and root mount
On Wed, 12 Nov 2008, Andriy Gapon wrote: on 05/11/2008 17:24 Andriy Gapon said the following: [...] I have a legacy-free system (no PS/2 ports, only USB) and I wanted to try a kernel without atkbd and psm (with ums, ukbd, kbdmux), but was bitten hard when I made a mistake and kernel could not find/mount root filesystem. So I stuck at mountroot prompt without a keyboard to enter anything. This was repeatable about 10 times after which I resorted to live cd. Since then I put back atkbdc into my kernel. I guess BIOS or USB hardware emulate AT or PS/2 keyboard, so the USB keyboard works before the driver attaches. I guess I need such emulation e.g. for loader or boot0 configuration. But I guess I don't have to have atkbd driver in kernel. This turned out not to be a complete solution as it seems that there are some quirks about legacy USB here, sometimes keyboard stops working even at loader prompt (this is described in a different thread). ukbd attachment still puzzles me a lot. I look at some older dmesg, e.g. this 7.0-RELEASE one: http://www.mavetju.org/mail/view_message.php?list=freebsd-usbid=2709973 and see that ukbd attaches along with ums before mountroot. I look at newer dmesg and I see that ums attaches at about the same time as before but ukbd consistently attaches after mountroot. I wonder what might cause such behavior and how to fix it. I definitely would like to see ukbd attach before mountroot, I can debug this issue, but need some hints on where to start. I haven't been following this thread, and I'm pretty sleepy right now, so sorry if this is irrelevant, but I had a somewhat similar problem that was fixed by adding hint.atkbd.0.flags=0x1 to /boot/device.hints . -- Nate Eldredge [EMAIL PROTECTED] ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: Asynchronous pipe I/O
On Wed, 5 Nov 2008, rihad wrote: Imagine this shell pipeline: sh prog1 | sh prog2 As given above, prog1 blocks if prog2 hasn't yet read previously written data (actually, newline separated commands) or is busy. What I want is for prog1 to never block: sh prog1 | buffer | sh prog2 [and misc/buffer is unsuitable] I found an old piece of code laying around that I wrote for this purpose. Looking at it, I can see a number of inefficiencies, but it might do in a pinch. You're welcome to use it; I hereby release it to the public domain. Another hack that you could use, if you don't mind storing the buffer on disk rather than memory, is sh prog1 tmpfile tail -f -c +0 tmpfile | sh prog2 Here's my program. /* Buffering filter. */ #include stdio.h #include unistd.h #include sys/types.h #include stdlib.h #include errno.h #include string.h /* Size of a single buffer. */ #define BUFSIZE 512 struct buffer { struct buffer *next; size_t length; unsigned char buf[BUFSIZE]; }; struct buffer *reader; struct buffer *writer; int max_mem = 100 * 1024; int current_mem; #define OK 0 #define WAIT 1 #define GIVEUP 2 int read_one (int fd) { int result; if (current_mem (max_mem - sizeof(*reader-next))) { fprintf(stderr, Reached max_mem!\n); return WAIT; } /* Get a new buffer. */ reader-next = malloc(sizeof(*reader-next)); if (reader-next) { current_mem += sizeof(*reader-next); fprintf(stderr, \rReading: \t%u bytes in buffer , current_mem); } else { fprintf(stderr, Virtual memory exhausted\n); return WAIT; } reader = reader-next; reader-next = NULL; result = read(fd, reader-buf, BUFSIZE); if (result 0) reader-length = result; else if (result == 0) { fprintf(stderr, Hit EOF on reader\n); return GIVEUP; } else if (result 0) { fprintf(stderr, Error on reader: %s\n, strerror(errno)); return GIVEUP; } return OK; } int write_one (int fd) { struct buffer *newwriter; if (reader == writer) return WAIT; /* the reader owns the last buffer */ if (writer-length 0) { int result; result = write(fd, writer-buf, writer-length); if (result == 0) { fprintf(stderr, Hit EOF on writer\n); return GIVEUP; } else if (result 0) { fprintf(stderr, Error on writer: %s\n, strerror(errno)); return GIVEUP; } } newwriter = writer-next; free(writer); current_mem -= sizeof(*writer); fprintf(stderr, \rWriting: \t%u bytes in buffer , current_mem); writer = newwriter; return OK; } void move_data(int in_fd, int out_fd) { int reader_state = OK; int writer_state = OK; int maxfd = ((in_fd out_fd) ? in_fd : out_fd) + 1; reader = malloc(sizeof(*reader)); if (!reader) { fprintf(stderr, No memory at all!\n); return; } reader-next = NULL; reader-length = 0; writer = reader; current_mem = sizeof(*reader); while (1) /* break when done */ { int result; fd_set read_set, write_set; FD_ZERO(read_set); FD_ZERO(write_set); if (reader_state == OK) FD_SET(in_fd, read_set); if (writer_state == OK) FD_SET(out_fd, write_set); result = select(maxfd, read_set, write_set, NULL, NULL); /* If we're ready to do something, do it. Also let the other end get a chance if something changed. */ if (FD_ISSET(in_fd, read_set)) { reader_state = read_one(in_fd); if (writer_state == WAIT) writer_state = OK; } if (FD_ISSET(out_fd, write_set)) { writer_state = write_one(out_fd); if (reader_state == WAIT) reader_state = OK; } /* Check for termination */ if (writer_state == GIVEUP) break; /* can't write any more */ if (reader_state == GIVEUP writer_state == WAIT) break; /* can't read any more, and wrote all we have */ } } int main(void) { move_data(0, 1); return 0; } -- Nate Eldredge [EMAIL PROTECTED] ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: memtest86+ can not link: binutils issue?
On Fri, 31 Oct 2008, Andriy Gapon wrote: on 30/10/2008 20:46 Peter Jeremy said the following: On 2008-Oct-30 18:08:35 +0200, Andriy Gapon [EMAIL PROTECTED] wrote: ld --warn-constructors --warn-common -static -T memtest_shared.lds \ -o memtest_shared head.o reloc.o main.o test.o init.o lib.o patn.o screen_buffer.o config.o linuxbios.o memsize.o pci.o controller.o random.o extra.o spd.o error.o dmi.o \ ld -shared -Bsymbolic -T memtest_shared.lds -o memtest_shared head.o reloc.o main.o test.o init.o lib.o patn.o screen_buffer.o config.o linuxbios.o memsize.o pci.o controller.o random.o extra.o spd.o error.o dmi.o head.o(.text+0x7): In function `startup_32': : undefined reference to `_GLOBAL_OFFSET_TABLE_' Segmentation fault (core dumped) gmake: *** [memtest_shared] Error 139 I can't help here. _GLOBAL_OFFSET_TABLE_ is related to the binutils PIC support and it appears that the linker doesn't like the code (in head.S) is explicitly referencing it. Not only linking fails, but ld even crashes. I agree this shouldn't happen. Can anybody suggest anything about this problem? It looks like stand-alone PIC code on FreeBSD needs some different incantations to Linux. My understanding is that several of the i386 bootstraps are relocatable so you might like to peruse the code in /usr/src/sys/boot/i386 for ideas. I wonder if this is something about out port of binutils or is it an issue in older version of binutils. I'll try to look at the boot code, thank you for the hint. FreeBSD's version of binutils is quite old. I've definitely found bugs in it which are fixed in GNU's current version. So you might try building the official GNU binutils and see if that works any better. I don't know if it will fix your error but maybe it at least won't crash. ld crashing is definitely a bug, and it would be nice if you could file a PR, including the object files. If the GNU version doesn't crash that would be useful information for the PR also, as it might encourage Them to consider importing a newer version. -- Nate Eldredge [EMAIL PROTECTED] ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: includes, configure, /usr/lib vs. /usr/local/lib, and linux coders
On Fri, 31 Oct 2008, Steve Franks wrote: Let's backup. What's the 'right' way to get a bloody linux program that expects all it's headers in /usr/include to compile on freebsd where all the headers are in /usr/local/include? That's all I'm really asking. Specifically, it's looking for libusb libftdi. If I just type gmake, it can't find it, but if I manually edit the Makefiles to add -I/usr/local/include, it can. Obviously, manually editing the makefiles is *not* the right way to fix it (plus it's driving me crazy). C_INCLUDE_PATH=$C_INCLUDE_PATH:/usr/local/include LIBRARY_PATH=$LIBRARY_PATH:/usr/local/lib export C_INCLUDE_PATH LIBRARY_PATH ./configure gmake Adjust as appropriate if using csh. Personally, I set those environment variables in my .profile. By the way, I think you're being a little unfair to blame this on Linux programs or programmers. Normally it's the user's responsibility to ensure that their compiler searches for include files, etc, in the appropriate place. Many Linux distributions put everything in /usr/include, which is searched by default. FreeBSD puts stuff from ports in /usr/local/include which isn't searched by default. I find that behavior inconvenient, which is why I set those environment variables, so I don't have to think about it. -- Nate Eldredge [EMAIL PROTECTED] ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: neophyte: tcsetattr() gives 22 error in i386, not in amd64?
On Fri, 24 Oct 2008, Steve Franks wrote: Hi, I'm getting a 22 errno from tcsetattr() on 7-STABLE i386 in code which was working under 7-STABLE amd64. Serial device is a ucom (silabs cp2103). Permissions on /dev/cuaU0 look fine. Cutecom/Minicom appears to open the port without error... I don't see anything obviously wrong, but I'd bet a bug related to 32/64-bit types. Can you post a complete piece of code that can be compiled and run and demonstrates the problem? Also, try compiling with -Wall -W and investigate any warnings that are produced. By the way, errno 22 is EINVAL, Invalid argument. perror() is your friend. [snip code] -- Nate Eldredge [EMAIL PROTECTED] ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: Why does adding /usr/lib32 to LD_LIBRARY_PATH break 64-bit binaries?
On Thu, 23 Oct 2008, Alexander Sack wrote: Alright, well I found some weirdness: [EMAIL PROTECTED] ~]# export LD_LIBRARY_PATH=/usr/bin:/usr/lib:/usr/lib32:/usr/lib64 [EMAIL PROTECTED] ~]# LD_DEBUG=1 ls /libexec/ld-elf.so.1 is initialized, base address = 0x800506000 RTLD dynamic = 0x80062ad78 RTLD pltgot = 0x0 processing main program's program header Filling in DT_DEBUG entry lm_init((null)) loading LD_PRELOAD libraries loading needed objects Searching for libutil.so.5 Trying /usr/bin/libutil.so.5 Trying /usr/lib/libutil.so.5 Trying /usr/lib32/libutil.so.5 loading /usr/lib32/libutil.so.5 /libexec/ld-elf.so.1: /usr/lib32/libutil.so.5: unsupported file layout That's because libutil.so.5 does not exist in /usr/lib only in /lib. The /usr/lib directory has: [EMAIL PROTECTED] ~]# ls -l /usr/lib/libutil* -r--r--r-- 1 root wheel 100518 Aug 21 2007 /usr/lib/libutil.a lrwxrwxrwx 1 root wheel 17 Sep 11 11:44 /usr/lib/libutil.so - /lib/libutil.so.5 -r--r--r-- 1 root wheel 103846 Aug 21 2007 /usr/lib/libutil_p.a So rtld is looking for major number 5 of libutil, without the standard /lib in my LD_LIBRARY_PATH it searches /usr/lib, doesn't find it but: [EMAIL PROTECTED] ~]# ls -l /usr/lib32/libutil* -r--r--r-- 1 root wheel 65274 Aug 21 2007 /usr/lib32/libutil.a lrwxrwxrwx 1 root wheel 12 Sep 11 11:45 /usr/lib32/libutil.so - libutil.so.5 -r--r--r-- 1 root wheel 46872 Aug 21 2007 /usr/lib32/libutil.so.5 -r--r--r-- 1 root wheel 66918 Aug 21 2007 /usr/lib32/libutil_p.a And whalah, I'm broke since there is a libutil.so.5 in there. So my question to anyone out there, WHY does /usr/lib32 contain major numbers but /usr/lib does not? This seems like a bug to me (FreeBSD 7.0-RELEASE is the same) or at least a dubious design decision. I think the distinction is this. rtld is looking for libutil.so.5 (with version number). This file has to be in /lib, in the root filesystem, so that programs can run before /usr is mounted. libutil.so on the other hand is not searched for by rtld, but by ld (driven by cc), when the program is built. /usr/lib is the traditional place for it to search; I'm not sure if it searches /lib at all. In the case of static libraries, /usr/lib is certainly the right place for libutil.a to go, so having libutil.so there makes sense in my mind. I think your best bet is to dig into whatever is setting LD_LIBRARY_PATH and get it set correctly. Remove /usr/lib32 or at least ensure that /lib is searched first. Trying to change rtld's behavior is not the right approach, IMHO. -- Nate Eldredge [EMAIL PROTECTED] ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: Laptop suggestions?
On Wed, 22 Oct 2008, Gary Kline wrote: On Wed, Oct 22, 2008 at 01:06:29PM +0200, Dag-Erling Sm?rgrav wrote: martinko [EMAIL PROTECTED] writes: I have always thought that Fn key in left most bottom corner of the keyboard is, especially for programmers, a very bad idea. :-( Seconded. Worse still, on my Lenovo T60, if the Fn key is held down longer than a fraction of a second, it generates an input event which just happens to correspond to Gnome's default key binding for the next track function in media players... I've seen that Fn key, but don't know what it is for. What? you press it, then follow with the integers [ 1, 2, 3 ... ]? At any rate, maybe you can remap the key with ~/.xmodmaprc. Fn is usually used on laptop keyboards to allow two logical keys to share a single physical key. For example, see the keyboard pictured at http://www.notebookreview.com/assets/3415.jpg . On the extreme lower right is a key with - in white and End in blue. Pressing it by itself sends the keycode corresponding to an ordinary keyboard's - key. Holding Fn and pressing that key sends the keycode corresponding to an ordinary keyboard's End key. On many keyboards, pressing Fn by itself sends no keycode at all, so it cannot be remapped. It is also sometimes used to control hardware features which on a desktop machine might have a different interface. For instance, on the laptop pictured, holding Fn and pressing F6 would increase the screen brightness, probably without sending a keycode. A desktop machine would probably have a button on the monitor itself to do this. -- Nate Eldredge [EMAIL PROTECTED] ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: indicating a debug image
On Thu, 16 Oct 2008, Chuck Robey wrote: -BEGIN PGP SIGNED MESSAGE- Hash: SHA1 I was wondering, for FreeBSD images, is there a symbol that one could look for, to indicate if image had debug symbols? I know you could destroy that by just stripping, I just wanted to know if there is any way to definitely tell, short of firing up gdb and looking for info. There's really three possibilities: 1. Image has no symbols 2. Image has only non-debug symbols (e.g. global functions and variables) 3. Image has debug symbols (e.g. line numbers, local variables) strip(1) or gcc -s produces #1. gcc without -g produces #2. gcc -g produces #3. You can distinguish #1 because 'nm image' will give no output. nm and objdump don't appear able to distinguish #2 and #3, but readelf -w will give a bunch of output for #3 and none for #2. Does that help? -- Nate Eldredge [EMAIL PROTECTED] ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: ZFS boot
On Sat, 11 Oct 2008, Pegasus Mc Cleaft wrote: FWIW, my system is amd64 with 1 G of memory, which the page implies is insufficient. Is it really? This may be purely subjective, as I have never bench marked the speeds, but when I was first testing zfs on a i386 machine with 1gig ram, I thought the performance was mediocre. However, when I loaded the system on a quad core - core2 with 8 gigs ram, I was quite impressed. I put localized changes in my /boot/loader.conf to give the kernel more breathing room and disabled the prefetch for zfs. #more loader.conf vm.kmem_size_max=1073741824 vm.kmem_size=1073741824 vfs.zfs.prefetch_disable=1 I was somewhat confused by the suggestions on the wiki. Do the kmem_size sysctls affect the allocation of *memory* or of *address space*? It seems a bit much to reserve 1 G of memory solely for the use of the kernel, expecially in my case when that's all I have :) But on amd64, it's welcome to have terabytes of address space if it will help. The best advice I can give is for you to find an old machine and test-bed zfs for yourself. I personally have been pleased with it and It has saved my machines data 4 times already (dieing hardware, unexpected power bounces, etc) Sure, but if my new machine isn't studly enough to run it, there's no hope for an old machine. So I'm trying to figure out what I actually need. -- Nate Eldredge [EMAIL PROTECTED] ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: Is it possible to recover from SEGV?
On Sat, 11 Oct 2008, Yuri wrote: Let's say I have signal(3) handler set. And I know exactly what instruction caused SEGV and why. Is there a way to access from signal handler CPU registers as they were before signal, modify some of them, clear the signal and continue from the instruction that caused SEGV initially? Absolutely. Declare your signal handler as void handler(int sig, int code, struct sigcontext *scp); You will need to cast the pointer passed to signal(3). struct sigcontext is defined in machine/sysarch.h I believe. struct sigcontext contains the CPU registers as they were when the faulting instruction began to execute. You can modify them and then return from the signal handler. The program will resume the faulting instruction with the new registers. You can also alter the copy of the instruction pointer in the struct sigcontext if you want it to resume somewhere else. There is also a libsigsegv which looks like it wraps some of this process in a less machine-specific way. Out of curiosity, what are you looking to achieve with this? And what architecture are you on? -- Nate Eldredge [EMAIL PROTECTED] ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: continuous backup solution for FreeBSD
On Wed, 8 Oct 2008, Evren Yurtesen wrote: Thanks again for pointing out snapshots. It is more or less suitable :) I'll just warn you that if you're planning to use snapshots for your own purposes, to first do an extensive stress test on a non-critical machine with backed up data. I've had a lot of problems with snapshots occasionally causing deadlocks which hang the machine. This was under 6.x but I had the same problem under many previous versions, so I don't necessarily expect that it's fixed. Also, while it's never happened to me, I've heard other people report data corruption. -- Nate Eldredge [EMAIL PROTECTED] ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to [EMAIL PROTECTED]
Debugging reboot with Linux emulation
Hi folks, I recently tried to run a Linux binary of Maple (commercial math software) on my FreeBSD 7.0-RELEASE/amd64 box, and the machine rebooted. I tried it again while watching the console, and no panic message appeared to be produced. Does anyone have any ideas on how to debug problems of this nature? I realize I may not be able to get Maple to work, but in any case the system should not die like this, so I can at least try to fix that bug. Incidentally, is it possible to run kdb with a USB keyboard? Hitting Ctrl-Alt-Esc gives me the kdb prompt, but I can't type, so I can do nothing except hit the power button. I do have hint.atkbd.0.flags=0x1 in /boot/device.hints. Unfortunately I don't have a PS/2 keyboard on hand, though I can try and get a hold of one if all else fails. -- Nate Eldredge [EMAIL PROTECTED] ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: Debugging reboot with Linux emulation
On Wed, 13 Aug 2008, Kostik Belousov wrote: Then, the issue of mixing our reboot(2)/linux fcntl(2) is irrelevant. The original reporter said that system just rebooted, and I believe that filesystems where not synced and not unmounted properly. Our reboot(2) does not have flag combination that could cause such behaviour, I think. You are right, file systems were not unmounted, and I doubt that they were synced either. They had to be fscked when the system came back up. Also, I doubt that the program being run is statically linked or run by root. Confirmation ? I did not run it as root. Sorry, I should have said that before. It is a little hard to trace their maze of shell scripts to figure out which binary was being run, but if I am looking at the right one, it is dynamically linked and branded SVR4. I will make sure later today. Overall, this looks like a nasty bug, hopefully in the linuxolator. Indeed. -- Nate Eldredge [EMAIL PROTECTED] ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: read with timeout ??
On Fri, 8 Aug 2008, Chuck Robey wrote: -BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Pieter de Goeje wrote: I think poll(2) is also simpler than select for this purpose. It does look like that, I need to check the implementation a bit, because the name of this thing makes me really suspicious about how often it checks for an fd for being ready for a read. I know select comes right back, I was under the impression that poll didn't use signals to do this. AFAIK the effects are identical, just the arguments are set up in a different way. Both of them will block until the fd is ready and then return immediately (subject to other processes running of course). The name poll is a misnomer because it doesn't actually work by polling, but you can pretend that it does (and does so infinitely often). Neither one uses signals per se, though if the underlying hardware device is interrupt-driven, that will be what (indirectly) triggers the wake-up. poll does seem to be more convenient than messing about with fd_set's. select is older and so it comes to my mind first, that's all. -- Nate Eldredge [EMAIL PROTECTED] ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: read with timeout ??
On Thu, 7 Aug 2008, Chuck Robey wrote: -BEGIN PGP SIGNED MESSAGE- Hash: SHA1 I have my head lost in a code problem. I just hit a point where I need to do a read from an fd, but I need to associate it with a timeout, on the order of 1 second, something like that. I had the feeling that there's a function in FreeBSD's libc that makes that simple, but I forget the function name. If anyone can remember something like what I'm talking about, I sure would appreciate a function name. I can figure out how it works, if I could only dredge up that name. man 2 select -- Nate Eldredge [EMAIL PROTECTED] ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: consolekit on 7.0-STABLE i386
On Wed, 30 Jul 2008, sam wrote: hello my trouble FreeBSD static 7.0-STABLE FreeBSD 7.0-STABLE #23: Mon Jul 28 18:10:51 MSD 2008 [EMAIL PROTECTED]:/usr/obj/usr/src/sys/STATIC i386 top_output- |874 root17 00 8296K 2660K waitvt 1 0:00 0.00% console-kit-daemon| ---vmstat_output--- | procs memory pagedisks faultscpu r b w avmfre flt re pi pofr sr ad4 ad6 in sy cs us sy id 0 19 0 1113M29M 493 1 0 0 265 129 0 0 133 45119 4588 8 5 87 0 20 0 1113M29M 249 0 2 0 3311 0 0 22 157 7872 2262 5 7 88 0 19 0 1113M29M 346 0 0 0 148 0 0 0 110 78963 1793 4 9 87 0 19 0 1113M29M 115 0 0 0 0 0 0 0 105 5743 1731 13 1 85 0 19 0 1113M29M 318 0 0 0 138 0 0 0 108 78837 1732 3 10 87 0 19 0 1113M29M 112 0 0 032 0 0 1 100 5549 1682 11 1 88 0 19 0 1113M29M 297 0 0 0 136 0 0 2 122 78880 1749 6 7 87 | consolekit in |waitvt state, influencing on high volumes in procs-b I don't understand what the problem is. It looks like consolekit is sleeping and not using any CPU. waitvt just indicates where in the kernel it's sleeping. I don't understand what you mean by high volumes in procs-b. -- Nate Eldredge [EMAIL PROTECTED] ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: General questions about virtual memory
On Wed, 30 Jul 2008, FreeBSD Hackers wrote: If anyone is willing to help me understand this, I would greatly appreciate it. I would also value your input if there are other resources (people, mailing lists, books, web pages, etc.) that you want to recommend instead of taking some time to help teach me. As a slightly less orthodox suggestion, I learned a lot of this from the practice side rather than the theory side, and it seems like maybe this is where some of your questions lie. In addition to a textbook, you might find it useful to get a copy of the manual for your favorite CPU, which will explain, at the level of assembly language, how all these features work. (They are usually available free on the manufacturer's website, though you may have to hunt around a bit or register for a developer program or something.) You can read it in conjunction with the FreeBSD kernel source to see an actual example. I found this approach very instructive. -- Nate Eldredge [EMAIL PROTECTED] ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: SCHED_4BSD bad interactivity on 7.0 vs 6.3
On Sun, 13 Jul 2008, Nate Eldredge wrote: On Sun, 13 Jul 2008, Kris Kennaway wrote: Nate Eldredge wrote: On Sun, 13 Jul 2008, Kris Kennaway wrote: Nate Eldredge wrote: Hi folks, Hopefully this is a good list for this topic. It seems like there has been a regression in interactivity from 6.3-RELEASE to 7.0-RELEASE when using the SCHED_4BSD scheduler. After upgrading my single-cpu amd64 box, 7.0 has much worse latency. When running a kernel compile, there is a noticeable lag to echo my typing or scroll my browser windows, and playing an mp3 frequently cuts out for a second or two. This did not happen on 6.3-RELEASE. Are you sure it's not the x.org server bug that was present in the version shipped with 7.0? Update to the latest version and see if your X interactivity improves. Yes, I had not yet upgraded my x.org port when testing this, so it was the same x.org that was fine under 6.3. Also: I wrote a small program which forks two processes that run gettimeofday() in a tight loop to see how long they get scheduled out. On 6.3 the maximum latency is usually under 100 ms. On 7.0 it is 500 ms or more even when nothing else is running on the system. When a compile is also running it is sometimes 1400 ms or more. This test shows a difference even in single user mode, when X is not running at all. It shows *a* difference, but perhaps not the *same* difference. Please humour me and rule it out. Okay. I am in the process of recompiling all my ports, so after that is done I will boot with a GENERIC kernel and see what happens. After trying this, I can't seem to reproduce the sound skipping behavior, unless I do something fairly extreme like make -j 6. But the mouse does seem to skip when a compile is running, so I do believe there is a regression. -- Nate Eldredge [EMAIL PROTECTED] ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: SCHED_4BSD bad interactivity on 7.0 vs 6.3
On Sun, 13 Jul 2008, Kris Kennaway wrote: Nate Eldredge wrote: Hi folks, Hopefully this is a good list for this topic. It seems like there has been a regression in interactivity from 6.3-RELEASE to 7.0-RELEASE when using the SCHED_4BSD scheduler. After upgrading my single-cpu amd64 box, 7.0 has much worse latency. When running a kernel compile, there is a noticeable lag to echo my typing or scroll my browser windows, and playing an mp3 frequently cuts out for a second or two. This did not happen on 6.3-RELEASE. Are you sure it's not the x.org server bug that was present in the version shipped with 7.0? Update to the latest version and see if your X interactivity improves. Yes, I had not yet upgraded my x.org port when testing this, so it was the same x.org that was fine under 6.3. Also: I wrote a small program which forks two processes that run gettimeofday() in a tight loop to see how long they get scheduled out. On 6.3 the maximum latency is usually under 100 ms. On 7.0 it is 500 ms or more even when nothing else is running on the system. When a compile is also running it is sometimes 1400 ms or more. This test shows a difference even in single user mode, when X is not running at all. -- Nate Eldredge [EMAIL PROTECTED] ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: SCHED_4BSD bad interactivity on 7.0 vs 6.3
On Sun, 13 Jul 2008, Kris Kennaway wrote: Nate Eldredge wrote: On Sun, 13 Jul 2008, Kris Kennaway wrote: Nate Eldredge wrote: Hi folks, Hopefully this is a good list for this topic. It seems like there has been a regression in interactivity from 6.3-RELEASE to 7.0-RELEASE when using the SCHED_4BSD scheduler. After upgrading my single-cpu amd64 box, 7.0 has much worse latency. When running a kernel compile, there is a noticeable lag to echo my typing or scroll my browser windows, and playing an mp3 frequently cuts out for a second or two. This did not happen on 6.3-RELEASE. Are you sure it's not the x.org server bug that was present in the version shipped with 7.0? Update to the latest version and see if your X interactivity improves. Yes, I had not yet upgraded my x.org port when testing this, so it was the same x.org that was fine under 6.3. Also: I wrote a small program which forks two processes that run gettimeofday() in a tight loop to see how long they get scheduled out. On 6.3 the maximum latency is usually under 100 ms. On 7.0 it is 500 ms or more even when nothing else is running on the system. When a compile is also running it is sometimes 1400 ms or more. This test shows a difference even in single user mode, when X is not running at all. It shows *a* difference, but perhaps not the *same* difference. Please humour me and rule it out. Okay. I am in the process of recompiling all my ports, so after that is done I will boot with a GENERIC kernel and see what happens. -- Nate Eldredge [EMAIL PROTECTED] ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to [EMAIL PROTECTED]
SCHED_4BSD bad interactivity on 7.0 vs 6.3
Hi folks, Hopefully this is a good list for this topic. It seems like there has been a regression in interactivity from 6.3-RELEASE to 7.0-RELEASE when using the SCHED_4BSD scheduler. After upgrading my single-cpu amd64 box, 7.0 has much worse latency. When running a kernel compile, there is a noticeable lag to echo my typing or scroll my browser windows, and playing an mp3 frequently cuts out for a second or two. This did not happen on 6.3-RELEASE. I wrote a small program which forks two processes that run gettimeofday() in a tight loop to see how long they get scheduled out. On 6.3 the maximum latency is usually under 100 ms. On 7.0 it is 500 ms or more even when nothing else is running on the system. When a compile is also running it is sometimes 1400 ms or more. SCHED_ULE is much better, so I've switched over. But it's not the default yet, and most people are still going to be using SCHED_4BSD. It used to be acceptable but now it isn't. Does anyone know why it's regressed so badly? -- Nate Eldredge [EMAIL PROTECTED] ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to [EMAIL PROTECTED]