Re: Linux mode is not enabled help
suken woo wrote: after cvsup'd recent,the Linux emulator was disabled.rebuild it , get the same error: Linux mode is not enabled. Loading linux kernel module now... kldload: can't load linux: No such file or directory This is common in -current. You either have to give an absolute path to the module file, or it has to be in one of the new path locations. The loader has the same problem, when you try to load a kernel installed in /. There is apparently a problem with the parsing of the path elements, though I have not bothered to find out what it is. I don't think current directory works, either; maybe it's one of those over-zealous security things that keep happening. To see your current path: sysctl kern.module_path -- Terry To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-current in the body of the message
Re: Trivial patch: fdisk doesn't recognize my partitions
[ ... Partition ID changes ... ] Nate Lawson wrote: But as I said, this is rather marginal and I really don't feel it should go in unless this xor-0x10 convention is more widespread. partition magic does this too. isn't the correct failure mode just to print the part. id in hex instead of expanding it? Frankly, who cares? You guys still haven't told us, if these partitions are being hidden... WHY ARE WE NOT RESPECTING THE DECISION TO HIDE THE THINGS? A user installed the software doing the hiding on purpose. The software changed the ID hide it, on purpose. Windows ignores these partitions -- on purpose. If you're not going to respect the user's wishes in this, then that's a different kettle of fish... like not respecting the user disabling things in the BIOS, because the probe routines still detect it. If you're going that route, why does FreeBSD care about partition ID at all? All it is is a *hint*; it's not definitive. It's not lika a protocol type encapsulation on a packet. It doesn't matter what the ID says, the rest of the partition table entry demarcates a region of a linear arraw of bytes that contain data. I think looking at the content of that linear array is what should determine what the content is, in the absence of a valid hint. Specifically, if it has a valid disklabel on the thing, I don't care what partition ID it has on it, I give it to the disklabel handler. If it has a valid FAT32 FS on it, I give it to the FAT32FS. If it has a valid FFS superblock on it, I give it to FFS. Etc.. -- Terry To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-current in the body of the message
Re: Trivial patch: fdisk doesn't recognize my partitions
Garance A Drosihn wrote: My own opinion is that if I have explicitly hid a partition, then freebsd should ignore it. There are times that I do this specifically so *freebsd* will ignore it, and I don't want freebsd trying to second-guess what I meant. Exactly. If you wanted the dratted thing unhidden, then you would use the tool you hid it with to unhide it. Specifically, if it has a valid disklabel on the thing, I don't care what partition ID it has on it, I give it to the disklabel handler. If it has a valid FAT32 FS on it, I give it to the FAT32FS. If it has a valid FFS superblock on it, I give it to FFS. Etc.. The fact that the disklabel is valid does not mean that the filesystem in that partition is still valid. If I hide a partition, it may be that I had a very good reason for hiding it, and freebsd shouldn't be giving it to anything when the partition ID is not a recognized ID. That's really for the FS code to deal with. Handling it any other way means that a corrupt disk can panic the machine. That's a really dumb thing to allow, particularly with removable media. That's just a general principle, totally independant of hiding things; I only point it out because the people who were wanting the hidden partition types known to FreeBSD are totally missing the point about what a partition is or isn't, and who's responsible for validating the data therein. -- Terry To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-current in the body of the message
Re: Any ideas at all about network problem?
Craig Reyenga wrote: It worked fine in 4.7 and all previous versions, just DP2 dunno about DP1. Well, you will have to back up to a version of the source code before DP2 that didn't have the problem, perform a binary search to find the exact delta that caused the problem, and examine the code differences in order to find the problem change, and why it causes the problem. Personally, I'd start with DP1, but that's because I have a CDROM locally, and the CVS tree is not always buildable, since there is no software enforcement of buildability before a change is committed. There are almost 2 years worth of changes in the things which were not brough back to the -STABLE branches from -CURRENT, so diffing 4.7 and DP2 isn't likely to get you anywhere, I think. -- Terry To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-current in the body of the message
Re: Update to UFS2 Superblock Format
Kirk McKusick wrote: Ah No wonder, I tried editing the /sys/boot/i386/boot2/Makefile to enable UFS2 bootblock but then disklabel complained that boot2 was too big. I will have to revert to UFS1 Thanks Manfred You have hit upon the exact problem. UFS2 has a much bigger area reserved for the boot block, but the programs that set up disk labels and boot blocks don't know about it yet so assume that they have to cram into the much smaller UFS1 boot-block area. Seems to be a candidate to explain the disklabel corruption, actually. The disklabel is expected to follow the initial boot code, and preceed the region(s) it describes... Basically, the boot blocks are going to have to know the disklabel offset, as promiscuous knowledge (i.e. hard-wired intot he code). -- Terry To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-current in the body of the message
Re: Problem with ntpdate
Daniel C. Sobral wrote: Giorgos Keramidas wrote: On 2002-11-28 17:00, Daniel C. Sobral [EMAIL PROTECTED] wrote: I found out that ntpdate just doesn't seem to be working at all during boot. Ntpd dies because of the time differential (windows changes the time two hours because of the TZ). No message from ntpdate (I'll next try to divert it to syslog). If you want to add code to fix this, it's trivial: 1) Read the CMOS clock directly 2) Read the CMOS clock via vm86() 3) If there is a difference measured in round units, apply it as an adjustment to the value each time you read directly. Problem solved. Basically, it comes down to initializing an integer at boot time via a SYSCONFIG() created for that purpose. Personally, I don't have any boxes with a BIOS that's still broken. I used to have one, but I disassembled it with Frank van Gilluwe's Sourcer, hacked the timezone adjustment out of the code, assembled the new code with MASM, and burnt some new PROMs for it. That was back in 1997. If you are looking for advice, my advice is to fix your BIOS... it's easier. 8-). Otherwise, it wouldn't be hard at all to hack the described fix into machdep.c, to make FreeBSD more tolerant of broken hardware (always one in the plus column). -- Terry To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-current in the body of the message
Re: Are SysV semaphores thread-safe on CURRENT?
Brian Smith wrote: On Mon, 18 Nov 2002 22:05:34 -0800, Terry Lambert wrote: Use mmap of a backing-store file, and then use file locking to do record locking in the shared memory segment. Ok, I did this, and it actually works considerably better than the SysV shared memory. However flock() has the same problem as the SysV semaphores, where they block the entire process, allowing the same deadlock situation to occur. Has this flock() behavior changed in CURRENT? It seems like this behavior is much more likely to change than the SysV code. Do you mean flock(), or do you mean fcntl(fs, F_SETLKW, ...)? If you are using range locks, then you mean fcntl(). That's unfortunate: there's an easy way to convert blocking file locks into non-blocking, plus a context switch. I thought th threads library already did this for you in the fcntl() wrapper, in /usr/src/lib/libc_r/uthread/uthread_fcntl.c, but apparently it doesn't. 8-(. The easy way to do this is to convert the blocking request into a non-blocking request, with a retry; e.g., where you have a call to: err = fcntl( fd, F_SETLKW, flock); Replace it with: while( ( err = fcntl( fs, F_SETLK, flock)) == -1 errno == EAGAIN) { sleep( 1); /* use nanosleep(), if 1 second is too big */ } This will cause the processor to be yielded to other threads for as long as the lock can't be acquired, an acquisititon will be retried until it succeeds (effectively, blocking only that thread in sleep()). The difference between F_SETLKW and F_SETLK is why I suggested the approach in the first place (FWIW). The cost of doing this is that blocking requests will not be serviced in FIFO order, as they would if F_SETLKW were being used. This may get expensive if you have a highly contended resource, because you are effectively implementing a low cost polling to obtain the lock. The answer to this is that you are not supposed to use semaphores for highly contended resources, or if you do, use a spin-lock before you use the semaphore, so you can fail early at reduced expense. Probably making the above code into an line function and/or actually modifiying the _fcntl() implementation in the threads library is the way to go. Worse comes to worse, I can give you a kernel patch so that an fcntl() to assert a blocking lock on a non-blocking fd returns the EWOULDBLOCK error, with a patch against _fcntl() similar to the code in _read(). I didn't do that this time, because I don't know how much code really depends on a lock assert on a non-blocking fd blocking anyway, and no matter how you slice it, it's still going to have the same non-FIFO ordering, unless I implemented a FIFO ordered request queue, as well (it'd have to, to be correct). -- Terry To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-current in the body of the message
Re: Are SysV semaphores thread-safe on CURRENT?
Daniel Eischen wrote: No, libc_r doesn't properly handle flock. Usually, all syscalls that take file descriptors as arguments honor the non-blocking mode of the file if set. I guess flock(2) doesn't and has its own option to the operation argument (LOCK_NB). I hacked libc_r to periodically check (every 100msecs) the flock. See if this fixes things: Same thing I suggested, only I think he was really using fcntl(), not flock()? My patch wasn't integral to the library (it was more of a hack), and my default time was 1S, not 100uS. Same non-FIFO request ordering, too. 8-(. I guess the real question is what is an fcntl()/flock() supposed to do on a blocking call against a non-blocking fd? I could not tell, so I punted. -- Terry To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-current in the body of the message
Re: system locks with vnode backed md(4)
Michal Mertl wrote: I'm now unable to make it dead-lock again. Yet it happened quite easily. I had more md backing files in the same directory at the beginning (to test Terry's suspicion mentioned in thread 'jail' on hackers@). After the first lock-up I tried 'while(1);tar xzf ports.tgz; rm -rf ports;end' on normal filesystem, let it run for long time ( 1h) and then I found the system almost dead-locked too (the system worked, but anything accessing disk was painfully slow - it might be the same problem or it might be different. It never ended (at least for ~30 mins when I didn't (weren't able) anything on it). syncer and bufdaemon and others were in wdrain. Disk as seen in systat -v showed maximal usage yet no inodes were resolved. Sometimes during that test I had lock order reversal: Hmm. This isn't actually the same, I think. This is just the point at which you have run out of available memory to maintain additional dependencies, and giant held. The key to this diagnosis is that you let it run for a long time before it locked up. The deadlock condition requires that two people do directory traversals at the same time in vnode backed files in the same directory. It has to do with the locks on the backing files as a result of root vnode traversal vs. the backing vnode in the parent directory. I haven't characterized it better than that. -- Terry To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-current in the body of the message
Re: system locks with vnode backed md(4)
Robert Watson wrote: On Sat, 30 Nov 2002, Michal Mertl wrote: I'm now unable to make it dead-lock again. Yet it happened quite easily. I had more md backing files in the same directory at the beginning (to test Terry's suspicion mentioned in thread 'jail' on hackers@). I've noticed that chroot() environments tend to make existing deadlock opportunities more likely. I'm not quite sure why that is. :-) Lock to parent. It's the same reason you can lock up if you use automount, with all the automount mount points happening in the same subdirectory. There are a fair number of vnode locking deadlock scenarios that are unavoidable where we rely on grabbing vnode locks out of the directory structure lock order. This occurs for vnode-backed md devices, quotas, and UFS1 extended attributes, and probably some other situations. I suspect that Terry is correct that operations on the vnode backing file storage directory are triggering the problem, since that increases the chances that a vnode lock race to root will occur from both the file system backed into the md device, and for the md backing vnodes during blocking I/O. See other postings. The race to root is the one I was originally commenting on. I'm not sure that it applies in this case, I think this case might be the out of memory to create new soft dependencies case, where you can end up holding a lock on a buffer that needs to be flushed to recover memory, until you can satisfy the request to create a dependency (starvation deadlock). The race to root is a deadly embrace deadlock. If you can avoid directory operations on the md backing directory, that would probably be one way to avoid triggering the bug. Yes. By placing each vnconfiged device in its own subdirectory, you avoid them. There's still a window on your host OS doing it's own traversal, but that's (effectively) a whole FS lock, so it doesn't trigger a problem. Seeing it reproduced would probably confirm that this is the case. It's a pain. I wasted a couple of days trying to reproduce, without a box I could wipe and make into a wscratch box, with little luck. I think that it requires reproducing the failing box in detail, which I wasn't willing to do (hence the workaround). On the other hand, there may be other deadlocks in the vnode/ufs/md code that can be more easily corrected than this general VFS problem, so details there would be very useful. There are a number of them; they are all a pain. It's really tempting to just refactor the code so that all locking occurs at the same logical layer, without being held across function calls. That'd be a heck of a lot of work, though... probably worth it, in the end. -- Terry To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-current in the body of the message
Re: C++ Issue On -CURRENT
Cy Schubert - CITS Open Systems Group wrote: does the problem still occur if you add in 'using namespace std'? Thanks. That also fixed it. Yeah. Just remember that the standard namespace isn't. -- Terry To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-current in the body of the message
Re: Trivial patch: fdisk doesn't recognize my partitions
Bruce Evans wrote: On Thu, 28 Nov 2002, Poul-Henning Kamp wrote: In message [EMAIL PROTECTED], Riccardo Torrini write As far as I know it use an EXOR 0x10 to hide/unhide but fdisk doesn't recognize 0x0B/0x0C fat32 when hidden (0x1B/0x1C) But as I said, this is rather marginal and I really don't feel it should go in unless this xor-0x10 convention is more widespread. Hiding partitions is a bug IMO, so it should have negative support. This convention would break many OS's conventions. E.g., NextSTEP | 0x10 gives BSDI. If you think about it, if there is no one to claim it, it's reasonable to treat it as raw disk space, and try to find a partition on it. Really, there's no reason to care about partition type at all, since the contents will have the right magic numbers and the right data layout for a FATFS: you don't really care. That's really only meaningful if you decide the hiding that magic.com does doesn't apply to you; if it applies to you, then, in fact, it's a good thing that it's not recognized: the magic.com program has successfuly accomplished what it was written to accomplish -- so it's a non-problem. -- Terry To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-current in the body of the message
Re: [PATCH] Searching for users of netncp and nwfs to help
Julian Elischer wrote: Where does the passed in thread come from? Your changes to make certain functions which are exported interfaces take a thread * instead of a proc * argument. Generally don't use a thread pointer other than yourself unless you have a lock on the proc structure, or the schedlock. Certainly never store it anywhere.. Particularly anywhere that may persist while you sleep in any way. -exception.. kernel threads- .. they are persistant. The received lock response is going to come in on IPX, which is like UDP, so it's connectionless. The NCP is an el-cheapo timeout-and-retransmit layer on top of formated IPX datagrams (NetWare runs on IPX, not SPX). Basically, this means that the response and the async response to a lock requests, or the async server-to-client notifications (shutdowns, etc.) can come in and activate any listener. The connection has to be looked up, and then the operation has to be processed in the context of the process that has the connection open. It does not care which thread it's processed on, only that it's in the process that owns the connection (there are address space issues). The main problem here is that lockmgr() is being called to lock things that technically don't need to be locked, at all, really, to insure that operations are not attempted concurrently. It's not really necessary: the server will refuse additional requests on a connection, when there is one request outstanding. The only exception isn't really relevent here, because the code that I've seen writen doesn't really support packet burst mode data transfers (pseudo-windowed data streams layered on top of datagrams). Basically, this means unless someone is willing to do the work to set up a virtual circuit -- three network handles per -- per each potentially outstanding thread, and then, further, maintain an idle pool for them, everyone should treat the code as if it were not thread reentrant... because it's not. Gary Tomlinson, Duck, Ted Cowan, and others literally put man years into getting that working, and they had access to Novell source code in order to do the work in the NUC (NetWare UNIX Client) product. It's unlikely that it can be reverse-engineered without at least a PNW/NWU (Portable NetWare/NetWare for UNIX) source license. The *only* reasons there's a thread in there now as a paremeter is that (1) the top level interfaces require it and (2) the lockmgr() calls, that shouldn't need to be there, IMO, require it as a parameter. -- Terry To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-current in the body of the message
Re: [PATCH] Searching for users of netncp and nwfs to help
Terry Lambert wrote: The main problem here is that lockmgr() is being called to lock things that technically don't need to be locked, at all, really, to insure that operations are not attempted concurrently. It's not really necessary: the server will refuse additional requests on a connection, when there is one request outstanding. In case this wasn't clear to whoever was thinking of doing the work: add a serialization barrier at the ncp_* layer. You can remove it later, without any other code being adversely affected, if you add a connection pool later. Note also that the credentials can be passed on the VC, if you don't mind not running on NetWare prior to 3.1b. I recommend this, since it means connection, but not credential, sharing between processes for threads in the work-to-do pool. -- Terry To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-current in the body of the message
Re: Problem pulling particular directory from CVS
Paul A. Scott wrote: I do the following: cvs co src/contrib cvs checkout: in directory src/contrib/cvs: cvs checkout: cannot open CVS/Entries for reading: No such file or directory cvs [checkout aborted]: cannot write CVS/Template file: No such file or directory cvs co stops on the src/contrib/cvs directory and will not go further. I have plenty of space available on the file system. The problem may be a corrupt repository. Is there any way to do a checkout on src/contrib while bypassing src/contrib/cvs? Or, can this be fixed to work? You are not being quite forthright, I think. This normally happens on a cvs -R co, rather than a cvs co, when you are asking for a specific date tag or a release tag which no longer exists, when running against a read-only repository. I ran into a similar problem recently, when someone suggested I use cvs against a FreeBSD server in German, in order to match their version of the source code so I could create a patch for a problem they were having. The answer is that the val-tags file is not writeable, and is being used. There was a long discussion on this file ablot 6 months back; I believe the resolution of that discussion was to make the ${CVSROOT}/CVSROOT/val-tags file unnecessary, but advisory, in the case that it was not writeable. Probably you can get around the problem by updating your 'cvs', though it may also be necessary to update the 'cvs' on the remote host to have the new code, as well. You can also checkout without a tag, or with a tag that is already in the val-tags file on the serving host. Alternately, have them add a line to val-tags with the tag you want to checkout, e.g. [indentation mine]: RELENG_2_2_2_RELEASE y -- Terry To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-current in the body of the message
Re: mbuf header bloat ?
Andrew Gallatin wrote: What I (as a 3rd party driver author working in a GNUish autoconf/gnumake environment) do is to require a user building from source to specify the location of a configured kernel tree where make depend has been run (defaulting to GENERIC). I then pickup the various option and bus files out of that directory. When I build binary modules, I build from source as a normal user (using a 4.1.1 system in a chroot). Using an approach like this, a vendor could ship a MAC aware driver by picking up the options files from a MAC kernel build directory. I believe he was talking about modules for which source code is not available. How is one supposed to build a 3rd party module these days? One is not. The vendor supplies only a binary. I think you under-estimate the complexity of variably sized key kernel data structures. mbuf.h is included all over the kernel, as well as in many user applications (although often for bogus reasons). My proposed strategy is the following: Bizzare. I had no idea userland apps used mbuf.h. That does indeed sound bogus. On the contrary: it's a very clever thing to do. -- Terry To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-current in the body of the message
Re: mbuf header bloat ?
Andrew Gallatin wrote: Terry Lambert writes: Andrew Gallatin wrote: What I (as a 3rd party driver author working in a GNUish This is how I do it. ... How is one supposed to build a 3rd party module these days? How are you supposed to do it? One is not. The vendor supplies only a binary. Damn it Terry, I AM the vendor. Somtimes I wonder if you even read the articles you reply to. I'm asking how the vendor (me) is supposed to build a binary module and I gave an example of how currently do it. You're the vendor in the first statement, and a consumer in the second. The topic of the post to which you were replying was third party binary compatability. The answer is that if the structures change, then there is no binary compatability without source code, period. It seemed to me that you were assuming access to the source code for consumers of third party modules. I think the issue that Robert is concerned about is MAC modules that are provided by a third party to a consumer of FreeBSD and the modules, and for which the structure changes and so on can not be permitted. This mnakes sense, because the MAC code is being developed under a DARPA contract, and it's likely that the module source code and the modules won't be available to the end users, let alone the general public, without some kind of security clearance, and then probably not then. -- Terry To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-current in the body of the message
Re: mbuf header bloat ?
Andrew Gallatin wrote: If you're a vendor of a device which inserts MAC mtags and needs options MAC, you put this code in your driver: if (mbstat.m_mhlen != MHLEN) { printf(Please rebuild your kernel with 'options MAC'\n); goto atach_failed_no_mac; } I've already got code like this in my driver to check that m_mclbytes and m_mlen is what I expect it to be, since people sometimes change them. I think you are still not getting it, but it's not worth arguing over. -- Terry To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-current in the body of the message
Re: pw_user.c change for samba
David W. Chapman Jr. wrote: I know we're in a code freeze right now, but would anyone have a problem with this patch once the freeze is up? This brings us closer to allowing samba to automatically joining machines to the domain. This change permits '$' in the account name, group name, and login class fields. Why is this actually necessary for SAMBA? Is it necessary for all three of these to permit this, or is it sufficient to (for example) allow it in the group name? -- Terry To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-current in the body of the message
Re: pw_user.c change for samba
David W. Chapman Jr. wrote: Why is this actually necessary for SAMBA? Is it necessary for all three of these to permit this, or is it sufficient to (for example) allow it in the group name? Samba needs a user account for the domain machine account the machine account always ends with a $ So it would only have to be for the account name I gathered that from the SAMBA site, too. The '$' is a pain. None of the examples in the original post would have worked, because the '$' was not '\$', and the shell would have blown chunks over the variable expansion. It seems to me that this could cause a great deal of problems for scripts that process the password files, as they currently exist, if they use constructs like eval, or back-ticks, etc.. If it's allowed, it whould probably only be allowed in the user name (i.e. the patch is wrong; it should probably add another parameter to the allowable values of 'int gecos', and change it to 'int checktype' or similar). It seems to me that another alternative is that all these names end in '$'; therefore, when you are expecting one of these names, you could imply a '$', without needing to actually have it in the password file -- in other words, it's an attribute, not really part of the account name. Will this open up a security hole for a nomal user account being used to compromise the domain system security? Is it absolutely necessary to use an in-band method to distinguish these records from ordinary user accounts? If the answer to either of these is no, then it seems that implying the '$', rather than permitting it directly, would be best, to keep scripts working. -- Terry To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-current in the body of the message
Re: pw_user.c change for samba
David W. Chapman Jr. wrote: If it's allowed, it whould probably only be allowed in the user name (i.e. the patch is wrong; it should probably add another parameter to the allowable values of 'int gecos', and change it to 'int checktype' or similar). I don't have a problem with this, but the patch I sent in is the extent of my abilities to give me desired results(making pw like samba) See attached patch. It could still screw scripts (e.g. the perl script version of adduser) by allowing the $ in the login field, but at least it keeps it out of the login class and group fields. See below, though: I don't think '$' should be permitted. It seems to me that another alternative is that all these names end in '$'; therefore, when you are expecting one of these names, you could imply a '$', without needing to actually have it in the password file -- in other words, it's an attribute, not really part of the account name. Will this open up a security hole for a nomal user account being used to compromise the domain system security? Is it absolutely necessary to use an in-band method to distinguish these records from ordinary user accounts? I don't think the samba people would be willing to make this type of change just for FreeBSD since it works for most everyone else. I also don't think there is currently a way to store attributes about machines/users permanently in samba. I think you misunderstand. The intent is to allow accounts without $ appended to be used as machine logins. Samba would see the '$', remove it, and check normally. The potential problem is that normal user accounts could be used in place of machines. The proper BSD way to avoid this hack would be to add a login class samba_server (or whatever), and make Samba permit this type of check only if the user was in the correct login class. -- Terry Index: pw.h === RCS file: /cvs/src/usr.sbin/pw/pw.h,v retrieving revision 1.13 diff -c -r1.13 pw.h *** pw.h5 Jul 2001 08:01:15 - 1.13 --- pw.h27 Nov 2002 17:21:03 - *** *** 62,67 --- 62,74 W_NUM }; + enum _checktype + { + PWC_DEFAULT, + PWC_GECOS, + PWC_LOGIN + }; + struct carg { int ch; *** *** 105,111 int pw_user(struct userconf * cnf, int mode, struct cargs * _args); int pw_group(struct userconf * cnf, int mode, struct cargs * _args); ! char*pw_checkname(u_char *name, int gecos); int addpwent(struct passwd * pwd); int delpwent(struct passwd * pwd); --- 112,118 int pw_user(struct userconf * cnf, int mode, struct cargs * _args); int pw_group(struct userconf * cnf, int mode, struct cargs * _args); ! char*pw_checkname(u_char *name, enum _checktype checktype); int addpwent(struct passwd * pwd); int delpwent(struct passwd * pwd); Index: pw_user.c === RCS file: /cvs/src/usr.sbin/pw/pw_user.c,v retrieving revision 1.51 diff -c -r1.51 pw_user.c *** pw_user.c 24 Jun 2002 11:33:17 - 1.51 --- pw_user.c 27 Nov 2002 17:30:43 - *** *** 231,237 } } if ((arg = getarg(args, 'L')) != NULL) ! cnf-default_class = pw_checkname((u_char *)arg-val, 0); if ((arg = getarg(args, 'G')) != NULL arg-val) { int i = 0; --- 231,237 } } if ((arg = getarg(args, 'L')) != NULL) ! cnf-default_class = pw_checkname((u_char *)arg-val, PWC_DEFAULT); if ((arg = getarg(args, 'G')) != NULL arg-val) { int i = 0; *** *** 293,299 } if ((a_name = getarg(args, 'n')) != NULL) ! pwd = GETPWNAM(pw_checkname((u_char *)a_name-val, 0)); a_uid = getarg(args, 'u'); if (a_uid == NULL) { --- 293,299 } if ((a_name = getarg(args, 'n')) != NULL) ! pwd = GETPWNAM(pw_checkname((u_char *)a_name-val, PWC_LOGIN)); a_uid = getarg(args, 'u'); if (a_uid == NULL) { *** *** 455,461 if ((arg = getarg(args, 'l')) != NULL) { if (strcmp(pwd-pw_name, root) == 0) errx(EX_DATAERR, can't rename `root' account); ! pwd-pw_name = pw_checkname((u_char *)arg-val, 0); edited = 1; } --- 455,461 if ((arg = getarg(args, 'l')) != NULL) { if (strcmp(pwd-pw_name, root) == 0) errx(EX_DATAERR, can't rename `root' account); ! pwd-pw_name = pw_checkname((u_char *)arg-val, PWC_LOGIN); edited = 1; } *** *** 595,601 * Shared add/edit code
Re: pw_user.c change for samba
Garance A Drosihn wrote: the machine account always ends with a $ So it would only have to be for the account name I think I'd prefer a somewhat more involved change, one which allowed $ only for account-name, and only as the last character. That seems like a good idea to me. But then, I'm not volunteering to write it... :-) My change doesn't allow it only for the last, but it does restrict it to the login name. I notice that pw.h exports the code. If somone is using the function from outside, that's probably something that needs to be considered. I've changed the prototype, so that it will at least complain on compilation, if someone is using the code that way. I think the $ on the end worked because of the dangling $ handling in the shell they they happened to be using; the original example namess are still broken for some shells, with no back-quoting. -- Terry To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-current in the body of the message
Re: pw_user.c change for samba
Oops. Better patch attached (damn Makefile dependencies are broken unless you manually build them via make depend). -- Terry Index: pw.h === RCS file: /cvs/src/usr.sbin/pw/pw.h,v retrieving revision 1.13 diff -c -r1.13 pw.h *** pw.h5 Jul 2001 08:01:15 - 1.13 --- pw.h27 Nov 2002 17:21:03 - *** *** 62,67 --- 62,74 W_NUM }; + enum _checktype + { + PWC_DEFAULT, + PWC_GECOS, + PWC_LOGIN + }; + struct carg { int ch; *** *** 105,111 int pw_user(struct userconf * cnf, int mode, struct cargs * _args); int pw_group(struct userconf * cnf, int mode, struct cargs * _args); ! char*pw_checkname(u_char *name, int gecos); int addpwent(struct passwd * pwd); int delpwent(struct passwd * pwd); --- 112,118 int pw_user(struct userconf * cnf, int mode, struct cargs * _args); int pw_group(struct userconf * cnf, int mode, struct cargs * _args); ! char*pw_checkname(u_char *name, enum _checktype checktype); int addpwent(struct passwd * pwd); int delpwent(struct passwd * pwd); Index: pw_group.c === RCS file: /cvs/src/usr.sbin/pw/pw_group.c,v retrieving revision 1.13 diff -c -r1.13 pw_group.c *** pw_group.c 22 Jun 2000 16:48:41 - 1.13 --- pw_group.c 27 Nov 2002 17:44:10 - *** *** 135,141 grp-gr_gid = (gid_t) atoi(a_gid-val); if ((arg = getarg(args, 'l')) != NULL) ! grp-gr_name = pw_checkname((u_char *)arg-val, 0); } else { if (a_name == NULL) /* Required */ errx(EX_DATAERR, group name required); --- 135,141 grp-gr_gid = (gid_t) atoi(a_gid-val); if ((arg = getarg(args, 'l')) != NULL) ! grp-gr_name = pw_checkname((u_char *)arg-val, PWC_DEFAULT); } else { if (a_name == NULL) /* Required */ errx(EX_DATAERR, group name required); *** *** 145,151 extendarray(members, grmembers, 200); members[0] = NULL; grp = fakegroup; ! grp-gr_name = pw_checkname((u_char *)a_name-val, 0); grp-gr_passwd = *; grp-gr_gid = gr_gidpolicy(cnf, args); grp-gr_mem = members; --- 145,151 extendarray(members, grmembers, 200); members[0] = NULL; grp = fakegroup; ! grp-gr_name = pw_checkname((u_char *)a_name-val, PWC_DEFAULT); grp-gr_passwd = *; grp-gr_gid = gr_gidpolicy(cnf, args); grp-gr_mem = members; Index: pw_user.c === RCS file: /cvs/src/usr.sbin/pw/pw_user.c,v retrieving revision 1.51 diff -c -r1.51 pw_user.c *** pw_user.c 24 Jun 2002 11:33:17 - 1.51 --- pw_user.c 27 Nov 2002 17:30:43 - *** *** 231,237 } } if ((arg = getarg(args, 'L')) != NULL) ! cnf-default_class = pw_checkname((u_char *)arg-val, 0); if ((arg = getarg(args, 'G')) != NULL arg-val) { int i = 0; --- 231,237 } } if ((arg = getarg(args, 'L')) != NULL) ! cnf-default_class = pw_checkname((u_char *)arg-val, PWC_DEFAULT); if ((arg = getarg(args, 'G')) != NULL arg-val) { int i = 0; *** *** 293,299 } if ((a_name = getarg(args, 'n')) != NULL) ! pwd = GETPWNAM(pw_checkname((u_char *)a_name-val, 0)); a_uid = getarg(args, 'u'); if (a_uid == NULL) { --- 293,299 } if ((a_name = getarg(args, 'n')) != NULL) ! pwd = GETPWNAM(pw_checkname((u_char *)a_name-val, PWC_LOGIN)); a_uid = getarg(args, 'u'); if (a_uid == NULL) { *** *** 455,461 if ((arg = getarg(args, 'l')) != NULL) { if (strcmp(pwd-pw_name, root) == 0) errx(EX_DATAERR, can't rename `root' account); ! pwd-pw_name = pw_checkname((u_char *)arg-val, 0); edited = 1; } --- 455,461 if ((arg = getarg(args, 'l')) != NULL) { if (strcmp(pwd-pw_name, root) == 0) errx(EX_DATAERR, can't rename `root' account); ! pwd-pw_name = pw_checkname((u_char *)arg-val, PWC_LOGIN); edited = 1; } *** *** 595,601 * Shared add/edit code */ if ((arg = getarg(args, 'c')) != NULL) { !
Re: bonobo-activation core dump help
suken woo wrote: hi, all: setting the env with zh_CN.EUC ,and run X but got the following errors. pid 495 (bonobo-activation-s), uid 1001: exited on signal 11 (core dumped) Compile the bonobo-activation-s binary with debugging symbols, so that you can debugthe core file and see where it's crashing, and then correct the bonobo source code, so that it doesn't crash, and once that's done, fix the locale file, which appears to be out of date, to include the messages which are missing that the code is not handling properly. Bascially, bonobo is crashing on bad data, when it should handle it and not crash. -- Terry To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-current in the body of the message
Re: pw_user.c change for samba
Juli Mallett wrote: The '$' is a pain. None of the examples in the original post would have worked, because the '$' was not '\$', and the shell would have blown chunks over the variable expansion. Your foundation is flawed, we allow $ in passwd just fine, and the only problem here is whether a pw should let someone do something we support which they might need to do. Apply the patch. Then try to add a user with a trailing $ via adduser(1); Note the failure. -- Terry To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-current in the body of the message
Re: pw_user.c change for samba
Giorgos Keramidas wrote: On 2002-11-27 12:55, Terry Lambert [EMAIL PROTECTED] wrote: Will this open up a security hole for a nomal user account being used to compromise the domain system security? Probably 'yes'. I haven't tried this, but I guess one could name his machine Administrator. When that username is passed around, is it clear that it is a machine name and not a user name? I guess that if this way someone just might trick a remote SMB server that his username is 'Administrator' by changing his local machine's name, we have a problem... That's a namespace issue... they would still need a password. I think that a login class would fix it. That would mean that you could not have a user and a machine with the same name, but if you want to be technical, doing it the other way, I can't have a user named Administrator$ and a machine named Administrator, so either waym there's a namespace incursion. -- Terry To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-current in the body of the message
Re: pw_user.c change for samba
NAKAJI Hiroyuki wrote: David W. Chapman Jr. [EMAIL PROTECTED] wrote: David Wouldn't pw still have to be updated. I haven't looked at adduser but I David thought it was a wrapper for pw? No. My /usr/sbin/adduser, updated on Nov/23/2002 21:58 JST, does not call pw command. It adds account to /etc/master.passwd and invokes 'pwd_mkdb'. See 'sub new_users' function in /usr/sbin/adduser. There are two adduser scripts. One is perl, and one was written to use pw and provide the same semantics, in a shell script, as part of the perl purge that happened recently. One of them pukes on the trailing $, and the other doesn't. It's confusing, unless you caught that we were talking about most recent -current. -- Terry To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-current in the body of the message
Re: Problem pulling particular directory from CVS
Paul A. Scott wrote: setenv CVSROOT :pserver:[EMAIL PROTECTED]:/home/ncvs cvs login cvs co src/contrib When it gets to directory src/contrib/cvs, I get: cvs checkout: cannot open CVS/Entries for reading: No such file or directory cvs [checkout aborted]: cannot write CVS/Template file: No such file or directory Nothing hidden, totally forthright. Except that's a different error than the one you said before. 8-). This particular error usually when you are doing this as root, and have an overly-anal umask set. To correct it, you should delete the subtree from that point, and at an upper level, type: cvs update -d The subdirectories that would have been included in the original checkout will be brought in and created (-d), without you needing to repeat the checkout. Probably you can get around the problem by updating your 'cvs', Running 'cvs -v' on FreeBSD 4.5: Concurrent Versions System (CVS) 1.10 `Halibut' (client/server) This version breaks on checkout of src/contrib/cvs Running 'cvs -v' on FreeBSD 4.7: Concurrent Versions System (CVS) 1.11.1p1-FreeBSD (client/server) This version works. Thanks. I'll update my cvs. I still find it hard to believe you aren't using a particular tag; the other procedure outlined above should work for you with the old CVS against the error message you are getting now. One possibility is that the source tree you are doing has a stick tag set? In any case, if you have a workaround, you're probably more interested in the fact it works than in why. 8-) 8-). -- Terry To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-current in the body of the message
Re: ACPI problem with laptop?
John Angelmo wrote: Terry Lambert wrote: Is this a Dell Lattitude? They are known to have heat problems. There's also the possibility that the CPU is a desktop CPU in the laptop; people aren't supposed to do that, either, but it can crank up the heat. No it's a Evo N114 with an Athlon 4 in it, I think that this is a mobile CPU It may be that Windows ensures that the computer runs cooler by down-clocking it. Have you applied the most recent ACPI patches, and turned on debugging output (at least hw.acpi.verbose=1) to see if it fixes the problem (and if it doesn't, at least report what's going on)? -- Terry To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-current in the body of the message
Re: ACPI problem with laptop?
Terry Lambert wrote: Have you applied the most recent ACPI patches, and turned on debugging output (at least hw.acpi.verbose=1) to see if it fixes the problem (and if it doesn't, at least report what's going on)? It looks like the author of the ACPI code has already replied to your post; apply the patch he suggests, and turn on the debugging he suggests. He knows far mor about ACPI than the rest of us. -- Terry To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-current in the body of the message
[PATCH] Searching for users of netncp and nwfs to help debug 5.0 problems
Nate Lawson wrote: It's not so much that I volunteered as I said that I'd help with thread/proc issues.. The trouble was that there are places where it used a proc in the old code, but in some cases it needs to be a proc, and in other cases it now needs to be a thread. But all they stored was the proc. Also, from my memories of the code you needed to understand the protocol to know which needed to be which, and I don't know that protocol. In addition whoever does it needs to remember that any structure that stores a thread poitner is probably in error, as threads are transient items and any stored thread pointer is probably a wild pointer within a few milliseconds of being stored. :-) I'll take a whack at it and send it out by tomorrow, working or not. Don't bother. 8-). The attached patch makes it compile, and takes a shot at doing the right thing. The threasd stuff is problematic; it's useful only for a blocking context. The process stuff is there to identify the connection, actually, which can mean huge latencies (hence the caching of procp). It helps to know that the protocol is exclusively request/response per session, the current code handles only a single session per process (not one per thread), and that lock requests are answered bith synchronously and asynchronously (request/response, then async message on timeout or success). -- Terry smbfs_thr.diff.gz Description: GNU Zip compressed data
Re: [PATCH] Searching for users of netncp and nwfs to help debug 5.0 problems
Terry Lambert wrote: I'll take a whack at it and send it out by tomorrow, working or not. Don't bother. 8-). The attached patch makes it compile, and takes a shot at doing the right thing. Just a followup... select definitely won't work (IMO), but needs someone who is threads-savvy with kernel locks to deal with it; I cribbed lock flow from elsewhere, and it looks wrong to me. So this is *definitely* 56K of diffs that *only* address compilation completely, and not full function. -- Terry To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-current in the body of the message
Re: [PATCH] Searching for users of netncp and nwfs to help debug5.0 problems
First of all, the patch was just to get to the point of compilability, which other prople said they would take it from there. I don't have a NetWare server to test against in my apartment. I'd be just as happy to _let_ the other people who wanted to take it from there do do, now that I made it compile. That said... Julian Elischer wrote: some comments: firstly: ncp_conn_locklist(int flags, struct proc *p) { ! struct thread *td = TAILQ_FIRST(p-p_threads); /* XXX */ ! ! return lockmgr(listlock, flags | LK_CANRECURSE, 0, td); } can't you use unidifs? :-) Only if I want the code to be unreadable. 8-) 8-). Can't you apply the patch to a -current tree checked out for that purpose, and get whatever flavor of diffs you want, e.g. cvs diff -u? Unidiffs don't support fuzz, and uless you are committing the thing, I'd rather not have to recreate it interatively until it passes someone's filter. A context diff gives me that. ok there is a Macro to find the first thread in a process. FIRST_THREAD_IN_PROC(p) I didn't see this. It's definitely the way to go. Any chance of that being documented any time soon, so that no one else does the obvious thing, instead, like I did? I use it because it makes it easy to find the places that are DEFINITLY BROKEN. The marker for a KSE breakage is: XXXKSE, but places that use FIRST_THREAD_IN_PROC(p) are marked implicitly since nearly any time it is used, the code is trying to do something that doesn't make sense. (except in fork and maybe exit and exec, and in things like linux emulation where you KNOW there is only one thread). if you see TAILQ_FIRST(p-p_threads) (or FIRST_THREAD_IN_PROC(p)) you can pretty much guarantee that what is happenning there need to be completely rewritten, possibly to the extent of rerwiting it's callers and the caller's callers. That is the problem I hit when trying to convert it to start with... Actually, I really wanted a LAST_THREAD_IN_PROC(p). The reality is that I want the most recently inserted thread to use as a blocking context, on the theory that it would not be used until much later, and that all the pages it cares about are much more likely to be in core, particularly on a process with tons of threads. The only reason it's being used at all is that the lockmgr() class need a blocking context for their calls, and it's an explicit parameter (arguably, it should be curthread). I can edit the patch (since it's not a unidiff 8-) 8-) 8-)), or I can post a new one, if you want (it's only 9K, compressed). But realize that all it means is give me a thread that I can use as a blocking context while I'm waiting on lockmgr(). Since I don't know the way that process IDs come into the session control (you have to understand the protocol for that) I basically hit a wall on trying to work out what to rewrite, and how. The wire protocol is always request/response. Always. As I stated before, the only exception is a lock, with/without a timeout. In that case, you get the synchronous response to your synchornous request, which basically means request has been queue for servicing, and you later get a seperate notification. The notification is by connection. The connection is per process, because we are talking about a connection where the credentials are associated with the connection in question. The connection provides both a state context (waiting for request vs. request in progress). Making additional requests over the same VC will result in a request in progress, go away you dork response to the client. When an async response to a lock request comes in, in comes in on a seperate VC (each connection has 3 VCs, only 2 of which are normally used: the one for request/response, and the one for async notifications, which is overloaded to handle lock responses). The connection is mapped back to a process to map it back to the blocker. What this basically means is that NCP's can't be handled as multithreaded, without establishing a VC per thread. It just does not work. Therefore NCP request are going to serialize in the kernel, no matter what you do in happy-thread-town. If you are asking for the code to be thread safe, then you are basically talking about multithreading the whole stack: NWFS - NCP Client - IPX - IP Probably it would be a better idea if TCP/IP was multithreaded first? BTW the obnoxious FIRST_THREAD_IN_PROC will go away when we have got rid of most of the broken code and be replaced in places for which it is ok with p2td() which will be: __inline struct thread * p2td(struct proc *p) { KASSERT(((p-p_flag P_KSES) == 0), (Threaded program uses p2td())); return (TAILQ_FIRST(p-p_threads)); } Uh, how exactly is that less obnoxious, given it's the same code with a different name and an obnoxious inline instead of a macro? 8-). You can always get from a thread to a single process but the reverse always presents
Re: [PATCH] Searching for users of netncp and nwfs to help debug5.0 problems
Terry Lambert wrote: Did you want me to update the patch to use your FIRST_THREAD_IN_PROC macro and resend it? OK; here it is, whether you wanted it or not. -- Terry smbfs_thr.diff.gz Description: GNU Zip compressed data
Re: [PATCH] Searching for users of netncp and nwfs to help debug5.0problems
Julian Elischer wrote: The answer is that the code doesn't care what thread; it would prefer to not have to think in terms of threads at all, but if you want to force it to, then it's going to think in terms of blocking contexts for the benefit of FreeBSD code it calls, and nothing else. Hense the confusion as to whether to use a thread or a proc.. Not confusing at all. The only issue is references to the connection structure caches proc, which uses the first thread on the cached proc; otherwise, it uses the thread that was passed in. Did you want me to update the patch to use your FIRST_THREAD_IN_PROC macro and resend it? you could but the fact that FIRST_THREAD_IN_PROC() is used indicates that the whole thing is broken anyway. Your edits are mostly mechanical and don't actually solve the problem. To do that you probably need to actually rewrite some of it I think. They were _intended_ to be mechanical edits. It fixes the problem for the people who were willing to fix it, but didn't have any idea of how to do the edits. I can't really rewrite the code for you, without risking that Novell would claim that I did it with knowledge of the NUC implementation... you _do_ remember the last time Novell and BSD had an issue over code, right, back in 1994, after they bought USL? It's probably better that the patch I've done get to the people who volunteered to fix the code, once it could be compiled, and that the people who volunteered to help them with the threads issues do so. I've done as much as I can without legal risk. -- Terry To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-current in the body of the message
Re: ACPI and apm_saver?
John Baldwin wrote: On 22-Nov-2002 Terry Lambert wrote: Someone needs to write an acpi_saver.ko. No, they need to write a dpms_saver.ko instead. :) acpi doesn't really have the same functionality as far as screen blanking IIRC. You're right. Is it just me, or is there a lot of month old mail stuck in a queue somewhere? -- Terry To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-current in the body of the message
Re: -current unusable after a crash
Kris Kennaway wrote: On Mon, Nov 25, 2002 at 10:24:46AM -0500, Robert Watson wrote: I thought, this might be due to the priority of the background fsck and have once left it alone for several hours -- with no effect. The usual fsck takes a few minutes. We really need to disable background fsck if the system panicked. I've seen far too much bizarre filesystem behaviour that went away the next time I did a full fsck. I don't think this is really possible. I went looking for a generic application use CMOS are for this sort of thing a while back, and I was unable to find one. If you made system dumps mandatory (or marked swap with a non-dump header in case of panic), this still would not handle the silent reboot, double panic, or single panic with disk I/O trashed cases. 8-(. There was a discussion about these issues when background fsck first went in. My opinion of having it on by default is that if you are going to play that loose, you might as well mount the FSs async, and be done with it. -- Terry To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-current in the body of the message
Re: -current unusable after a crash
Mikhail Teterin wrote: On Monday 25 November 2002 12:24 pm, Kris Kennaway wrote: = On Mon, Nov 25, 2002 at 10:24:46AM -0500, Robert Watson wrote: = = I thought, this might be due to the priority of the background = fsck and have once left it alone for several hours -- with no = effect. The usual fsck takes a few minutes. = = We really need to disable background fsck if the system panicked. Otherwise, is there a need for fsck at all? Can sudden powerloss be reliably distinguished from a panic? No, nor from hardware failures (disk/controller/other), without NVRAM to save the crash reason in the case there is one. -- Terry To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-current in the body of the message
Re: mbuf header bloat ?
Bosko Milekic wrote: This is not entirely true. You can allocate an mbuf chain without holding Giant if the caches are well populated - and they should be in the common/general case. You can in fact modify the allocator to just not do a kmem_malloc() if called with M_DONTWAIT, but I don't think you'd want to do this at this point. In fact, one of the first changes I make in a kernel when I go to do a networking product of any kind is to allocate the mbufs in machdep.c out of physical RAM, and then pre-link them onto a free-list, instead of using the standard (comparatively very slow) mbuf allocator. The gist of the argument boils down to the fact that network buffer allocations have different requirements than general all-purpose allocations (by design, the last time I checked), and that is why an mbuf/cluster allocator exists. Everything allocated at interrupt has pretty much the same requirements. The only real difference in mbuf's is that the allocation failure cases are generally better handled than all other allocation failure cases within the kernel (or people would not have been beating up Jeff about a month ago for the kmem_map space issue). -- Terry To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-current in the body of the message
Re: mbuf header bloat ?
Bosko Milekic wrote: [ ... packet size distribution ... ] I am equally curious about this. One of the design assumptions for mbufs and clusters, according to McKusick et al. (and I believe another text which currently escapes me) is that packets are typically either very small or fairly large. Given the MAC label additions (yes it would be nice if this was done using the m_tag interface but at the very least one can say that they are implemented fairly 'consistently' despite the fact that they appear imposing to the general mbuf structure), and the currently available data region in the mbuf, it is absolutely necessary to know whether the assumption of packet size distribution still holds before a decision is made on how to modify the MAC label implementation - if at all. In fact, it is even more useful to consider the idea of variable sized mbufs. The actual size you want is whatever size is needed for the incoming packets for the MTU of the sender. Practically, this means 8K (a compromise on the 9K jumbograms vs. page size), 1536 (512*3), etc.. I get concerned with all this decoration of mbufs (MAC vs. m_tag vs. whatever) that people are doing, since this type of thing is going to reduce overall capacity more than m_pullup(), etc.. -- Terry To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-current in the body of the message
Re: I'm impressed, but ...
Philip Paeps wrote: On 2002-11-25 14:41:22 (-0500), Hiten Pandya wrote: On Mon, Nov 25, 2002 at 01:49:34AM +0100, Philip Paeps wrote: | unknown: PNP0401 can't assign resources (port) | unknown: PNP0501 can't assign resources (port) Can you try changing the hardware tunable, hw.pci.allow_unsupported_io_range, to the value of 1 in your loader.conf. I think this should do it. You can then check this value after you booted by `sysctl hw.pci`. I'm afraid that doesn't cure the 'problem'. I think Hiten responded based on the can't assign resource messages, without reading all the way through; I sometimes do kneee-jerk responses to problem reports, as well. The reason his advice didn't help you suppress the messages is that the failure is in port and IRQ assignments, not in memory window assignments. The problem is related to multiple claimants for the device: the BIOS, vs. the OS. If you change the BIOS settings for PnP OS, the messages should go away. Note that the messages are just warnings; they will not make anything not work, given your configuration. The maildirs issue, I won't comment on, at this time. -- Terry To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-current in the body of the message
Re: mbuf header bloat ?
Bosko Milekic wrote: [ ... memory allocator ... ] FWIW: The Sequent Dynix allocator paper has been converted, and is now available online: Experience With an Efficient Parallel Kernel Memory Allocator Paul E. McKenney, Jack Slingwine, Phil Krueger Sequent Computer Systems, Inc. Software Practice And Experience http://citeseer.nj.nec.com/484408.html This is the same reference that is in the books UNIX Internals: The New Frontiers and UNIX For Modern Architectures. It's the reference I always give out, when locking in allocators comes up, but now I have something other than a ratty photocopy to give people. 8-). -- Terry To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-current in the body of the message
Re: -current unusable after a crash
Kris Kennaway wrote: On Mon, Nov 25, 2002 at 02:02:14PM -0800, Terry Lambert wrote: I don't think this is really possible. Yeah :( If you made system dumps mandatory (or marked swap with a non-dump header in case of panic), this still would not handle the silent reboot, double panic, or single panic with disk I/O trashed cases. 8-(. And the panics that affect the disk/filesystem are likely to not give a crashdump, but at the same time are likely to cause FS problems for bgfsck :-( Actually, the worst problems come when the corruption does not result in a crash subsequently. If you just crashed again, you could simply set in the superblock a flag that said background fsck in progress, and if that flag was set at boot time, then do a full fsck (knowing you died during a background fsck). If you don't get a second crash, and you reboot, you're screwed. You could add another utility to say force full fsck -- basically, to set the flag manually. This is a pain because you have to do it through an fcntl() or ioctl(), since there are no block devices to use to do the work, and you can't open a mounted device to write it, even if you know what you are doing, the OS enforces like it's smarter than you. We ran into exactly this same problem in the InterJet, when we first paid Kirk to have soft updates ported to FreeBSD (I actually did the preliminary make it compile work, and Julian did most of the debugging; I helped some after that, but my boss didn't like me doing it). The point was to get rid of the need for a UPS in the InterJet. A log structured FS doesn't actually have this problem, but is a real pain because of the need for a cleaner to run constantly, to garbage collect, which makes thing that used to be deterministic time take variable time. Not very good for multimedia or streaming content serving. The InterJet handled this by having a DC holdup time following AC failure notification, which was enough to throw a stick into the spokes, to prevent the wheels from turning, and the bicycle falling over the cliff. Another way to handle it would be CMOS, with a BIOS initialization (e.g. set bit 1 of the crash state) that didn't effect the bits that indicated the failure mode. Unfortunately, the computer manufacturers have not really agreed on a standard for this sort of thing, nor do they think anyone in OS space or userland should be able to own a section of CMOS memory (no OS allocation policy, tagging, etc.). -- Terry To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-current in the body of the message
Re: -current unusable after a crash
Brad Knowles wrote: At 2:02 PM -0800 2002/11/25, Terry Lambert wrote: If you made system dumps mandatory (or marked swap with a non-dump header in case of panic), this still would not handle the silent reboot, double panic, or single panic with disk I/O trashed cases. 8-(. How about we do the safe thing, and only do background fsck if we can prove that the system state is something where it would be suitable? Or would that mean that we almost never do background fsck? It would mean that you can *never* background fsck safely. The problem is that you need to distinguish a power failure, which is technically the only safe time to do it, from all other failure modes. You can distinguish, at least on R/W FS's, whether or not to do any fsck (by looking at the clean bit), but all other bets are off. One approach that works well for desktop systems is to implement a soft read-only. We did this at Artisoft in 1995/1996, when we ported the VFS stacking framework to Windows 95, and first implemented a soft updates for FFS/UFS, which we ported to run on Windows 95 under the stacking framework. The way a soft read-only works is to leave the FS mounted read/write, and then insert at write attempts, everywhere that read-only is checked, a check for a soft read-only bit on the in-core superblock. Basically, we flush out all writeable state to the FS, and then set the clean bit in the superblock, and flush it to disk, if I/O on the FS has been idle for a while. Then, when someone wants to write it, we reset the dirty bit, flush the superblock back out to disk, and, once we know that the change has been committed to stable storage, we permit the write operation to continue. There's actually some problems that now exist in the sync code in FreeBSD that result in unnecessary writes to the disk, these days, which make it hard to implement this (the system basically sync's disk buffers that don't need to be sync'ed, at intervals); that would have to be fixed before such a system can be used. The result is a box you can just turn off, without trashing the FS, assuming it's relatively quiescent, relative to FS writes (e.g. desktop systems, as I said at the top). Similarly, if the system were to panic, lose power, whatever, at this point, then the FS's would be clean, and come back up with no need to fsck. -- Terry To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-current in the body of the message
Re: -current unusable after a crash
Marcin Dalecki wrote: I don't think this is really possible. I went looking for a generic application use CMOS are for this sort of thing a while back, and I was unable to find one. Well you should please take a look at the fast boot option of moderately modern BIOS-es. Somthing along those lines went right now in to the linux kernel. Seems pretty adequate to me, since you would be even able to controll it through the BIOS setup... Is there documentation available for this anywhere? The BIOS vendor documentation, not the Linux source code. My gut feeling is that this isn't going to be too helpful, without AC failure notification with a DC holdup time. The problem is that the best case is power failure, and the worst case is a corrupted GDT and a double panic off a trap 12 in the trap 12 handler (such that you would get a trap 12 when you tried to write back to the CMOS that this was the worst case, not the best case). Basically, you are still stuck needing power failure notification, so you can write the cause of the failure back. At startup, you have to set the saved state to worst possible failure: no way to update cause of failure in CMOS, and then back off to softer failure modes from there. I think this Fast boot stuff is useful, but the way it's useful is if your main memory is reflected to a seperate area of the disk, so that you can bring up the system image very quickly. Basically, it means that it's not at all useful for the problem at hand, unless it provides for power fail notification. -- Terry To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-current in the body of the message
Re: I'm impressed, but ...
Philip Paeps wrote: The maildirs issue, I won't comment on, at this time. I hope I can provide enough information for someone to solve it though :-) It would be nice to be able to read my mail 'reliably' :-) The problem is not the amount, but the type of information. You need to characterize the problem well enough that you can write a little program that can repeat it on someone else's machine, without them having to create an installation identical to yours on a scratch box ...particularly when it looks like if they tried that, it would work for them. Right now, there are other people using the same software that can't repeat the problem. Without knowing whether or not you are both/neither/or-or-the-other using NFS, etc., it's really impossible to even point you in the right direction (NFS is my hunch, in this case; it's a common reason for use of maildirs, to try and side-step locking issues). You probably need to get together with the other person who said they were *not* having a problem, and do a detailed compare on system configuration, if all other things are equal. -- Terry To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-current in the body of the message
Re: -current unusable after a crash
Dan Nelson wrote: Is there documentation available for this anywhere? The BIOS vendor documentation, not the Linux source code. http://www.microsoft.com/hwdev/resources/specs/simp_bios.asp http://www.microsoft.com/hwdev/resources/specs/simp_boot.asp is the best I could find; you'll need a Word doc viewer. It's mainly geared toward detecting boot failure rather than abnormal shutdowns, though. What we need is a matching Simple Shutdown Flag variable. The license you have to agree to to download it permits implementation for firmware, but not for the OS: 1(a)(iii), 1(b), 2(b)(b), 3. According to the documentation at the end of the page of the URL you posted above, the OS has to be full PnP compliant for it to work as expected, and multiboot is not supported. For thise interested in pursuing this, more information (no license agreement required) is available from: http://www.microsoft.com/hwdev/platform/performance/fastboot/fastboot-winxp.asp Though I expect you won't be able to implement without the specification. I guess looking at the Linux code as a reference is OK... they violated the license, not you, so it's not the same thing as you violating the license. -- Terry To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-current in the body of the message
Re: Objective-C threads
[ ... Objective C ... ] Chad David wrote: And I thought this thread was dead :). It just showed up in the inbox last night; it must have been stuck in your mail server. Sorry about that. I don't really feel a need to convince. If people are too busy (or just do not care) to maintain ObjC within FreeBSD, then I'll just have to do it locally. That's kind of what I was implying would be the correct course of action for a while. 8-). I have gotten literally hundreds of patches into FreeBSD by ignoring the FreeBSD process, and submitting the patches back to the vendor from which FreeBSD obtains the code, so this is a success strategy. Manipulation is a life stategy :). Anything that works is better than anything that doesn't. 8-). -- Terry To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-current in the body of the message
Re: DP2 Fatal Trap
Scott Sipe wrote: I didn't make a backup copy (or mark down the errors) of the bad file or try rebooting which in retrospect would have been a good idea..sorry--I just fixed the file and saved it so I could compile some ports--and that worked. Just FWIW: if it was a transfer error, a backup copy would have been made from cache. As such, it would only be useful in seeing if the data was zeroed. It wouldn't help in the lost word during burst transfer case, or the reboot to test for self-healing case. If it happens again, then: (1) hd the file, and determine if the data was lost, or is merely not displaying because it was zeroed, and (2) reboot to see if it heals. Both of these are very imporant tests. I have an IWILL KK266 motherboard which has a AMI MegaRaid controller and a VIA Apollo KT133A chipset. The FreeBSD drive is primary master ad0 on the via ide line (both Current and Stable are on the same disk). I have a dvd drive and a cdrw on the secondary channel. Then 2 harddisks, one each on the RAID controller (I use the bios to alternate which drives are used for booting--the RAID or the IDE) [ ... other information ... ] OK. None of this resembles hardware for which bugs have been reported to the -current or -hackers mailing lists. This is an affirmation (but not positive confirmation) that the problem is not in the disk controller, disk, or FreeBSD driver. The fact that you'v not had strange probelms with -stable indicates for certain that it's not a disk or controller problem. That leaves other bug (which is what I thought in the first place) or driver bug. I don't think it's a driver bug, but I can't prove it isn't. 1) Yes it happened with a generic kernel straight off the DP2 install CD. OK. No recompiles, fancy driver load directives, etc.. If John Baldwin wants to try and repeat your problem, he *may* need a copy of your rc.conf. DO NOT SEND THIS UNLESS REQUESTED TO DO SO. 2) I had the problems directly off DP2 iso image burned cd install, so can that tell you what you need to know about the cvs date or do you want me to do more? OK; what this means, because there was no tag laid down, and there was not a published checkout datestamp that can be used to duplicate a -current system (according to John, it's a checkout of -CURRENT, hacked to change the name to DP2 for the build), is that I will have to build a known kernel locally, so that I have source tree that duplicated the failure for you. Do you have it booting DP2 enough to replace the kernel, or is it fully reverted to -STABLE at this point? It would be very hard for me to build a full release CDROM ISO image and transfer it, without sending it through the mail. 3) Yes, I'm at college on a fast connection (though with a limited upload) so if you need to I can setup an ftp login for you on my computer. If you can live with just kernel replacements, then if you can set this up, I can give you a kernel which we will then hope *that it fails* as soon as tomorrow, or whenever is convenient, and, after you verify that it does, indeed, fail, then I can do the fix and give you another kernel 2-3 days after that (depending on the porting required, since it involves assmebly language). Let me know. -- Terry To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-current in the body of the message
Re: Lots of swapping from 'kldload acpi'
Bruce Evans wrote: The existence of SYSINIT is a bug, but shouldn't the order of SYSINITs be such that they are run before MOD_LOAD events? SYSINITs have no way to communicate failure, so they are especially broken when they are used to allocate resources. Users of the resources have no way to tell whether the SYSINITs worked except to check for the existence of the resources before using them, but if they can do that properly then they can just call the resource allocation functions. The SYSINITs in the module load case can communicate failure, since it's the module load routine that handles them. The real answer here is that SYSINIT is intended for system initialization time operations. IF the system fails to initialize properly, there's really no way to communicate failure (for example, a failure prior to the console being up is very hard to report on a still-born console that has not be setup or initialized to do reporting). The function is void, because it makes no sense to trap an error condition you can't handle (Dennis Ritchie said this). It's true that the functions are void; however, in the context of the driver, they call functions which are locally defined, and therefore could update a static errno-type value, very easily, to report their status. Alternately, they could call a callback function, which was defined only in the module load case. I can provide sample code, if necessary. -- Terry To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-current in the body of the message
Re: Autodefaults in disklabel on 5.0dp2 install
Garance A Drosihn wrote: This is something I noticed while installing 5.0-dp2. I'm not sure how much we'd want to change it. I'm installing dp2 on a 4-gig disk. I want to split that in two, with dos for the first 2 gig and freebsd in the last 2 gig. When I got to the disklabel step, I tried the Auto Defaults option to split up the freebsd partition. It picked partition sizes of: 128 meg - / 1231 meg - swap space 208 meg - /var 208 meg - /tmp 83 meg - /usr This is a machine with 768 meg of memory, but I think the install is more likely to work with a less swapspace and something more than 83 meg for /usr. I know it's tricky to come up with an algorithm which will pick decent sizes for every combination of disk and memory sizes, but perhaps it should wire in some kind of minimum size for /usr. Also, maybe something to the effect that neither /var nor /tmp should end up larger than /usr. I have not looked at the source, so maybe it's just a simple case of the swap calculation being done based on the size of the hard drive instead of the size of the freebsd partition. The default swap size calculation is done on the basis of a multiple of the physical memory size. Specifically, the physical memory may be completely consumed by kernel structures, up to the KVA size, and therefore in the case of a system dump, it can require up to the size of the physical meory, plus 64K, at a bare minimum for a successful system dump. So even if you were to reduce the swap size, you should not reduce it below 768M + 64K. You will, of course, agree that a prerelease named DP2 should have the ability to successfully system dump, as that is one of the primary reasons it's being handed out: to catch problems, and to provide detailed bug reports about them, sufficient to correct them before the official release. You should be able to take memory away from swap, even after the Auto, and give it to /usr. If you choose to give less swap than is necessary for a system dump, expect no help with problems with the system which may arise during your testing: not because no one wants to help, but because you aren't going to be able to provide sufficient information to enable them to help. If you can accept that limitation (i.e. you are trying DP2 to find problems in user space software which needs to run on it, ONLY), then fine; otherwise, you might want to consider using a bigger disk and/or removing some RAM. -- Terry To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-current in the body of the message
Re: DP2 Fatal Trap
Vallo Kallaste wrote: You got me wrong. I'm user and do not know and don't want to know about any CPU architecture and bugs. But I've got problems and simply trying to provide any data possible to gather by myself. Either CPU hardware or software bug, fine. You're claiming to know the bug and possible fix, but don't want to publish it, fine. I do not object to publication of code that embodies a workaroun to the poblem, so long as that workaround doesn;t specifically disclose the root cause problem itself. I don't want to think about it because with my knowledge this is going to nowhere and only wasting my time. Things you see above are my results using consistent testing environment, take it or leave it. I'll stick with DISABLE_PSE enabled and DISABLE_PG_G disabled for the time being. I'll make the same offer of a fixed kernel binary, for testing purposes, if you are willing to test two: one to be sure that there is no serendipity involved, and one with the patch. We can skip the first one if you can give me a CVS date or tag to checkout to get code identical to code you have locally, which has the problem. E.g. if you have a local copy of the CVS tree, and you check out with a date tag of, say, last Wednesday, and the kernel you build from that coe ha he problem, then I can check out identical code, patch it, and give you a binary to try. -- Terry To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-current in the body of the message
Re: Autodefaults in disklabel on 5.0dp2 install
Garance A Drosihn wrote: Hmm. I hadn't really thought much about the specifics of what is needed. I was just wondering if we might want to think about the auto size algorithm a bit more. The value, by default is 2 * `sysctl hw.physmem`. This is kind of ugly, because it doesn't include the space for the kernel itself. There isn't a sysctl variable with the actual size of the real physical RAM, without the kernel and page table space subtracted out of it, unfortunately (I tried to get a patch in on this a year ago, last June). So even if you were to reduce the swap size, you should not reduce it below 768M + 64K. Well, I can see that the auto default partition mechanism should probably take that into account too. I'm just saying that the current algorithm gives the user (any generic user, not me specifically) a useless result. It would be nice if it came up with more usable sizes. The 2 * physical memory *is* a useful size, for most cases. The base assumption here is that you installed that much memory for a reason, and you intend to use it. That yields a proper swap size. It also permits you to (almost) double the amount of physical RAM installed on the machine, and still successfully crash dump, so it gives you room to upgrade your hardware. You will, of course, agree that a prerelease named DP2 should have the ability to successfully system dump, as that is one of the primary reasons it's being handed out: to catch problems, and to provide detailed bug reports about them, sufficient to correct them before the official release. Well, I mentioned this now with an eye towards 5.0-release, although I realize my original message didn't indicate that. This same issue comes up when installing recent releases from 4-stable. My point is that the resulting partition sizes (in this case) are unusable. There is no point in worrying about the ability to save a system dump on a system where the initial install has pretty close to zero chance of succeeding. It's very difficult to get disks that small these days, without partitioning for multiboot, or going to a back room somewhere, and blowing dust off a box. 8-). I think the most common place this would happen is someone trying out FreeBSD. That makes it an important problem for -RELEASE, I think. Not knowing what people intend to install up front, it's hard to know how much space is ncessary for various files. I think someone needs to come up with minimum ratios and hog partitions -- like swap -- that can be stolen from in order to get a running system (it should also spit out an alarming message with justanOK button (to steal from Windows 8-)) about not being able to support crash dumps. I don't think anyone would object to patches; the code you want to hack will be in /usr/src/usr.sbin/sysinstall/label.c. Let me repeat that I'm not sure how much we'd want to change it. I just wanted to point out how the current algorithm behaves when given this particular combination of disk and memory sizes. Spitting out Insufficient resources is a sure way to scare someone off. 8-(. -- Terry To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-current in the body of the message
Re: fsck's, current vs earlier releases
Garance A Drosihn wrote: I have 4.6.2-release, 4.7-release, and 5.0-dp2-release on a single PC. After some bouncing between versions, and an occasional 'disklabel' command, I seem to have the partitions for 4.6.2 in an odd state. Both 4.7 and 5.0-dp2 have no problem mounting them, but if I try to boot up the 4.6.2 system it fails because 4.6.2 finds that values in super block disagree with those in first alternate. 4.6.2 wants me to 'fsck' the partitions manually, but I *think* I remember that using the older fsck might cause trouble. Yes. You need to recompile a -STABLE fsck on the older version of the OS, so that it can do the right thing about the don't care areas of the superblock. A generic fix would grow don't care regions down, and care regions up, with a boundary offset (which started in the top 25% for forward compatability, and grew down), above which you didn't complain about differences. It's too late to implement that, now that people have hacked up the superblock, an there is an existing installed base with it hacked the wrong way for automatic binary compatability. Right now, all you can do is compile a new version of fsck for your old version of the OS, to make it ignore differences in those areas. - Terry To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-current in the body of the message
Re: ACPI problem
Ertan Kucukoglu wrote: I want to use the power key to shutdown the system. It is compaq evo 300, P4 1.6ghz, 368MB RAM Yesterday OS was 5.0DP2. I can not power it off. It comes to a point when it should cut the power off 'System is shutting down using ACPI' like message is displayed and after a while. It just panics at free(). This morning I cvsuped and buildworld the machine. This time it do not panic, but 'Timeout' error message comes and system reset itself leading a new boot.' Is this problem because of my hardware or something else? You will most likely need to dump your ACPI table and file a bug report, assigning it to the ACPI code owner, who will then tell you what is wrong with your BIOS, an ha yuned to do to fix it (or give you a patch to th ACPI code in FreeBSD, to make it tolerate your BIOS). The first step will be a bug report. -- Terry To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-current in the body of the message
Re: gcc 3.2.1 release import?
Vallo Kallaste wrote: On Fri, Nov 22, 2002 at 10:03:50AM -0800, David O'Brien I would like to see GCC 3.2.1 release be our 5.0-R compiler. However, the GCC 3.2.1 release date kept slipping and in fact was nebulous for a while. The same for our 5.0-R. So this has made it hard to decided what to do. I suspect GCC 3.2.1-R wouldn't cause us much or any problems. But the question is does the project as a whole have the resources to deal with any problems that do creap up? It is a hard judgement call. Somebody with knowledge and time should generate patches, so it's possible to at least test and report problems (or success). Given that enough people give it a try and report, there's possibility for import, IMHO. But who will bell the cat? I vote for Snuffles. -- Terry To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-current in the body of the message
Re: gcc 3.2.1 release import?
Vallo Kallaste wrote: On Sat, Nov 23, 2002 at 02:52:27PM -0800, Terry Lambert [EMAIL PROTECTED] wrote: Somebody with knowledge and time should generate patches, so it's possible to at least test and report problems (or success). Given that enough people give it a try and report, there's possibility for import, IMHO. But who will bell the cat? I vote for Snuffles. Don't understand. Some inside joke or something based on US centric TV? What are you trying to tell me? Remember I'm not native. 1930's/1940's cartoon character, Snuffles, the mouse. Was in a lot of cartoons. Played the Why? game that children play, to the annoyance of his chosen victim, and the amusement of the people. One story was a script based on the Aesop's Fable about belling the cat. Here is a short reprise: 1) All of the mice decided that Something Needs To Be Done About The Cat(tm) 2) They had a big meeting 3) Finally, one mouse, who no one listened to very often, suggests that they put a bell around its neck, so they will be able to tell whic it's coming, and escape, to live in peace, in their mousey ways 4) No one wants to bell the cat; it's a perfect idea, which lacks for implementation, and there are no potential implementors to take the risk on behalf of the group In the cartoon version, at this point, the mouse who made the suggestion is volunteered by his comrades for the deadly duty (Snuffles). -- Terry To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-current in the body of the message
Re: gcc 3.2.1 release import?
Steve Kargl wrote: Don't worry about it; it's being totally blown out of proportion; there's no way anyone will commit to importing a 2 day old 3.2.1, which is why I put the smiley's there. Well, the 2-day old 3.2.1 fixes numerous problems with our 3.2.1 [FreeBSD] 20021009 (prerelease). Compiling this void ice(int m, int n, double *f) { int i, j; for (j = 0; j n; j++) { for (i = 1; i m; i++) { f[i] = (double) (i * j); f[i + j] = (double) ((i + 1) * j); } } } with gcc -O2 -c yields an ICE in FreeBSD-current. The 2-day old gcc 3.2.1 does not blow chucks on the above code. What does it do for all the other code in -ports, and in the comp.source.* archives, and that anyone else has ever written, such that you know it doesn't cause more problems than it solves? Supposedly, bringing in 3.2 was going to solve more problems than it caused. It turns out the 4.x compiler, GCC 2.95.3, also does not have an ICE as a result of compiling that code. What is food to one, is to others bitter poison. -- Titus Lucretius Carus When you are updating tools, it's actually about risk/reward; the risk of not supporting IA64, and the risk of the object file compatability has (supposedly) be addressed. The only other reasonable path would be to tie FreeBSD releases to GCC releases, plus some period of time for burn-in, and that really isn't reasonable: 3.3 was supposed to be out already; should FreeBSD's release schedule slip every time GGC's slips? -- Terry To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-current in the body of the message
Re: Searching for users of netncp and nwfs to help debug 5.0 problems
Robert Watson wrote: It sounds like there are a couple of problems here -- that we need a debugging guide (How to prepare a useful bug report for a kernel panic, How to prepare a useful bug report for a sysinstall failure, etc) A bug-filing wizard would be useful. The send-pr system doesn't cut it, and most people are unaware of how to file a decent bug report. It doesn't help when the process involves another computer, a serial cable, recompiling a kernel to use a serial console and turn DDB support on, special configuration for system dump images, and changing the size of your swap partition to support the amount of RAM you have put into the machine. The bug-filing process has to be self-contained, and would be best served by a literal transcription of the message that comes up as a result of the problem. The number one thing that can be done, though, is completely and totally within FreeBSD's control: unique error reports for wach possible error. This is more or less a design issue. -- that we need a better way to find developers on a particular topic who are willing to pick up more debugging burden. Most developers do not like to clean up after the messes they make; this isn't unique to FreeBSD, but FreeBSD does seem to have a larger number of prima donnas than other projects. There has also been a lot of kingdom-building; John Dyson, who I still greatly admire, used to squat on six or seven distinct areas of the kernel, but only had time to work on one or two of them, even when Oracle was paying him to work on FreeBSD full time for their Network Computer product. There were warm bodies willing to work on several of the areas he felt he owned. Simultaneous progress was possible in all of these areas at the same time... IFF the people had been allowed to work on the code, rather than being told No, I know the answer there, and I will fix it soon. The time of soon never came, and the effort proceeded serially, and less progress was made. Respectfully, I submit that the same thing happens on a daily basis, and that John Dyson was not the only one who was trying to juggle too many balls at the same time, though I will not name names: you all know who you are, and most of us in the peanut gallery know, too. I would have guessed that, in general, problems with finding a responsible party developer would lie more in the areas of the system that don't have an active maintainer (vis owner), which is a harder problem to address. If that's not a correct impression, then it's something that's probably easier to fix :-). I think, in general, that FreeBSD attracts just as many developers as Linux, or any other project, but fails to let them in. One approach might be to decide rough ownership of areas of the system: if people are going to act like they own the areas, then make them explicitly responsible. Do this at a sufficiently high granularity, and you'll see that certain individuals own perhaps dozens of areas. Then add a field to the PR: area owner. Go through each of the PR's, and add this field. Don't let an owner do any additional work until they've closed that PR, one way or another, to the satisfaction of the submitter (this is to ensure that screw you is not a satisfactory resolution). Put a time limit on it. If the PR contains a patch, and the owner does nothing in the allotted time, then give the PR submitter a commit bit, and give ownership of the area over to them. At the very least, PR's will be closed, and more people actively writing code will end up with commit bits. -- Terry To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-current in the body of the message
Re: No entries in /proc :: feature or problem ??
Robert Watson wrote: (2) truss currently relies on procfs, albeit not working very well. There were a set of patches floating around to make truss use ptrace(), which is the direction we probably do want to take this. If someone could finish up that work, it would be great. The reasons to deprecate procfs are many-fold -- not least that there are existing interfaces in the kernel that provide most or all of its features at a substantially lower risk. You just have to see the kernel-related security advisories for FreeBSD, Linux, Solaris, etc, over the last five years to understand why we want to turn it off if we can. :-) It would be nice if a condition of turning it off were a working truss. A priori. -- Terry To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-current in the body of the message
Re: ACPI and apm_saver?
Donn Miller wrote: ACPI works pretty well on my HP Pavilion Notebook. But is it possible to get the apm saver (apm_saver.ko) to work with ACPI? From what I understand, acpi is a superset of apm, and is able to provide some emulation of apm functionality. So, by this principle, shouldn't apm_saver work with acpi? APM and ACPI are mutually exclusive. A similar question to yours might be: I had a Toyota Corolla, and I've traded it in on a Mack Semi Tractor; can I use the floor mats from my Corolla in the new Semi? 8-). Someone needs to write an acpi_saver.ko. -- Terry To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-current in the body of the message
Re: gcc 3.2.1 release import?
Steve Kargl wrote: Supposedly, bringing in 3.2 was going to solve more problems than it caused. It turns out the 4.x compiler, GCC 2.95.3, also does not have an ICE as a result of compiling that code. You know the reason why 3.2 pre-release was brought into the tree, right? GCC has changed the C++ ABI between 3.1.1 and 3.2. If FreeBSD 5.0 shipped with 2.95.3, then 5.x would use 2.95.3 until 6.0 was released. Try getting support from the GCC folks for 2.95.3. I'm well aware of that. I was merely pointing out that all compiler versions have different bugs, and you might as well suggest a known quantity instead of an unknown one, if your sole goal in life is to avoid a particular internal compiler error, instead of looking at all the code involved. I respect David's judgement about bringing 3.2.1 into the tree, but your statement above (totally blown out...) suggests you don't follow GCC development. Several significant bugs were fixed between our pre-release version and 3.2.1. I *understand* that they fixed several bugs that are present in the pre-release, and they *hope* they didn't introduce any new ones. Given their track record in this regard (e.g. the internal compiler error in 3.2.1-prereelease that wasn't there in 2.95.3), I have little faith in their hope. Unless someone is willing to stand up as a shield to personally take the slings and arrows from any new compiler bugs, which *might* range up to and including delaying the 5.0-RELEASE as a result of it, after import and bmake, not compiling some things that worked with 3.2.1-prerelease, it can wait until after the 5.0-RELEASE. As you yourself pointed out: the C++ ABI change is in already, so it's no longer the substantial risk it used to be. Unless there's another ABI change (which the advocates of importing the prerelease assured us there would not be), then the only thing that not updating breaks is the example code that was posted, and I think we can all live with that until at least the day after the 5.0-RELEASE. 8-). -- Terry To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-current in the body of the message
Re: DP2 Fatal Trap
John Baldwin wrote: Is it any help to know that my problems on P4 stopped after enabling DISABLE_PSE? Initially I had both of these enabled, but seems that one is enough. Just FYI. If we can verify that DISABLE_PG_G has no effect then that would be nice. It has an effect: writing CR3 or a TSS resulting in a changed CR3 will not invalidate TLB entries with the G flag set, if PGE is set in CR4. I know what PG_G does, Terry. My question is that if DISABLE_PG_G has no effect on the _problems_ people are having. It can have an effect, if the problem is being exhibited on a P3 or an AMD processor, but not on a P4, unless it has 512M of memory; the jury is out on other memory sizes, after Matt Dillon's dynamic sizing changes (my own suggestion in this area was to conservatively not go overboard in allocating a multiplier of physical memory for page mappings, when doing so would push the space the mappings could cover well over the available physical address space, if you'll remember). I think the processor in the bug report that started this thread was an AMD K6? There really is a CPU bug, John, and the new FreeBSD locore.s code is triggering it. A stock FreeBSD 4.4, for example, will not exhibit this problem, on the same hardware. -- Terry To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-current in the body of the message
Re: DP2 Fatal Trap
Vallo Kallaste wrote: I have now definitive answer for _my_ case and environment. Just finished full package build for my workstation bundle port (92 ports), including XFree-4, KDE3, mozilla-devel and whatnot. It all went very well running kernel which had: DISABLE_PSE enabled DISABLE_PG_Gdisabled Are you interested of the reverse? Can it be that enabling DISABLE_PSE incorporates DISABLE_PG_G somehow? I give up. You guys obviously still think it's a software problem that you can characterize and fix using binary elimination to find the offending code. It's not. -- Terry To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-current in the body of the message
Re: DP2 Fatal Trap
Scott Sipe wrote: It was a trap 12, and definitely that address...I think something more overarching must be going on though. I'm able to login with /bin/sh (not csh/tcsh) and so I've been trying various things--I can't compile a kernel because I get bus errors, same with many ports I've been trying to install. pkg_adding seems fine. Any chance this could be acpi related? How about this... o Are you using a GENERIC kernel? o Do you have a timestamp that can be used to check out a /usr/src/sys from CVS that will let me build the same kernel? o Do you have a place I can upload two or more 3/4MB kernel files for you to try? Let's say the answer to all three questions is yes. Assuming I can build you a binary kernel from your sources which then fails on your machine, I believe I can fix the problem, and give you a new binary kernel that fixes it, if it's the problem I think it is. That way, we all win: you get a working kernel, and I get to convince people that the problem is what I said it was in the first place: a CPU bug that has to be specifically worked around. -- Terry To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-current in the body of the message
Re: gcc 3.2.1 release import?
David Schultz wrote: It really comes down to a question of living with known bugs, or risking gaining a new set of unknown bugs. In theory, the set of bugs in an actual release should be smaller than the set of bugs in a prerelease. In theory, practice will be the same as theory. 8-) 8-). -- Terry To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-current in the body of the message
Re: Searching for users of netncp and nwfs to help debug5.0 problems
[ ... bug filing wizard ... ] Brad Knowles wrote: Speaking as someone who is about to step off the deep end and start trying to actually run and test -CURRENT on my system here at home, I believe that this kind of resource would be vitally important. In contrast, I've had a few crashes this past week from other programs here on my PowerBook G4 running MacOS X (primarily Chimera, based on the Mozilla Gecko engine with native Aqua interface), and they have made it very easy for me to report crashes. They have integrated tools to extract the maximum amount of information from the system as to exactly what other programs were running, what the program stack was, and a whole host of other things. All I have to do is type in my e-mail address, optionally describe what I was trying to do at the time, and have a functioning Internet connection so that they can upload the reports. I'd share some examples with you, but they are *huge*. The amount of information is much less important than the utility of the information to someone who is attempting to resolve the bug. Most bug reports contain a lot of information that is really useless, not topical to the bug in question (I was running XYZ and the kernel paniced!), etc.. In terms of kernel problems, the absolutely most useful information is the DDB traceback, followed by a DDB traceback mapped against a debug kernel, followed by a system dump image and a debug kernel, etc.. By default, the system, as distributed, is not setup to supply this. In terms of general problems, diagnostic messages are pretty lame; setting aside the argument against data interfaces not being ripped out being in itself lame, one bix example is the libkvm error message kinfo_proc size mismatch (expected %d, got %d). An error message should tell give you enough information that you could deal with the problem reasonably; for the example message, a better message would be: _kvm_err(kd, kd-program, kinfo_proc size mismatch (expected %d, got %d), sizeof(struct kinfo_proc), kd-procbase-ki_structsize); _kvm_err(kd, kd-program, recompile libkvm and recompile and reinstall %s, kd-programsrcdir); Now, we can say that running -CURRENT is not for people who want to be molly-coddled. But I believe it's a good idea to give people better tools to help make a better system. I am convinced that we can find a better compromise. Plus we aren't really talking about -CURRENT, we are talking about 5.0-DP2, or almost 5.0-RC1, if we're being honest. If the PR contains a patch, and the owner does nothing in the allotted time, then give the PR submitter a commit bit, and give ownership of the area over to them. At the very least, PR's will be closed, and more people actively writing code will end up with commit bits. Gack. I'm not sure even I would be quite this radical -- any moron (like me ;-) can generate a PR that might include a patch. IMO, better would be to give the area to another person who is suitably qualified, has the available cycles, and presumably already has a commit bit. Moron PR's are easily filterable: they are closable by the owner with little effort. PR's that are ignored, on the other hand, rather than being closed, are most likely legitimate, but inconvenient. -- Terry To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-current in the body of the message
[PATCH] Re: smbfs panic
Tim Robbins wrote: Here's a backtrace of a smbfs panic. Looks like it does not correctly handle the smbfs_getpages error it is encountering and leaves garbage vnodes lying around. The panic probably comes from the VI_LOCK macro call on smbfs_node.c line 321. # cp blah.tar.gz ~tim cp: /home/tim/blah.tar.gz: Bad address Read the list archives. We discussed this to death. The correct thing to do is to back off and retry -- that's decimal 60, which is hex 0x3c, which, according to /sys/netsmb/smb_rq.h is: #define SMBR_REXMIT0x0004 /* request should be retransmitted */ #define SMBR_INTR 0x0008 /* request interrupted */ #define SMBR_RESTART 0x0010 /* request should be repeated if possible */ #define SMBR_NORESTART 0x0020 /* request is not restartable */ If you don't want to try to implement this, note that the read/write works (try 'dd' instead of 'cp', so you aren't using mmap'ed pages, and see that 'dd' works). I don't think it's worth writing the code to attempt the retry, until you know that the code will work -- that it's backed up far enough that the retry won't fail. This basically means you need to know why it's failing in the first place, which means you need to be familiar with how the server is implemented (not likely, unless your name is Luke Howard, you're a Microsoft employee, or you have a Windows source license -- otherwise it's probably about two weeks worth of work). The attached patch works around the problem by disabling the getpages and putpages code in the smbfs. This basically turns paging operations into reads and writes, which we know from using 'dd' instead of 'cp' will work. Note: this is only a workaround: it disables obviously incorrect code, but doesn't provide replacement code for the bogus code. -- Terry Index: smbfs_io.c === RCS file: /cvs/src/sys/fs/smbfs/smbfs_io.c,v retrieving revision 1.13 diff -c -r1.13 smbfs_io.c *** smbfs_io.c 4 Aug 2002 10:29:30 - 1.13 --- smbfs_io.c 21 Nov 2002 05:53:23 - *** *** 400,405 --- 400,406 return error; } + #if BROKEN_PAGE_IO /* * Vnode op for VM getpages. * Wish wish get rid from multiple IO routines *** *** 655,660 --- 656,662 return rtvals[0]; #endif /* SMBFS_RWGENERIC */ } + #endif/* BROKEN_PAGE_IO */ /* * Flush and invalidate all dirty buffers. If another process is already Index: smbfs_node.h === RCS file: /cvs/src/sys/fs/smbfs/smbfs_node.h,v retrieving revision 1.2 diff -c -r1.2 smbfs_node.h *** smbfs_node.h18 Sep 2002 09:27:04 - 1.2 --- smbfs_node.h21 Nov 2002 05:53:47 - *** *** 75,83 #define VTOSMB(vp)((struct smbnode *)(vp)-v_data) #define SMBTOV(np)((struct vnode *)(np)-n_vnode) struct vop_getpages_args; - struct vop_inactive_args; struct vop_putpages_args; struct vop_reclaim_args; struct ucred; struct uio; --- 75,85 #define VTOSMB(vp)((struct smbnode *)(vp)-v_data) #define SMBTOV(np)((struct vnode *)(np)-n_vnode) + #if BROKEN_PAGE_IO struct vop_getpages_args; struct vop_putpages_args; + #endif/* BROKEN_PAGE_IO */ + struct vop_inactive_args; struct vop_reclaim_args; struct ucred; struct uio; *** *** 89,96 --- 91,100 struct smbfattr *fap, struct vnode **vpp); u_int32_t smbfs_hash(const u_char *name, int nmlen); + #if BROKEN_PAGE_IO int smbfs_getpages(struct vop_getpages_args *); int smbfs_putpages(struct vop_putpages_args *); + #endif/* BROKEN_PAGE_IO */ int smbfs_readvnode(struct vnode *vp, struct uio *uiop, struct ucred *cred); int smbfs_writevnode(struct vnode *vp, struct uio *uiop, struct ucred *cred, int ioflag); void smbfs_attr_cacheenter(struct vnode *vp, struct smbfattr *fap); Index: smbfs_vnops.c === RCS file: /cvs/src/sys/fs/smbfs/smbfs_vnops.c,v retrieving revision 1.24 diff -c -r1.24 smbfs_vnops.c *** smbfs_vnops.c 26 Sep 2002 14:07:43 - 1.24 --- smbfs_vnops.c 21 Nov 2002 05:52:32 - *** *** 96,102 --- 96,104 { vop_create_desc, (vop_t *) smbfs_create }, { vop_fsync_desc, (vop_t *) smbfs_fsync }, { vop_getattr_desc,(vop_t *) smbfs_getattr }, + #if BROKEN_PAGE_IO { vop_getpages_desc, (vop_t *) smbfs_getpages }, + #endif/* BROKEN_PAGE_IO */ { vop_inactive_desc, (vop_t *) smbfs_inactive }, { vop_ioctl_desc, (vop_t *) smbfs_ioctl }, { vop_islocked_desc, (vop_t *) vop_stdislocked }, *** *** 108,114 --- 110,118 { vop_open_desc, (vop_t *) smbfs_open }, { vop_pathconf_desc,
Re: Cross-Development with NetBSD
Ruslan Ermilov wrote: On Thu, Nov 21, 2002 at 12:10:14AM -0700, M. Warner Losh wrote: In message: [EMAIL PROTECTED] Wilkinson,Alex [EMAIL PROTECTED] writes: : Is FreeBSD likely to follow the in footsteps of NetBSD and create : a framework to do crossbuilds ? : : http://ezine.daemonnews.org/200211/xdevnetbsd.html FreeBSD already has cross builds for a while, since before NetBSD's cross build infrastructure. However, NetBSD's infrastructure is a little more extensive because it is possible to do incremental builds and build full releases that work in a cross build evironment. What do you mean by incremental builds and full releases that work ...? You know, like changing one line in /usr/src/lib/libstand on a source tree on a x86 box, typing make release, and having only the things that need to be rebuilt being rebuilt, resulting in a working FreeBSD-Alpha or FreeBSD-SPARC64 release CDROM image. -- Terry To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-current in the body of the message
Re: smbfs problems
[ ... smbfs ... ] Vallo Kallaste wrote: Now the writing part: After creating 5MB file using /dev/urandom, I'm trying to copy it over to users/vallo smb share mounted at /mnt, which fails. The copy is interruptible using Ctrl-C. Examination at NT4 server shows 0 byte file. Umount of /mnt fails with device busy. Umount -f /mnt fails to return prompt, but after interrupting the smbfs is unmounted. There is no kernel messages or something in syslog. The copy operation returns failure ~3 seconds after start. Try using 'dd' instead of 'cp', or the patch I posted last night. The shows 0 byte file is a normal artifact of how file metadata is handled in Windows filesystems: unlike UNIX, a partial file does not have the metadata, including the file size, updated until the file is closed. Therefore FTP restart is pretty meaningless on Windows, unless you have an FTP client that closes and reopens for writing the file it is transferring at intervals, among other things -- one of which is that any interrupted create operation will leave a 0 length file. I don't know why your umount fails with device busy; what you need to do is look at the connections which are open, and why it cares about whether or not they are abandoned, in the unmount case. I rather expect that the connection(s) are jammed up, so you can't close them so you have virtual circuit instances that you can't get rid of. I would expect even a force to take whatever time it takes to dump the open handles, plus 2MSL plus however much time it keeps the connection in the half-close state, waiting for a FIN/ACK from the server. -- Terry To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-current in the body of the message
Re: smbfs problems
Vallo Kallaste wrote: Sorry forgot to add one detail. Althought dd'ing the same file to smbfs mount works, it'll sometimes modify the file being copied (size is different). It doesn't happen reliably, sometimes the file is copied fine, sometimes not. At the times the file isn't copied right there's an error message: root:vallo# dd if=testfile of=/mnt/vallo/test1 dd: /mnt/vallo/test1: Bad address 9356+0 records in 9355+0 records out 4789760 bytes transferred in 20.350003 secs (235369 bytes/sec) It seems to me that adding conv=sync flag to dd removes the abovementioned failure case. 10 tries of dd with this flag added did fine. The 'conv-sync' flag to 'dd' pads the operation out to a record boundary, if the input of the operation is not a full record in length. This observation is consistent with an incomplete final write, for lack of data. Probably this has to do with the TCP_PUSH option and/or the SMB server's connection flags. -- Terry To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-current in the body of the message
Re: Cross-Development with NetBSD
Ruslan Ermilov wrote: NetBSD builds a directory full of tools that you can later use to incrementally build, say, 'ls' or 'cat' because one can define USETOOLS to be 'yes' and have the make automatically pick them up when rebuilding. There are a few of the details I'm a little unclear on, but that's the jist of it. We also can, this just requires a few really tiny tweaks to Makefile.inc1, and I've posted them already some time ago -- basically, for each architecture you should build the subset of buildworld targets (WMAKE_TGTS), up to and including _libraries (if you want to build roughly any bit later), and them you can ``make {depend|all} SUBDIR_OVERRIDE=bin/cat'' for each of the desired TARGET_ARCH. Any ETA on when this will be committed? I know that the Alpha and sparc64 binaries produced on i386 work. I thought that the Alpha boot blocks ended up too large in the cross-build case? They did, last time I tried it. I know that cross-compiling i386 on either Alpha or sparc64 is broken (GCC sometimes produces different assembler output than the native compiler). I lack the necessary hardware to actually test/fix the issues with cross-releases. I don't think he was attacking you, personally, to ask you to fix the problem, I think he was just noting the problem exists. One thing that would help a lot -- and probably be helpful in general -- would be a binary compare tool that ignored date stamps in things like libraries, tar images, etc., so that you could compare where things differ, easily, allowing someone to track down differences. It would be helpful in general to be able to compare what you built vs. a release version, to assemble binary only delta lists, for preparations for upgrade tools, etc.. I keep meaning to do this, but I really don't want to have to release the tool under the GPL, if I don't have to. -- Terry To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-current in the body of the message
Re: recommended VAIO for ACPI hacking (Re: cvs commit:www/en/releases/5.0R todo.sgml)
Mitsuru IWASAKI wrote: Also, development resources are limited. For example, none of ACPI developers has VAIO. Well, I don't know enough to be a developper but I do have a VAIO (Z600TEK) and can test things. Just ask. BTW, I'm planning to buy VAIO (maybe used one) to improve ACPI support. Any recommendations? While I personally want you to buy a PCG-XG29 (what I have 8-)), I think the most problems have been reported on the Z505 and Z5xx series. If you are going to buy used, the Z5?? and PCG-XG2? (especially the PCG-XG28, not 29) are probably what will be available to you. -- Terry To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-current in the body of the message
Re: DP2 Fatal Trap
John Baldwin wrote: On 21-Nov-2002 Scott Sipe wrote: On Thursday 21 November 2002 01:36 pm, John Baldwin wrote: Hmm, is this from a GENERIC kernel? This is from straight from DP2 iso image cd install, X-Developer install, first boot after the install finished, generic kernel etc. Ok, generic kernel is the only really important part. :) Can you do me a favor and see if you have a /boot/kernel.GENERIC/kernel.debug or a /boot/kernel/kernel.debug? If so, can you please do 'gdb -k kernel.debug' and then at the prompt do 'l *instruction pointer' where instruction pointer is the second part of the instruction pointer from the panic message? (I.e., w/o the leading '0x8:' part.) It's the PSE and PGE, John. Are you sure you won't agree to not disclose, so I can tell you what's happening? Bosko has a patch which he will give you if you ask him for it that (mostly) works around the problem. -- Terry To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-current in the body of the message
Re: DP2 Fatal Trap
John Baldwin wrote: Is it any help to know that my problems on P4 stopped after enabling DISABLE_PSE? Initially I had both of these enabled, but seems that one is enough. Just FYI. If we can verify that DISABLE_PG_G has no effect then that would be nice. It has an effect: writing CR3 or a TSS resulting in a changed CR3 will not invalidate TLB entries with the G flag set, if PGE is set in CR4. -- Terry To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-current in the body of the message
Re: 5.0 DP2 on Vaio Z600
Richard Tobin wrote: I'm trying to install DP2 on a Sony Vaio Z600TEK laptop, but it hangs at Probing devices, please wait (this can take a while) [ ... ] Any suggestions? Tell it to not load ACPI. -- Terry To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-current in the body of the message
Re: gcc 3.2.1 release import?
Marc Recht wrote: There is neither a gcc 3.2.1 nor a gcc 3.3 yet, so I would't use any of them in a stable release. gcc 3.2.1 has been uploaded on ftp.gnu.org at Nov. 19th. So it's been extensively tested by the full user base for the last two days, and you should have known about it before you posted. 8-) 8-). -- Terry To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-current in the body of the message
Re: gcc 3.2.1 release import?
Marc Recht wrote: So it's been extensively tested by the full user base for the last two days, and you should have known about it before you posted. 8-) 8-). My original question was only if it will be imported before 5.0R. David O'Brien already answered it with no. That's fine with me. Don't worry about it; it's being totally blown out of proportion; there's no way anyone will commit to importing a 2 day old 3.2.1, which is why I put the smiley's there. -- Terry To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-current in the body of the message
Re: DP2 (and earlier) Problems *addition
Scott Sipe wrote: Sorry, should have done this with the first email. The dmesg from my stable boot: Yank half your memory, and try it again, and let us know. -- Terry To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-current in the body of the message
Re: DP2 (I think!) crash booting from floppies
local.freebsd.current wrote: I got a pair of floppies from: ftp://ftp.freebsd.org/pub/FreeBSD/snapshots/i386/5.0-20021103-SNAP/floppies/ and booted them on a Dell Dimension XPS D300 which is currently running 4.7. It's a PII/300 with an Adaptec 2940 SCSI and an STB Riva graphics card. When booting the kernel off the second floppy I get: Booting [/kernel]... / Fatal trap 12: page fault while in vm86 mode fault virtual address = 0x9f800 instruction pointer= 0xf000:0x8c3e Patch which was never integrated. Build a new kernel. http://docs.freebsd.org/cgi/getmsg.cgi?fetch=341812+0+archive/2002/freebsd-current/20021027.freebsd-current -- Terry To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-current in the body of the message
Re: Device permissions with DEVFS
Kip Macy wrote: Sorry, if I'm repeating something already said, but the tone of your mail would indicate that I'm not. This doesn't sound like an intrinsic limitation of devfs, just an issue with how it is structured now. There should just be a central file for all the devices which devfs sucks in at build (or maybe boot) time specifying the appropriate permissions and any other configuration information. A separate ELF data section for this information would allow kernel modules to have this information edited with a tool, as well permitting the kernel itself to be edited with the same tool, so that site defaults could be persistantly changed from the source tree defaults. Indeed, this would allow the permissions to be listed in the case that Bruce was complaining about, which is the inability to see what will happen when the hardware is present, if it's not available to the tester. An extension of this would permit chmod's against the devfs to be written back to the kernel or driver module affected, assuming your secure level is low enough and your flags are set to permit it, which also gets rid of the common complaint about persistance (which is really just a handy thing to use to bitch about devfs when you can't come up with any legitimate complaints, IMO). -- Terry To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-current in the body of the message
Re: Device permissions with DEVFS
Marcin Cieslak wrote: What's wrong with having /etc/minor_perm et consortes a la Solaris? With sensible kernel defaults to allow booting without your favourite root partition. What's wrong with just having /dev? -- Terry To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-current in the body of the message
Re: Device permissions with DEVFS
Poul-Henning Kamp wrote: I have to say that the ownership issue has been a pet peeve of mine for some time: I would really like the kernel to know about exactly two magic id values: uid 0 (suser uid, default uid, default devfs owner), and gid 0 (default gid, default devfs owner). Hard-coding of other non-0 values in the kernel leads to many potential (and real) problems. While you are right in principle, I think we should not overengineer here. People who are likely to give operator a different gid are also very likely to compile their own kernels (which I admit does not solve the 3rd party KLD issue but...) Devfs(8) provides a mechanism for setting the m/o/g and a few other attributes, and it would in theory be possible to let all devices come up 0/0/0 and have /etc/rc set the policy from /etc/rc. One fix for this would be to have a UID/GID list that's used to derive both the default uid/gid values in devices, and the contents of the default passwd file, so that they matched. It seems to me that this issue is merely one of getting the UNIX auth database and the default device attributes to agree, right? -- Terry To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-current in the body of the message
Re: New NVidia drivers on -current
David Holm wrote: Note that the patch has already been applied so no need to patch your kernel! BTW, why hasn't anyone set the mailing list to automatically set the reply-to address to [EMAIL PROTECTED]? Because the poster may not be a list subscriber, and the most important person to reply to is the poster. It's up to the poster's MUA to know that the poster is a subscriber to the list, and set the Reply-To: on the basis of that knowledge. This could be done on the server, but there are two reasons not to do it in the mailing list manager: 1) It's computationally expensive, and all processing that could be done on either the server or the client, should be done on the client, to ensure that the deployment scales. 2) There is a draft RFC which is under consideration by the IETF, and is likely to become an issued RFC, which requires that certain headers not be altered by mailing list managers; specifically, all non hop-to-hop headers should not be modified by the mailing list manager, and Reply-To: is an end-to-end header, not a hop-to-hop header. Sof if it isn't in violation of an RFC for the mailing list software to set the Reply-To: now, it likely will be, in the near future. -- Terry To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-current in the body of the message
Re: Run two copies of named from rc.conf?
Brad Knowles wrote: It depends on how you do it. You could $INCLUDE the exterior file inside the interior file, if that subset of information is the same. You could also use BIND 9 views. Otherwise, split-horizon can be a pain. If you have a LAN behind a transient network connection, and you want your LAN to function without degradation as a result of losing the link (Who ever heard of DSL going out?), then you want to have your on site DNS server be authoritative. But. If you are transiently connected, then if the on site DNS server is authoritative, then there is no way to look up externally hosted services via DNS, unless the external DNS, also a hosted service, and therefore not transiently connected, is authoritative. One potential answer to this is that the external DNS is a secondary of a stealth primary running at your local site. However, this has the unfortunate effect that a persistant outage will become a general outage, should it last longer than the TTL for the externally visible records. In addition, there are no NOTIFY updates sent to the secondaries, if the primary is offline when it is updated. In addition, making the primary MX on site means a 3 minute delay on all external mail send attempts to the site domain(s)., as the connection attempt times out and falls back to the secondaries, which are externally hosted. Finally, externally hosted resources may require changes as the actual facilities are changed around. This includes relocation of primary and secondary external MX's, relocation of web services, relocation of database and other outsourced services, relocation of shopping cart services, etc.. This may include relocation of the primary IP address of the customer site, which would also require a change to the IP address configured into the secondaries of the stealth primaries. Basically, what this boils down to is that you are never fully authoritative for a domain for which there exist externally hosted services, and such services must have priority ofver transiently connected services. For this to work, you have to have a DNS server that's external (hosted, and therefore always available), as well as being seen to be authoritative. For local authority, then, you must delegate authority, without delegating it as a subdomain, to the external server. The easiest way to do this is to, on a local lookup miss, forward the request to an external server, even if you are the authoritative server, AND to replicate local DNS information to the external authoritative server, as well. DNS does not support this right now, even with BIND 9's views. The entire point of people coming onto the Internet for the first time is to make themselves appear real, clueful, etc., and that means a virtual non-transient connection, which basically means external hosting of visible services by a third party, so that it looks like the company has a full time Internet connection, rather than looking like a Mom and Pop with only a dialup or other transient connection. Yeah, that doesn't sit very well with you, if you are a company who wants to sell one server to each of 100 customers, rather than 6 servers to a hosting provider, but tough: there's no law that requires me to protect your business model, unless you are a member of the music or motion picture industry, and have bribed enough senators. -- Terry To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-current in the body of the message
Re: Run two copies of named from rc.conf?
Brad Knowles wrote: Sorry, I wasn't think of transient networks. Indeed, that does make things a lot uglier. I'll have to think some more about all the various implications, however. One of the draft RFC's in the FTP directory I referenced is a Best Current Practices document. -- Terry To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-current in the body of the message
Re: Run two copies of named from rc.conf?
John De Boskey wrote: This an interesting thread, but it seems to be getting a bit off target. I need to kick off 2 name servers. The first is authoritive for the domain as seen externally and the 2nd which is authoritive for the internal network. The internal forwards to the external when appropriate. These networks are not transient. Then you want a single BIND 9 install with two views, one bound to the internal IP, the other to the external. You don't want what I've suggested. And You don't want what you originally asked for. 8-). -- Terry To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-current in the body of the message
Re: Are SysV semaphores thread-safe on CURRENT?
Brian Smith wrote: Sure SysV semaphores are thread-safe. When a thread blocks on one, the entire process blocks (no threads run). You won't get any safer than that ;-) Yikes that isn't good. Is that only in STABLE? or does CURRENT do that as well? I guess I'll have to protect the semop() call with a pthread mutex to prevent two threads locking a single semaphore by the same process (creating a deadlock situation). Is this the recommended method of preventing these problems? Yeah: don't make blocking system calls for which there are no asynchronous equivalents. Use the POSIX interfaces for use by pthreads, instead. (the SysV semaphore is protecting shared memory accessed by multiple processes). Thanks for the info... it explains alot! Use mmap of a backing-store file, and then use file locking to do record locking in the shared memory segment. -- Terry To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-current in the body of the message
Re: Run two copies of named from rc.conf?
John De Boskey wrote: It would be nice if rc.conf could start a 2nd copy of named (split dns). Comments on the following simplistic patch? Interior and exterior DNS is a useful case; however, there are multiple ways to set it up; in general, it's not possible to have interior authoritative DNS at the same time you have exterior authoritative DNS (this was a mistake we made on the InterJet, for a long time), without modifying the DNS server to forward requests for which it has incomplete information (e.g. the PNS draft RFC I wrote). See the files in: ftp://ftp.whistle.com/pub/terry/drafts/ for details. -- Terry To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-current in the body of the message
Re: DISABLE_PSE DISABLE_PG_G still needed?
Vallo Kallaste wrote: Hi The kernel compiled from yesterday sources and with the abovementioned options disabled will not finish make -j2 buildworld on P4. Dies with bus error: /usr/src/lib/libc/gen/termios.c: In function `tcgetpgrp': /usr/src/lib/libc/gen/termios.c:104: internal error: Bus error Please submit a full bug report, with preprocessed source if appropriate. See URL:http://www.gnu.org/software/gcc/bugs.html for instructions. Are those options still needed? They are commented out in NOTES and shouldn't be necessary, right? What happens if you add those options? Does it still crash? -- Terry To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-current in the body of the message
Re: DISABLE_PSE DISABLE_PG_G still needed?
Vallo Kallaste wrote: This may be a bit overstated. I removed those options from my kernel a few weeks ago and have no problems at all. Are you certain the problem is not specific to a particular CPU? Sorry, this can be CPU specific, but I'm not sure. I'll try to reproduce it on my home P2 system and P3-SMP lying under my desk at work. How much memory do these systems have? -- Terry To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-current in the body of the message
Re: DISABLE_PSE DISABLE_PG_G still needed?
Robert Watson wrote: On Fri, 15 Nov 2002, John Baldwin wrote: It only happens with P4's. I haven't seen it locally on a p4 test machine at work that I have built test releases on. Also, it would be nice to see if just adding one of the options fixed the problems. As for NOTES, those options should not be enabled in NOTES as they would defeat the purpose of LINT since they disable code. Does this apply generally to all P4's, or just a subset? If all, it may be we want to add a P4-workaround to GENERIC so that P4's work better ouf of the box. If it's a select few, I wonder if there's some way to test for the problem early in the boot... One of the recurring themes here has (a) been P4 processors, and (b) been a fear that because of timing changes introduced by the DISABLE_FOO flags, the real bug is still there, but less visible in the tests people are running. The amount of RAM will also affect it. It can also happen on P3's and AMD K6's. It is a CPU bug related to the use of 4M pages. Bosko understands the problem (I have explained it to him under non-disclosure), and he has a patch which avoids it without really disclosing the problem, which I'm OK with. Using the patch cranks the amount of base memory required for a minimal FreeBSD up to 16M, and loads the kernel at 4M, instead of 1M. This avoids the problem on purpose that the older FreeBSD locore.s used to avoid by accident. The alternative is to take up to a 15% performance hit by not using 4M and global pages, or to revert the locore.s code so that it does not tickle the hardware bug. -- Terry To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-current in the body of the message
Re: DISABLE_PSE DISABLE_PG_G still needed?
Wesley Morgan wrote: Based on this, are you recommending that the DISABLE_* still be used? Will I never see the problem with 512mb of ram? When Matt Dillon made some of the machdep.c allocation sizes dependent on the physical RAM size, it made the problem much less predictable, based on the amount of RAM, so without sitting down and doing some math to find out exactly where each byte of memory was going, I could not tell you for a given amount of RAM and CPU. What I will tell you is that there is a stair function involved in the amount of RAM you can install, and there is a following function that looks similar, for the allocations made by machdep.c now. The problem will occur when there is a gap between the stair and the follower, e.g.: RAM available | . | +.. | | . -- bug triggered | `-+. |.+.. | | . -- bug triggered | `-+. | .+.. || . -- bug triggered V RAM used --- Bosko understands the problem (I have explained it to him under non-disclosure), and he has a patch which avoids it without really disclosing the problem, which I'm OK with. Using the patch cranks So basically, there is a DEFECT in something that either Intel or AMD has some me (you, everyone) and they will not disclose the defect, honor any warranties, or provide fixes for the problem? No. The non-disclosure was mine. I am not an Intel/AMD employee; I discovered the defect independently. As far as I know, they are aware of the problem from Microsoft, but have no idea as to its root cause. It is likely that AMD licensed Intel designs, in order for AMD to have the same problem. You should be aware that Microsoft recommends a registry setting that disables the use of 4M pages to work around the problem on customer systems that have the problem. They don't have the PG_G problem that FreeBSD 5.x has, for the same reason that FreeBSD 4.3 didn't have it: serendipity. How... crappy. Reminds me of the Redhat/DMCA suppressed patch. I think consumers have a right to know about any defects in something they have bought. Argue with your congressman; it was a U.S. law that suppressed the patch, in that case. And I also think that the marketer should assume some liability for selling defective hardware (even though software makers seem to be able to get away with it). Even defects that haven't been discovered or characterized by them? Argue with the U.S. Supreme Court and the tobacco industry on that one. Degree of product liability is based on the prior knowledge of potential harm. -- Terry To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-current in the body of the message
Re: Calendar Changes.
Tony Harverson wrote: My attention was drawn a little while ago to the fact that the South African holidays in /usr/share/calendar/ were far out of date (most not having been celebrated since 1994), and so I decided to clean them up. As soon as I got into actually working in that directory, it struck me that it's hard to know just where to fix things. [ ... ] I'm getting the impression the whole thing grew organically, rather than with a design in mind.. I'm getting the impression that South Africa's major historical events have occurred at almost random times, with the resulting list of official holidays growing organically, rather than with a design in mind... 8-) 8-). Maybe we should put a cap on major historical events? Yes, Oingnatia, it would be nice for your people to be free, but if you become a representative democracy, and make a holiday of it, we will have to edit calendar.holiday, and that would be a pain; could you keep your murderous dictator until Next July 17th? Then we will be able to just symlink you to your neighbor, Mugataland, since that's when they killed off their former military Junta. Or if you could at least wait to throw off the chains of oppression until after 5.0 is released, we'd really appreciate it. Thanks. Things involving people often grow organically, rather than being planned; I think we just have to live with it... -- Terry To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-current in the body of the message
Re: -current make jdk13 with native_threads error
Bill Huey (Hui) wrote: On Wed, Nov 13, 2002 at 01:51:38AM -0800, Bill Huey wrote: That's all been removed from a MFC of libc_r recently. Native Uh, you're running on -current I presume (without reviewing the original post), but the same logic still applies. They didn't say; I assumed they were, because of the line number in the header fole for the undefined timeval struct matching the -current source code, but not 4.7, and because they posted to the -current list. 8-). Thanks for the HotSpot info, BTW; it was worth squirreling away for me, and I'm sure they will find it useful... -- Terry To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-current in the body of the message
Re: binutils symbol hiding and versioning (was Re: [PATCH] note the __sF change in src/UPDATING)
Loren James Rittle wrote: FYI, the libstdc++-v3 maintainers on the FSF side are only guaranteeing forward ABI compatibility of any sort if libstdc++.so is built with symbol versioning and symbol hiding. FWIW: symbol versioning is incredibly broken. It attempts to do in UNIX what interface versioning does in Windows, through the use of class factories accessed via IUnknown. The point of the exercise is to allow multiple simultaneous versions of an interface to be exported by a single library. The main reason this is bas is the same reason that Novell must, in their SDK's, support interface versions all the way back to NetWare 1.x: in order to hve the largest possible user base, a software vendor would have to be stupid to write to anything other than the lowest common denominator of interfaces: it's really stupid to limit your customers to running NetWare 4.2 or above, when there is still such a large installed base of 3.x, 4.0, and 4.1 versions of NetWare. The only thing you do when you do that is to disenfranchise postential customers for your product. Windows uses this approach because they do not have the concept of shared object versioning; VCRTL32.DLL is VCRTL32.DLL, no matter what, so a version change that permits old applications to continue working is a the same DLL, plus extensions (since there is no version in the file name,multiple versions can only exist simultaneously if they exist in the same file). It would be a very big mistake to actually utilize symbol name versioning on a UNIX system, and buy into this model, even if the idea was supported by the tools. That Linux has bought into the idea of using this is, frankly, Linux's problem, and they are going to regret it in the future as much, or more, than they regretted implementing shared library support in the SVR3.2 way, of linking libraries to fixed locations, and carving up chunks of the user virtual address space to implement them, back when they first supported shared libraries. -- Terry To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-current in the body of the message
Re: building -STABLE from within -CURRENT
Wilkinson,Alex wrote: In a dual boot situation, is it possible to be logged into -CURRENT and build -STABLE ( ie -STABLE filesystems live on separate fdisk partitions and are exported ) ? Only if you unpack the contents of CDROM #2 from a -STABLE system into a chroot environment. Specifically, the compiler and FFS changes have added a number of incompatabilities that are insurmountable for cross-building, from my experience. Check the -current mailing list archives, as it applies to being able to cross-build versions of FreeBSD; this was covered in detail last week (and about every two weeks, previous to that). -- Terry To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-current in the body of the message
Re: Samba on -current
Jason Vervlied wrote: I am having problems with a Samba share on my -current box, I just installed from 20021103-SNAP. I did recompile my kernel with the following options. [ ... ] I also added the SMP options to the kernel. I used the same options under -stable and experineced no issues. Here is the error I get when I try to copy a file from my samba share [jvervlied@current 80-85]$ cp bad_religion-yesterday.mp3 ~/ cp: /home/jvervlied/bad_religion-yesterday.mp3: Bad address This was discussed in detail about 3 weeks ago. I suggested one workaround, which would be to disable the Samba-specific getpages/putpages code, since the timeout is in the getpages, where an operational status code says that the attempt is both not recoverable and should be retried. Another partial fix is to retry for a count, but the unrecoverable part of the error indicates that the operation need to be retried at a much higher level (potentially, all the way to the point of reestablishing the session). See the archived previous discussion for details. One alternative is to use dd instead of cp to copy the file, thus avoiding the mmap'ed data failts that come from cp to the SMBFS. -- Terry To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-current in the body of the message
Re: Samba on -current
Sheldon Hearn wrote: It's a known problem. Consider reading the -current mailing list to keep up to date with known problems. It was discussed last week. No solution is known at this time. Use dd(1) instead of cp(1) as an interim workaround. Actually, it's fixable 3 ways: o Full fix for cp, ugly Remove the getpages/putpages from the implementation of the SMBFS' VOPs table. This will force it to fall back to code that uses read/write instead, which doesn't have the problem. Performance will suck, but the copies will work as expected, though mmap won't. o Incomplete fix, ugly, may be enough anyway Put a retry loop in the getpages/putpages code (mostly, the getpages code), to retry the operation at that level; if the failure does not occur at the session or handle level, then this will cover up the problem. If the session or handle reference is failing, there is insufficient data to rewind state to the point where it can be retried, due to the fact that you would need to go up several calls, and then back down into the VOP, to reestablish the handle to retry the call again. If you had to reestablish the session, you'd have to go even higher. o Complete fix, a lot of work The code needs to be refactored, so that a restart with the handle or session invalid works. This means seperating out the session and handle management from the standard code path, so that it can be restarted at any point, so that the state doesn't need to be unwound. The problem here is that you are in a trap handler of a write on another FS, faulting on a page that's backed by the SMBFS, so it's not like you can recover enough information otherwise to recreate the handle or session, if necessary. So you would have to ask for the handle from the cache, and then the session for the handle from the cache, if the handle was not valid. -- Terry To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-current in the body of the message
Re: binutils symbol hiding and versioning (was Re: [PATCH] note the __sF change in src/UPDATING)
Loren James Rittle wrote: FWIW: symbol versioning is incredibly broken. It attempts to do in UNIX what interface versioning does in Windows, through the use of class factories accessed via IUnknown. You might be absolutely correct in general. However, please read http://gcc.gnu.org/onlinedocs/libstdc++/abi.txt . It is clear that symbol versioning is not being used at all like you supposed it might have been (mis-)used. To provide multiple API's in a single shared library image, to avoid a number of images being necessary? That's exactly how it's being used; they call it age gracefully, but what that means is that multiple API versions remain supported for a very, very long time, without having seperate libraries involved. Technically, the symbol decoration used in C++ is also in error; it's there simply to avoid having to change the object file format to accomodate interface attributes seperately from the symbol name (just as adding an @@version name suffix does). If this were not ther, the linker could automatically create glue code for doing argument coercion, and it would save a lot of code that currently has to be supplied by programmers. FWIW: There is no concept of IUnknown or implementation factories (and, yes, I do understand those concepts) in how libstdc++.so (v3) is using symbol versioning. I invite you to take a close look at how that library is actually using symbol hiding and versioning before you attempt to cast your judgment. If you have informed comments, then please direction them to [EMAIL PROTECTED] not me personally (as a libstdc++-v3 maintainer, I will read them over there like all others). I'm well aware of how it's used; the IUnknown reference was an analogy; the document you refer to specifically states that it's to avoid a proliferation of shared library files. That's exactly the purpose of IUnknown version information for class factories, as well. Part of the problem here is that GCC dropped the minor version number from shared libraries in binutils, and FreeBSD and Linux followed suit. Now this turns out to be a mistake, and rather than admitting the mistake, instead now we have more decoration occurring in the symbol name to fake up another orthogonal namespace. Traditional UNIX systems have minor versions on shared libraries to address this, and bump major versions only if existing interfaces change, not when interfaces are added (thus program foo linked against lib.so.M.N works just fine against lib.so.M.N+K). If you don't like the Microsoft DLL version analogy, a different analogy is the Aztec C support for directories, by naming files with their complete path, and treating the character / as a path component seperator in order to achieve a namespace escape, when the Mac FS didn't support directory hierarchies. In all cases, what's happening is a namespace incursion in order to permit coexistance of otherwise conflicting implementations. Short summation: We only mark the first version of the library that a new symbol is added. E.g. there will never be [EMAIL PROTECTED] and also [EMAIL PROTECTED] The first release after an ELF library version number bump resets all tags to be the same. As clearly documented, libstdc++.so (v3) will bump the major version number just like FreeBSD does on installed shared libraries to express other sorts of C++ compiler or library ABI change. This still fails when the OS version does not bump each time the compiler version bumps. I guess this is OK, if you are a compiler vendor, but less OK, when you are an OS vendor. 8-). If the system compiler of FreeBSD still wanted to install multiple versions of libstdc++.so (v3) with major number bumps for other reasons (because it is considered safer for compatibility by the system designers), that would be quite fine. But completely ignoring the symbol hiding features will make the FreeBSD C++ system compiler and environment worse than the Linux version and worse than a g++ installed from equivalent FSF sources IMHO (since we will leak all sorts of internal implementation symbols that are not suppose to influence user application link behavior). Anyways, Alex was already going to look into trying this for the FreeBSD 5.0 system compiler so hopefully this will not be the case. No, it will make it incompatible, which is rather annoying, but it's an introduced incompatability that came from the compiler, and we shouldn't pretend it isn't. In any case, the issue was in attempting to prevent the exposure of data interfaces, and symbols not part of the defined API; this is still goinf to cause problems for the reasons this discussion came up in the first place: other language compilers that need to link against system libraries, and share implementation instances so that they can be linked against foreign object files that use the same underlying implementation. For the purposes of this discussion, that's the stdio implementation, as exposed