Re: missing files in readdir(3) on NFS export of ZFS volume (since v28?)
On Tue, Mar 08, 2011 at 07:40:19PM +0100, Pawel Jakub Dawidek wrote: Since I upgraded to ZFS v28 I noticed missing files from NFS. The files are still accessible through NFS but they don't show up on a readdir(3). Could you try r219404? It's fixed! Great work! Thanks a million! -- Sent from my FreeBSD server Pierre Beyssac p...@fasterix.frmug.org ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: missing files in readdir(3) on NFS export of ZFS volume (since v28?)
Hello Rick, Thanks for your reply. On Mon, Mar 07, 2011 at 06:12:43PM -0500, Rick Macklem wrote: Readdir (in both NFS servers) depends on ZFS to reply EOPNOTSUPP for VFS_VGET() when it cannot be done, so that Readdir will switch to using VP_LOOKUP(). Just a wild guess, but maybe ZFS v28 isn't doing this? My client was plain and simple ls(1). I said readdir(3) because I wrongly assumed ls used that, but actually from looking at the code it looks like it uses fts_open(3) and friends instead... -- Sent from my FreeBSD server Pierre Beyssac p...@fasterix.frmug.org ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
missing files in readdir(3) on NFS export of ZFS volume (since v28?)
Hello, I'm running a 9-current server as compiled on Sat Mar 5 02:17:14 CET 2011. Since I upgraded to ZFS v28 I noticed missing files from NFS. The files are still accessible through NFS but they don't show up on a readdir(3). On the NFS server (files are stored on a ZFS v15 volume, not yet upgraded to the v28 format): % cd /usr/ports/devel/autoconf % ls -i 311401 Makefile 204505 files204509 pkg-plist 204504 distinfo 204508 pkg-descr On the NFS client side (FreeBSD 8.2-RELEASE): % cd /usr/ports/devel/autoconf % ls -i 204504 distinfo 204508 pkg-descr 204505 files204509 pkg-plist Yet the missing file can be accessed: % head -3 Makefile # New ports collection makefile for:autoconf # Date created: 7th December 2006 # Whom: a...@freebsd.org Note that the missing files are scattered throughout the volume, no relation to the inode number, as shown on a diff: @@ -1,8 +1,6 @@ 37 drwxr-xr-x 70 pb staff 93 4 mar 19:11 /usr/ports - 42 -rw-r--r--1 pb staff 241 24 jan 2007 /usr/ports/astro/tclgeomap/pkg-plist 53 drwxr-xr-x2 pb staff 6 22 fév 12:04 /usr/ports/astro/tkgeomap 63 drwxr-xr-x4 pb staff 6 29 jul 2008 /usr/ports/Tools - 75 drwxr-xr-x 33 pb staff 34 25 nov 15:59 /usr/ports/accessibility 83 drwxr-xr-x 12 pb staff 14 9 fév 2009 /usr/ports/arabic 11 51 drwxr-xr-x 900 pb staff 901 6 mar 14:36 /usr/ports/audio 123 -rw-r--r--1 pb staff 584 25 aoû 2006 /usr/ports/astro/tkgeomap/pkg-descr @@ -16,10 +14,8 @@ 233 drwxr-xr-x3 pb staff 7 24 mar 2010 /usr/ports/astro/wcslib 244 -rw-r--r--1 pb staff1414 5 jan 2010 /usr/ports/astro/wcslib/Makefile 255 drwxr-xr-x 31 pb staff 33 1 jan 23:16 /usr/ports/french -262 -rw-r--r--1 pb staff 197 5 jan 2010 /usr/ports/astro/wcslib/distinfo 27 63 drwxr-xr-x 1110 pb staff 23 fév 15:37 /usr/ports/games 283 drwxr-xr-x2 pb staff 4 24 mar 2010 /usr/ports/astro/wcslib/files -292 -rw-r--r--1 pb staff 236 5 jan 2010 /usr/ports/astro/wcslib/files/6-patch-configure 303 -rw-r--r--1 pb staff 677 5 jan 2010 /usr/ports/astro/wcslib/files/patch-GNUmakefile 312 -rw-r--r--1 pb staff 401 17 jul 2009 /usr/ports/astro/wcslib/pkg-descr 324 -rw-r--r--1 pb staff1515 5 jan 2010 /usr/ports/astro/wcslib/pkg-plist ... Reverting to an old 9-current kernel (January 10, before the ZFS v28 patches) fixes the problem... -- Sent from my FreeBSD server Pierre Beyssac p...@fasterix.frmug.org ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: DVD+R burning flakey after ATAng.
On Mon, Oct 20, 2003 at 09:10:39PM -0400, David Gilbert wrote: This all started screwing up with ATAng. At first, ATAng didn't support atapicam, but that was rectified. Now the dvd+rw port (growisofs) doesn't work at all ... it finishes with an error that I'm loathe to coaster another (expensive) DVD to find out. burncd will I've only tried DVD+R burning with growisofs since ATAng. My drive is a Ricoh MP5125. I've previously burned quite a few DVD+RW with burncd, it mostly worked before ATAng and it still does. One problem I have seen is at the fixation phase. I occasionaly have growisofs returning an error at the end, sometimes it doesn't seem to matter and sometimes it does (no TOC, preventing any access). In the latter case I then tried a burncd fixate on it which made the coaster back into a DVD. It's funny enough since burncd is not supposed to handle DVD+R AFAIK, but the Ricoh drive apparently did the right thing anyway. As usual, YMMV. In my case it's clearly not related to the ISO image I put on the DVD. -- Pierre Beyssac [EMAIL PROTECTED] [EMAIL PROTECTED] Free domains: http://www.eu.org/ or mail [EMAIL PROTECTED] ___ [EMAIL PROTECTED] mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: 5.2-RELEASE TODO
On Wed, Oct 01, 2003 at 10:00:18AM -0400, Robert Watson wrote: FreeBSD 5.2 Open Issues This is a list of open issues that need to be resolved for FreeBSD 5.2. If you have any updates for this list, please e-mail [EMAIL PROTECTED] Must Resolve Issues for 5.2-RELEASE It would be nice if fixing the Raidframe driver could be added to the must resolve list, or a note added to the release notes explaining that's it's broken. It's been totally unusable since before 5.1-RELEASE (raidctl panics). See kern/50541. -- Pierre Beyssac [EMAIL PROTECTED] ___ [EMAIL PROTECTED] mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to [EMAIL PROTECTED]
bug in big pipe code causing performance problems
Hello, If no one objects to it, I'd like to commit the following ASAP. It fixes an obvious bug in the big pipe code, a discrepancy between how free space is calculated in write vs poll. The bug affects stable as well. The bug's implications are less obvious: it make write(2) on a non-blocking pipe return EAGAIN when poll(2) select(2) return the descriptor as ready for write. This in turns causes the libc_r code to busy-wait on pipes, causing a major performance hog. As an example the patch boosts the multimedia/transcode port (a big pipes threads user) by a factor of 2-3. Index: sys_pipe.c === RCS file: /home/ncvs/src/sys/kern/sys_pipe.c,v retrieving revision 1.140 diff -u -r1.140 sys_pipe.c --- sys_pipe.c 30 Jul 2003 18:55:04 - 1.140 +++ sys_pipe.c 30 Jul 2003 21:19:41 - @@ -1041,7 +1041,8 @@ if ((space uio-uio_resid) (orig_resid = PIPE_BUF)) space = 0; - if (space 0 (wpipe-pipe_buffer.cnt PIPE_SIZE)) { + if (space 0 +wpipe-pipe_buffer.cnt wpipe-pipe_buffer.size) { if ((error = pipelock(wpipe,1)) == 0) { int size; /* Transfer size */ int segsize;/* first segment to transfer */ -- Pierre Beyssac [EMAIL PROTECTED] [EMAIL PROTECTED] Free domains: http://www.eu.org/ or mail [EMAIL PROTECTED] ___ [EMAIL PROTECTED] mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: bug in big pipe code causing performance problems
On Wed, Jul 30, 2003 at 11:32:49PM +0200, Pierre Beyssac wrote: - if (space 0 (wpipe-pipe_buffer.cnt PIPE_SIZE)) { + if (space 0 + wpipe-pipe_buffer.cnt wpipe-pipe_buffer.size) { PS : not-so-obvious after all since the above is equivalent to (space 0) by itself, so I won't commit the above as is, and the real fix might be something more complicated... -- Pierre Beyssac [EMAIL PROTECTED] [EMAIL PROTECTED] Free domains: http://www.eu.org/ or mail [EMAIL PROTECTED] ___ [EMAIL PROTECTED] mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to [EMAIL PROTECTED]
broken DRM for ATI Radeon
Since the new DRM has been integrated in current by the end of April, I've been unable to use DRI with my ATI card. The X server starts and apparently works, then suddenly (when scrolling an xterm or doing some memory-intensive operation like 3D rendering) enters a busy loop. After an awful lot of searching and attempting to debug the code, I've finally been able to find out that the server loops on an ioctl(DRM_IOCTL_DMA) returning EBUSY, which means that the DRM driver can't get a free buffer from radeon_cp.c:radeon_freelist_get(). My XFree server is the latest version of the package (XFree86-Server-4.3.0_8). Earlier versions exhibited the same behaviour. Does anyone have a clue on where to investigate some more and fix that? -- Pierre Beyssac [EMAIL PROTECTED] [EMAIL PROTECTED] Free domains: http://www.eu.org/ or mail [EMAIL PROTECTED] ___ [EMAIL PROTECTED] mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to [EMAIL PROTECTED]
panic in netinet/tcp_syncache.c: syncache_timer
I'd like a review of the following fix I'd like to commit. The syncache_timer() function seems to have a locking problem causing panics, even after yesterday's patches. This apparently occurs when there are unexpired syncache entries while the corresponding listening socket is closed. tcp_close() destroys the relevant lock in the inpcb structure, which causes INP_LOCK() on that structure in the next syncache_timer() call to panic. I'm testing the patch below, which simply removes the inpcb locking and avoids the panic. It seems safe to me since we're running splnet, but I'm not sure it's correct since I suppose the locking is there for a reason... --- tcp_syncache.c.old Sat Dec 21 03:03:22 2002 +++ tcp_syncache.c Sat Dec 21 17:50:10 2002 @@ -384,14 +384,12 @@ break; sc = nsc; inp = sc-sc_tp-t_inpcb; - INP_LOCK(inp); if (slot == SYNCACHE_MAXREXMTS || slot = tcp_syncache.rexmt_limit || inp-inp_gencnt != sc-sc_inp_gencnt) { nsc = TAILQ_NEXT(sc, sc_timerq); syncache_drop(sc, NULL); tcpstat.tcps_sc_stale++; - INP_UNLOCK(inp); continue; } /* @@ -400,7 +398,6 @@ * entry on the timer chain until it has completed. */ (void) syncache_respond(sc, NULL); - INP_UNLOCK(inp); nsc = TAILQ_NEXT(sc, sc_timerq); tcpstat.tcps_sc_retransmitted++; TAILQ_REMOVE(tcp_syncache.timerq[slot], sc, sc_timerq); -- Pierre Beyssac [EMAIL PROTECTED] [EMAIL PROTECTED] Free domains: http://www.eu.org/ or mail [EMAIL PROTECTED] To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-current in the body of the message
Re: panic in netinet/tcp_syncache.c: syncache_timer
On Sat, Dec 21, 2002 at 10:57:35AM -0800, Jeffrey Hsu wrote: Can you try upgrading to rev 1.29 of tcp_syncache.c which I committed yesterday? I suspect that should fix this problem. No, I believed that too when I saw your patch, but it didn't solve my problem. -- Pierre Beyssac [EMAIL PROTECTED] [EMAIL PROTECTED] Free domains: http://www.eu.org/ or mail [EMAIL PROTECTED] To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-current in the body of the message
Re: panic in netinet/tcp_syncache.c: syncache_timer
On Sat, Dec 21, 2002 at 12:03:24PM -0800, Jeffrey Hsu wrote: Pierre, can you see if this patch fixes your problem? Thanks. Yes, it does. Actually I tried that before, but then I thought locking at this place was probably unnecessary because it seemed to apply to the generation count only. As matter of fact I just committed the previous patch I sent before I saw your mail... probably we should commit yours instead? -- Pierre Beyssac [EMAIL PROTECTED] [EMAIL PROTECTED] Free domains: http://www.eu.org/ or mail [EMAIL PROTECTED] To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-current in the body of the message
Re: Wine-2002.10.07 port on FreeBSD 5.0-current
On Wed, Oct 30, 2002 at 05:29:39PM +0100, Dag-Erling Smorgrav wrote: That revision doesn't change the structure, just how it is defined, so binary compatibility is not an issue. As for source compatibility, just use the DBREG_DRX macro, which exists in both -STABLE and -CURRENT (it was merged into -STABLE two years ago). It's too bad source compatibility hasn't been preserved. This macro is a real PITA to use with a static structure (see my Wine patch below). Argument d is not properly parenthesized, I'll commit the following patch if nobody objects: - #define DBREG_DRX(d,x) (d-dr[(x)]) /* reference dr0 - dr7 by - register number */ + #define DBREG_DRX(d,x) ((d)-dr[(x)]) /* reference dr0 - dr7 by + register number */ Now for Gerald and Krzysztof, could you try the attached patch? Works for me under current at least. I'll test it under stable then I'll send it to Wine. -- Pierre Beyssac [EMAIL PROTECTED] [EMAIL PROTECTED] Free domains: http://www.eu.org/ or mail [EMAIL PROTECTED] --- context_i386.c.orig Wed Aug 14 22:59:03 2002 +++ context_i386.c Fri Nov 8 11:29:34 2002 @@ -373,6 +373,15 @@ struct dbreg dbregs; if (ptrace( PTRACE_GETDBREGS, pid, (caddr_t) dbregs, 0 ) == -1) goto error; +#ifdef DBREG_DRX + /* needed for FreeBSD, the structure fields have changed under 5.x */ + context-Dr0 = DBREG_DRX((dbregs), 0); + context-Dr1 = DBREG_DRX((dbregs), 1); + context-Dr2 = DBREG_DRX((dbregs), 2); + context-Dr3 = DBREG_DRX((dbregs), 3); + context-Dr6 = DBREG_DRX((dbregs), 6); + context-Dr7 = DBREG_DRX((dbregs), 7); +#else context-Dr0 = dbregs.dr0; context-Dr1 = dbregs.dr1; context-Dr2 = dbregs.dr2; @@ -380,6 +389,8 @@ context-Dr6 = dbregs.dr6; context-Dr7 = dbregs.dr7; #endif + +#endif } if (flags CONTEXT_FLOATING_POINT) { @@ -437,6 +448,17 @@ { #ifdef PTRACE_SETDBREGS struct dbreg dbregs; +#ifdef DBREG_DRX + /* needed for FreeBSD, the structure fields have changed under 5.x */ + DBREG_DRX((dbregs), 0) = context-Dr0; + DBREG_DRX((dbregs), 1) = context-Dr1; + DBREG_DRX((dbregs), 2) = context-Dr2; + DBREG_DRX((dbregs), 3) = context-Dr3; + DBREG_DRX((dbregs), 4) = 0; + DBREG_DRX((dbregs), 5) = 0; + DBREG_DRX((dbregs), 6) = context-Dr6; + DBREG_DRX((dbregs), 7) = context-Dr7; +#else dbregs.dr0 = context-Dr0; dbregs.dr1 = context-Dr1; dbregs.dr2 = context-Dr2; @@ -445,6 +467,7 @@ dbregs.dr5 = 0; dbregs.dr6 = context-Dr6; dbregs.dr7 = context-Dr7; +#endif if (ptrace( PTRACE_SETDBREGS, pid, (caddr_t) dbregs, 0 ) == -1) goto error; #endif
Re: Wine-2002.10.07 port on FreeBSD 5.0-current
On Fri, Nov 08, 2002 at 12:08:32PM +0100, Gerald Pfeifer wrote: Unfortunately (in the sense that both of you duplicated effort), Alfred independently came up with a similiar patch which went in as $PORTSDIR/emulators/wine/files/patch-context_i386 and which I already fed upstream to the Wine folks. Fine, but if included as is in Wine because, it will break compatibility with Net/OpenBSD because DBREG_DRX is a FreeBSDism... that's why I surrounded my patch with a #ifdef DBREG_DRX (which seems cleaner than a #ifdef __FreeBSD__). I'll send my patch with these additional explanations to Wine, then. -- Pierre Beyssac [EMAIL PROTECTED] To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-current in the body of the message
Re: Wine-2002.10.07 port on FreeBSD 5.0-current
On Fri, Nov 08, 2002 at 02:04:01PM +0100, Pierre Beyssac wrote: Fine, but if included as is in Wine because, it will break compatibility with Net/OpenBSD because DBREG_DRX is a FreeBSDism... Sorry for the phrasing, remove the spurious because to make sense of it :) -- Pierre Beyssac [EMAIL PROTECTED] To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-current in the body of the message
Re: XFree86 and -CURRENT crashing
On Wed, Apr 03, 2002 at 03:14:43AM -0500, Coleman Kane wrote: I have been having major issues with XFree86 recently. It seems to just completely halt the machine hard whenever I try starting it. If I run it from a remote terminal with -verbose all the way up, it seems to halt at the section just after it says it's loading the RENDER module. I was wondering if anyone else knew of or has the same problem. Basically, it is a typical Athlon system running a Radeon DDR 32MB card, but the If DRI is enabled in your XF86Config, try commenting out 'Section DRI' and start XFree again. If it fixes the crash, I bet you are using the DRI kernel module stuff (port drm-kmod). I'm using it for a Radeon card, and I need to recompile/reinstall it almost every time I update my kernel to avoid crashes when starting XFree. -- Pierre Beyssac [EMAIL PROTECTED] To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-current in the body of the message
Re: FreeBSD-localised OpenSSH hangs with Foundry SSH1 server
On Tue, Apr 02, 2002 at 10:40:08AM +0200, Dag-Erling Smorgrav wrote: Uh, no, it does not seem to work in ssh_config, only in sshd_config. Hmm, that needs fixing then. I have written the following patch, seems to work ok. Pierre --- readconf.c.orig Tue Mar 19 14:29:02 2002 +++ readconf.c Tue Apr 2 19:44:46 2002 -116,7 +116,8 oKbdInteractiveAuthentication, oKbdInteractiveDevices, oHostKeyAlias, oDynamicForward, oPreferredAuthentications, oHostbasedAuthentication, oHostKeyAlgorithms, oBindAddress, oSmartcardDevice, - oClearAllForwardings, oNoHostAuthenticationForLocalhost + oClearAllForwardings, oNoHostAuthenticationForLocalhost, + oVersionAddendum } OpCodes; /* Textual representations of the tokens. */ -188,6 +189,7 { smartcarddevice, oSmartcardDevice }, { clearallforwardings, oClearAllForwardings }, { nohostauthenticationforlocalhost, oNoHostAuthenticationForLocalhost }, + { versionaddendum, oVersionAddendum }, { NULL, oBadOption } }; -675,6 +677,13 } if (*activep *intptr == -1) *intptr = value; + break; + + case oVersionAddendum: + ssh_version_set_addendum(strtok(s, \n)); + do { + arg = strdelim(s); + } while (arg != NULL *arg != '\0'); break; default:
FreeBSD-localised OpenSSH hangs with Foundry SSH1 server
I had problems connecting with the FreeBSD openssh client to a Foundry BigIron gigabit switch running ssh 1.2.27, whereas I can connect fine to the same switch when using a locally-compiled OpenSSH 3.1p1. The culprit is apparently the length of the version string sent by FreeBSD and received by the Foundry switch. If it is over 24 characters, the Foundry ssh daemon just sits there and hangs for a few minutes until it timeouts and closes the connection. If I shorten the client version string to be OpenSSH_3.1 FreeBSD, everything works ok again. The closest thing to a standard description of the SSH1 protocol I could find is below. It clearly sets a upper limit of 40 characters for the version part of the identification string. This is lower than the 42 chars of OpenSSH_3.1 FreeBSD localisations 20020318, but higher than the maximum of 24 character accepted by the Foundry implementation. So it looks like neither side is strictly compliant to something that's not really a standard anyway. It would be easier on me (and other Foundry switch users) and in the interest of interoperability with broken ssh implementations if the FreeBSD-specific string could be shortened (to at most 11 chars, which is exactly enough to put des20020307 in there for example ;-), made user-configurable, or altogether removed. http://www.snailbook.com/docs/protocol-1.5.txt Protocol Version Identification After the socket is opened, the server sends an identification string, which is of the form SSH-protocolmajor.protocolminor- version\n, where protocolmajor and protocolminor are integers and specify the protocol version number (not software distribution version). version is server side software version string (max 40 characters); it is not interpreted by the remote side but may be use- ful for debugging. Pierre To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-current in the body of the message
Re: FreeBSD-localised OpenSSH hangs with Foundry SSH1 server
On Mon, Apr 01, 2002 at 11:32:07PM +0200, Dag-Erling Smorgrav wrote: if the FreeBSD-specific string could be shortened (to at most 11 chars, which is exactly enough to put des20020307 in there for example ;-), made user-configurable, or altogether removed. Look for VersionAddendum in /etc/ssh/sshd_config (it can be used in ssh_config as well). Uh, no, it does not seem to work in ssh_config, only in sshd_config. Pierre To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-current in the body of the message
i386/boot2.c patches
Would anyone mind if I commit the following to i386/boot2.c? I've reviewed and tested them. add -n option to boot2 to disallow user interruption http://www.FreeBSD.org/cgi/query-pr.cgi?pr=36016 boot2 cleanup (modulo style(9) and a minor typo in a comment): http://www.FreeBSD.org/cgi/query-pr.cgi?pr=36015 -- Pierre Beyssac [EMAIL PROTECTED] To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-current in the body of the message
random woes (no RSA support in libssl and libcrypto)
Just in case some else gets caught (which is sure to happen), in case you get the following obscure message from ssh after updating your -current: ssh: no RSA support in libssl and libcrypto. See ssl(8). This just means you need to remake your /dev/urandom (ln -f random urandom). It seems the compatibility with the previous minor of urandom has been silently removed (I assume this happened with the last update/cleanup of the random device). It took me two hours to figure it out. -- Pierre Beyssac [EMAIL PROTECTED] To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
Re: CURRENT instability
On Mon, Mar 19, 2001 at 08:30:12AM +0100, Pascal Hofstee wrote: AMD K6-2 350 I noticed the vague stack smashes posting earlier ... and i think it's very likely this is the same bug Same here, random crashes -- AMD K6-2 300; no panic, no crash dump, just a complete system freeze if you happen to use too much CPU. I had to temporarily revert to an older kernel. -- Pierre Beyssac [EMAIL PROTECTED] To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
Re: CURRENT instability
On Mon, Mar 19, 2001 at 11:19:02PM +0100, Pierre Beyssac wrote: Ok, thanks, note that your previous patch works fine, at least my make world is still running :-) Famous last words; I had a freeze soon afterwards. Though it seems to have improved the situation quite a bit. Now running another make world with the new patch... -- Pierre Beyssac [EMAIL PROTECTED] To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
can't use vat (and only vat) with newpcm
I'm having problems running vat with newpcm: after opening the device the sound begins for a fraction of a second, then stops. Maybe this has to do with the fact that vat uses /dev/audio and not /dev/dsp; I've tried to open /dev/dsp instead and change the device format but the result is the same. Or maybe this has to do with the non-blocking opening done by vat (I've changed this too, same result). But strangely enough, a cat /dev/audio, amp or mpg123 all work ok. The chip is the following. The kernel is configured with just "device pcm0". From a ktrace, vat does nothing but writes to the device. pcm0: CS4236B at port 0x534-0x537,0x388-0x38b,0x220-0x22f irq 5 drq 1,0 on isa0 unknown0: Game at port 0x3a0-0x3a7 on isa0 unknown1: Ctrl at port 0xf00-0xf07 on isa0 unknown2: MPU at port 0x330-0x331 on isa0 -- Pierre Beyssac [EMAIL PROTECTED] To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
Re: egcs -O breaks ping.c:in_cksum()
On Tue, Nov 16, 1999 at 03:17:43PM +1100, Bruce Evans wrote: On Mon, 15 Nov 1999, Pierre Beyssac wrote: - volatile u_short answer = 0; + union { + u_int16_t us; + u_int8_t uc[2]; + } answer; This has indentation bugs. Uh, which one(s) do you mean exactly? The 4-space indented union (I just followed style(9)) or the double space before uc[2] (it was just to align us and uc vertically)? ping.c still assumes that u_short is u_int16_t everywhere else. But in_cksum() is more or less self-contained. Probably it's more consistent (even withing in_cksum which uses u_short elsewhere) to change back the union to u_short and u_char, though. This `answer' variable has nothing to do with the final `answer' variable. The latter should not be a union. The original code apparently reuses `answer' to do manual register allocation for ancient compilers. Agreed. Perhaps the above should be written as: sum += ntohs(*(u_char *)w 8); to avoid the undefined union access (answer.us). Uh... I'm not sure I don't prefer the union, actually :-) -- Pierre Beyssac [EMAIL PROTECTED] To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
Re: egcs -O breaks ping.c:in_cksum()
On Tue, Nov 16, 1999 at 10:58:06AM +0200, Sheldon Hearn wrote: The word ``union'' doesn't appear in style(9) and a 1 tab indent is used consistently in the examples of structs. Use 1 tab. Right, I reread style(9) and I apparently misunderstood the following part which only applies to code (mainly inside a statement): Indentation is an 8 character tab. Second level indents are four spaces. -- Pierre Beyssac [EMAIL PROTECTED] To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
Re: egcs -O breaks ping.c:in_cksum()
On Mon, Nov 15, 1999 at 05:59:23PM -0800, Kris Kennaway wrote: On Tue, 16 Nov 1999, Pierre Beyssac wrote: I've checked, the answer is no: apparently, in_cksum() in routed/rdisc.c is only called in two places, both with an even size. Can it hurt to pre-emptively fix it anyway in case some future change pulls the rug out from underneath? We could, but since the danger is purely theoretical for now (and probably will stay that way forever), I don't see any advantage in cluttering up the code. Since routed is sometimes sync'ed from external sources, it would only make life harder for the people doing the merges. Plus, everyone steals in_cksum from ping, not from routed (at least, that's what I do :-) Since in_cksum is used in several places (there's another optimized copy in libstand), a cleaner solution would be to put it in some library. -- Pierre Beyssac [EMAIL PROTECTED] To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
Re: egcs -O breaks ping.c:in_cksum()
On Tue, Nov 16, 1999 at 07:29:35PM +0100, Poul-Henning Kamp wrote: In message [EMAIL PROTECTED], Pierre Beyssac writes: Since in_cksum is used in several places (there's another optimized Isn't there one in libalias already ? Right. I missed it because it's called PacketAliasInternetChecksum()... -- Pierre Beyssac [EMAIL PROTECTED] To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
egcs -O breaks ping.c:in_cksum()
I've discovered the following problem, either due to egcs or the source code for in_cksum in ping, I'm not sure. The symptom is that in_cksum() returns an invalid result on an odd-size buffer, when compile optimization is on. You can check this by trying "ping -s 65 localhost" and seeing that no reply is ever received. The kernel ICMP bad checksum count just increases. The problem is apparently due to the following code fragment: register u_short answer = 0; [...] /* mop up an odd byte, if necessary */ if (nleft == 1) { *(u_char *)(answer) = *(u_char *)w ; sum += answer; } Removing the "register" declaration for 'answer' doesn't help. OTOH, adding "volatile" does help and seems to fix the problem. Only I'm not sure that's the correct fix because I'm unsure about the exact semantics of "volatile"; it might well be an egcs bug. Attached is a test program if anyone wishes to experiment. Try to compile with and without -O and see the difference. The correct output is "cksum=f9f6", the wrong output is "cksum=f5f6". -- Pierre Beyssac [EMAIL PROTECTED] #include sys/types.h #include stdio.h /* * in_cksum -- * Checksum routine for Internet Protocol family headers (C Version) */ u_short in_cksum(addr, len) u_short *addr; int len; { register int nleft = len; register u_short *w = addr; int sum = 0; volatile u_short answer = 0; /* * Our algorithm is simple, using a 32 bit accumulator (sum), we add * sequential 16 bit words to it, and at the end, fold back all the * carry bits from the top 16 bits into the lower 16 bits. */ while (nleft 1) { sum += *w++; nleft -= 2; } /* mop up an odd byte, if necessary */ if (nleft == 1) { *(u_char *)(answer) = *(u_char *)w ; sum += answer; } /* add back carry outs from top 16 bits to low 16 bits */ sum = (sum 16) + (sum 0x); /* add hi 16 to low 16 */ sum += (sum 16); /* add carry */ answer = ~sum; /* truncate to 16 bits */ return(answer); } int main() { unsigned char tb[] = { 1, 2, 3, 4, 5 }; printf("cksum=%04x\n", in_cksum(tb, sizeof tb)); }
Re: egcs -O breaks ping.c:in_cksum()
On Mon, Nov 15, 1999 at 06:52:23PM +0200, Sheldon Hearn wrote: On Mon, 15 Nov 1999 17:48:31 +0100, Pierre Beyssac wrote: I've discovered the following problem, either due to egcs or the source code for in_cksum in ping, I'm not sure. See PR 13292. Wow, Thanks! August 21th, it's not really new... Maybe I can at least commit the addition of "volatile" to the source code. That will work around that particular bug until egcs is fixed... That doesn't say how many occurences of similar code there are in the rest of the system, of course. -- Pierre Beyssac [EMAIL PROTECTED] To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
Re: egcs -O breaks ping.c:in_cksum()
On Mon, Nov 15, 1999 at 01:35:15PM -0500, Garrett Wollman wrote: If, rather than casting pointers, the code used a union (containing one u_int16_t and one array[2] of u_int8_t), the compiler would have enough information to know about the aliases. You're right, this seems to work even with optimization turned on. If nobody objects, I'll commit it. --- ck.c.oldMon Nov 15 19:41:35 1999 +++ ck.cMon Nov 15 19:39:43 1999 @@ -13,7 +13,10 @@ register int nleft = len; register u_short *w = addr; int sum = 0; - volatile u_short answer = 0; + union { + u_int16_t us; + u_int8_t uc[2]; + } answer; /* * Our algorithm is simple, using a 32 bit accumulator (sum), we add @@ -27,15 +30,16 @@ /* mop up an odd byte, if necessary */ if (nleft == 1) { - *(u_char *)(answer) = *(u_char *)w ; - sum += answer; + answer.uc[0] = *(u_char *)w ; + answer.uc[1] = 0; + sum += answer.us; } /* add back carry outs from top 16 bits to low 16 bits */ sum = (sum 16) + (sum 0x); /* add hi 16 to low 16 */ sum += (sum 16); /* add carry */ - answer = ~sum; /* truncate to 16 bits */ - return(answer); + answer.us = ~sum; /* truncate to 16 bits */ + return(answer.us); } int main() -- Pierre Beyssac [EMAIL PROTECTED] To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
Re: egcs -O breaks ping.c:in_cksum()
[in_cksum bugs] Fix committed for ping. There's another bug in sbin/routed/rdisc.c:in_cksum() on odd packet sizes, albeit I'm not sure it's ever triggered (does routed ever generate odd-size packets?). It's a portability bug (works only on little-endian machines). I'll commit the same fix if there's no objection. -- Pierre Beyssac [EMAIL PROTECTED] To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
Re: PATCH for testing
On Mon, Nov 15, 1999 at 02:27:10PM -0800, Matthew Dillon wrote: And, also, we need to get rid of the 'e' option to ps entirely. It's a major security hole. Not more so than option 'u', or even 'a', if you ask me. It's common knowledge under Unix that you shouldn't put anything sensitive in the command line or the environment. When there's any risk, the best option is to remove 'ps' alltogether, IMHO. -- Pierre Beyssac[EMAIL PROTECTED] [EMAIL PROTECTED] BSD : il y a moins bien, mais c'est coté en bourse Free domains: http://www.eu.org/ or mail [EMAIL PROTECTED] To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
Re: egcs -O breaks ping.c:in_cksum()
[in_cksum bugs] There's another bug in sbin/routed/rdisc.c:in_cksum() on odd packet sizes, albeit I'm not sure it's ever triggered (does routed ever generate odd-size packets?). I've checked, the answer is no: apparently, in_cksum() in routed/rdisc.c is only called in two places, both with an even size. -- Pierre Beyssac[EMAIL PROTECTED] [EMAIL PROTECTED] BSD : il y a moins bien, mais c'est coté en bourse Free domains: http://www.eu.org/ or mail [EMAIL PROTECTED] To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
Re: Still waiting for xl driver reports
On Sun, Oct 10, 1999 at 04:59:34PM -0400, Bill Paul wrote: A while back I posted a message here saying that I'd changed the xl driver a bit to hopefully improve performance for 3c90xB and later adapters (i.e. the "cyclone," "hurricane" and "tornado" chipsets). I asked for people to report if the changes helped, hurt, made no difference or were totally broken. So far not one person has said so much as a word to me on this subject. I use the xl device (two cards: xl0 and xl1) in my kernel (compiled on October 4th) and hadn't realized I was using the new driver. I only noticed weird things last Sunday when the interface hung while I was trying a tcpdump on it (I was loggued remotely, which explains why I saw the problem immediately)... Then there was a "xl1: watchdog timeout" and things went back to normal. I compiled today's kernel to see if it made any difference, and I can confirm I can repeat it easily you wish. The watchdog timeout seems to happen every time I get out of promiscuous mode, and (very seldom) at other random moments. To reiterate, this only concerns people with the following adapters: Here's what I have: xl0: 3Com 3c900-COMBO Etherlink XL irq 11 at device 10.0 on pci2 xl0: Ethernet address: 00:10:5a:bf:13:96 xl0: selecting 10baseT transceiver, half duplex xl1: 3Com 3c905B-TX Fast Etherlink XL irq 11 at device 17.0 on pci0 xl1: Ethernet address: 00:c0:4f:67:0b:82 It's on xl1 that I have problems, although I'm not sure this wouldn't happen on xl0 too (xl0 is configured down at the moment). Now regarding performance, I haven't made extensives tests. The closest thing to that that I have in mind is to stretch that 2*100Mbps trunking link we configured this morning between two of our ethernet switches :-) -- Pierre Beyssac [EMAIL PROTECTED] To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
Re: Wierd su/vnc issue
On Mon, Sep 20, 1999 at 09:55:19PM +, Adam Strohl wrote: Well, is there a work around, telneting out does the same thing, using the -l user option used to allow me to log into remote hosts, not The workaround I found was to boot single then "su - mylogin". -- Pierre Beyssac [EMAIL PROTECTED] To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
Re: DANGER: login and friends with libscrypt/libdescrypt
On Tue, Sep 21, 1999 at 08:29:26AM +0200, Mark Murray wrote: - /usr/bin/login and friends are now linked against libscrypt instead of libcrypt. This is a link bug. The Makefile says "-lcrypt". JDP? Then there's the problem that libcrypto.so.3 won't magically be a link to a working libdescrypt.so.3 if the latter doesn't exist, especially if you don't have crypto sources. Then, the fact that login SIGSEV's in strcmp from inside PAM doesn't look very normal to me either. I suppose there's an error check missing somewhere when the libscrypt is called while you use DES passwords. (gdb) where #0 0x280d0cf4 in strcmp () from /usr/lib/libc.so.3 #1 0x28115365 in pam_sm_authenticate () from /usr/lib/pam_unix.so #2 0x280754b9 in pam_getenvlist () from /usr/lib/libpam.so.1 #3 0x2807577d in _pam_dispatch () from /usr/lib/libpam.so.1 #4 0x28074b37 in pam_authenticate () from /usr/lib/libpam.so.1 #5 0x804a88a in setlogin () #6 0x8049c3a in setlogin () #7 0x804986d in setlogin () -- Pierre Beyssac [EMAIL PROTECTED] To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
top and w breakage
It looks like top and w are broken (whereas ps works) with a -current from this night. top displays weird stuff: PID USERNAME PRI NICE SIZERES STATETIME WCPUCPU COMMAND -1067947544 375545 185 50663 3078M 0K ? 64??? 134561792.00% 657040 0 root -22 49209 0K 0K ? 82 0:00 0.00% 1455.13% w works except that the "WHAT" column is apparently always "-". I have reinstalled libkvm and a kernel in sync with the rest of the system... Have I missed something? -- Pierre Beyssac [EMAIL PROTECTED] To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
Re: DANGER: login and friends with libscrypt/libdescrypt
On Tue, Sep 21, 1999 at 07:51:18PM +0200, Mark Murray wrote: This appears to have been lost. Hmm. I might be the culprit. Fixing now... Uh, this seems to have been fixed by Peter a moment ago. Now the only thing that I'd like to know is: where do I get the current CVS sources for libdescrypt, so that this doesn't prevent me from logging-in next time? peter 1999/09/21 07:44:28 PDT Modified files: lib/libcrypt Makefile Log: Somebody deleted the SONAME override causing the symlink to be expanded at link time and the target name compiled into the binaries. ie: everything used libscrypt or libdescrypt explicitly. Revision ChangesPath 1.21 +5 -1 src/lib/libcrypt/Makefile peter 1999/09/21 07:47:37 PDT Modified files: secure/lib/libcrypt Makefile Log: Restore SONAME setting, otherwise libdescrypt.so.3 doesn't end up with a special SONAME of libcrypt.so.3 and the runtime symlink doesn't work. Revision ChangesPath 1.19 +5 -1 src/secure/lib/libcrypt/Makefile -- Pierre Beyssac [EMAIL PROTECTED] To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
Re: DANGER: login and friends with libscrypt/libdescrypt
On Tue, Sep 21, 1999 at 08:09:02PM +0200, Mark Murray wrote: Usual places? It is in Internat as well. Yes, my question was more or less _where_ are the usual places :-) because internat.freebsd.org is apparently down at the moment. I finally got it from: ftp://ctm.freebsd.org/pub/FreeBSD/development/CTM-international/int-cvs-cur/ -- Pierre Beyssac [EMAIL PROTECTED] To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
Re: top and w breakage
On Tue, Sep 21, 1999 at 01:30:02PM -0500, Chris Costello wrote: You haven't given a lot of information. Was this the result of a full buildworld, installworld and then new kernel? Sorry I was unclear, yes: a full buildworld+installworld+new kernel (the same installworld that broke login and friends for me). This includes kernel modules of course. Something more (probably unrelated): when I exit top with ^C, I have a SIGSEGV there: (gdb) where #0 0x2808a19d in tputs () from /usr/lib/libncurses.so.5 #1 0x804c16b in free () #2 0x804d1f8 in clear () #3 0xbfbfdfd4 in ?? () #4 0x80493c5 in free () Am I cursed (pun intended) or what :-) ? -- Pierre Beyssac [EMAIL PROTECTED] To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
Re: DANGER: login and friends with libscrypt/libdescrypt
On Tue, Sep 21, 1999 at 08:59:42PM +0200, John Hay wrote: Can you explain a bit more what "apparently down" mean please? I didn't manage to connect on it by ftp when I sent the previous message. Apparently I haven't waited long enough: connectivity between France and this machine is just dreadfully slow. That's why I tried ping which makes checking easier than FTP, but I didn't know ping is filtered for this machine. It seems up to me. I'm logged in on it at the moment and did an It's up for me too now, but it's extremely sloow. I get FTP timeouts 3 times out of 4. -- Pierre Beyssac [EMAIL PROTECTED] To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
Re: top and w breakage
On Tue, Sep 21, 1999 at 08:40:07PM +0200, Pierre Beyssac wrote: Sorry I was unclear, yes: a full buildworld+installworld+new kernel (the same installworld that broke login and friends for me). This includes kernel modules of course. Something more (probably unrelated): when I exit top with ^C, I have a SIGSEGV there: Following up: top seems to work ok for me now (-current from tonight), as well as 'w'. I certainly have made something wrong somewhere. Sorry for the false alarm... -- Pierre Beyssac[EMAIL PROTECTED] [EMAIL PROTECTED] {Free,Net,Open}BSD, Linux : il y a moins bien, mais c'est plus cher Free domains: http://www.eu.org/ or mail [EMAIL PROTECTED] To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
DANGER: login and friends with libscrypt/libdescrypt
I've just been bitten by the following, so I figured I might as well warn others. From a quick glance it doesn't seem to have been mentioned, or not clearly enough, in this list. - libscrypt/libdescrypt major number has been bumped a few days ago. - /usr/bin/login and friends are now linked against libscrypt instead of libcrypt. The bottom line is, if: - you don't have crypto sources on your machine - you were using a symbolic link from libcrypt* to libscrypt*/libdescrypt* - you used that to link to an old libdes binary then ***test*** your compiled login binary before you reinstall everything. Thanksfully I kept an older -current on another machine from which I could find a working copy of login, which saved me from totally ruining my night. But I'll be sure to install complete crypto sources first thing tomorrow morning on my machines. -- Pierre Beyssac [EMAIL PROTECTED] To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
Re: Wierd su/vnc issue
On Mon, Sep 20, 1999 at 09:16:13PM +, Adam Strohl wrote: adams@nightfall(21:13:30)$ su Password:load: 0.35 cmd: su 407 [ttyin] 0.00u 0.00s 0% 664k As soon as I hit enter I get more: load: 0.32 cmd: su 407 [ttyin] 0.00u 0.00s 0% 664k I saw that, too. That might be a strange side effect of the libcrypt-libscrypt change (note how it happens with programs which ask for a passwd). I just get SIGSEVs right now when I try su but I've had the same as you at least once before that. -- Pierre Beyssac [EMAIL PROTECTED] To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
Re: net.inet.tcp.always_keepalive on as default ?
On Tue, Jun 01, 1999 at 03:02:47PM -0700, Matthew Dillon wrote: I think keepalive's could easily be turned on by default. At BEST, one of the first things I did 5 years ago was to turn them on permanently on all of our machines. I'd like to disagree on the by default part, on the following assumptions: 1. If you turn on keepalive by default, you will have to increase the keepalive timeout value well over the current 2 hours (at least that's what most people suggest to alleviate the concerns about having keepalives on) 2. Changing this default value of 2 hours will affect ALL applications. Many of them (and their users) are more or less expecting a 2 hours value. For example that's the case for Telnet: probably you don't want to wait for ONE WEEK before a connection dies if you are currently using keepalives! I don't see what this fuss is all about. If for _some_ big servers there are many dead connections around after a while (*), why don't THEY use a sysctl at boot-time to change the default state, rather than impose on the rest of us a change that might not be as innocuous as it looks? (*) In theory, for a FTP server, most such connections will be when the user does a PUT, not a GET. In a GET, the server has data to push and will timeout anyway. In the case of the control connection, there's a application timeout in most ftpds who close the connection after some configurable amount of time. This used to be a HUGE argument in the days where networks were less reliable and dialup lines were scarse. It is not an argument now, however. Go explain that to my cable provider :-). Keeping a long-lived connection through them is a real challenge; keepalives on would make my life even more difficult. Whatever we do, we should not start messing around with the internals of the kernel trying to 'fix' a non-problem. Keepalives work just dandy as they are currently implemented, we do not have to mess with it beyond possibly changing the default in rc.conf. possibly, but *only* as a last resort if there are good reasons for it, IMHO. But I haven't seen any such reason yet. I also think that having at least a kernel-wide (or better, having it configurable on a per-socket basis), dynamically configurable keepalive would be a good thing. -- Pierre Beyssac p...@enst.fr To Unsubscribe: send mail to majord...@freebsd.org with unsubscribe freebsd-current in the body of the message
Re: mbuf starvation
On Wed, May 12, 1999 at 01:34:05PM -0700, Matthew Dillon wrote: asleep() allows a subroutine deep in the call stack to specify an asynchronous blocking condition and then return a temporary failure up through the ranks. At the top level, the scheduler sees and acts upon the asynchronous blocking condition. Higher level routines do not So if I get it right, this would give something like the code below. Is that the idea? What's missing in the asleep/await code to use them in such a way? soxxx() { for (;;) { await(mbuf_slp); /* code */ error = xxx(mbuf_slp); if (error != ENOBUFS) break; } } m_retry() { /* find an mbuf... */ if (/* got an mbuf to return */) return mbuf; else { asleep(mbuf_slp); return NULL; } } m_free() { /* Free mbuf... */ wakeup(mbuf_slp); } And, unless I'm missing something, we still need to properly check for NULL return values from m_get and friends. -- Pierre Beyssac p...@enst.fr To Unsubscribe: send mail to majord...@freebsd.org with unsubscribe freebsd-current in the body of the message
Re: mbuf starvation
On Wed, May 12, 1999 at 10:50:42AM -0700, Archie Cobbs wrote: MCLGET(m, M_WAIT); + if (m == 0) { I think this part of the patch is useless. MCLGET() does not set m to NULL when it fails, it simply doesn't set the M_EXT flag. ...unless things have changed recently. No, they apparently haven't. You're absolutely right. -- Pierre Beyssac p...@enst.fr To Unsubscribe: send mail to majord...@freebsd.org with unsubscribe freebsd-current in the body of the message
mbuf starvation
I'm trying to plug some of the holes not checking for mbuf shortage. In particular, there are the following unchecked calls to MGET and friends in /sys/kern/uipc_socket.c:sosend() (see patches below). Would anyone mind if I commit these? I won't be able to commit these before next Sunday evening, so if anyone deems these to be useful, he's welcome to commit before then. Another big problem is that there's a check in m_retry and friends that panics when falling short of mbufs! I really believe this does more harm than good, because it prevents correct calling code (checking for NULL mbuf pointers) from exiting gracefully with ENOBUFS... These could most certainly help with 3.2-RELEASE too. Same problem, I won't be able to do anything more before Sunday. --- uipc_socket.c.orig Wed May 5 16:48:57 1999 +++ uipc_socket.c Wed May 12 16:55:27 1999 @@ -497,15 +497,27 @@ } else do { if (top == 0) { MGETHDR(m, M_WAIT, MT_DATA); + if (m == 0) { + error = ENOBUFS; + goto release; + } mlen = MHLEN; m-m_pkthdr.len = 0; m-m_pkthdr.rcvif = (struct ifnet *)0; } else { MGET(m, M_WAIT, MT_DATA); + if (m == 0) { + error = ENOBUFS; + goto release; + } mlen = MLEN; } if (resid = MINCLSIZE) { MCLGET(m, M_WAIT); + if (m == 0) { + error = ENOBUFS; + goto release; + } if ((m-m_flags M_EXT) == 0) goto nopages; mlen = MCLBYTES; @@ -617,6 +629,8 @@ flags = 0; if (flags MSG_OOB) { m = m_get(M_WAIT, MT_DATA); + if (m == 0) + return (ENOBUFS); error = (*pr-pr_usrreqs-pru_rcvoob)(so, m, flags MSG_PEEK); if (error) goto bad; --- uipc_mbuf.c.origFri Apr 30 12:33:50 1999 +++ uipc_mbuf.c Wed May 12 17:05:02 1999 @@ -263,10 +263,7 @@ if (m != NULL) { mbstat.m_wait++; } else { - if (i == M_DONTWAIT) - mbstat.m_drops++; - else - panic(Out of mbuf clusters); + mbstat.m_drops++; } return (m); } @@ -291,10 +288,7 @@ if (m != NULL) { mbstat.m_wait++; } else { - if (i == M_DONTWAIT) - mbstat.m_drops++; - else - panic(Out of mbuf clusters); + mbstat.m_drops++; } return (m); } -- Pierre Beyssac p...@enst.fr To Unsubscribe: send mail to majord...@freebsd.org with unsubscribe freebsd-current in the body of the message
Re: mbuf starvation
On Wed, May 12, 1999 at 05:43:27PM +0200, Stefan Bethke wrote: I've discussed this with Garett back in September. The reason is quite simple: unless all cases of not checking for a NULL pointer returned are fixed (or instrumented with a panic), it is better to fail with a panic than with some obscure problem later on. Yes, I would agree in the general case, but in that particular case the reasonning is flawed: you panic every time, while there are many cases that currently are handled gracefully by the caller. In other words, you don't leave any choice to the caller. That's bad. IHMO it's not even a good thing in general because mbuf starvations can and _will_ happen as a normal condition, not because of bugs but because of high resource use. It can have its uses for debugging purposes, as a compilation option. -- Pierre Beyssac p...@enst.fr To Unsubscribe: send mail to majord...@freebsd.org with unsubscribe freebsd-current in the body of the message
Re: m_get(M_WAIT, ...) _can_ return NULL ?
On Sat, May 08, 1999 at 03:32:33PM +0200, Luigi Rizzo wrote: m = m_get(M_WAIT, ...) m-m_len = something. looking at the code, it seems that m_get() _can_ return a NULL pointer even if one specifies M_WAIT. Could this be a potential weakness, and in this case, how shuld we go and fix it -- by making m_get never return if there is no memory, or by hunting all such occurrences of the code ? It's a well-known problem, it's a potential weakness, there are several pending PRs related to this (quick search from the subject, probably several more I didn't find): kern/9883 kern/10872 As Alfred said, this has been discussed several times on the lists, and IIRC the conclusion was that it's not easy to fix the code everywhere. It seemed best not to make m_get() wait, because it can result in unexpected blockings (classical starvation problem). I'd be willing to give it a look, but don't expect it to be 100% solved anytime soon. -- Pierre Beyssac p...@enst.fr To Unsubscribe: send mail to majord...@freebsd.org with unsubscribe freebsd-current in the body of the message
Re: Any action on PR 10570 ? getting closer to 65K :-(
On Fri, Apr 30, 1999 at 04:28:55PM +0800, adr...@freebsd.org wrote: I'll submit a followup to the pr, and a patch after I've verified it doesn't panic the system, but that will be sometime early next week (I can't setup a BGP connection to flood routes in and out before then..) Uh, from reading the PR it looks like it can be triggered by creating a little more than 2^16 routes to the same destination, and then deleting some of them to fall back under 2^16. I'm going to give it a try now and I'll send you the script if that works. I'm also making a world with short changed to int to see if it works. Wouldn't it be sensible to issue a warning (or panic) when increasing the reference count reaches 0, rather than causing a later kernel segfault? It would involve some overhead though, and I'm not sure having 2^32 routes is currently realistic since most machines don't even have that many bytes of RAM, but it might be true one day... -- Pierre Beyssac p...@enst.fr To Unsubscribe: send mail to majord...@freebsd.org with unsubscribe freebsd-current in the body of the message
Re: Any action on PR 10570 ? getting closer to 65K :-(
On Fri, Apr 30, 1999 at 07:08:18PM +0800, adr...@freebsd.org wrote: There isn't a reference to either IFAFREE or ifa_refcnt outside of /usr/src/sys on my tree. But changing its size changes the offsets of ifa_metric and ifa_rt which are after it in the structure. Also, netstat, route and ifconfig, at least, more or less depend on offsets in that structures, though I don't know exactly how (some of it has been rewritten to use ioctl() or the routing socket, I believe). -- Pierre Beyssac p...@enst.fr To Unsubscribe: send mail to majord...@freebsd.org with unsubscribe freebsd-current in the body of the message
Re: Any action on PR 10570 ? getting closer to 65K :-(
I'm running a (-CURRENT) system with the short-int patch applied and it seems to run ok so far. Here's a script to exercise the panic, you just need to define default as the IP for your default gateway. On Fri, Apr 30, 1999 at 07:28:26PM +0800, adr...@freebsd.org wrote: I didn't say you shouldn't make world again, I was just pointing out that there shouldn't be a need to modify anything else in userland. Uh, not directly anyway, but it seems that at least netstat displays the reference count as a signed short. Hence it displays a negative integer if you happen to have more than 2^15 references to the route. I suppose this will have to be given a closer look, as whatever interface it uses to get the count from the kernel might have to be changed. OTOH, the good news is that the old netstat executable still works as before. #!/bin/sh default=xxx.xxx.xxx.xxx count=65537 start=167772161 # 10.0.0.1 end=`expr $start + $count` echo Adding $count routes i=$start while [ $i -lt $end ]; do #route add $i $default #i=`expr $i + 1` done i=$start while [ $i -lt $end ]; do # echo -n 'Press RETURN to delete one route: ' # read a route delete $i $default i=`expr $i + 1` done -- Pierre Beyssac p...@enst.fr To Unsubscribe: send mail to majord...@freebsd.org with unsubscribe freebsd-current in the body of the message
Re: Any action on PR 10570 ? getting closer to 65K :-(
On Fri, Apr 30, 1999 at 04:08:35PM +0200, Pierre Beyssac wrote: and it seems to run ok so far. Here's a script to exercise the panic, you just need to define default as the IP for your default while [ $i -lt $end ]; do #route add $i $default #i=`expr $i + 1` Ooops, sorry, you need to uncomment the above 2 lines, too :-) -- Pierre Beyssac p...@enst.fr To Unsubscribe: send mail to majord...@freebsd.org with unsubscribe freebsd-current in the body of the message
Re: Any action on PR 10570 ? getting closer to 65K :-(
On Fri, Apr 30, 1999 at 07:29:14AM -0700, John Polstra wrote: It would be pretty hard to create 2^32 routes, given that IPv4 only has 32-bit addresses. :-) And here comes... IPv6 :-) Also, if you time it I suspect you'll find that it would take a geological lifetime on a fast machine to add that many routes. Some people crack 40-bit DES in no time nowadays, so who knows what to expect... I think it makes more sense to increase the size of the reference count as discussed, rather than adding checks that add more complexity and overhead. I agree. Let's count on an int being extended to 64 bits within the next few decades :-) -- Pierre Beyssac p...@enst.fr To Unsubscribe: send mail to majord...@freebsd.org with unsubscribe freebsd-current in the body of the message
Re: panic: vfs_busy: unexpected lock failure
On Mon, Mar 15, 1999 at 01:24:46PM -0800, Matthew Dillon wrote: Compile up a kernel with 'options DDB' and get a backtrace when it panics next ( 'trace' command from DDB prompt ). Ok, here goes. The kernel is compiled without -g for the moment, but I've provided the function offsets if that may help. vfs_busy() at vfs_busy+0x6d lookup()+0x3b9 namei() +0x180 stat() +0x44 syscall() +0x187 I also get what seems to be spurious EPROTONOSUPPORT errors that show up in cp while copying files... -- Pierre Beyssac p...@enst.fr To Unsubscribe: send mail to majord...@freebsd.org with unsubscribe freebsd-current in the body of the message
Re: panic: vfs_busy: unexpected lock failure
On Tue, Mar 16, 1999 at 11:11:44AM -0800, Matthew Dillon wrote: (cnp-cn_flags NOCROSSMOUNT) == 0) { if (vfs_busy(mp, 0, 0, p)) continue; ... You shouldn't be crossing a mount point. Are you by chance doing a recursive copy onto itself? e.g. cp -rp src dest where dest is mounted under src somewhere ? No. At first it was from a NFS-mounted volume to another NFS-mounted volume. I then found that it panic'ed the same when I copied from a local FFS volume to the same NFS volume. The NFS volumes are automounted by amd under /a. That may well have something to do with the panic: that's a recent change in my configuration; I previously used NFS mounts in /etc/fstab which didn't cause me any trouble. Of course, it is still a serious kernel bug. I would like to try to reproduce it in order to track it down. How are things mounted on your system ( df ) and what are the *exact* arguments you are using with cp? Here's the df (I removed some of the amd dummy mount points). $ df Filesystem 1K-blocks UsedAvail Capacity Mounted on /dev/wd0s1a 49583345951102276%/ /dev/wd1s1e 5975845 3556146 194163265%/home /dev/wd0s1f148823 1290 135628 1%/tmp /dev/wd0s1g 5380597 1615221 333492933%/usr /dev/wd0s1e39689538127 32701710%/var procfs 440 100%/proc [ ten pid...@bofh:/xyz lines removed ] pid...@bofh:/cal000 100%/cal huuh:/home/huuh 1217519 1064153 14119188%/a/huuh/home/huuh The failing cp is: $ cp -rp /home/beyssac/src/sendmail-8.9.3/cf/ /home/beyssac/nfs/junk/ In the above, /home/beyssac/nfs is a symbolic link to /cal/huuh/cal/beyssac which is automounted by amd (last line in the above df). -- Pierre Beyssac p...@enst.fr To Unsubscribe: send mail to majord...@freebsd.org with unsubscribe freebsd-current in the body of the message
Re: panic: vfs_busy: unexpected lock failure
On Tue, Mar 16, 1999 at 12:52:32PM -0800, Matthew Dillon wrote: A.. And if you make those AMD mounts normal nfs mounts it doesn't fry? If so, then we have a bug in AMD somewhere. I tried the cp several times again on a regular NFS mount, to make sure, and no, it doesn't seem to panic. So yes, that seems to be AMD-related. Can't it be in the vfs layer though? -- Pierre Beyssac p...@enst.fr To Unsubscribe: send mail to majord...@freebsd.org with unsubscribe freebsd-current in the body of the message
panic: vfs_busy: unexpected lock failure
Hello, My FreeBSD box keeps panicing when I'm trying to do a simple cp -rp from a local disk to a NFS-mounted disk. The NFS server is a Solaris 2.5 box; the NFS partition is mounted through amd. The files I try to copy are just sendmail's cf directory (lots of small files) and the panic happens every time I try (with cp -rp; not with piped tars). The kernel is today's, with NFS compiled-in (it's not a module). I'm having the following message: panic: vfs_busy: unexpected lock failure -- Pierre Beyssac p...@enst.fr To Unsubscribe: send mail to majord...@freebsd.org with unsubscribe freebsd-current in the body of the message
Re: UPDATE3: ATA/ATAPI driver new version available.
On Sun, Mar 07, 1999 at 10:49:39PM +0100, Søren Schmidt wrote: Third update to the new ATA/ATAPI driver: I tried your driver (update 2), and I solves all my CDROM problems (hanged after mount). I keep it. Great work, thanks! ata-pci0: Intel PIIX4 IDE controller rev 0x01 on pci0.7.1 ata0 at 0x01f0 irq 14 on ata-pci0 ata1 at 0x0170 irq 15 on ata-pci0 acd0: CD-ROM CDU701/1.0f CDROM drive at ata1 as master acd0: drive speed 2412KB/sec, 128KB cache acd0: supported read types: CD-R, CD-RW, CD-DA, packet track acd0: Audio: play, 256 volume levels acd0: Mechanism: ejectable tray acd0: Medium: CD-ROM 80mm data disc loaded, unlocked -- Pierre Beyssac p...@enst.fr To Unsubscribe: send mail to majord...@freebsd.org with unsubscribe freebsd-current in the body of the message
Re: NTP nanokernel support (experimental)
On Sat, Mar 06, 1999 at 08:00:53PM +0100, Ollivier Robert wrote: The two outputs I sent were with 4.0.90f. When I run 4.0.92c, ntpd is not able to get any accurate data from the device whereas 4.0.90f does. I get lots of these in /var/log/messages and it doesn't sync at all. -=-=- Mar 6 14:02:25 tara ntpd[7600]: parse: convert_rawdcf: INCOMPLETE DATA - time code only has 3 bits Mar 6 14:02:29 tara ntpd[7600]: parse: convert_rawdcf: INCOMPLETE DATA - time code only has 2 bits It's typical from bad parity setting on your serial port. Try a stty on that port; I bet it will show that PARENB is set. Unset it and things should go back to normal. Maybe it is a problem with 4.0.92c... Yes, it's a problem with most of the ntpd 4.0.9x series. There's absolutely no reason why you should enable PARENB for a raw DCF77 driver; yet that's what ntpd's configure does, at least under FreeBSD. I sent a bug report to the ntpd team a while ago but haven't heard from them. -- Pierre Beyssacp...@fasterix.frmug.org p...@fasterix.freenix.org {Free,Net,Open}BSD, Linux : il y a moins bien, mais c'est plus cher Free domains: http://www.eu.org/ or mail dns-mana...@eu.org To Unsubscribe: send mail to majord...@freebsd.org with unsubscribe freebsd-current in the body of the message
Re: NTP nanokernel support (experimental)
On Sun, Mar 07, 1999 at 08:09:50PM +0100, Poul-Henning Kamp wrote: I don't think Ollivier is doing PPP/hardpps() yet, at least I have not given him the semi-magic code needed for it :-) I wouldn't recommend trying it either, he is bound to have a 1 msec jitter on the DCF77 waves at his place, and that is a lousy diet for hardpps(). I agree; jitter with a low-cost DCF77 receiver is even more than that (5 to 10 ms). I tend to believe it's partly due to how the AM signal is demodulated and not that much from the location (Paris is not that far from Frankfurt after all). Maybe a lower jitter could be obtained by averaging 10 or 100 samples, I suppose that's how high-quality receivers work. This might be done in the ntpd driver. -- Pierre Beyssacp...@fasterix.frmug.org p...@fasterix.freenix.org {Free,Net,Open}BSD, Linux : il y a moins bien, mais c'est plus cher Free domains: http://www.eu.org/ or mail dns-mana...@eu.org To Unsubscribe: send mail to majord...@freebsd.org with unsubscribe freebsd-current in the body of the message
Re: IDE CDROM not found with PIIX4 chipset, -current kernel
On Thu, Feb 25, 1999 at 02:09:31PM +1030, Greg Lehey wrote: Wow. Thanks a million! I didn't even have to go so far, I just deleted wd2 and wd3 and acd0 now appears as if by magic. I can't tell you how extremely stupid I now feel... You shouldn't do. What you did shouldn't have any effect on the problem. Yet it definitely has. The CDROM never probed before, now it does (at least every time I tried: only about half a dozen reboots for the moment). So there's probably something strange with the probe code and how it interacts with wd[0-3] declarations. So the CDROM now probes correctly, but OTOH I didn't manage to use it. I get the following message every time: bofh /kernel: atapi1.0: controller not ready for cmd The machine then seems to hang, more or less (X11 kindly moves the mouse pointer, but the network is dead). FreeBSD still has difficulties with some ATAPI CD-ROM drives. If you continue to have trouble (and I suspect you will), you should enter a PR with send-pr or at http://www.freebsd.org/send-pr.html. Make sure to give exact details of your hardware and software configuration. I tried to find out why it worked with a Linux kernel by comparing our IDE code and theirs, but it was way beyond my comprehension. I'm not trained for the black magic of IDE probing. I'll try to investigate some more, then I'll try a bug report. -- Pierre Beyssac p...@enst.fr To Unsubscribe: send mail to majord...@freebsd.org with unsubscribe freebsd-current in the body of the message
Re: IDE CDROM not found with PIIX4 chipset, -current kernel
On Thu, Feb 25, 1999 at 04:20:20PM +1100, Bruce Evans wrote: doesn't see the slave, and control passes to the atapi probe which does support cdrom drives. It works worst with a cdrom master and no slave. Sadly, that's exactly the out-of-the-box Dell configuration: 2 disks on the first IDE, just a CDROM on the second IDE. I'd have changed the hardware configuration to swap the second disk with the CDROM, but I thought it would be much better if it could be made to work, since it apparently can... -- Pierre Beyssac p...@enst.fr To Unsubscribe: send mail to majord...@freebsd.org with unsubscribe freebsd-current in the body of the message
Re: IDE CDROM not found with PIIX4 chipset, -current kernel
On Thu, Feb 25, 1999 at 06:40:35PM +0200, Sheldon Hearn wrote: I'd have changed the hardware configuration to swap the second disk with the CDROM, but I thought it would be much better if it could be made to work, since it apparently can... Nope, it's almost certainly better to have your random access disks on two different controllers. Oh yes, I won't try to argue against that since I quite agree. I'll try this once I get tired trying my current config :-). I was just considering that the more configurations that work, the better for FreeBSD's ease of installation/configuration. The problem is bound to happen again since this is a configuration (CDROM by itself on a second IDE controller) currently sold by Dell. Don't ask me why, Dell always tends to make weird choices regarding how they install IDE devices and controllers. I suspect this has to do with this weird, differently abled stuff sold by a major company which lost the source code for it, arrogant enough to call it an OS. -- Pierre Beyssac p...@enst.fr To Unsubscribe: send mail to majord...@freebsd.org with unsubscribe freebsd-current in the body of the message
IDE CDROM not found with PIIX4 chipset, -current kernel
I've been having problems with an IDE controller on my motherboard. I can't seem to be able to get it to recognize a CDROM drive (it's the only device plugged on the second IDE controller). The kernel seems to timeout on that second controller during the probe phase. Maybe that's one of the infamous PIIX 4 bugs... The machine is a Dell Optiplex. The first weird thing is that the CDROM drive is found if I plug another IDE drive as slave on the second controller. The second weird thing is that it's recognized perfectly correctly if I boot a Linux installation disk... So I assume they found a workaround, if that's a PIIX 4 bug. I've tried to change the device flags (with and without DMA), to no avail. I've followed every recent patch to the IDE code, none of these seems to help (the last kernel I tried is a 4.0-current from Monday). Here's my dmesg: ide_pci0: Intel PIIX4 Bus-master IDE controller rev 0x01 on pci0.7.1 ... wdc0 at 0x1f0-0x1f7 irq 14 flags 0xa0ffa0ff on isa wdc0: unit 0 (wd0): Maxtor 90640D4, DMA, 32-bit, multi-block-16 wd0: 6149MB (12594960 sectors), 12495 cyls, 16 heads, 63 S/T, 512 B/S wdc0: unit 1 (wd1): Maxtor 90640D4, DMA, 32-bit, multi-block-16 wd1: 6149MB (12594960 sectors), 12495 cyls, 16 heads, 63 S/T, 512 B/S wdc1 at 0x170-0x177 irq 15 flags 0xa0ffa0ff on isa ... then nothing, it times out after a while and goes on with the boot, no CDROM is ever found. Here's an excerpt from my kernel config. Did I miss something obvious? controller wdc0at isa? port IO_WD1 bio irq 14 flags 0xa0ffa0ff vector wdintr diskwd0 at wdc0 drive 0 diskwd1 at wdc0 drive 1 controller wdc1at isa? port IO_WD2 bio irq 15 flags 0xa0ffa0ff vector wdintr diskwd2 at wdc1 drive 0 diskwd3 at wdc1 drive 1 options ATAPI #Enable ATAPI support for IDE bus options ATAPI_STATIC#Don't do it as an LKM device acd0#IDE CD-ROM -- Pierre Beyssac p...@enst.fr To Unsubscribe: send mail to majord...@freebsd.org with unsubscribe freebsd-current in the body of the message
Re: IDE CDROM not found with PIIX4 chipset, -current kernel
On Thu, Feb 25, 1999 at 01:01:09AM +0200, Sheldon Hearn wrote: controller wdc1[...] diskwd2 at wdc1 drive 0 diskwd3 at wdc1 drive 1 Um... Do you really have 4 wd devices plugged in? Uh, well, no, just 2 (on the first IDE controller). Assuming you don't have 4 drives, I'd suggest you slave your ATAPI CDROM device on your primary IDE controller. So you'd do something like: Wow. Thanks a million! I didn't even have to go so far, I just deleted wd2 and wd3 and acd0 now appears as if by magic. I can't tell you how extremely stupid I now feel... OTOH, I copied this from the GENERIC kernel config file, assuming it recognizes an ATAPI CDROM when it finds one. So my puzzled question is now: how can the GENERIC kernel work in that case, since it declares wd[0-3] ? -- Pierre Beyssac p...@enst.fr To Unsubscribe: send mail to majord...@freebsd.org with unsubscribe freebsd-current in the body of the message