Re: Best way to get a system on current?
On Fri, Oct 12, 2001 at 11:36:30AM -0500, Bob Willcox wrote: On Fri, Oct 12, 2001 at 09:20:35AM -0700, David Wolfskill wrote: Might help if you provided a pointer to the problems you had in the upgrade from -STABLE case. For that matter, a bit more detail on the install failed to mount the filesystems for the install from -CURRENT snapshot case would be of interest, as well. As for the snapshot install, since it's errors were only written to the screen I have to work from memory here as well. I believe the first complaint had to do with the filesystems to be mounted (/mnt/usr, for example) not specified in fstab. Since all of the mounts to /mnt failed, the system fails pretty soon apparently running out of space in /. Ahh, so I'm not the only one that ran into this problem. I thought I'd balked something up myself, so I did some extensive fiddling to try and rectify the problem. I got it working eventually by issuing newfs manually on each of the new partitions, mounting them on their respective /mnt mount points (i.e. /mnt, /mnt/var, /mnt/usr, etc), then symlinking these back to their root mount point equivalents (/var, /var/tmp, /usr, etc). I actually did all of this while the sysinstall dialog was still up on the first terminal -- once I'd fiddled with all the mount points and selected to try and install the bin distribution again, it worked. Not exactly an elegant solution, unfortunately. It'd be interesting to hear if anyone else has this problem. Thanks, Bob Regards, Trent. -- Trent Nelson - Software Engineer - [EMAIL PROTECTED] A man with unlimited enthusiasm can achieve almost anything. --unknown To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-current in the body of the message
Re: Some interrupt coalescing tests
On Sat, 13 Oct 2001, Terry Lambert wrote: Mike Silbersack wrote: One issue to be careful of here is that the removal of the tcptmpl actually causes a performance hit that wasn't there in the 4.3 code. My original complaint about tcptmpl taking up 256 instead of 60 bytes stands, but I'm more than half convinced that making it take up 60 bytes is OK... or at least is more OK than allocating and deallocating each time, and I don't yet have a better answer to the problem. 4.3 doesn't have this change, but 4.4 does. I need benchmarks to prove the slowdown, Terry. The testing I performed (which is limited, of course) showed no measureable speed difference. Remember that the only time the tcptempl mbuf ever gets allocated now is when a keepalive is sent, which is a rare event. The rest of the time, it's just copying the data from the preexisting structures over to the new packet. If you can show me that this method is slower, I will move it over to a zone allocated setup like you proposed. I'm not sure if the number was lower because the celeron couldn't run the flooder as quickly, or if the -current box was dropping packets. I suspect the latter, as the -current box was NOTICEABLY slowed down; I could watch systat refresh the screen. This is unfortunate; it's an effect that I expected with the -current code, because of the change to the interrupt processing path. To clarify here, the slowdown occurred both with and without the patch, right? The problem here is that when you hit livelock (full CPU utilization), then you are pretty much unable to do anything at all, unless the code path goes all the way to the top of the stack. Yep, the -current box livelocked with and without the patch. I'm not sure if -current is solely to blame, though. My -current box is using a PNIC, which incurs additional overhead relative to other tulip clones, according to the driver's comments. And the 3com in that box hasn't worked in a while... maybe I should try debugging that so I have an additional test point. The conclusion? I think that the dc driver does a good enough job of grabbing multiple packets at once, and won't be helped by Terry's patch except in a few very cases. 10% is a good improvement; my gut feeling is that it would have been less than that. This is actually good news for me, since it means that my 30% number is bounded by the user space program not being run (in other words, I should be able to get considerably better performance, using a weighted fair share scheduler). As long as it doesn't damage performance, I think that it's proven itself. Hm, true, I guess the improvement is respectable. My thought is mostly that I'm not sure how much it's extending the performance range of a system; testing with more varied packet loads as suggested by Alfred would help tell us the answer to this. In fact, I have a sneaky suspicion that Terry's patch may increase bus traffic slightly. I'm not sure how much of an issue this is, perhaps Bill or Luigi could comment. This would be interesting to me, as well. I gave Luigi an early copy of the patch to play with a while ago, and also copied Bill. I'm interested in how you think it could increase traffic; the only credible reason I've been able to come up with is the ability to push more packets through, when they would otherwise end up being dropped because of the queue full condition -- if this is the case, the bus traffic is real work, and not additonal overhead. The extra polling of the bus in cases where there are no additional packets to grab is what I was wondering about. I guess in comparison to the quantity of packet data going by, it's not a real issue. In short, if we're going to try to tackle high interrupt load, it should be done by disabling interrupts and going to polling under high load; I would agree with this, except that it's only really a useful observation if FreeBSD is being used as purely a network processor. Without interrupts, the polling will take a significant portion of the available CPU to do, and you can't burn that CPU if, for example, you have an SSL card that does your handshakes, but you need to run the SSL sessions themselves up in user space. Straight polling isn't necessarily the solution I was thinking of, but rather some form of interrupt disabling at high rates. For example, if the driver were to keep track of how many interrupts/second it was taking, perhaps it could up the number of receive buffers from 64 to something higher, then disable the card's interrupt and set a callback to run in a short bit of time at which point interrupts would be reenabled and the interrupt handler would be run. Ideally, this could reduce the number of interrupts greatly, increasing efficiency under load. Paired with this could be receive polling during transmit, something which does not seem to be done at current, if I'm reading correctly. I'm not sure
Re: Some interrupt coalescing tests
Mike Silbersack wrote: Hm, true, I guess the improvement is respectable. My thought is mostly that I'm not sure how much it's extending the performance range of a system; testing with more varied packet loads as suggested by Alfred would help tell us the answer to this. I didn't respond to Alfred's post, and I probably should have; he had some very good comments, including varying the load. My main interest has been in increasing throughput as much as possible; as such, my packet load has been geared towards moving the most data possible. The tests we did were with just connections per second, 1k HTTP transfers, and 10k HTTP transfers. Unfortunately, I can't give you seperate numbers without the LRP, since we didn't bother after the connection rate went from ~7000/second to 23500/second with LRP, it wasn't worth it. The extra polling of the bus in cases where there are no additional packets to grab is what I was wondering about. I guess in comparison to the quantity of packet data going by, it's not a real issue. It could be, if you were doing something that was network triggered, relatively low cost, but CPU intensive; on the whole, though, there's very little that isn't going to be network related, these days, and what there is, will end up not taking the overhead, unless you are also doing networking. Maybe it should be a tunable? But these days, everything is pretty much I/O bound, not CPU bound. The one thing I _would_ add -- though I'm waiting for it to be a problem before doing it -- is to limit the total number of packets processed per interrupt by keeping a running count. You would have to be _AMAZINGLY_ loaded to hit this, though; since it would mean absolutely continuous DMAs. I think it is self-limiting, should that happen, since once you are out of mbufs, you're out. The correct thing to do is probably to let it run out, but keep a seperate transmit reserve, so that you can process requests to completion. I don't know if anyone has tested what happens to apache in a denial of service attack consisting of a huge number of partial GET requests that are incomplete, and so leave state hanging around in the HTTP server... [ ... polling vs. interrupt load ... ] Straight polling isn't necessarily the solution I was thinking of, but rather some form of interrupt disabling at high rates. For example, if the driver were to keep track of how many interrupts/second it was taking, perhaps it could up the number of receive buffers from 64 to something higher, then disable the card's interrupt and set a callback to run in a short bit of time at which point interrupts would be reenabled and the interrupt handler would be run. Ideally, this could reduce the number of interrupts greatly, increasing efficiency under load. Paired with this could be receive polling during transmit, something which does not seem to be done at current, if I'm reading correctly. I'm not sure of the feasibility of the above, unfortunately - it would seem highly dependant on how short of a timeout we can realistically get along with how many mbufs we can spare for receive buffers. Yes. Floyd and Druschel recommend using high and low watermarks on the amount of data pending processing in user space. The most common approach is to use a fair share scheduling algorithm, which reserves a certain amount of CPU for user space processing, but this is somewhat wasteful, if there is no work, since it denies quantum to the interrupt processing, potentially wrongly. -- Terry To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-current in the body of the message
Multiple NFS server problems with Solaris 8 clients
I am using -current box as a homedir server for my Solaris clients and have noticed a wierd problem. When I login my homedir gets mounted ok but when I type ls -l it just waits until I ^C it. If I run snoop on Solaris I see a getattr request being sent and an answer being received but apparently it gets ignored by Solaris. This happens on both Sol x86 and Sparc ( both with MU5 installed) Another problem I see is that rebooting the client causes the server to ignore request afterwards. I see SYNS sent to the server but no respons at all... One more problem is in nfsd, if I set it to use udp only it starts eating all cpu cycles it can get,but only the master process. Trussing the proces shows no system calls whatsoever being performed. BTW This is -current built yesterday ( oct 13). Paul PS Snoop logs or tcpdump logs are avialable for those who know what to look for... To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-current in the body of the message
Re: ACPI panic at boot time in -current
Hi all, From: Brian Somers [EMAIL PROTECTED] Date: Fri, 12 Oct 2001 01:15:38 +0100 ::Hi, :: ::I was wondering if anybody has any suggestions about why this might ::be happening in -current: cut ::pccbb1: RF5C478 PCI-CardBus Bridge irq 0 at device 10.1 on pci0 ::pccbb1: PCI Memory allocated: 10001000 ::acpi_pcib0: possible interrupts: 9 ::panic: free: multiple freed item 0xc14f75f0 ::Debugger(panic) ::Stopped at Debugger+0x44: pushl %ebx ::db t I also get the same kind of panic, after import of ACPICA 20010920 snapshot. If I boot very current kernel with old acpi.ko based on 20010831 snapshot, system boots up just fine. Panic message from current kernel with acpi.ko based on 20010920 snapshot: -8-8-8-8-8-8-8 uhci0: Intel 82371AB/EB (PIIX4) USB controller port 0xfc60-0xfc7f at device 7. 2 on pci0 acpi_pcib0: possible interrupts: 9 panic: free: multiple freed item 0xc13843d0 Debugger(panic) Stopped at Debugger+0x44: pushl %ebx db -8-8-8-8-8-8-8 Related dmesg from the same kernel with old acpi.ko based on 20010831 snapshot: -8-8-8-8-8-8-8 uhci0: Intel 82371AB/EB (PIIX4) USB controller port 0xfc60-0xfc7f at device 7.2 on pci0 acpi_pcib0: matched entry for 0.7.INTD (source \\_SB_.LNKD) acpi_pcib0: possible interrupts: 9 acpi_pcib0: routed interrupt 9 via \\_SB_.LNKD usb0: Intel 82371AB/EB (PIIX4) USB controller on uhci0 usb0: USB revision 1.0 uhub0: Intel UHCI root hub, class 9/0, rev 1.00/1.00, addr 1 uhub0: 2 ports with 2 removable, self powered uhub1: Philips Semiconductors hub, class 9/0, rev 1.10/1.10, addr 2 uhub1: 3 ports with 3 removable, self powered -8-8-8-8-8-8-8 Hope this helps, Haro =-- _ _Munehiro (haro) Matsuda -|- /_\ |_|_| Business Incubation Dept., Kubota Corp. /|\ |_| |_|_| 1-3 Nihonbashi-Muromachi 3-Chome Chuo-ku Tokyo 103-8310, Japan Tel: +81-3-3245-3318 Fax: +81-3-3245-3315 Email: [EMAIL PROTECTED] To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-current in the body of the message
New features for -current
Over than an year ago (9.9.2000) I submitted a pr (kern/21154) to ask renaming from actual *_saver.ko to saver_*.ko of saver modules to uniform names under /boot/kernel as sound (snd_*), interfaces (if_*), splash (splash_*) and netgraph (ng_*). I tryed to figure where are used and I found only /etc/rc.i386: kldstat -v | grep -q _saver || kldload ${saver}_saver need to be changed to: kldstat -v | grep -q saver_ || kldload saver_${saver} Is this really so stupid? I think order is important... Another question: I noticed good support for USB peripherals like scanner, mp3 player (rio) mouse and ethernet but nothing to use a photo camera (yes, I buy an inexpensive usb digital photo camera, Agfa ePhoto-CL18) and I try to compile gphoto because it recently added support for CL-18 (w/out success). Would be a great idea add /dev/uphoto and even better a sort of photo-file-system, where read is mapped to download image, unlink to delete and maybe create file to take a picture so we can use ls, cp, rm and touch to access photo camera... Riccardo. To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-current in the body of the message
Re: [acpi-jp 1343] Re: ACPI panic at boot time in -current
Hi, Intel folks. I've just found the bug in rsutils.c which double free(); AcpiUtRemoveReference() and ACPI_MEM_FREE(). Here is a fix. Index: rsutils.c === RCS file: /home/ncvs/src/sys/contrib/dev/acpica/rsutils.c,v retrieving revision 1.1.1.7 diff -u -r1.1.1.7 rsutils.c --- rsutils.c 4 Oct 2001 23:12:13 - 1.1.1.7 +++ rsutils.c 14 Oct 2001 15:23:13 - @@ -490,7 +490,6 @@ */ Cleanup: -ACPI_MEM_FREE (ByteStream); return_ACPI_STATUS (Status); } I suspect that this should be removed in ACPICA 20010831-to-20010920 changes. Matsuda-san, thanks. I missed the original mail in current ML written by brian. Thanks Hi all, From: Brian Somers [EMAIL PROTECTED] Date: Fri, 12 Oct 2001 01:15:38 +0100 ::Hi, :: ::I was wondering if anybody has any suggestions about why this might ::be happening in -current: cut ::pccbb1: RF5C478 PCI-CardBus Bridge irq 0 at device 10.1 on pci0 ::pccbb1: PCI Memory allocated: 10001000 ::acpi_pcib0: possible interrupts: 9 ::panic: free: multiple freed item 0xc14f75f0 ::Debugger(panic) ::Stopped at Debugger+0x44: pushl %ebx ::db t I also get the same kind of panic, after import of ACPICA 20010920 snapshot. If I boot very current kernel with old acpi.ko based on 20010831 snapshot, system boots up just fine. Panic message from current kernel with acpi.ko based on 20010920 snapshot: -8-8-8-8-8-8-8 uhci0: Intel 82371AB/EB (PIIX4) USB controller port 0xfc60-0xfc7f at device 7. 2 on pci0 acpi_pcib0: possible interrupts: 9 panic: free: multiple freed item 0xc13843d0 Debugger(panic) Stopped at Debugger+0x44: pushl %ebx db -8-8-8-8-8-8-8 Related dmesg from the same kernel with old acpi.ko based on 20010831 snapshot: -8-8-8-8-8-8-8 uhci0: Intel 82371AB/EB (PIIX4) USB controller port 0xfc60-0xfc7f at device 7.2 on pci0 acpi_pcib0: matched entry for 0.7.INTD (source \\_SB_.LNKD) acpi_pcib0: possible interrupts: 9 acpi_pcib0: routed interrupt 9 via \\_SB_.LNKD usb0: Intel 82371AB/EB (PIIX4) USB controller on uhci0 usb0: USB revision 1.0 uhub0: Intel UHCI root hub, class 9/0, rev 1.00/1.00, addr 1 uhub0: 2 ports with 2 removable, self powered uhub1: Philips Semiconductors hub, class 9/0, rev 1.10/1.10, addr 2 uhub1: 3 ports with 3 removable, self powered -8-8-8-8-8-8-8 Hope this helps, Haro =-- _ _Munehiro (haro) Matsuda -|- /_\ |_|_| Business Incubation Dept., Kubota Corp. /|\ |_| |_|_| 1-3 Nihonbashi-Muromachi 3-Chome Chuo-ku Tokyo 103-8310, Japan Tel: +81-3-3245-3318 Fax: +81-3-3245-3315 Email: [EMAIL PROTECTED] To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-current in the body of the message
Re: New features for -current
Riccardo Torrini [EMAIL PROTECTED] wrote: Would be a great idea add /dev/uphoto and even better a sort of photo-file-system, where read is mapped to download image, unlink to delete and maybe create file to take a picture so we can use ls, cp, rm and touch to access photo camera... Yes, great idea, Riccardo -- please do it. :-) However, there is no standard for accessing digital photo cameras via USB. Recently, some of them seem to comply with the mass storage protocol (BSD's umass driver), but the majority of them use proprietary protocols. Even the same vendor uses different protocols for different of his cameras. So, basically you would have to write a separate kernel driver for every camera. This isn't feasible. It is probably much better to handle these issues in userland code. As an example, you could have a look at the oPhoto tool which handles the Kodak DC240, DC280 and DC3400 under Free- BSD (and possibly also others, but _not_ the Kodak DC220, DC260 and DC265). These are all USB photo cameras. The tool is written in userland code and uses the generic ugen driver to access the camera, which works pretty well. If you absolutely want to access the images like a real filesystem (I don't think this would have any real advan- tage), you could wrap an NFS userland server around the code. Bloating the kernel with such stuff is a bad idea, IMO. Regards Oliver PS: oPhoto: http://www.fromme.com/ophoto/ -- Oliver Fromme, secnetix GmbH Co KG, Oettingenstr. 2, 80538 München Any opinions expressed in this message may be personal to the author and may not necessarily reflect the opinions of secnetix in any way. All that we see or seem is just a dream within a dream (E. A. Poe) To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-current in the body of the message
gunzip returned -1 when installing
I tried to install latest 5-current via ftp. However, when sysinstall fetches all bin distribution, following dialog (sorry, I've forget to copy a screenshot) is shown: User Confirmation Requested Unable to transfer the bin distribution from ... Do you want to try to retrieve it again ? I switched to VTY2 and see what's message are there. The last 5 lines of them are: /stand/cpio: root/.profile linked to .profile .profile DEBUG: wait for gunzip returned status of -1! COPYRIGHT 12888 blocks Hmm, sysinstall says something goes wrong with gunzip. Anybody can confirm this behaviro or this is my local error? -- - Makoto `MAR' MATSUSHITA To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-current in the body of the message
Re: Multiple NFS server problems with Solaris 8 clients
Paul van der Zwan wrote: If I run snoop on Solaris I see a getattr request being sent and an answer being received but apparently it gets ignored by Solaris. This happens on both Sol x86 and Sparc ( both with MU5 installed) Please do a tcpdump, and examine it; I suspect you will find that your problem is that the IP address it was sent to is not the same as the IP address it was replied from. In general, this is because the code doesn't explicitly use recvfrom/sendto semantics, and just takes the route. This will most often occur when you mount it using an IP alias, but the primary (non-alias) IP address is is on the same subnet as the alias. It can also occur if you are using two address sets on the same wire, and do not use an intervening router. Another problem I see is that rebooting the client causes the server to ignore request afterwards. I see SYNS sent to the server but no respons at all... Again, you will need to tcpdump it. One prospect is for the ARP table to be different on the who has after the reboot. I've noticed that a ping socket gets a route, and even after an ICMP redirect, I still get a bunch of redirects, since FreeBSD does not update the route table for already created clones (this is a bug in FreeBSD's routing code). Another possibility is the reboot reset the sequence number; a common thing is to ensure that the random sequence number used is later than the one that was used last for the same IP/port pairs. The client will most likely reuse the same numbers, or lower numbers, even if it is RFC compliant as to non-guessable sequence numbers (you will see this on the tcpdump). FreeBSD will not guarantee increasing sequence numbers -- and will thus ignore the packets -- unless you enable the sysctl to disable the pure random sequence nu,mber hack. Look for it via the command sysctl -A | grep -i seq. NB: FreeBSD also does not reset connections in TIME_WAIT, if it gets packets from the same IP/port on the client while the server is in TIME_WAIT because the connections are dead. This is a common hack (NT does this by default, and so does Solaris), but it opens you up for connection force-down attacks for active connections, if your network is improperly firewalled. One more problem is in nfsd, if I set it to use udp only it starts eating all cpu cycles it can get,but only the master process. Trussing the proces shows no system calls whatsoever being performed. The I/O daemons make a system call and never return to user space. To track down this problem, truss is of no use: you must use DDB in the kernel (or remote kernel debugging, if you have two systems available: see the FreeBSD Developer's Handbook), and find out what it's doing in the kernel when this happens... I suspect that you are having one of the problems above, and are being packet-flooded by the clients, when they get no response, or at least none they like, from the server. BTW This is -current built yesterday ( oct 13). You may also want to try 4.3 or 4.4 instead. PS Snoop logs or tcpdump logs are avialable for those who know what to look for... I'll look at them if they are up on a web site, but not if you mail them, so _DON'T_ mail them to me! -- Terry To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-current in the body of the message
Re: Multiple NFS server problems with Solaris 8 clients
Actually, I've also noticed problems in FreeBSD-current also- ls and reads work, but things like mkdir hang. Here's the tcpdump output: Script started on Sun Oct 14 12:21:50 2001 quarm.feral.com root tcpdump -vv -i fxp0 host antares tcpdump: listening on fxp0 12:21:58.498568 antares.1294025654 quarm.nfs: 116 getattr [|nfs] (DF) (ttl 64, id 2722, len 156) 12:21:58.498746 quarm.nfs antares.1294025654: reply ok 116 getattr [|nfs] (DF) (ttl 64, id 29331, len 156) 12:21:58.501021 antares.1294025655 quarm.nfs: 116 getattr [|nfs] (DF) (ttl 64, id 2723, len 156) 12:21:58.501184 quarm.nfs antares.1294025655: reply ok 116 getattr [|nfs] (DF) (ttl 64, id 29332, len 156) 12:21:58.501657 antares.1294025656 quarm.nfs: 116 getattr [|nfs] (DF) (ttl 64, id 2724, len 156) 12:21:58.501707 quarm.nfs antares.1294025656: reply ok 116 getattr [|nfs] (DF) (ttl 64, id 29333, len 156) 12:21:58.502062 antares.1294025657 quarm.nfs: 116 getattr [|nfs] (DF) (ttl 64, id 2725, len 156) 12:21:58.502117 quarm.nfs antares.1294025657: reply ok 116 getattr [|nfs] (DF) (ttl 64, id 29334, len 156) 12:21:58.502475 antares.1294025658 quarm.nfs: 116 getattr [|nfs] (DF) (ttl 64, id 2726, len 156) 12:21:58.502519 quarm.nfs antares.1294025658: reply ok 116 getattr [|nfs] (DF) (ttl 64, id 29335, len 156) 12:21:58.598618 antares.1018 quarm.nfsd: . [tcp sum ok] 437975440:437975440(0) ack 4039870942 win 24820 (DF) (ttl 64, id 2727, len 40) - OKAY- that was the ls that workes 12:22:10.893273 antares.1294025660 quarm.nfs: 116 getattr [|nfs] (DF) (ttl 64, id 2728, len 156) 12:22:10.893409 quarm.nfs antares.1294025660: reply ok 116 getattr [|nfs] (DF) (ttl 64, id 29367, len 156) 12:22:10.893740 antares.1294025661 quarm.nfs: 120 getattr [|nfs] (DF) (ttl 64, id 2729, len 160) 12:22:10.992986 quarm.nfsd antares.1018: . [tcp sum ok] 117:117(0) ack 236 win 62459 (DF) (ttl 64, id 29368, len 40) - that was the mkdir (that hung) ^C 218 packets received by filter 0 packets dropped by kernel To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-current in the body of the message
Re: Multiple NFS server problems with Solaris 8 clients
Hi, One more problem is in nfsd, if I set it to use udp only it starts eating all cpu cycles it can get,but only the master process. Trussing the process shows no system calls whatsoever being performed. The last one is a know problem. There is a (unfinished) patch available to solve this problem. Thomas Moestl [EMAIL PROTECTED] is still working on some issues of the patch. Please contact him if you like to know more. Here is the URL for the patch: http://home.teleport.ch/freebsd/userland/nfsd-loop.diff Martin Martin Blapp, [EMAIL PROTECTED] -- Improware AG, UNIX solution and service provider Zurlindenstrasse 29, 4133 Pratteln, Switzerland Phone: +41 061 826 93 00: +41 61 826 93 01 PGP Fingerprint: 57E 7CCD 2769 E7AC C5FA DF2C 19C6 DCD1 1B3A EC9C -- To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-current in the body of the message
Re: Multiple NFS server problems with Solaris 8 clients
The last one is a know problem. There is a (unfinished) patch available to solve this problem. Thomas Moestl [EMAIL PROTECTED] is still working on some issues of the patch. Please contact him if you like to know more. Here is the URL for the patch: http://home.teleport.ch/freebsd/userland/nfsd-loop.diff That patch is a bit out of date, because Peter removed a big chunk of kerberos code from nfsd since. I was actually just looking at this problem again, so I include an updated version of Thomas's patch below. This version also removes entries from the children[] array when a slave nfsd dies to avoid the possibility of accidentally killing unrelated processes. The issue that remains open with the patch is that currently if a slave nfsd dies, then all nfsds will shut down. This is because nfssvc() in the master nfsd returns 0 when the master nfsd receives a SIGCHLD. This behaviour is probably reasonable enough, but the way it happens is a bit odd. Thomas, I'll probably commit this within the next few days if you have no objections, and if you don't get there before me. The exiting behaviour can be resolved later if necessary. Ian Index: nfsd.c === RCS file: /dump/FreeBSD-CVS/src/sbin/nfsd/nfsd.c,v retrieving revision 1.21 diff -u -r1.21 nfsd.c --- nfsd.c 20 Sep 2001 02:18:06 - 1.21 +++ nfsd.c 14 Oct 2001 20:19:18 - @@ -52,6 +52,8 @@ #include sys/syslog.h #include sys/wait.h #include sys/mount.h +#include sys/linker.h +#include sys/module.h #include rpc/rpc.h #include rpc/pmap_clnt.h @@ -64,6 +66,7 @@ #include err.h #include errno.h +#include signal.h #include stdio.h #include stdlib.h #include strings.h @@ -86,12 +89,16 @@ intnfsdcnt;/* number of children */ void cleanup(int); +void child_cleanup(int); void killchildren(void); -void nonfs (int); -void reapchild (int); -intsetbindhost (struct addrinfo **ia, const char *bindhost, struct addrinfo hints); -void unregistration (void); -void usage (void); +void nfsd_exit(int); +void nonfs(int); +void reapchild(int); +intsetbindhost(struct addrinfo **ia, const char *bindhost, + struct addrinfo hints); +void start_server(int); +void unregistration(void); +void usage(void); /* * Nfs server daemon mostly just a user context for nfssvc() @@ -126,13 +133,12 @@ fd_set ready, sockbits; fd_set v4bits, v6bits; int ch, connect_type_cnt, i, len, maxsock, msgsock; - int nfssvc_flag, on = 1, unregister, reregister, sock; + int on = 1, unregister, reregister, sock; int tcp6sock, ip6flag, tcpflag, tcpsock; - int udpflag, ecode, s; - int bindhostc = 0, bindanyflag, rpcbreg, rpcbregcnt; + int udpflag, ecode, s, srvcnt; + int bindhostc, bindanyflag, rpcbreg, rpcbregcnt; char **bindhost = NULL; pid_t pid; - int error; if (modfind(nfsserver) 0) { /* Not present in kernel, try loading it */ @@ -141,8 +147,8 @@ } nfsdcnt = DEFNFSDCNT; - unregister = reregister = tcpflag = 0; - bindanyflag = udpflag; + unregister = reregister = tcpflag = maxsock = 0; + bindanyflag = udpflag = connect_type_cnt = bindhostc = 0; #defineGETOPT ah:n:rdtu #defineUSAGE [-ardtu] [-n num_servers] [-h bindip] while ((ch = getopt(argc, argv, GETOPT)) != -1) @@ -313,8 +319,6 @@ daemon(0, 0); (void)signal(SIGHUP, SIG_IGN); (void)signal(SIGINT, SIG_IGN); - (void)signal(SIGSYS, nonfs); - (void)signal(SIGUSR1, cleanup); /* * nfsd sits in the kernel most of the time. It needs * to ignore SIGTERM/SIGQUIT in order to stay alive as long @@ -324,40 +328,31 @@ (void)signal(SIGTERM, SIG_IGN); (void)signal(SIGQUIT, SIG_IGN); } + (void)signal(SIGSYS, nonfs); (void)signal(SIGCHLD, reapchild); - openlog(nfsd:, LOG_PID, LOG_DAEMON); + openlog(nfsd, LOG_PID, LOG_DAEMON); - for (i = 0; i nfsdcnt; i++) { + /* If we use UDP only, we start the last server below. */ + srvcnt = tcpflag ? nfsdcnt : nfsdcnt - 1; + for (i = 0; i srvcnt; i++) { switch ((pid = fork())) { case -1: syslog(LOG_ERR, fork: %m); - killchildren(); - exit (1); + nfsd_exit(1); case 0: break; default: children[i] = pid; continue; } - + (void)signal(SIGUSR1, child_cleanup); setproctitle(server); - nfssvc_flag = NFSSVC_NFSD; - nsd.nsd_nfsd = NULL; - while (nfssvc(nfssvc_flag,
Hello, your friend recommended openxxx.net to you
You have been invited to check out this adult site by one of your friends who visited us. click here , our URL is: http://www.openxxx.net/ enjoy, OpenXXX TEAM 2001 To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-current in the body of the message
Re: ACPI panic at boot time in -current
From: Mitsuru IWASAKI [EMAIL PROTECTED] Date: Mon, 15 Oct 2001 00:46:57 +0900 (JST) ::Hi, Intel folks. I've just found the bug in rsutils.c which double ::free(); AcpiUtRemoveReference() and ACPI_MEM_FREE(). Here is a fix. :: ::Index: rsutils.c ::=== ::RCS file: /home/ncvs/src/sys/contrib/dev/acpica/rsutils.c,v ::retrieving revision 1.1.1.7 ::diff -u -r1.1.1.7 rsutils.c ::--- rsutils.c 4 Oct 2001 23:12:13 - 1.1.1.7 ::+++ rsutils.c 14 Oct 2001 15:23:13 - ::@@ -490,7 +490,6 @@ :: */ :: Cleanup: :: ::-ACPI_MEM_FREE (ByteStream); :: return_ACPI_STATUS (Status); :: } :: :: ::I suspect that this should be removed in ACPICA 20010831-to-20010920 ::changes. :: ::Matsuda-san, thanks. I missed the original mail in current ML written ::by brian. :: ::Thanks Hi Iwasaki-san, That fixed my panic problem. Thanks for the patch. Haro =-- _ _Munehiro (haro) Matsuda -|- /_\ |_|_| Business Incubation Dept., Kubota Corp. /|\ |_| |_|_| 1-3 Nihonbashi-Muromachi 3-Chome Chuo-ku Tokyo 103-8310, Japan Tel: +81-3-3245-3318 Fax: +81-3-3245-3315 Email: [EMAIL PROTECTED] To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-current in the body of the message
Re: Multiple NFS server problems with Solaris 8 clients
On Sun, 2001/10/14 at 21:38:26 +0100, Ian Dowse wrote: The last one is a know problem. There is a (unfinished) patch available to solve this problem. Thomas Moestl [EMAIL PROTECTED] is still working on some issues of the patch. Please contact him if you like to know more. Here is the URL for the patch: http://home.teleport.ch/freebsd/userland/nfsd-loop.diff That patch is a bit out of date, because Peter removed a big chunk of kerberos code from nfsd since. I was actually just looking at this problem again, so I include an updated version of Thomas's patch below. This version also removes entries from the children[] array when a slave nfsd dies to avoid the possibility of accidentally killing unrelated processes. The issue that remains open with the patch is that currently if a slave nfsd dies, then all nfsds will shut down. This is because nfssvc() in the master nfsd returns 0 when the master nfsd receives a SIGCHLD. This behaviour is probably reasonable enough, but the way it happens is a bit odd. Thomas, I'll probably commit this within the next few days if you have no objections, and if you don't get there before me. The exiting behaviour can be resolved later if necessary. Thanks! I've been meaning to update and commit this patch for quite some time, but was rather focused on sparc64 development recently when I had time. I also wanted to resolve this exiting behaviour before, but I agree that it is probably not a real issue. - thomas To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-current in the body of the message
Re: KSE settling in (smbfs broken) again
On Fri, 5 Oct 2001, Sheldon Hearn wrote: I need to look at it again.. (I figured I just didn't have the time to try understand it all AND do the rest of the kernel.) Of course the best woudl be if Mr. Popov did the conversion but I believe he's incredibly busy at the moment.. Certainly if someone else wants to make an effort at it. they are welcome to do it.. otherwise I will eventually get to it. (but I have no way to test them). Boris goes through phases, like the rest of us. :-) Yes, this is correct. Doing hardware stuff consumes a lot of my time :( His last round of changes from Mac OS X sorted out my panics quite nicely. Good to hear. I've selected most critical bugfixes and there is still big diffs to merge. It sounds like the message is I'd like to help with smbfs, but don't have time right now, and it'd make a whole lot more sense for someone closer to the code to take a look. Obviously, I'll do the job at some moment. If someone can do it before - feel free to do this. This may even include import of userland part in the /contrib hierarchy. -- Boris Popov http://rbp.euro.ru To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-current in the body of the message
Re: Why do soft interrupt coelescing?
On Thu, Oct 11, 2001 at 01:02:09 -0700, Terry Lambert wrote: Kenneth D. Merry wrote: If the receive ring for that packet size is full, it will hold off on DMAs. If all receive rings are full, there's no reason to send more interrupts. I think that this does nothing, in the FreeBSD case, since the data from the card will generally be drained much faster than it accrues, into the input queue. Whether it gets processed out of there before you run out of mbufs is another matter. [ ... ] Anyway, if all three rings fill up, then yes, there won't be a reason to send receive interrupts. I think this can't really happen, since interrupt processing has the highest priority, compared to stack processing or application level processing. 8-(. Yep, it doesn't happen very often in the default case. OK, assuming you meant that the copies would stall, and the data not be copied (which is technically the right thing to do, assuming a source quench style livelock avoidance, which doesn't currently exist)... The data isn't copied, it's DMAed from the card to host memory. The card will save incoming packets to a point, but once it runs out of memory to store them it starts dropping packets altogether. I think that the DMA will not be stalled, at least as the driver currently exists; you and I agreed on that already (see below). My concern in this case is that, if the card is using the bus to copy packets from card memory to the receive ring, then the bus isn't available for other work, which is bad. It's better to drop the packets before putting them in card memory (FIFO drop fails to avoid the case where a continuous attack pushes all good packets out). Dropping packets before they get into card memory would only be possible with some sort of traffic shaper/dropping mechanism on the wire to drop things before they get to the card at all. The problem is still that you end up doing interrupt processing until you run out of mbufs, and then you have the problem of not being able to transmit responses, for lack of mbufs. In theory you would have configured your system with enough mbufs to handle the situation, and the slowness of the system would cause the windows on the sender to fill up, so they'll stop sending data until the receiver starts responding again. That's the whole purpose of backoff and slow start -- to find a happy medium for the transmitter and receiver so that data flows at a constant rate. In practice, mbuf memory is just as overcommitted as all other memory, and given a connection count target, you are talking a full transmit and full receive window worth of data at 16k a pop -- 32k per connection. Even a modest maximum connection count of ~30,000 connections -- something even an unpatches 4.3 FreeBSD could handle -- means that you need 1G of RAM for the connections alone, if you disallow overcommit. In practice, that would mean ~20,000 connections, when you count page table entries, open file table entries, vnodes, inpcb's, tcpcb's, etc.. And that's a generaous estimate, which assumes that you tweak your kernel properly. You could always just put 4G of RAM in the machine, since memory is so cheap now. :) At some point you'll hit a limit in the number of connections the processor can actually handle. One approach to this is to control the window sizes based on th amount of free reserve you have available, but this will actually damage overall throughput, particularly on links with a higher latency. Yep. In the ti driver case, the inability to get another mbuf to replace the one that will be taken out of the ring means that the mbuf gets reused for more data -- NOT that the data flow in the form of DMA from the card ends up being halted until mbufs become available. True. This is actually very bad: you want to drop packets before you insert them into the queue, rather than after they are in the queue. This is because you want the probability of the drop (assuming the queue is not maxed out: otherwise, the probabilty should be 100%) to be proportional to the exponential moving average of the queue depth, after that depth exceeds a drop threshold. In other words, you want to use RED. Which queue? The packets are dropped before they get to ether_input(). Dropping random packets would be difficult. Please look at what happens in the case of an allocation failure, for any driver that does not allow shrinking the ring of receive mbufs (the ti is one example). It doesn't spam things, which is what you were suggesting before, but as you pointed out, it will effectively drop packets if it can't get new mbufs. Maybe I'm being harsh in calling it spam'ming. It does the wrong thing, by dropping the oldest unprocessed packets first. A FIFO drop is absolutely the wrong thing to do in an attack or overload case, when you want to shed load. I consider