Re: Realtek 8139 + Acer Laptop
Dmesg say nothing. According to WinXP system information, the card is RealTek RTL 8139/810x Family Fast Ethernet NIC. at bus PCI 0 : 7 : 0. Maybe an IRQ conflict? I attach the dmesg pciconf -lv output. Unfortunately, what you did not do is show us the dmesg output from NetBSD or OpenBSD so that we could see what happens when the chip is probed correctly. Looking at the dmesg and pciconf output it seems the device was not found at all. This means it's not a networking problem at all, but a PCI problem. The failure to detect the device could be due to any one of the following: - There's an option to disable the on-board NIC in the BIOS and you disabled it and forgot about it - There's a bug in the PCI bridge code which is preventing it from enumerating all of the devices properly - There's some magic you need to do to enable/power up the on-board NIC that we're not doing This is something you should be asking the PCI gurus about, not the networking gurus. -Bill -- = -Bill Paul(510) 749-2329 | Senior Engineer, Master of Unix-Fu [EMAIL PROTECTED] | Wind River Systems = If stupidity were a handicap, you'd have the best parking spot. = ___ [EMAIL PROTECTED] mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: bge driver not recognising BCM 5705M
I'm somewhat confused. So am I: where were you when I asked sent e-mail to this list asking for people to test the 5705 changes before I committed them? On a recent 5.1-CURRENT, boot -v gives me: Actually, boot -v gives you much more, like the date when the kernel image was compiled. Too bad you decided not to show everything to us. found- vendor=0x14e4, dev=0x165d, revid=0x01 bus=2, slot=0, func=0 class=02-00-00, hdrtype=0x00, mfdev=0 cmdreg=0x0116, statreg=0x02b0, cachelnsz=8 (dwords) lattimer=0x20 (960 ns), mingnt=0x40 (16000 ns), maxlat=0x00 (0 ns) intpin=a, irq=11 powerspec 2 supports D0 D3 current D0 followed by: pci2: network, ethernet at device 0.0 (no driver attached) This is the internal Gigabit ethernet on my Dell D800 laptop... but it's not recognised, even though... static struct bge_type bge_devs[] = { ... { BCOM_VENDORID, BCOM_DEVICEID_BCM5705, Broadcom BCM5705 Gigabit Ethernet }, ... }; and ... #define BCOM_VENDORID 0x14E4 #define BCOM_DEVICEID_BCM5705M 0x165D ... so why doesn't the bge driver kick in? You'll need to investigate this one for yourself. Make *SURE* you booted from the right kernel image (strings -a /boot/kernel/kernel | grep 5705). A good way to experiment is compile your kernel _WITHOUT_ bge support, and then build if_bge.ko as a module: # cd /sys/modules/bge # make; make load -Bill -- = -Bill Paul(510) 749-2329 | Senior Engineer, Master of Unix-Fu [EMAIL PROTECTED] | Wind River Systems = If stupidity were a handicap, you'd have the best parking spot. = ___ [EMAIL PROTECTED] mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: bge driver not recognising BCM 5705M
Bill == Bill Paul [EMAIL PROTECTED] writes: I'm somewhat confused. Bill So am I: where were you when I asked sent e-mail to this list Bill asking for people to test the 5705 changes before I committed Bill them? I very well might not have had this machine. When did you commit them? On a recent 5.1-CURRENT, boot -v gives me: Bill Actually, boot -v gives you much more, like the date when the Bill kernel image was compiled. Too bad you decided not to show Bill everything to us. I didn't want to spam, *sigh* No. Spam is when you try to sell me viagra or bestiality porn. Providing detailed problem reports is not spam. It saves me from having to _ask_ you for more information, thereby prolonging what might otherwise be a simple one shot exchange. It also can save time and wear and tear on developers, since, in the process of collecting detailed information, you might stumble upon possible solutions to your problem on your own, to wit: ...but my recent current is: FreeBSD canoe.velocet.net 5.1-CURRENT FreeBSD 5.1-CURRENT #0: Tue Jul 15 17:54:29 EDT 2003 [EMAIL PROTECTED]:/usr/obj/usr/src/sys/CANOE i386 [/u/wpaul/xl/src/sys/dev/bge]:zim.wrs.com{58}% cvs log if_bge.c [...] revision 1.44 date: 2003/07/16 00:09:56; author: wpaul; state: Exp; lines: +226 -103 ^^ Add support for the BCM5705 and its ilk. Changes: - 5705 doesn't support jumbo frames - Statistics must be read from registers - RX return ring must be capped at 512 entries - Omit initialization of certain device blocks - Acknowledge link change interrupts by setting the 'link changed' bit in the status register (used to have no effect) - Remember to toggle the MI completion bit too - Set the mbuf low watermark differently (on-chip memory buffers, not BSD mbufs) - Don't enable [EMAIL PROTECTED] feature for certain 5705 chip revs - Add additional PCI IDs for 5705 and 5782 parts - Add a forgotten 5704 PCI ID Most changes ripped kicking and screaming from the Broadcom linux driver. Thanks to Paul Saab for sanity testing. (My lack of sanity has been confirmed.) Your kernel image on July 15th. The changes were committed on July 16th. You missed by one day. -Bill -- = -Bill Paul(510) 749-2329 | Senior Engineer, Master of Unix-Fu [EMAIL PROTECTED] | Wind River Systems = If stupidity were a handicap, you'd have the best parking spot. = ___ [EMAIL PROTECTED] mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: Help diagnosing NIS breakage ?
Ugh... I'm still a moron. I just uploaded yet another diff. Can you test this one for me please? -BIll No dice; same effect. Thanks for looking into this. Let me know what other patches you'd like for me to try. RObin Gr. I don't know how I can keep getting this wrong. Ok, this time I tested the change with a sample program. Try applying http://www.freebsd.org/~wpaul/getpwent.diff again. Verify that the result matches the file in the fbsd5 test account. The getpwuid() routine seems to work ok, though my test for the geteuid() == 0 case was a bit of a kludge since I don't actually have root on the test box. -Bill -- = -Bill Paul(510) 749-2329 | Senior Engineer, Master of Unix-Fu [EMAIL PROTECTED] | Wind River Systems = If stupidity were a handicap, you'd have the best parking spot. = ___ [EMAIL PROTECTED] mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: Help diagnosing NIS breakage ?
On a client bound to this server, please do: % ypwhich -m Thanks for getting back to me on this. First off, apologies if I'd failed to mention the server before...Now, on a -CURRENT NIS client (with rev 1.81 getpwent.c): $ ypwhich -m shadow dc3 passwd.byuid dc3 [...] Ok, so it does support the YPPROC_MASTER procedure. Let's try something a little different. This time, do: % ypwhich -m master.passwd.byname And show the results. Might as well try ypwhich -m master.passwd.byuid too, though the result will probably be the same. OkUsing getpwent.c v1.82 with your diff: # id robin id: robin: no such user # ypwhich -m master.passwd.byname dc3 # ypwhich -m master.passwd.byuid dc3 [...] AGH Ok, first, whoever is responsible for this NIS server implementation is an idiot. It appears the YPPROC_MASTER procedure does no argument validation and always returns success even for maps that don't exist. This is why revision 1.182 fails in your case: using yp_master() to check for the master.passwd maps succeeds, which makes the code think it should be doing master.passwd lookups (which ultimately fail when the actual lookup is performed). Fortunately, it looks like YPPROC_ORDER works correctly: [EMAIL PROTECTED] [~]$ ypwhich GCDC2.gc.nat [EMAIL PROTECTED] [~]$ ypwhich -m master.passwd.byname dc3 [EMAIL PROTECTED] [~]$ yppoll master.passwd.byname yppoll: no such map master.passwd.byname. Reason: No such map in server's domain [EMAIL PROTECTED] [~]$ yppoll passwd.byname Map passwd.byname has order number 10683. Wed Dec 31 21:58:03 1969 The master server is dc3. Second, I'm an idiot because I made a mistake in the patch I provided: the nis_map() function should return NS_UNAVAIL if yp_order() fails, rather than falling through and returning NS_SUCCESS all the time. I uploaded a new diff, please test this instead: http://www.freebsd.org/~wpaul/getpwent.diff Thanks for providing me access to this machine, it helped me realize where I'd gone wrong in my patch. If this works for you, and if nobody objects, I will check it in. -Bill -- = -Bill Paul(510) 749-2329 | Senior Engineer, Master of Unix-Fu [EMAIL PROTECTED] | Wind River Systems = If stupidity were a handicap, you'd have the best parking spot. = ___ [EMAIL PROTECTED] mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: Help diagnosing NIS breakage ?
In our implementation, the NIS server is ActiveDirectory with ServicesForUnix 3.0 :) Ok, first, shame on you for waiting until now to reveal this piece of information. (Although, I'm coming into this thread late, so if you mentioned it in a previous message and I wasn't able to find it, then I apologize. But if you didn't mention it, *THWAP*.) On a client bound to this server, please do: % ypwhich -m and show us the results. The ypwhich(8) utility will obtain a list of all the NIS maps being served by your server host and try to do a YPPROC_MASTER on each one to learn the name of its master server. It's possible that the Services For Unix implementation of NIS does not implement the YPPROC_MASTER procedure because the 'master' server in this case is CaptiveDirectory rather than an actual NIS master server (which means trying to do normal NIS master server things like yppasswd updates wouldn't work). If this is the case, then ypwhich should return an error for each map. I don't remember if ypwhich prints out the whole error status, but the expected underlying RPC error would be RPC_PROCUNAVAIL (procedure not available). This complicates matters a bit. When testing for the master.passwd maps, you can use the following logic: - If you call yp_order() on the master.passwd.byname map and the NIS server says YPERR_MAP (no such map), then you know that the server supports the YPPROC_ORDER procedure, but it doesn't have the master.passwd.byname map. So either it's not a FreeBSD server, or it is a FreeBSD server but the administrator has chosen not to serve the master.passwd maps. In either case, you should roll over to the normal passwd map lookup. - If you call yp_order() and get back YPERR_YPERR, that means there was an underlying RPC error (i.e. RPC_PROCUNAVAIL) which indicates the server doesn't support the YPPROC_ORDER procedure. The Sun NIS+ server in YP compat mode is one example of this. The FreeBSD ypserv does implement YPPROC_ORDER, so this error means you are not talking to a FreeBSD server, and again, you would roll over to the normal passwd map lookup. - When you get to the passwd map lookup, you probably shouldn't be attempting to do either a yp_master() or yp_order() poll on the map. You should just attempt a lookup. The yp_order() trick was something I introduced for the sole purpose of trying to determine if the server on the other end had master.passwd maps on it (i.e. it was a FreeBSD server). The nis_map() function in -current is analagous to the _havemaster() function in -stable, but _havemaster() was only meant to test for the master.passwd maps, whereas nis_map() checks master.passwd maps and then, if that fails, checks for passwd maps too. I don't think this is the correct approach: you should only attempt a yp_order() on the master.passwd.by* maps, and if that doesn't work, you let the nis_lookup() function do the yp_first()/yp_next()/yp_match() and return an error if necessary. I have a diff to do this at: http://www.freebsd.org/~wpaul/getpwent.diff I think this is exposing a bug in our NIS implementation, but don't know enough about it to be sure. I think backing it out just hides the bug again. As a work-around, we could try yp_order first, and if that fails, try yp_master. It's not a bug in our implementation, it's implementation weirdness in the server. :) -Bill -- = -Bill Paul(510) 749-2329 | Senior Engineer, Master of Unix-Fu [EMAIL PROTECTED] | Wind River Systems = If stupidity were a handicap, you'd have the best parking spot. = ___ [EMAIL PROTECTED] mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: Help diagnosing NIS breakage ?
In our implementation, the NIS server is ActiveDirectory with ServicesForUnix 3.0 :) Ok, first, shame on you for waiting until now to reveal this piece of information. (Although, I'm coming into this thread late, so if you mentioned it in a previous message and I wasn't able to find it, then I apologize. But if you didn't mention it, *THWAP*.) On a client bound to this server, please do: % ypwhich -m Thanks for getting back to me on this. First off, apologies if I'd failed to mention the server before...Now, on a -CURRENT NIS client (with rev 1.81 getpwent.c): $ ypwhich -m shadow dc3 passwd.byuid dc3 [...] Ok, so it does support the YPPROC_MASTER procedure. Let's try something a little different. This time, do: % ypwhich -m master.passwd.byname And show the results. Might as well try ypwhich -m master.passwd.byuid too, though the result will probably be the same. And to elaborate... [...] Shall I patch getpwent.c (rev 1.82) with your diff and see what happens ? If you could, please. Though I'm curious to know just what was causing the failure in the first place. Clearly YPPROC_MASTER on maps that exist seems to work (otherwise ypwhich would have complained like gangbusters), so maybe it's generating some sort of non-standard error on maps that don't exist. The fact that it fails for root means it must have something to do with probing for the master.passwd.by* maps, but I'm not sure what yet. -Bill -- = -Bill Paul(510) 749-2329 | Senior Engineer, Master of Unix-Fu [EMAIL PROTECTED] | Wind River Systems = If stupidity were a handicap, you'd have the best parking spot. = ___ [EMAIL PROTECTED] mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to [EMAIL PROTECTED]
More Broadcom BCM5705 updates
I have merged in some additional updates provided by Paul Saab: - Support the BCM5782 chip (5705 workalike, new PCI ID) - Increase firmware handshake timeout - Always check for GMII PHYs at PHY address 1 (required for some chips, doesn't hurt on the others) - Add ASIC rev numbers for 5705_A1, 5705_A2 and 5705_A3 There may be some additional performance tweaks needed, but this should get the device attached and working. As before, download the new code from: http://www.freebsd.org/~wpaul/Broadcom/5705 - Copy if_bge.c and if_bgereg.h to /sys/dev/bge - Copy miidevs and brgphy.c to /sys/dev/mii - Rebuild your kernel and/or if_bge.ko and miibus.ko modules -Bill -- = -Bill Paul(510) 749-2329 | Senior Engineer, Master of Unix-Fu [EMAIL PROTECTED] | Wind River Systems = If stupidity were a handicap, you'd have the best parking spot. = ___ [EMAIL PROTECTED] mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to [EMAIL PROTECTED]
Call for testers: Broadcom 5705 gigabit ethernet
While I still don't have any documentation for the BCM5705, I recently obtained a Broadcom driver with 5705 support. After scrutinizing it carefully, it looks like the differences between it and its predecessors are: - No jumbo frame support - RX return ring is limited in size to 512 entries - No support for DMA'ed statistics block (stats must be read from registers instead) - Initialization of certain on-chip blocks and parameters are skipped - 5705 rev A0 has a bug that requires a workaround: the driver has to poll the NIC's memory after setting up the RX descriptor ring to verify the chip has actually loaded it. I have merged all these changes into a copy of the bge(4) driver from -current (should also work with 5.1-RELEASE). You can get it from: http://www.freebsd.org/~wpaul/Broadcom/5705 To use it, just drop the supplied if_bge.c and if_bgereg.h files into /sys/dev/bge and recompile your kernel and/or if_bge.ko module. Unfortunately, I don't have a machine with a 5705 chip in it, so I need other people to help me test these changes. If you have avilable right now: - a laptop or other box with a 5705 gigE chip - FreeBSD 5.1-RELEASE or -CURRENT - another network interface that you can use to load this driver Then please test this updated driver for me and report back. Information that I would like to see: - Describe the system with the 5705 chip in it (I'm under the impression the 5705 is being used in embedded configurations only) - A copy of dmesg output showing the ASIC revision of your chip (doesn't have to be a verbose boot, though I won't mind if it is) - A detailed description of any problems you may observe while testing the driver Information I don't want to see: - Requests for help with other totally unrelated drivers - Requests for help transfering large sums of money out of Nigeria - Information about septic tank clealing - Pictures of people getting it on with barnyard animals - Bikeshed arguments Thanks in advance for any help anyone is able to provide. -Bill -- = -Bill Paul(510) 749-2329 | Senior Engineer, Master of Unix-Fu [EMAIL PROTECTED] | Wind River Systems = If stupidity were a handicap, you'd have the best parking spot. = ___ [EMAIL PROTECTED] mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to [EMAIL PROTECTED]
Broadcom 5705, addendum
Almost forgot: The BCM5705 also has a new PHY ID, which means an update to brgphy.c and miidevs is required. So, to test the 5705 driver update, do the following: - Download the files from http://www.freebsd.org/~wpaul/Broadcom/5705 - Put if_bge.c and if_bgereg.h into /sys/dev/bge - Put brgphy.c and miidevs into /sys/dev/mii - Recompile your kernel and/or if_bge.ko and miibus.ko modules. -Bill -- = -Bill Paul(510) 749-2329 | Senior Engineer, Master of Unix-Fu [EMAIL PROTECTED] | Wind River Systems = If stupidity were a handicap, you'd have the best parking spot. = ___ [EMAIL PROTECTED] mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to [EMAIL PROTECTED]
Call for testers: rl(4) optimized for 8139C+
Somehow I doubt I'm going to get a lot of responses to this, since I'm not sure how many people besides me actually have an 8139C+ NIC. That said, if you have one, and you're running FreeBSD 5.1 or later, please try the driver code at: http://www.freebsd.org/~wpaul/RealTek/cplus If you actually have a C+ card, it will show up like this: rl0: RealTek 8139 10/100BaseTX (C+) port 0xc000-0xc0ff mem 0xdc001000-0xdc0010ff irq 11 at device 13.0 on pci0 rl0: Ethernet address: 00:e0:4c:00:00:1b miibus5: MII bus on rl0 rlphy0: RealTek internal media interface on miibus5 rlphy0: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, auto Also, ifconfig rl0 will show this: rl0: flags=8802BROADCAST,SIMPLEX,MULTICAST mtu 1500 options=1bRXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING The 8139C+ uses real descriptor-based DMA and actually exhibits pretty good performance. (With my Athlon 900Mhz test machine, I can achieve 143000 frames/sec raw transmit speed.) This driver supports the following features: - RX and TX descriptor lists - RX and TX TCP/IP checksum offload - RX and TX hardware vlan tagging - TX interrupt moderation (using the 8139C+'s on-board timer) The chip supports TCP large send, but there's no driver support for that because there's no way to exploit it in FreeBSD. The chip also supports two TX DMA queues (normal and high priority), but the driver only takes advantage of one. If you have a -current system, please compile this code with -DNEW_BUSDMA_API (this enables the use of the two new arguments to the bus_dma_tag_create() function which appeared a couple of days ago). I'm mostly looking for performance reports and success/failure reports concerning VLANs. (I don't have an easy way to test the VLAN support at home. I think I did everything right, but I want to be sure before I commit to the tree.) There is preliminary support for the 8169 gigE chip, but I don't have a card to test with, so don't expect RealTek gigE NICs to work yet. Lastly, I'm also interested to see just what NICs are out there that use the 8139C+. The only way to spot the presence of such a chip raw transmit speed.) This driver supports the following features: - RX and TX descriptor lists - RX and TX TCP/IP checksum offload - RX and TX hardware vlan tagging - TX interrupt moderation (using the 8139C+'s on-board timer) The chip supports TCP large send, but there's no driver support for that because there's no way to exploit it in FreeBSD. The chip also supports two TX DMA queues (normal and high priority), but the driver only takes advantage of one. If you have a -current system, please compile this code with -DNEW_BUSDMA_API (this enables the use of the two new arguments to the bus_dma_tag_create() function which appeared a couple of days ago). I'm mostly looking for performance reports and success/failure reports concerning VLANs. (I don't have an easy way to test the VLAN support at home. I think I did everything right, but I want to be sure before I commit to the tree.) There is preliminary support for the 8169 gigE chip, but I don't have a card to test with, so don't expect RealTek gigE NICs to work yet. Lastly, I'm also interested to see just what NICs are out there that use the 8139C+. The only way to tell if you have the chip is to check the part number on it. (It should in fact say 8139C+.) I hope RealTek has actually sold this chip well, because it actually seems to perform really well. -Bill -- = -Bill Paul(510) 749-2329 | Senior Engineer, Master of Unix-Fu [EMAIL PROTECTED] | Wind River Systems = If stupidity were a handicap, you'd have the best parking spot. = ___ [EMAIL PROTECTED] mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to [EMAIL PROTECTED]
Heads up: checking in change to ata-card.c
A long time ago, circa FreeBSD 4.3, my Sony PCMCIA CD-ROM drive with brain-damaged Teac ATA controller drive worked, and the people were content and all was right with the land. Then my 4.4 CD set arrived, and I was disappointed to find the install kernel would lock up after the driver was probed. Complaints were made and epithets were hurled, and lo, in 4.5-RELEASE, all was well once again. Then came 4.6-RELEASE, and it was broken again. Then came 4.6.2, and it was still broken. And in 4.7. And 4.8. And 5.0. (Yeah yeah, shut up.) I just got my 5.1-RELEASE CDs and guess what? My CD-ROM drive STILL doesn't work!!! Guys, there's nothing more frustrating than getting a brand new CD set in the mail and not being able to install via CD. So I'm not leaving it up to chance anymore. I have a patch to make my drive work here: http://www.freebsd.org/~wpaul/ata-card.c.diff-5.1 I have no idea what the altio port remapping goop in ata_pccard_probe() is supposed to accomplish, but I'm telling you all right now, it doesn't work with my drive. I have specifically added to code to skip the remapping for drives with the product string of NinjaATA- (the problem with the Teac controller is that its vendor/product ID is 0x/0x, so there's really no better way to indentify it). This patch changes this: ata2: NinjaATA- at port 0x180-0x187,0x386-0x387 irq 9 function 0 config 33 on pccard0 device_probe_and_attach: ata2 attach returned 6 Into this: ata2: NinjaATA- at port 0x180-0x187,0x386-0x387 irq 9 function 0 config 33 on pccard0 acd0: CDROM CD-224E at ata2-master PIO4 Now, I'm sending out this notification that I _WILL_ check this patch in before I leave the office today. I'm not asking for permission. I've been waiting for half a dozen releases for this to get fixed. When 5.2-RELEASE arrives in my mailbox, it _WILL_ install from CD-ROM on my laptop or so help me I will find the responsible parties and force them to listen to RMS sing Free The Software -- _live_ -- until their ears bleed. I have tested this patch. It compiles. It works. I can mount CDs successfully and transfer stuff from them to my heart's content. I've reduced the change down to the absolute minimum and insured that it won't affect any other drives except mine. It's going in. Period. I am not leaving it up for debate, because if I do, all that will happen is that people will argue over what the more correct fix should be, and at this point I _just_ _don't_ _care_. I want my drive to work, and this does it. Understand? Good. This concludes my announcement. Thank you all for your attention. Don't forget to tip the waitress. -Bill P.S.: Be sure to join us next time when I ream out whoever it was that broke support for my 3Com 3c575C cardbus ethernet NIC. -- = -Bill Paul(510) 749-2329 | Senior Engineer, Master of Unix-Fu [EMAIL PROTECTED] | Wind River Systems = If stupidity were a handicap, you'd have the best parking spot. = ___ [EMAIL PROTECTED] mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: Heads up: checking in change to ata-card.c
Here's a better patch, basesd on wpaul's input. Bill, can you try it an see if it works for you? If so, i would be better to commit this one. If not, I'll work with you to fix it. Close, but this doesn't quite do it. I thought something didn't look right when I read it, and I tested it just to be certain. Detecting the card via the table is fine, but you're not skipping the right piece of code. The existing code logic says: if (getting the altio port range with bus_get_resource() succeeds) then (set the resource offset depending on port range size) else (release the ioport resource and bail with ENXIO) With my drive, it's the bus_get_resource() that fails, so we end up failing the probe with ENXIO. Your patch changes this to: if (the ONE_REGION flag is not set?) and (getting the port range with bus_get_resource() succeeds) then (set the resource offset depending on port range size) else (release the ioport resource and bail with ENXIO) So if the ONE_REGION bit is set, we still bail with ENXIO. I put the question mark up there because I think there's a typo: -if (bus_get_resource(dev, SYS_RES_IOPORT, ATA_ALTADDR_RID, tmp, tmp)) { +if ((ap == NULL || (ap-flags ONE_REGION) != 0) + bus_get_resource(dev, SYS_RES_IOPORT, ATA_ALTADDR_RID, tmp, tmp)) { I think this: if ((ap == NULL || (ap-flags ONE_REGION) != 0) ... should be this: if ((ap == NULL || (ap-flags ONE_REGION) == 0) ... I might be wrong. In any case, it still doesn't work. I think the logic should be: if (the ONE_REGION flag is not set) then (do all the other stuff) This way, we skip the whole altio remapping step entirely. I just tested this and it seems to work. Here's a modified version of your patch showing what I mean: http://www.freebsd.org/~wpaul/ata-card.c.diff2 Also, since Soren asked, here are verbose boot messages in both the failure and success cases: http://www.freebsd.org/~wpaul/dmesg.bad http://www.freebsd.org/~wpaul/dmesg.good If you are uninterested in working with us to get things in, then your patch will not lasts the evening as such an approach is unacceptibe. Well, with all due respect, the fact that this is still broken is also unacceptable. All I want is to insure that this is fixed and that it stays fixed. I don't care if it's your patch or mine, as long as it works, and doesn't regress. And for the record, if somebody came to me and said: Bill! Your so-and-so driver hasn't worked right on such-and-such card I don't have in months! I'm at wit's end! You leave me *NO* alternative! I'm just going to have to fix it for you! At gunpoint! No no, don't try and stop me! This is for your own good! well, then, gosh, I'd let 'em. But I guess that's just me. -Bill -- = -Bill Paul(510) 749-2329 | Senior Engineer, Master of Unix-Fu [EMAIL PROTECTED] | Wind River Systems = If stupidity were a handicap, you'd have the best parking spot. = ___ [EMAIL PROTECTED] mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: Heads up: checking in change to ata-card.c
In message [EMAIL PROTECTED], M. Warner Losh writes: Here's a better patch, basesd on wpaul's input. Bill, can you try it an see if it works for you? If so, i would be better to commit this one. If not, I'll work with you to fix it. FYI, I have a no-name (PCMCIA/CD-ROM) drive that also requires failure of the second IO range to be made non-fatal. How about just deleting the `else' clause as in the patch below? It seems that this can only affect CD-ROM drives that were otherwise not working, so it should be fairly safe. This patch also tests good with my drive. -Bill -- = -Bill Paul(510) 749-2329 | Senior Engineer, Master of Unix-Fu [EMAIL PROTECTED] | Wind River Systems = If stupidity were a handicap, you'd have the best parking spot. = Ian Index: ata-card.c === RCS file: /dump/FreeBSD-CVS/src/sys/dev/ata/ata-card.c,v retrieving revision 1.14 diff -u -r1.14 ata-card.c --- ata-card.c17 Jun 2003 12:33:53 - 1.14 +++ ata-card.c26 Jun 2003 23:00:01 - @@ -131,10 +131,6 @@ start + ATA_ALTOFFSET, ATA_ALTIOSIZE); } } -else { - bus_release_resource(dev, SYS_RES_IOPORT, rid, io); - return ENXIO; -} /* allocate the altport range */ rid = ATA_ALTADDR_RID; ___ [EMAIL PROTECTED] mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to [EMAIL PROTECTED]
Possible EHCI bugs
usbd. Starting local daemons:. Updating motd. Configuring syscons: blanktime. Starting sshd. Starting sendmail. Initial i386 initialization:. Additional ABI support:. Starting cron. Local package initialization:. Additional TCP options:. Starting background file system checks in 60 seconds. Fri Jun 13 01:01:06 PDT 2003 Jaxe0: read PHY failed axe0: read PHY failed axe0: read PHY failed axe0: read PHY failed Memory modified after free 0xc22e8310(12) panic: Most recently used by USB Debugger(panic) Stopped at Debugger+0x54: xchgl %ebx,in_Debugger.0 db where Debugger(c050426c,c05c6ca0,c051ae00,cbe11a48,1) at Debugger+0x54 panic(c051ae00,c04fc329,c,c083aa44,c083aa20) at panic+0xab mtrash_ctor(c22e8310,10,0,554,0) at mtrash_ctor+0x5d uma_zalloc_arg(c083aa20,0,1,0,0) at uma_zalloc_arg+0x17f malloc(2,c05527a0,1,c23bd200,cbe11af8) at malloc+0xd4 ehci_allocm(c22e0400,c23bd23c,2,c0321dbc,c23bd24c) at ehci_allocm+0x27 usbd_transfer(c23bd200,cbe11b48,c02b9b7d,c23bd200,c22dec00) at usbd_transfer+0x5c usbd_sync_transfer(c23bd200,c22dec00,0,1388,cbe11ba0) at usbd_sync_transfer+0x1c usbd_do_request_flags_pipe(c22dec00,c22deb80,cbe11ba0,cbe11bce,0) at usbd_do_request_flags_pipe+0x7d usbd_do_request_flags(c22dec00,cbe11ba0,cbe11bce,0,0) at usbd_do_request_flags+0x3c usbd_do_request(c22dec00,cbe11ba0,cbe11bce,7c0,20001) at usbd_do_request+0x37 axe_cmd(c22e2000,2007,1,0,cbe11bce) at axe_cmd+0x92 axe_miibus_readreg(c22deb00,0,1,c21ad1f0,c22de780) at axe_miibus_readreg+0x9a MIIBUS_READREG(c22deb00,0,1,cbe11c28,c024c256) at MIIBUS_READREG+0x56 miibus_readreg(c22de780,0,1,c22dcac0,c22dca40) at miibus_readreg+0x27 MIIBUS_READREG(c22de780,0,1,c22de780,c22814a0) at MIIBUS_READREG+0x56 bmtphy_status(c22dca40,0,c0de184c,c05ca1c8,c22dca40) at bmtphy_status+0x3c bmtphy_service(c22dca40,c22dcac0,1,c22dcac0,c22e2000) at bmtphy_service+0xe8 mii_tick(c22dcac0,c050379c,7,7,c22e2000) at mii_tick+0x32 axe_tick(c22e2000,0,c050526d,d0,1) at axe_tick+0x2f softclock(0,0,c05020e3,231,c0de1790) at softclock+0x1b8 ithread_loop(c0de0100,cbe11d48,c0501fa8,30c,65d7c) at ithread_loop+0x182 fork_exit(c02fef90,c0de0100,cbe11d48) at fork_exit+0xc0 fork_trampoline() at fork_trampoline+0x1a --- trap 0x1, eip = 0, esp = 0xcbe11d7c, ebp = 0 --- db This seems to indicate something in the USB code is re-using a free()ed memory buffer. Unfortunately, I don't have this particular hardware available to me, and I don't know how much debugging support the individual at Transmeta will be able to offer. (He has his own problems.) Hopefully this will at least help spur some investigation. -Bill -- = -Bill Paul(510) 749-2329 | Senior Engineer, Master of Unix-Fu [EMAIL PROTECTED] | Wind River Systems = If stupidity were a handicap, you'd have the best parking spot. = ___ [EMAIL PROTECTED] mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: Checksum offload support for Intel 82550/82551
I think my problem is not that hard. This bug only occurs when you are using CURRENT (or 5.0 RELEASE) from (I think) the point where BPF changed. Actually, there's something else than changed. The ng_fec module tries to set ifp-if_output to ng_fec_output() so that it can do some output processing on frames, however, later it would call ether_ifattach(), which would set ifp-if_output to ether_output(), which wouldn't work. I just commited a fix for this. (I verified that I can successfully send packets via fec0.) I also modified the input handling so that it no longer uses the ng_ether_input_p hook. Now that FreeBSD 5.x has an ifp-if_input vector, it's not necessary to abuse this hook anymore. This avoids a collision with the ng_ether module, which is what was supposed to be using this hook in the first place. Let me know how this works. -Bill -- = -Bill Paul(510) 749-2329 | Senior Engineer, Master of Unix-Fu [EMAIL PROTECTED] | Wind River Systems = If stupidity were a handicap, you'd have the best parking spot. = To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-current in the body of the message
Re: Checksum offload support for Intel 82550/82551
Hello, Yes, it's me. I'm still alive. It's great to hear that one of the most talented FreeBSD hackers is back in business :) Does this means that you can afford some time to investigate the problems regarding your old software? Not unless it's something I can fix using easily available resources. I can't easily drop everything and slap together a test setup with exactly the right software and hardware I need to debug everyone's particular problem. (This bug only occurs in -CURRENT as of 30 seconds ago and on an UltraSPARC 10 with 16 if_dc interfaces and I need you to fix it _NOW_ pleasepleasepleaseI'llevengiveyouahandjob.) I mean ng_fec primarily, because I couldn't get help in the past few months/years(?)... You may know, or not it is now part of FreeBSD, the only problem is that it does not work. I'm shocked. Shocked, I tell you. I filed a PR (kern/46720) about a month ago, but haven't gotten too much response back. On these lists I think there is a consensus (search the archives :) that the FEC implementation is a good thing. This particular PR relates to using ng_fec with BPF (i.e. tcpdump fec0 blows up). The code has evidently rotted quite a bit since it was imported. I just fixed it. Another problem, which I faced years ago that if you want to use .1q tagged packets on a FEC interface, it just does not works. I don't know if this is still a problem or not. At the moment, I have no easy way to test it. There are more verbose details on these lists too. Are there any chances to get these fixed? Like I said, it depends on time and availability of resources. -Bill -- = -Bill Paul(510) 749-2329 | Senior Engineer, Master of Unix-Fu [EMAIL PROTECTED] | Wind River Systems = If stupidity were a handicap, you'd have the best parking spot. = To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-current in the body of the message
Re: Checksum offload support for Intel 82550/82551
On Mon, 24 Feb 2003 18:13:42 -0800 (PST), in sentex.lists.freebsd.current you wrote: be reliable, nevermind pleasant to look at. I only have access to an 82550 card, so I don't know if this is fixed in the 82551 or not. Hi, Can you tell reliably from the dmesg which type one has ? % dmesg | egrep -i fxp|inphy fxp0: Intel Pro 10/100B/100+ Ethernet port 0xc000-0xc03f mem 0xe880-0xe881,0xe8831000-0xe8831fff irq 11 at device 1.0 on pci1 fxp0: Ethernet address 00:07:e9:09:69:60 inphy0: i82555 10/100 media interface on miibus0 inphy0: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, auto fxp1: Intel Pro/100 Ethernet port 0xc800-0xc83f mem 0xe8832000-0xe8832fff irq 10 at device 8.0 on pci1 fxp1: Ethernet address 00:01:80:40:0e:b3 inphy1: i82562ET 10/100 media interface on miibus1 inphy1: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, auto Or do you have to look at the card / MB ? This is a good question. I apologize for not providing this info right off the bat. The fxp driver seems to use the same name strings for lots of different cards, so dmesg won't help you identify it. The only way to tell you have an 82550, other than looking at the card itself and checking for the i82550 part number, is to do: # pciconf -l | grep fxp and check for a revision code of 0xc (12) or higher. if_fxpreg.h lists a bunch of known revision values. Anything up to and including 0x9 is an 82557/8/9, which won't gain anything from these mods, I'm sorry to say. -Bill -- = -Bill Paul(510) 749-2329 | Senior Engineer, Master of Unix-Fu [EMAIL PROTECTED] | Wind River Systems = If stupidity were a handicap, you'd have the best parking spot. = To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-current in the body of the message
Checksum offload support for Intel 82550/82551
Yes, it's me. I'm still alive. And thanks to Wind River, I now know more about the Intel 8255x ethernet chipset than I ever really wanted to. Recently, I even learned how to enable checksum offload support for the 82550 and 82551 chips, and I decided it would be a good idea to add support for this to the if_fxp driver. There's a modified version of the code from 5.0-CURRENT sitting at: http://www.freebsd.org/~wpaul/Intel The bulk of the changes are in if_fxp.c and if_fxpreg.h. I've been testing this on 5.0-RELEASE, using 82557, 82559 and 82550 cards, and so far it seems to behave as expected. I would like to commit this, but first I want to make sure I'm not stepping on anyone's toes by doing so. I don't know who (if anyone) is maintaining the fxp driver at this point. (I think jlemon was doing, but don't know if he still is.) Some background: The 82550 and 82551 chips are newer than the 82559, even though their part number is lower. The 82559 has limited RX-only checksum offload capabilities. The 82550 and 82551 have IP and TCP/UDP checksum support for both TX and RX, using extended RX and TX descriptor structures. (Computing checksums across fragmented packets is _not_ supported.) The programming info used to enable the checksum offload support comes from the manual and driver source at: http://www.sourceforge.net/projects/e1000 Now, you'd think that the manual alone would be enough, but it isn't. The documentation describes the operation of TX checksum offload, which is implemented using a special command block called an IPCB. This is essentially an extended TxCB, except that where a TxCB contains two fragment pointer/length pairs, an IPCB has just one, and the extra space is used to control the packet parsing and offload capabilities. The manual also mentions the existence of extended RFDs for receive checksum offload, but the stupid thing doesn't tell you what they look like or how to enable them. For that, you have to go through Intel's Linux driver. It turns out there are extra bits in the config block that need to be set to enable extended RX mode. Also, the config block for the 82550 and 82551 is 32 bytes in size rather than 22. (The config bit to enable magic receive mode is in byte 23.) Adding support for these features while maintaining support for older chips (without making massive code changes) was a bit tricky. I don't think I did all that great a job of it, but it works. Basically, I overlaid the new IPCB structure pieces over the existing TxCB using a union. This allows the existing structure layout to be preserved for the benefit of older chips. There seems to be one annoying bug in TX checksum offload: the chip appears to botch IP header checksums for IP fragments containing less than 4 bytes. One of the tests I ran was to send a UDP packet of 1473 bytes, which ends up being fragmented across two IP datagrams, the latter containing only 1 byte of actual data. For some reason, the chip doesn't compute a proper checksum for this fragment. Consequently, TX IP header checksumming is turned off by default. If you compile the driver with -DFXP_IP_CSUM_WAR, you'll get some workaround code that attempts to hand-patch the IP checksum for datagrams of 1 to 3 bytes in size. This is not used by default because I don't consider it to be reliable, nevermind pleasant to look at. I only have access to an 82550 card, so I don't know if this is fixed in the 82551 or not. Unless anybody complains loudly, I'd like to commit this soon. I'm fairly confident that (at the very least) it doesn't break any existing functionality. Unfortunately, I'm not in a position to do in-depth performance tests right now. -Bill -- = -Bill Paul(510) 749-2329 | Senior Engineer, Master of Unix-Fu [EMAIL PROTECTED] | Wind River Systems = If stupidity were a handicap, you'd have the best parking spot. = To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-current in the body of the message
Re: my dc now doesn't work
After the last cvsup (changes from 29 of september) i've got dead dc (21143 based NIC). You have to tell us _exactly_ what card you have. Find the manufacturer and model info. Look on the box the card came in. Look at the card itself. Show us the output from pciconf -l so we can see the PCI vendor/device ID info. Yes, this information is important. Yes, I'm irritated that you didn't provide it straight off. (But then nobody ever does. Guys? You don't need me to ask you to provide this information. It's common sense. It's staring you right in the face.) LEDs are dead, but card is successfully probed and attached, so i have device but can't use it. What should i send to help investigate this problem? Knowing exactly what card this is will help. You can't debug this problem: I'm going to have to figure out a way to test and debug this myself, which is going to suck, as I no longer have an easy way to do FreeBSD work now that Wind River has pulled the plug on the test lab. If you want to be really nice, you can arrange to have this machine made accessible remotely (via an alternate network interface) and let me tinker with it via ssh. Otherwise, you'll have to wait for me to put together a test setup locally. Oct 11 11:57:37 ws-ilmar /boot/kernel.old/kernel: dc0: Ethernet address: 00:80:ad:90:b4:38 Oct 11 11:57:37 ws-ilmar /boot/kernel.old/kernel: miibus0: MII bus on dc0 Oct 11 11:57:37 ws-ilmar /boot/kernel.old/kernel: dcphy0: Intel 21143 NWAY media interface on miibus0 Oct 11 11:57:37 ws-ilmar /boot/kernel.old/kernel: dcphy0: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, auto I strongly suspect that the recent changes to the miibus code by jlemon have hosed the dcphy driver, which is very sensitive. (You don't want to know how long it took me to get it working halfway decently.) -Bill To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-current in the body of the message
Re: Where to put new bus_dmamap_load_mbuf() code
Maybe, but bus_dmamap_load() only lets you map one buffer at a time. I want to map a bunch of little buffers, and the API doesn't let me do that. And I don't want to change the API, because that would mean modifying busdma_machdep.c on each platform, which is a hell that I would rather avoid. bus_dmamap_load() is only one part of the API. bus_dmamap_load_mbuf or bus_dmamap_load_uio or also part of the API. They just don't happen to be impmeneted yet. 8-) Perhaps there should be an MD primitive that knows how to append to a mapping? This would allow you to write an MI loop that does exactly what you want. Any one of those ideas would be just fine. I eagerly await their realization. :) It's a separate list. The driver is reponsible for allocating the head of the list, then it hands it to bus_dmamap_list_alloc() along with the required dma tag. bus_dmamap_list_alloc() then calls bus_dmapap_create() to populate the list. The driver doesn't have to manipulate the list itself, until time comes to destroy it. Okay, but does this mean that bus_dmamap_load_mbuf no longer takes a dmamap? Drivers may want to allocate/manage the dmamaps in a different way. Yes, bus_dmamap_load_mbuf() accepts a dma tag, the head of the dmamap list, an mbuf, an segment array and a segment count. The Driver allocates the segment array with a certain number of members. It passes the array and segment count to bus_dmamap_load_mbuf(), which treats the segment count as the maximum number of segments that it can return to the caller. Once all the mappings have been done, it updates the segment count to indicate how many segments were actually needed. Then the driver transfers the info from the segment array into its DMA descriptor structures and kicks off the DMA operation. Once the device signals the transfer is done, the driver calls bus_dmamap_unload_mbuf() and bus_dmamap_destroy_mbuf() to unload the maps and return them to the map list for later use. It isn't until the driver calls bus_dmamap_list_destroy() that the dmamaps are actually released and the list free()ed. -Bill -- = -Bill Paul(510) 749-2329 | Senior Engineer, Master of Unix-Fu [EMAIL PROTECTED] | Wind River Systems = I like zees guys. Zey are fonny guys. Just keel one of zem. -- The 3 Amigos = To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-current in the body of the message
Re: Where to put new bus_dmamap_load_mbuf() code
My understanding is that you need a dmamap for every buffer that you want to map into bus space. You need one dmamap for each independantly manageable mapping. A single mapping may result in a long list of segments, regardless of whether you have a single KVA buffer or multiple KVA buffers that might contribute to the mapping. Yes yes, I understand that. But that's only if you want to map a buffer that's larger than PAGE_SIZE bytes, like, say, a 64K buffer being sent to a disk controller. What I want to make sure everyone understands here is that I'm not typically dealing with buffers this large: instead I have lots of small buffers that are smaller than PAGE_SIZE bytes. A single mbuf alone is only 256 bytes, of which only a fraction is used for data. An mbuf cluster buffer is usually only 2048 bytes. Transmitted packets are typically fragmented across 2 or 3 mbufs: the first mbuf contains the header, and the other two contain data. (Or the first one contains part of the header, the second one contains additional header data, and the third contains data -- whatever.) At most I will have 1500 bytes of data to send, which is less than PAGE_SIZE, and that 1500 bytes will be fragmented across a bunch of smaller buffers that are also smaller than PAGE_SIZE. Therefore I will not have one dmamap with multiple segments: I will have a bunch of dmamaps with one segment each. (I can hear somebody out there saying: What about jumbo frames? Yes, with jumbo frames, I will have 9K buffers to deal with, and in that case, you could have one dmamap with several segments, and I am taking this into account with the updated code I've written.) So unless I'm mistaken, for each mbuf in an mbuf list, what we have to do is this: - create a bus_dmamap_t for the data area in the mbuf using bus_dmamap_create() Creating a dmamap, depending on the architecture, could be expensive. You really want to create them in advance (or pool them), with at most one dmamap per concurrent transaction you support in your driver. The only problem here is that I can't really predict how many transactions will be going at one time. I will have at least RX_DMA_RING maps (one for each mbuf in the RX DMA ring), and some fraction of TX_DMA_RING maps. I could have the TX DMA ring completely filled with packets waiting to be DMA'ed and transmitted, or I may have only one entry in the ring currently in use. So I guess I have to allocate RX_DMA_RING + TX_DMA_RING dmamaps in order to be safe. - do the physical to bus mapping with bus_dmamap_load() bus_dmamap_load() only understands how to map a single buffer. You will have to pull pieces of bus_dmamap_load into a new function (or create inlines for common bits) to do this correctly. The algorithm goes something like this: foreach mbuf in the mbuf chain to load /* * Parse this contiguous piece of KVA into * its bus space regions. */ foreach bus space discontiguous region if (too_many_segs) return (error); Add new S/G element With the added complications of deferring the mapping if we're out of space, issuing the callback, etc. Why can't I just call bus_dmamap_load() multiple times, once for each mbuf in the mbuf list? (Note: for the record, an mbuf list usually contains one packet fragmented across multiple mbufs. An mbuf chain contains several mbuf lists, linked together via the m_nextpkt pointer in the header of the first mbuf in each list. By the time we get to the device driver, we always have mbuf lists only.) Chances are you are going to use the map again soon, so destroying it on every transaction is a waste. Ok, I spent some more time on this. I updated the code at: http://www.freebsd.org/~wpaul/busdma The changes are: - Tried to account for the case where an mbuf data region is larger than a page, i.e. when we have an mbuf with a 9K external buffer attached for use a jumbo ethernet frame. - Added routines to allocate a chunk of maps in a singly linked list, from which the other routines can grab them as needed. The driver attach routine calls bus_dmamap_list_init() with the max number of dmamaps that it will need, then the detach routine calls bus_dmamap_list_destroy() to nuke them when the driver is unloaded. The bus_dmamap_load_mbuf() routine uses the pre-allocated dmamaps from the list and bus_dmamap_list_destroy() returns them to the list when the transaction is completed. - Updated the modified if_sf driver to use the new code. Again, I've got this code running on the test box in the lab, so it's correct inasmuch as it compiles and runs, even though it may not be aesthetically pleasing. -Bill -- = -Bill Paul(510) 749-2329 | Senior Engineer, Master of Unix-Fu
Re: Where to put new bus_dmamap_load_mbuf() code
The fact that the data is less than a page in size matters little to the bus dma concept. In other words, how is this packet presented to the hardware? Does it care that all of the component pieces are PAGE_SIZE in length? Probably not. It just wants the list of address/length pairs that compose that packet and there is no reason that each chunk needs to have it own, and potentially expensive, dmamap. Maybe, but bus_dmamap_load() only lets you map one buffer at a time. I want to map a bunch of little buffers, and the API doesn't let me do that. And I don't want to change the API, because that would mean modifying busdma_machdep.c on each platform, which is a hell that I would rather avoid. Why can't I just call bus_dmamap_load() multiple times, once for each mbuf in the mbuf list? Due to the cost of the dmamaps, the cost of which is platform and bus-dma implementation dependent - e.g. could be a 1-1 mapping to a hardware resource. Consider the case of having a full TX and RX ring in your driver. Instead of #TX*#RX dmamaps, you will now have three or more times that number. There is also the issue of coalessing the discontiguous chunks if there are too many chunks for your driver to handle. Bus dma is supposed to handle that for you (the x86 implementation doesn't yet, but it should) but it can't if it doesn't understand the segment limit per transaction. You've hidden that from bus dma by using a map per segment. Ok, a slightly different question: what happens if I call bus_dmamap_load() more than once with different buffers but with the same dmamap? (Note: for the record, an mbuf list usually contains one packet fragmented across multiple mbufs. An mbuf chain contains several mbuf lists, linked together via the m_nextpkt pointer in the header of the first mbuf in each list. By the time we get to the device driver, we always have mbuf lists only.) Okay, so I haven't written a network driver yet, but you got the idea, right? 8-) Just don't get 3c509 and 3c905 misxed up and we'll be fine. :) - Added routines to allocate a chunk of maps in a singly linked list, from which the other routines can grab them as needed. Are these hung off the dma tag or something? dmamaps may hold settings that are peculuar to the device that allocated them, so they cannot be shared with other clients of bus_dmamap_load_mbuf. It's a separate list. The driver is reponsible for allocating the head of the list, then it hands it to bus_dmamap_list_alloc() along with the required dma tag. bus_dmamap_list_alloc() then calls bus_dmapap_create() to populate the list. The driver doesn't have to manipulate the list itself, until time comes to destroy it. -Bill -- = -Bill Paul(510) 749-2329 | Senior Engineer, Master of Unix-Fu [EMAIL PROTECTED] | Wind River Systems = I like zees guys. Zey are fonny guys. Just keel one of zem. -- The 3 Amigos = To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-current in the body of the message
Where to put new bus_dmamap_load_mbuf() code
Okay, I decided today to write a bus_dmamap_load_mbuf() routine to make it a little easier to convert the PCI NIC drivers to use the busdma API. It's not the same as the NetBSD code. There are four new functions: bus_dmamap_load_mbuf() bus_dmamap_unload_mbuf() bus_dmamap_sync_mbuf() bus_dmamap_destroy_mbuf() This is more or less in keeping with the existing API, except the new routines work exclusively on mbuf lists. The thing I need to figure out now is where to put the code. The current suggestion from jhb is to create the following two new files: sys/kern/kern_busdma.c sys/sys/busdma.h The functions are machine-independent, so they shouldn't be in sys/arch/arch/busdma_machdep.c. I mean, they could go there, but that would just result in code duplication. If somebody has a better suggestion, now's the time to speak up. Please let's avoid creating another bikeshed over this. Current code snapshot resides at: http://www.freebsd.org/~wpaul/busdma There's also a modified version if the Adaptec starfire driver there which uses the new routines. I'm running this version of the driver on a test box in the lab right now. -Bill -- = -Bill Paul(510) 749-2329 | Senior Engineer, Master of Unix-Fu [EMAIL PROTECTED] | Wind River Systems = I like zees guys. Zey are fonny guys. Just keel one of zem. -- The 3 Amigos = To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-current in the body of the message
Re: Where to put new bus_dmamap_load_mbuf() code
Another thing- maybe I'm confused- but I still don't see why you want to require the creating of a map each time you want to load an mbuf chain. Wouldn't it be better and more efficient to let the driver decide when and where the map is created and just use the common code for loads/unloads? Every hear the phrase you get what you pay for? The API isn't all that clear, and we don't have a man page or document that describes in detail how to use it properly. Rather than whining about that, I decided to tinker with it and Use The Source, Luke (tm). This is the result. My understanding is that you need a dmamap for every buffer that you want to map into bus space. Each mbuf has a single data buffer associated with it (either the data area in the mbuf itself, or external storage). We're not allowed to make assumptions about where these buffers are. Also, a single ethernet frame can be fragmented across multiple mbufs in a list. So unless I'm mistaken, for each mbuf in an mbuf list, what we have to do is this: - create a bus_dmamap_t for the data area in the mbuf using bus_dmamap_create() - do the physical to bus mapping with bus_dmamap_load() - call bus_dmamap_sync() as needed (might handle copying if bounce buffers are required) - insert mysterious DMA operation here - do post-DMA sync as needed (again, might require bounce copying) - call bus_dmamap_unload() to un-do the bus mapping (which might free bounce buffers if some were allocated by bus_dmamap_load()) - destroy the bus_dmamap_t One memory region, one DMA map. It seems to me that you can't use a single dmamap for multiple memory buffers, unless you make certain assumptions about where in physical memory those buffers reside, and I thought the idea of busdma was to provide a consistent, opaque API so that you would not have to make any assumptions. Now if I've gotten any of this wrong, please tell me how I should be doing it. Remember to show all work. I don't give partial credit, nor do I grade on a curve. Yay! The current suggestion is fine except that each platform might have a more efficient, or even required, actual h/w mechanism for mapping mbufs. It might, but right now, it doesn't. All I have to work with is the existing API. I'm not here to stick my fingers in it and change it all around. I just want to add a bit of code on top of it so that I don't have to go through quite so many contortions when I use the API in network adapter drivers. I'd also be a little concerned with the way you're overloading stuff into mbuf itself- but I'm a little shakier on this. I thought about this. Like it says in the comments, at the device driver level, you're almost never going to be using some of the pointers in the mbuf header. On the RX side, *we* (i.e. the driver) are allocating the mbufs, so we can do whatever the heck we want with them until such time as we hand them off to ether_input(), and by then we will have put things back the way they were. For the TX side, by the time we get the mbufs off the send queue, we always know we're going to have just an mbuf list (and not an mbuf chain), and we're going to toss the mbufs once we're done with them, so we can trample on certain things that we know don't matter to the OS or network stack anymore. The alternatives are: - Allocate some extra space in the DMA descriptor structures for the necessary bus_dmamap_t pointers. This is tricky with this particular NIC, and a little awkward. - Allocate my own private arrays of bus_dmamap_t that mirror the DMA rings. This is yet more memory I need to allocate and free at device attach and detach time. I've got space in the mbuf header. It's not being used. It's right where I need it. Why not take advantage of it? Finally- why not make this an inline? Er... because that idea offended my delicate sensibilities? :) -Bill -- = -Bill Paul(510) 749-2329 | Senior Engineer, Master of Unix-Fu [EMAIL PROTECTED] | Wind River Systems = I like zees guys. Zey are fonny guys. Just keel one of zem. -- The 3 Amigos = To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-current in the body of the message
Need reviewers for busdma changes to ethernet driver
Hi folks: Well, after threatening to do it for a long time, I finally sat down and converted one of my ethernet drivers to use the bus_dma API so that I no longer have to do things like call contigmalloc() and/or vtophys() directly. The changes I made are to the driver in -current, and the new code is at: http://www.freebsd.org/~wpaul/SiS/busdma I have tested this driver on FreeBSD/x86 using a NatSemi DP83815 card (the Netgear FA312TX) and it seems to work fine for me. However, I'm not 100% certain I used the busdma API properly in all cases. If anyone with a busdma clue would care to look over the code and see everything looks more or less legal, I would appreciate it. My main concern is that I'm using bus_dma_load() and bus_dma_unload() correctly (i.e. such that I'm not leaking any resources). Unless anyone raises serious objections, I would like to commit this code ASAP (the last test I really need to do is make sure it works correctly on an alpha). -Bill = -Bill Paul(510) 749-2329 | Senior Engineer, Master of Unix-Fu [EMAIL PROTECTED] | Wind River Systems = I like zees guys. Zey are fonny guys. Just keel one of zem. -- The 3 Amigos = To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-current in the body of the message
Fixing ypbind with TI-RPC
Ok. Friday I sat down and tried to make the -m option to ypbind work correctly using the new TI-RPC code. Unfortunately, my test machine chose that day to eat itself. Even more unfortunately, it was an AMD 900Mhz Thunderbird. Today, I started working on another box and managed to get things to work, but there are some problems that still need solving. I need some input to decide how to do this. The problem is with the code in yp_ping.c. This module contains a special version of clntudp_call() which has been modified in two ways: 1) If the XDR encode routine is specified as NULL, it skips the transmit portion of clntudp_send() and jumps straight to receiving and decoding the reply. 2) When processing a reply, the routine omits the check of the transaction ID, so that the reply will be processed even if its XID doesn't match the XID of the request that was last sent. This is done so that we can send a bunch of YPPROC_DOMAIN_NONACK requests to different servers, each with a different transaction ID, then wait to see who replies first. Distinguishing the servers based on the XID gets around the case where the server is multihomed and replies on an interface other than the one where it received the original RPC. This is basically an asynchronous RPC, where the request and response are handled separately rather than in the context of a single clntudp_call(). Anyway, now that we have the TI-RPC library, the magic clntudp_a_call() routine needs to be changed to a clnt_dg_a_call(). Unfortunately, when I tried to do this, I ran into a serious problem: - The clnt_dg.c module has several module-wide lock variables which are shared between the create/call/destroy methods. Trying to set up a private call method won't work, because the lock variables are static, hence not exported from the clnt_dg.o object module. As a hack I created a separate clnt_dg.c module which I linked directly into a test ypbind binary, but this is not what I consider a proper solution. Basically, I can't do things the way I did them with the older RPC code because of the threading/mutex locks. Even building a separate clnt_dg.o module with modifications was harder than it needed to be because the clnt_dg.c code #includes special header files within the libc source (src/lib/libc/include) which aren't available if you aren't building the world. The solution I'm leaning towards at the moment is adding the necessary hacks to src/lib/libc/rpc/clnt_dg.c in such a way that they can be enabled when needed with a special CLSET flag using clnt_control(). Then I can rip out the custom call method code from yp_ping.c entirely. I'm a little reluctant to do this since I was under the impression that creating a custom method should still work, but it looks as if this problem is endemic even to the original Sun TI-RPC code, not just us. Comments? Questions? Pie? -Bill To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
Re: Fixing ypbind with TI-RPC
Why can't you just enable sigio on the reply socket, send all the requests with a 0 timeout and then wait for a signal to either interrupt the sending or to notify you when you complete sending? Your solution seems awfully complex for what seems to be a simple problem; doing a directed broadcast and taking the first answer you get back. Is the whole reason you need to do this because you're using the xid to differentiate between the servers? Once upon a time, a young coder needed to be able to send multiple RPC requests and then wait for a single reply, instead of adhering to the strict request/reply mechanism in the existing code. So he studied the problem over many long nights. He fretted. He fussed. He tried not to reinvent the wheel or cure cancer while he was at it. Eventually, he made a couple of minor changes to a piece of existing code that did exactly what he wanted. The end. If that's true then you have a couple of options... We could add a hack to our version of the rpc library to allow manipulation/query of the xid. Or if the reply's source doesn't match any of the destinations you can remember it and send out another ping to that address. You're already allowed to get/set the transaction ID using CLGET_XID and CLSET_XID, and I intend to use this too. All I want to do is invoke YPPROC_DOMAIN_NONACK on a bunch of servers and see which one replies first, and I need to do it using unicasts because broadcasts won't get forwarded across routers or point-to-point dialup links. I originally created this monstrosity for NIS+, and decided it would be useful to silence the people who insisted on putting their NIS clients on separate networks from their NIS servers, and didn't want to set up NIS slaves on the remote subnets like they were supposed to. I'm including a patch to clnt_dg.c and clnt.h that adds the functionality I need. It's quite small, and it's the path of least resistance in this case, which is why I prefer it in this case in spite of the brokenness it seeks to supress. Oh, there's also a fix for a bug in here. At least, I think it's a bug. The clnt_dg_call() routine increments the transaction ID before transmitting a request. It assumes that the XID is a 32-bit value at a certain position in the block to be sent, and simply does a cast to a u_int32_t and an in-place increment. The problem is, this value is actually in network byte order, so to increment it properly, you need to ntohl() it first, increment, then htonl() it back. Of course, if you believe Sun, all the world's a SPARC, so for them it doesn't matter. This really only becomes a problem if you actually use the CLGET_XID and CLSET_XID control codes on a UDP client handle: the code in clnt_dg_control() does the proper byte swapping, but clnt_dg_call() doesn't. I'm not positive I'm doing the right thing here, but without this fix, my newly hacked __yp_ping() routine produces some weird results. -Bill *** clnt_dg.c.orig Mon Mar 26 21:17:00 2001 --- clnt_dg.c Mon Mar 26 21:21:08 2001 *** *** 126,131 --- 126,132 char*cu_outbuf; u_int cu_recvsz; /* recv size */ struct pollfd pfdp; + int cu_async; charcu_inbuf[1]; }; *** *** 238,243 --- 239,245 cu-cu_total.tv_usec = -1; cu-cu_sendsz = sendsz; cu-cu_recvsz = recvsz; + cu-cu_async = FALSE; (void) gettimeofday(now, NULL); call_msg.rm_xid = __RPC_GETXID(now); call_msg.rm_call.cb_prog = program; *** *** 312,317 --- 314,320 socklen_t fromlen, inlen; ssize_t recvlen = 0; int rpc_lock_value; + u_int32_t xid; sigfillset(newmask); thr_sigsetmask(SIG_SETMASK, newmask, mask); *** *** 336,347 call_again: xdrs = (cu-cu_outxdrs); xdrs-x_op = XDR_ENCODE; XDR_SETPOS(xdrs, cu-cu_xdrpos); /* * the transaction is the first thing in the out buffer */ ! (*(u_int32_t *)(void *)(cu-cu_outbuf))++; if ((! XDR_PUTINT32(xdrs, proc)) || (! AUTH_MARSHALL(cl-cl_auth, xdrs)) || (! (*xargs)(xdrs, argsp))) { --- 339,357 call_again: xdrs = (cu-cu_outxdrs); + if (cu-cu_async == TRUE xargs == NULL) + goto get_reply; xdrs-x_op = XDR_ENCODE; XDR_SETPOS(xdrs, cu-cu_xdrpos); /* * the transaction is the first thing in the out buffer +* XXX Yes, and it's in network byte order, so we should to +* be careful when we increment it, shouldn't we. */ ! xid = ntohl(*(u_int32_t *)(void *)(cu-cu_outbuf)); ! xid++; ! *(u_int32_t *)(void *)(cu-cu_outbuf) = htonl(xid); ! if ((! XDR_PUTINT32(xdrs, proc)) || (! AUTH_MARSHALL(cl-cl_auth, xdrs)) ||
Re: newcard/cardbus instabilities
That's a bit ugly. xl0: 3Com 3c575C Fast Etherlink XL port 0x3000-0x307f mem 0x4402-0x4403,0x44002480-0x440024ff,0x44002400-0x4400247f irq 10 at device 0.0 on cardbus1 xl0: chip is in D6 power mode -- setting to D0 I'm a bit worried about this; "D6" doesn't really exist, so it's possible that something is going wrong here. Bill; you might have some better ideas than I do. Suggestions? My suggestion? Chop out the power management stuff in xl_attach() and see what happens. The xl driver is using the pci_get_powerstate() and pci_set_powerstate() routines right now in order to check for PCI NICs that have been forced into the D3 state by Windoze during shutdown. However, those functions are internal to the PCI bus code, and I'm not sure what will happen when you try to use them with devices that are children of a cardbus bus. So, edit /sys/pci/if_xl.c, find the xl_attach() function, and comment out/#ifdef out/delete the section that checks the power state of the card. Like Mike says, the D6 state is bogus. Unfortunately, I can't test this myself at the moment since I find myself without a laptop. I might be able to coerce^Wconvince John Baldwin to let me test this with his though. -Bill To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
Any people with 3c905CX cards out there?
3Com has yet another revision of the Tornado chipset floating around out there on newer 3c905C adapters. Supposedly, these are marked as 3c905CX and have become available within the last couple of months. I've seen some noise on the Linux mailing lists that seems to indicate that some driver mods were necessary due to reset timing differences introduced in the new chipset, however I haven't been able to get my hands on one of these cards yet so I don't know whether or not there are also problems with FreeBSD. Nobody has reported any yet, but it would be nice to confirm the issue one way or the other. If someone has one of these cards and is using it with the xl driver, I'd be interested to know how well (or how badly) it's working. -Bill To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
Getting at cardbus CIS data from inside drivers
Okay. Recently, David O'Brien handed me an Intel 10/100 Cardbus NIC, which uses the 21143-PB chip. It's a non-MII card (has a Quality Semi symbol PHY). Unfortunately, it looks like Intel has taken a few shortcuts with this card: the serial EEPROM doesn't contain any useful information. Instead, the MAC address and, I presume, the GPIO programming info is stored in the CIS. When the card is inserted, the cardbus code prints out several 'Function Extension' lines, one of which contains the MAC address. The problem is, there's no way for me to obtain this info from inside the driver, unless I map the expansion ROM directly and grovel through the CIS myself, which I don't want to do. I have the card working at the moment using a couple of ugly cheats: I programmed the MAC address in manually using ifconfig dc0 ether blah, and I brute forced the GPIO settings so that all of the pins are configured as outputs and are forced to 1's. This seems to be enough to activate the transceiver, and I can exchange traffic. (I'm composing this e-mail with it right now.) The LED programming is still off though: both LEDs are lit green, and stay on regardless of link indication or speed. Is there any support planned for externalizing the CIS info somehow, i.e. by providing bus methods to call the CIS parsing routines? Another way to do it would be to pass the info down to the child device using ivars. I would imaging that there's similar support for this in Windows, otherwise Intel's driver wouldn't work. -Bill To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
Re: vx driver patch
Someone (I can't find who in my records, please let me know if it was you so I can credit you in the commit message) sent out patches to make the vx driver not use the pci compat shims. I just found it in my home directory, applied it, tweaked things very minorly and it builds and boots. Trouble is, I don't have a vortex to test with. It also appears that there is no driver maintainer at this time, so I thought I'd send it here. Unfortunately, there are a couple of problems with this patch. Somebody tried copying the EISA attachment code too closely: there's only one I/O space that needs to be allocated (the pci_io allocation is bogus). The IRQ allocation needs the RF_SHAREABLE flag or it will blow up in the case where the IRQ is shared with another device. Also, the vx driver still uses the ugly hack of statically allocated softc structs. I was working on this in the office the other day and just got done testing it. I have patches to fix all of this, plus make it use the bus_space_*() stuff instead of inb/outb/etc, plus allow it to be compiled as a KLD. The only thing I didn't do was implement detach routines, which means the driver can be loaded as a KLD, but not unloaded. The driver should also build in the alpha. I'll commit the changes to -current shortly. -Bill To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
Re: Driver Floppy implementation (Re: make release breakage - dokern.sh patch 2)
I'm not sure whether the problem of loading secondary usb modules is a problem in 4.x but it is easy to try. Boot a machine without usb support compiled in. after login, kldload usb, then the miibus and then the if_aue modules. If that works, you should be ok. I cannot test this as at the moment as I don't have a STABLE box (will have once the first RC comes out of JKH factories). I usually do the following: # kldload usb (probes USB controllers) # kldload miibus # kldload if_aue # usbd -f /dev/usb0 If the device has already been plugged in, starting usbd will cause it to be probed/attached by the aue driver. If not, it will be detected when it's plugged in later. -Bill To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
Re: if_rl.c broken ? Realtek 8139 not longer recognised.
Hi, I have a realtek ethernet card. The normal dmesg is this: rl0: RealTek 8139 10/100BaseTX port 0xb400-0xb4ff mem 0xd900-0xd9ff irq 10 at device 11.0 on pci0 rl0: Ethernet address: 00:e0:7d:7d:cd:35 miibus0: MII bus on rl0 rlphy0: RealTek internal media interface on miibus0 rlphy0: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, auto With the change to 8bit wide eeprom reads instead of the 6bit wide reads, the message is now: rl0: RealTek 8139 10/100BaseTX port 0xb400-0xb4ff mem 0xd900-0xd9ff irq 10 at device 11.0 on pci0 rl0: Ethernet address: 00:e0:7d:7d:cd:35 rl0: unknown device ID: 4a7 I changed if_rl.c to confirm that it really is the 6/8 bit change: Just fixed this. It should be 0x8129 that we compare with, not 8129. Sorry about that. Note that the cardbus hacks aren't in -stable yet so it wasn't affected. -Bill To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
Re: cvs commit: src/sys/pci if_dc.c
Hello Bill, I'm sorry about that. Here's some information that I can gather: 1. The Intel 21143 chips is intergrated in NEC VersaPro NoteBook PC. No LED to indicate the network activity are available. 2. It is connected to 10BaseT Hub (HP 28688B) at half duplex. Ok, two more things: - Show me the output of pciconf -l. - Is this supposed to be a 10/100 interface or just 10mbps? -Bill To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
Re: cvs commit: src/sys/pci if_dc.c
Hello Bill, After the following commit, my system fail to connect to network. If I backout, seems to work again. Any comments appreciated. No no no. *You* are the one who's supposed to make the comments. Like exactly what card do you have (make/model)? Exactly what speed and duplex mode are you using? (10mbps? 100mbps? full duplex? half duplex?) What ifconfig command do you use to bring up the interface? What kind of hub/switch/whatever is the card connected to? You know, all the stuff that I can't figure out for myself because I can't see your computer from way over here. -Bill To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
Re: Looking for testers for if_dc patches
For reference the ID reported is: de0@pci0:3:0: class=0x02 card=0x chip=0x00191011 rev=0x11 hdr=0x00 Hm, ok. First of all, I made a mistake in what I told you. The code in dcphy.c checks the subsystem ID, not the device ID. The device ID is always the same, since that identifies the 21143 chip, however the subsystem ID can vary from board to board depending on the manufacturer's whims. The odd thing is that the subsystem ID here is 0x (the "card=" value), however that doesn't rule out running our test. So, go back to dcphy.c and do this: case COMPAQ_PRESARIO_ID: case 0x: /* Example of how to only allow 10Mbps modes. */ sc-mii_capabilities = BMSR_ANEG|BMSR_10TFDX|BMSR_10THDX; break; Let me know if this has any effect. -Bill To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
Re: Looking for testers for if_dc patches
Hi Bill, I applied your patches to -current without incidents. I have a testbox (Digital dual P6) that gives: May 31 10:56:38 p6 /kernel: dc0: Intel 21143 10/100BaseTX port [...] May 31 11:03:27 p6 /kernel: dc0: watchdog timeout This box can also house an Alpha Miata MX5 mainboard, the Intel Alpha boards use the same PCI riser card that also contains the 21143 chip. The patches don't seem to help on this particular hardware. I will try to give the Alpha a spin too, later today. BTW: ifconfig-ing to use 10baseT/UTP does not help either. The media bulkhead is a 10baseT/10base2 one. if_de has no problems: Alright, hold it. Stop. Just to make sure I understand: - There's one interface involved here - It has a 21143 chip - It has 10baseT and AUI ports - It's supposed to be 10Mbps only If this is all correct, then I'd like you to try the following: - Run pciconf -l on this machine and obtain the PCI ID for this device. The device ID is the hex number after the "chip=" section in the output. For the sake of this example, let's say it's 0x12345678. - Bring up /sys/dev/mii/dcphy.c in your favorite editor. - Look for the following code in the dcphy_attach() routine: case COMPAQ_PRESARIO_ID: /* Example of how to only allow 10Mbps modes. */ sc-mii_capabilities = BMSR_ANEG|BMSR_10TFDX|BMSR_10THDX; break; - Add your PCI device ID like this: case COMPAQ_PRESARIO_ID: case 0x12345678: /* Example of how to only allow 10Mbps modes. */ sc-mii_capabilities = BMSR_ANEG|BMSR_10TFDX|BMSR_10THDX; break; One thing I discovered is that trying to enable 100Mbps autoneg on a device that only has a 10Mbps port doesn't work. This broke the support for the 10Mbps ethernet in certain Compaq Presario machines, which is why I special-cased it. This will not make the AUI port work (I need to add extra code for that) but it if this is the same problem as the Compaq, it should allow the 10baseT port to work. Let me know if this has any effect. -Bill To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
Re: Looking for testers for if_dc patches
- There's one interface involved here Correct. - It has a 21143 chip Well, the de driver says 21142. The dc driver says 21143. It's just a difference in chip revision, really. This one does not have AUI so that is not going to be a problem. What I do wonder, though, is what will happen if a 10/100Mbit bulkhead is installed on this machine. I don't expect the PCI ID to change (right?). I can pull the 10/100 bulkhead from my Miata GL to give this a try. It would help if you could look at both of them and tell me what chips are on them. The 21143 can do 10Mbps all by itself, but for 100Mbps you'd need an extra transceiver. I've been working under the assumption that they're just using the built-in 10baseT port on the 21143, but it's possible they're using the GPIO bits to do some funny business to switch the ports. In the meantime I gave your patch a quick try and I unfortunately don't see a change in behaviour. Still watchdog timeouts and no connection. Question: I had expected dmesg and ifconfig to report 10Mbit only modes. They still show 100 as supported media in addition to the 10Mbit modes. You have to be able to tell that the chip only supports 10Mbps modes. The 21143 is a 100Mbps chip, and only in certain cases do people design 10Mbps-only NICs around it. The problem is that to know if you've got only 10Mbps, you normally have to slog through the SROM info, however a lot of card vendors get this wrong, so I don't even bother with it. There is something else that might interest you: when replacing a 10 Mbit only bulkhead with a 10/100 one you need to connect it to the PCI bulkhead with a different cable to a different connector (on the PCI bulkhead). The 10/100 one is silkscreened as MII. Then it probably has a 10/100 PHY on it. Assuming the driver can probe it without having to flip any magic GPIO bits, it should work. Could this mean the driver sees a MII interface while in this particular setup the bulkhead is connected to something non-MII ? Wild guess maybe.. I'm sure it is non-MII. It's still supposed to work, however it's hard to tell just what I'm supposed to do to make it happy from way over here. -Bill To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
Looking for testers for if_dc patches
Several people have reported problems with if_dc botching autonegotiation on 21143 NICs with non-MII media, such as the DEC/Compaq DE500-BA and the built-in 10/100 ethernet on some alphas. As my first official act as a BSDi/WC employee, I sat down and tried to fix this. I produced some patches for if_dc.c/if_dcreg.h and dcphy.c, which are sitting at http://people.freebsd.org/~wpaul/dc_test. To apply them, do the following: # cd /sys/pci # patch if_dc.patch # cd /sys/dev/mii # patch dcphy.patch These patches should work on either 4.0-STABLE or 5.0-CURRENT. (They should also work on 4.0-RELEASE.) There are also some fixes for the Macronix 98713A/98715/98715A and the LC82C115 PNIC II, which also use the 21143-style NWAY interface. Note that I still need to add code to properly set the LEDs on 21143 boards. I went after the autoneg problem first since it was somewhat more pressing. In any event, please try these patches and report the results to [EMAIL PROTECTED] -Bill To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
Re: Looking for testers for if_dc patches
On Tue, May 30, 2000 at 12:28:25AM -0700, Bill Paul wrote: Several people have reported problems with if_dc botching autonegotiation on 21143 NICs with non-MII media, such as the DEC/Compaq DE500-BA and the built-in 10/100 ethernet on some alphas. As my first official act as a BSDi/WC employee, I sat down and tried to fix this. I produced some patches for if_dc.c/if_dcreg.h and dcphy.c, which are sitting at http://people.freebsd.org/~wpaul/dc_test. To apply them, do the following: [...] cc -c -O -Wall -Wredundant-decls -Wnested-externs -Wstrict-prototypes -Wmissing -prototypes -Wpointer-arith -Winline -Wcast-qual -fformat-extensions -ansi -g -nostdinc -I- -I. -I../.. -I../../../include -D_KERNEL -include opt_global.h -elf -mno-fp-regs -ffixed-8 -Wa,-mev56 ../../pci/if_dc.c ../../pci/if_dc.c: In function `dc_init': ../../pci/if_dc.c:2697: structure has no member named `dc_flgs' *** Error code 1 Stop in /var/d7/src-2000-05-28/src/sys/compile/CICELY9. This is on 5.0-CURRENT as of 28th May on alpha Grrr. Typo on my part, sorry. It should be flags, not flgs. I just fixed the patch file. You can download it again, or just correct the typo manually. -Bill To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
Re: NFS, rl0 and Alpha
Of all the gin joints in all the towns in all the world, Gary Jennejohn had to walk into mine and say: [...] Yes, this patch fixes the problem. Thank you, Bill Paul ! *sigh* It figures. Ok, I applied the patch to -current and -stable. We now return you to your regularly scheduled program. Please drive through. -Bill -- = -Bill Paul(212) 854-6020 | System Manager, Master of Unix-Fu Work: [EMAIL PROTECTED] | Center for Telecommunications Research Home: [EMAIL PROTECTED] | Columbia University, New York City = "It is not I who am crazy; it is I who am mad!" - Ren Hoek, "Space Madness" = To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
Re: NFS, rl0 and Alpha
Of all the gin joints in all the towns in all the world, Gary Jennejohn had to walk into mine and say: OK. Unfortunately, gdb core dumps when I try to analyze a crash dump with a debugging kernel :( Even worse, gdb core dumps when I try to run a debugging gdb in gdb to find out why gdb is core dumping when I try to debug a kernel with symbols :(( Wonderful. I suspect this may have something to do with the way packets sometimes wrap from the end of the RX buffer pool to the beginning. This might result in fragmentation across multiple mbufs in some cases (I think). If I squint hard enough, I can see a way for the data to end up misaligned in one of the additional mbufs. Try this patch. It's an untested hack (I don't have a RealTek card in a test box right this second) but should fix the problem if it's what I think it is. -Bill P.S.: Regardless, somebody should fix gdb. -- = -Bill Paul(212) 854-6020 | System Manager, Master of Unix-Fu Work: [EMAIL PROTECTED] | Center for Telecommunications Research Home: [EMAIL PROTECTED] | Columbia University, New York City = "It is not I who am crazy; it is I who am mad!" - Ren Hoek, "Space Madness" = *** if_rl.c.origSat Apr 29 14:15:10 2000 --- if_rl.c Thu May 4 22:16:31 2000 *** *** 913,919 goto fail; } ! sc-rl_cdata.rl_rx_buf = contigmalloc(RL_RXBUFLEN + 32, M_DEVBUF, M_NOWAIT, 0, 0x, PAGE_SIZE, 0); if (sc-rl_cdata.rl_rx_buf == NULL) { --- 911,917 goto fail; } ! sc-rl_cdata.rl_rx_buf = contigmalloc(RL_RXBUFLEN + 1518, M_DEVBUF, M_NOWAIT, 0, 0x, PAGE_SIZE, 0); if (sc-rl_cdata.rl_rx_buf == NULL) { *** *** 1122,1129 wrap = (sc-rl_cdata.rl_rx_buf + RL_RXBUFLEN) - rxbufpos; if (total_len wrap) { m = m_devget(rxbufpos - RL_ETHER_ALIGN, ! wrap + RL_ETHER_ALIGN, 0, ifp, NULL); if (m == NULL) { ifp-if_ierrors++; printf("rl%d: out of mbufs, tried to " --- 1120,1132 wrap = (sc-rl_cdata.rl_rx_buf + RL_RXBUFLEN) - rxbufpos; if (total_len wrap) { + /* +* Fool m_devget() into thinking we want to copy +* the whole buffer so we don't end up fragmenting +* the data. +*/ m = m_devget(rxbufpos - RL_ETHER_ALIGN, ! total_len + RL_ETHER_ALIGN, 0, ifp, NULL); if (m == NULL) { ifp-if_ierrors++; printf("rl%d: out of mbufs, tried to " *** *** 1132,1145 m_adj(m, RL_ETHER_ALIGN); m_copyback(m, wrap, total_len - wrap, sc-rl_cdata.rl_rx_buf); - if (m-m_len sizeof(struct ether_header)) - m = m_pullup(m, - sizeof(struct ether_header)); - if (m == NULL) { - printf("rl%d: m_pullup failed", - sc-rl_unit); - ifp-if_ierrors++; - } } cur_rx = (total_len - wrap + ETHER_CRC_LEN); } else { --- 1135,1140 To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
Re: PCI-CardBus bridge + PCMCIA Lucent WaveLAN IEEE troubles
Of all the gin joints in all the towns in all the world, Michael I. Vasilenko had to walk into mine and say: pccardd[166]: Card "Lucent Technologies"("WaveLAN/IEEE") matched "Lucent Technologies" ("WaveLAN/IEEE") pccardd[166]: Using I/O addr 0x100, size 64 pccardd[166]: Setting config reg at offs 0x3e0 to 0x41, Reset time = 50 ms pccardd[166]: Assigning I/O window 0, start 0x100, size 0x40 flags 0x5 /kernel: wi0: WaveLAN/IEEE 802.11 at port 0x100-0x13f irq 7 slot 0 on pccard0 /kernel: wi0: Ethernet address: 00:60:1d:f6:cc:5d ^^^ and machine just hangs completly. You did disable the parallel port on this machine so that you can safely use IRQ 7, right? And I don't mean "take the parallel port driver out of the kernel config." I mean "go into the computer's BIOS setup screen and turn the parallel port off." -Bill -- ===== -Bill Paul(212) 854-6020 | System Manager, Master of Unix-Fu Work: [EMAIL PROTECTED] | Center for Telecommunications Research Home: [EMAIL PROTECTED] | Columbia University, New York City = "It is not I who am crazy; it is I who am mad!" - Ren Hoek, "Space Madness" = To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
Re: ifconfig hang
Of all the gin joints in all the towns in all the world, Chuck Robey had to walk into mine and say: I'm trying to get current up on another test box, Who's exact CPU type and hardware configuration must be a state secret, since you didn't describe them here. Come _on_ people, how often do I have to keep harping on this? Don't just tell me "I have a box." Tell me about it! and this one has a CNET AX8814 equipped network card. One second after I do a ifconfig: ifconfig dc0 inet (somaddr) netmask (somemask) it hangs. It does this with a completely static kernel (shouldn't be loading any modules), even if I start up in single-user. My config has: device isa device eisa device pci device miibus # MII bus support device dc0 as far as network. My dmesg on the machine shows what I take to be a normal dc0 entry, but something I don't recognize for "amphy0" (I added cariage returns 'cause I know my mailer will do a worse job if I don't): dc0: ASIX AX88140A 10/100BaseTX port 0x6100-0x617f mem 0xf0201000-0xf020107f irq 12 at device 19.0 on pci0 dc0: Ethernet address: 00:80:ad:41:4a:95 miibus0: MII bus on dc0 amphy0: Am79C873 10/100 media interface on miibus0 amphy0: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, auto Any idea why my hang might be happening? amphy is the driver for the transceiver on the card with the ASIX ethernet controller. The ASIX AX88140A doesn't have a built-in transceiver. It's actually the transceiver (PHY) that does the autonegotiation. I suspect it's really a Davicom PHY, but the Davicom parts look like they're designed to duplicate the register layout and operation of certain AMD PHYs, and they claim to have the same vendor/device ID info. Anyway. This is almost certainly a hardware problem. You haven't provided enough evidence for me to suspect it could be anything else (it would have helped if you had tried compiling the kernel with options DDB and attempted to break into the debugger to see where it was stuck -- if you actually did try this and it was wedged so bad that you couldn't break into the debugger, then you should have said so). The usual suspect in this sort of thing is some sort of problem with bus master DMA. Maybe you tried to overclock this system and got the timings wrong. Maybe the PCI chipset has bugs. Maybe it doesn't get along well with the ASIX part. Maybe you have an old machine that doesn't support bus master DMA on all of its slots, and you put the card in a slave-only slot without realizing it. As soon as you ifconfig an interface up, the kernel tries to send a gratuitous ARP through it, which triggers a transmission and a DMA. If there's a problem, this DMA operation could wedge the bus. Some of the other cards need to do a DMA just to program the receive filter (though the ASIX is not one of these). I have tested the dc driver with an ASIX card and I'm pretty sure I didn't do anything recently to goof it up, otherwise somebody else would have complained by now. (Right guys? Right? Bah.) I would try to scrounge up an MS-DOG boot floppy and run the diagnostics on the diskette supplied with the card. If the vendor-supplied diags also wedge the system during a transmission, then you need to check your hardware. -Bill -- ===== -Bill Paul(212) 854-6020 | System Manager, Master of Unix-Fu Work: [EMAIL PROTECTED] | Center for Telecommunications Research Home: [EMAIL PROTECTED] | Columbia University, New York City = "It is not I who am crazy; it is I who am mad!" - Ren Hoek, "Space Madness" = To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
Re: Suggestions for Gigabit cards for -CURRENT
Of all the gin joints in all the towns in all the world, Kenneth D. Merry had to walk into mine and say: Talking of the XMAC II, there's one other thing I forgot to mention earlier. The FreeBSD sk driver does jumbo frames, but the SysKonnect drivers don't. At least, not yet. The XMAC II's receive FIFO is 8K. By default, the chip operates in 'store and forward' mode in order to perform error checking on received frames (it has to get the entire frame in the FIFO in order to do a CRC on it, I think). This is fine for normal frames, but if you want to handle jumbograms larger than 8192 bytes, you have to put the chip into 'streaming' mode, otherwise any frame larger than 8192 bytes will be truncated. To get 'streaming' mode to work, you have to disable all of the RX error checking. That is unfortunate, since it means you can't do checksum offloading with jumbo frames. Uhm. I'm not sure about that. The 8K FIFO limitation is in the XMAC II, not in the GEnesis controller. And I believe it's the GEnesis that actually does the hardware checksumming stuff. Oh, and the XMAC appears to have a 4K TX FIFO, not 2K. My mistake. FWIW, of the three gigabit ethernet implementations I've seen anything of (Alteon, Intel, SysKonnect), none have implemented all of the hooks necessary for a seamless zero copy receive implementation. Alteon comes the closest, but they don't support splitting out the headers (yet), which is a requirement for us. The only way to do zero copy receive with our VM architecture (that I know of) is page flipping, i.e. receive the page in the kernel, and then trade it for the user's page. You can't do it on anything less than page-sized granularity, and things have to be page aligned. (The IO-Lite stuff from Rice is an exception to all this.) The nice thing about the Alteon boards, though, is that you can modify the firmware, and so header splitting is an option there. It would even be possible to split the headers off of IPv6 packets, or any other protocol that you have knowlege of. If you can actually modify the firmware to do this then you have a lot more guru points than I do. :) I've looked at the Alteon firmware code but it's all quite opaque to me. -Bill -- = -Bill Paul(212) 854-6020 | System Manager, Master of Unix-Fu Work: [EMAIL PROTECTED] | Center for Telecommunications Research Home: [EMAIL PROTECTED] | Columbia University, New York City = "It is not I who am crazy; it is I who am mad!" - Ren Hoek, "Space Madness" = To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
Re: Suggestions for Gigabit cards for -CURRENT
Of all the gin joints in all the towns in all the world, Kenneth D. Merry had to walk into mine and say: On Wed, Feb 02, 2000 at 13:03:09 -0500, Thomas Stromberg wrote: We're currently looking at upgrading several of our FreeBSD servers (dual PIII-600's, 66MHz PCI) and some Sun Ultra's to Gigabit Ethernet. We plan to hook these machines into our Cisco Catalyst 5000 server. They will most likely move to be running FreeBSD 4.x by the time that we actually get our budget approved. What experiences do you guys have with the cards? Currently we're looking at the ~$1000 range, specifically at Alteon 512k's ($1000) for the FreeBSD servers and Sun Gigabit 2.0's ($2000) for the Sun servers. I was interested in the Myrinet cards (for obvious reasons), but they appear to require a Myrinet switch (though I found myself slightly confused so I may be wrong) rather then being able to hook into our Catalyst 5000. The Intel PRO/1000 Gigabit cards look rather nice too, but I haven't seen drivers yet for FreeBSD (Linux yes). I'm pretty much purchasing on marketing and reputation rather then any experience here, so any help would be much appreciated. I would recommend getting Alteon boards. It is likely that the Sun boards are Alteon OEM, although I'm not positive. I think the first gigabit cards Sun had on the market were OEMed from Alteon, but I've been told that their newer cards are something else entirely. I don't know exactly what, but they're not Tigon-based. One thing to keep in mind is that both Netgear and 3Com are OEMing Alteon boards, and you'll get them much cheaper that way. The boards are pretty much identical to the Alteon branded boards (which have no identifying marks on them). The performance is the same, at least for the Netgear boards. (I don't have any 3Com boards.) There are a number of companies selling OEM'ed alteon boards for various prices. IBM sells two cards, one for PC-based hardware and one for RS/6000s which I think are basically the same hardware with different driver kits. Of course, the RS/6000 card is $2100 while the PC-based one is probably around $600 or so. My guess is they're Alteon cards with different PCI device IDs, but I can't confirm this as I don't have one. The SGI gigabit adapter, NEC gigabit adapter, DEC EtherWORKS/1000, 3Com 3c985 and 3c985B, and the Netgear GA620 are all Tigon boards (not to mention the Alteon ACEnic) and should all work fine with the ti driver. Oh, I found another one recently: Farallon also sells a gigabit PCI NIC for the Mac which is Tigon-based. The Netgear GA620 is a 512K Tigon 2 board, and generally goes for around $300 or so. The 3Com boards have 1MB of SRAM, but I'm not sure whether they're Tigon 1 or Tigon 2. You really want a Tigon 2 board. Maybe someone who has one can comment. The original 3Com 3c985 was a Tigon 1 board (I have one) and the 3c985B is a Tigon 2. The Tigon 1 is no longer in production, though of course I try to maintain support for it for those people who still have them. The Tigon 1 had only a single R4000 CPU in it while the Tigon 2 has two. The Netgear GA620 is by far the cheapest at about $320. The various OEM cards sold for the PC are usually around $600, give or take $100. The GA620 only has 512K of SRAM compared to 1MB on most of the others, however you're not likely to notice a problem with that unless you try to push the card really hard with a really big TCP window size and jumbo frames. The Intel cards may look nice, and there is a FreeBSD driver for them, but I wouldn't get one. The first problem with the Intel boards is that there are no docs for them. Supposedly they're using a Cisco chip, and the specs for the chip are top secret. This is why I don't buy or recommend Intel NICs. But that's just my personal bias. The FreeBSD driver (written by Matt Jacob) is based on the Linux driver, which Intel wrote, and he hasn't yet managed to get decent throughput through the cards. (Maybe Matt will comment.) They also only have 64K of memory on board, which is insufficient for a heavily loaded server, IMO. Even with the 512K Alteon boards, you have a minimum of about 200K, and probably more like 300K of cache for transmit and receive. The Alteon cards also need a certain amount of SRAM to run the firmware. The Intel boards also don't have the features necessary to really support zero copy TCP receive. The Alteon boards, on the other hand, have most of the features necessary, and if I get some time, I may add the last feature (header splitting) to the firmware. The other alternative is SysKonnect, and that might actually be a good alternative. I haven't seen the boards, don't know how much they cost, etc. etc. You might want to ask Bill Paul about them, he wrote the driver. The SysKonnect cards aren't bad. A single port multimode fiber card is around $700, I think. The single mode cards are more expensive. However
Re: Suggestions for Gigabit cards for -CURRENT
Of all the gin joints in all the towns in all the world, Kenneth D. Merry had to walk into mine and say: [ Thanks for the info Bill! ] No problemo. [...] Both the Alteon and SysKonnect NICs are 64-bit PCI cards. (Actually, I'm pretty sure all of the PCI gigabit NICs are 64-bit.) Both kinds of cards can do jumbograms on FreeBSD. Also, both vendors have released pretty good hardware documentation, which makes them good choices for custom applications, if you're into that sort of thing. Alteon also provides firmware source, which can really come in handy. Do you know if SysKonnect has released firmware? The SysKonnect GEnesis controller and the XaQti XMAC II chips are both static devices and do not require firmware. If you go to www.syskonnect.com and search their online knowledge base for the word "manual" you should be able to find the gigabit NIC programmer's manual. Similarly, XaQti has the full datasheet for the XMAC II at www.xaqti.com somewhere. (As I recall, you have to go through a brief registration procedure to get it, but once that's done you should be able to download it right away.) Talking of the XMAC II, there's one other thing I forgot to mention earlier. The FreeBSD sk driver does jumbo frames, but the SysKonnect drivers don't. At least, not yet. The XMAC II's receive FIFO is 8K. By default, the chip operates in 'store and forward' mode in order to perform error checking on received frames (it has to get the entire frame in the FIFO in order to do a CRC on it, I think). This is fine for normal frames, but if you want to handle jumbograms larger than 8192 bytes, you have to put the chip into 'streaming' mode, otherwise any frame larger than 8192 bytes will be truncated. To get 'streaming' mode to work, you have to disable all of the RX error checking. Also, the default TX FIFO threshold on the XMAC is very small (8 bytes, I think). The FreeBSD sk driver bumps this up a bit (to 512 bytes, if I remember correctly). This is to deal with the case where you have a dual port card and are pumping data through both XMAC chips at once: with the default FIFO threshold, I would often see TX FIFO underruns from one of the XMACs and performance on that port would get spotty. I think the total TX FIFO memory on the XMAC II is 2K. -Bill -- ===== -Bill Paul(212) 854-6020 | System Manager, Master of Unix-Fu Work: [EMAIL PROTECTED] | Center for Telecommunications Research Home: [EMAIL PROTECTED] | Columbia University, New York City = "It is not I who am crazy; it is I who am mad!" - Ren Hoek, "Space Madness" = To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
Re: Problems with an0 and ISA Aironet Card..
Of all the gin joints in all the towns in all the world, Paul Reece had to walk into mine and say: On Thu, 20 Jan 2000, Bill Paul wrote: snip Back up. You're leaving out some info. - When did you buy these cards? (The firmware rev may be an issue. knowing when you bought the card helps me figure out if your firmware is newer than mine.) Cards were purchased in the past 6 months. Revision of the card I'm using at the moment is 3.13 - I upgraded the firmware to the latest. (Win DGS under 'status' reports 3.13). You can view the firmware rev with the if_an driver (when it works) by doing ancontrol -i an0 -I. The newest PCMCIA card that I have seems to be using revision 3.10. The ISA card that I have is using 2.06. The trouble is I don't have Windows machine set up to run the firmware update utility. What I tried to do today was swap the PCMCIA module on my existing ISA card with one of the new ones with the later firmware. I did this a while back when I got our first batch of cards. However, I can't do it now. One of the problems I had with the Aironet cards initially is that they were set up so that they would operate in two modes: if you applied +5volts to the vpp1 and vpp2 pins on the PCMCIA module, it would work in PCMCIA mode such that you could get at the CIS data and configure it like any other PCMCIA card. Without the +5volts, the module would work in a special 'dumb bus' mode that would allow it to interface with the ISA and PCI bridge adapter cards that Aironet uses for their ISA and PCI cards. Basically, this allows them to make just one PCMCIA module and use it in all three kinds of cards. However the latest PCMCIA cards that we just got are different: now they always work in PCMCIA mode regardless of how vpp1 and vpp2 are set. On the one hand, this is good because it means you don't have to frob sys/pccard/pccard.c to enable the vpp voltage when the card is inserted. (My older cards will not work with FreeBSD unless I apply this tweak to the kernel.) On the other hand, this means that the newer PCMCIA cards won't work in the ISA and PCI bridge adapters. This sort of stymied my attempts to duplicate your problem here in the lab. What would be nice is if you could somehow set up a scratch box with an Aironet ISA4800 card in it that I could access remotely. I'm reasonably confident I could make it work if I could just experiment with it for a while. Unfortunately, this may not be possible depending on technical on various political constraints, especially since I need to twiddle around as root in order to examine register contents and test a new driver. pcpaul# ./testa COMMAND: 0 PARAM0: ff11 PARAM0: 0x (still no lights on card) if I run it again: pcpaul# ./testa COMMAND: 0 PARAM0: 1234 PARAM0: 0x (and still no lights). This info help at all? Well, yes. It tells me two things. First, it tells me that I made a typo on the program that I gave you. :) Second, it shows me that the card is at the I/O address that it's supposed to be, although it appears to not be responding to the 'read SSID list' command that the if_an driver issues during the probe phase. Unfortunately, as I said earlier, I need to be able to experiment on this thing in order to figure out the problem, and I can't do that unless you can somehow arrange remote access. -Bill -- = -Bill Paul(212) 854-6020 | System Manager, Master of Unix-Fu Work: [EMAIL PROTECTED] | Center for Telecommunications Research Home: [EMAIL PROTECTED] | Columbia University, New York City = "It is not I who am crazy; it is I who am mad!" - Ren Hoek, "Space Madness" = To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
Re: Problems with an0 and ISA Aironet Card..
Of all the gin joints in all the towns in all the world, Paul Reece had to walk into mine and say: Having a few problems trying to get an ISA Aironet 4800 card working under FreeBSD 4.0-CURRENT. I did try with 3.4-RELEASE first with the appropriate drivers, but had even less luck. What I'm seeing at boot: Back up. You're leaving out some info. - When did you buy these cards? (The firmware rev may be an issue. knowing when you bought the card helps me figure out if your firmware is newer than mine.) - What sort of machine are you using? (Show us the *whole* dmesg output. Timing may also be an issue, in which case I need to know the CPU speed.) first suspect lines: isa0: unexpected tag 14 isa0: unexpected tag 14 I'm not sure if this is related. then: an0: reset failed unknown0: Aironet ISA4500/ISA4800 at port 0x100-0x13f irq 5 on isa0 an0: reset failed unknown1: Aironet ISA4500/ISA4800 at port 0x140-0x17f irq 10 on isa0 (machine has 2 cards in it). When trying with NON PNP mode, the cards also have the same problem. Tell us what kernel config line you use when using the card in non-PnP mode. Note that the switches on the card must all be in the correct position in order to enable PnP mode: consult your user's manual for the proper settings. I believe they all need to be in the off position, however I don't have the manual here at home with me so I could be mistaken. (I do remember they all have to be set the same way.) PCI cards work fine, just not the ISA equivalents.. Anyone have any clues/hints/tips etc? Not really. My one and only ISA card works fine, or at least it did when I did my tests right before I imported the driver. It would help if you could actually look at the card when the kernel boots to see if the LEDs flash at all. If the reset is screwing up, then you should see the LEDs flicker when it tries to access the board. If it's failing to access the board at all, the LEDs won't change at all. Try commenting out the code in an_reset() (i.e. make it an empty function that does nothing) and see if it works then. If it *still* doesn't work, then there's something else wrong. Try to run the following program as root: #include sys/types.h #include machine/cpufunc.h #include sys/fcntl.h #include stdio.h #define IOADDR 0x100 /* change to 0x140 for other card */ main() { int f; f = open("/dev/io", O_RDWR); printf("COMMAND: %x\n", inw(IOADDR)); printf("PARAM0: %x\n", inw(IOADDR + 0x2)); outw(IOADDR + 0x2, 0x1234); printf("PARAM0: 0x\n", inw(IOADDR + 0x2)); exit(0); } This will print out the command and status registers for the card at iobase 0x100. If the card has been properly activated, you should see for the COMMAND and PARAM0 registers initially, then the program will try to write 0x1234 to the PARAM0 register and read it back. If it reads back 0x1234, then the card is configured right and the reset is screwing up. If on the other hand the program prints for all of the register contents, then the card is not really configured properly for address 0x100. Cheers. Regards, Paul. (replies to me direct please - not on list) I'm doing both. Deal with it. -Bill -- ===== -Bill Paul(212) 854-6020 | System Manager, Master of Unix-Fu Work: [EMAIL PROTECTED] | Center for Telecommunications Research Home: [EMAIL PROTECTED] | Columbia University, New York City = "It is not I who am crazy; it is I who am mad!" - Ren Hoek, "Space Madness" = To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
Re: USB D-Link DSB-650 kue0: failed to load code
Of all the gin joints in all the towns in all the world, Eric J. Haug had to walk into mine and say: Hi all, I have a Toshiba 2100CDS laptop with an OHCI USB controller that gives a kue0: failed to load code segment error message Rather than clutter the list, the conf file and the dmesg boot file is available at ftp.eas.slu.edu:/pub/incoming/[usbdmesg, usbbootmsg, usbltaconf] The usbbootmsg is from yesterdays kernel sources with some of the debug variables set to 15. The changes from today did not seem to make any difference. the stripped mesg output from a boot follows: ohci0: NEC uPD 9210 USB controller mem 0xf7fff000-0xf7ff irq 11 at device 11.0 on pci0 usb0: OHCI version 1.0 usb0: NEC uPD 9210 USB controller on ohci0 usb0: USB revision 1.0 uhub0: NEC OHCI root hub, class 9/0, rev 1.00/1.00, addr 1 uhub0: 2 ports with 2 removable, self powered kue0: D-Link Corp 10Mbps ethernet adapter, rev 1.00/0.02, addr 2 kue0: failed to load code segment: IOERROR device_probe_and_attach: kue0 attach returned 6 An important point which you neglect to mention is: how long did it take before the IOERROR message appeared? (That is, how much time passed between the first kue0 probe message and the next?) Getting the Kawasaki chip to work requires downloading firmware into it, and the code segment of the firmware is about 3800 bytes, which makes for a fairly large control transfer. I had to set things up with a longer than normal timeout to make this work on my laptop. If the IOERROR message appears after only a second or two (or maybe three), then the timeout may not be long enough for your machine. If it sits there for a long time (ten seconds or longer) then it's probably something else. To see if this in fact the problem, do the following: - Bring up /sys/dev/usb/if_kue.c in your favorite editor. - Find the kue_do_request() function. - Change the timeout from 50 to 100, i.e. change this: usbd_setup_default_xfer(xfer, dev, 0, 50, req, data, UGETW(req-wLength), USBD_SHORT_XFER_OK, 0); to this: usbd_setup_default_xfer(xfer, dev, 0, 100, req, data, UGETW(req-wLength), USBD_SHORT_XFER_OK, 0); Then recompile your kernel/module/whatever and try again. (And let me know what happens, of course.) -Bill -- = -Bill Paul(212) 854-6020 | System Manager, Master of Unix-Fu Work: [EMAIL PROTECTED] | Center for Telecommunications Research Home: [EMAIL PROTECTED] | Columbia University, New York City = "It is not I who am crazy; it is I who am mad!" - Ren Hoek, "Space Madness" = To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
Re: /dev/sndstat
Of all the gin joints in all the towns in all the world, George Cox had to walk into mine and say: I cvsupped yesterday. I install a complete snapshot today. extremis /dev # ./MAKEDEV sndstat0 expr: non-numeric argument bad node: mknod mixerstat0 Something's wrong :-) No, nothing is wrong: x-ctr# cd /dev x-ctr# ./MAKEDEV snd0 x-ctr# ls -l /dev/sndstat crw-rw-rw- 1 root wheel 30, 6 Jan 17 10:51 /dev/sndstat /dev/sndstat is created as a consequence of creating doing MAKEDEV snd0. -Bill -- = -Bill Paul(212) 854-6020 | System Manager, Master of Unix-Fu Work: [EMAIL PROTECTED] | Center for Telecommunications Research Home: [EMAIL PROTECTED] | Columbia University, New York City = "It is not I who am crazy; it is I who am mad!" - Ren Hoek, "Space Madness" = To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
Re: USB D-Link DSB-650 kue0: failed to load code
Of all the gin joints in all the towns in all the world, Eric J. Haug had to walk into mine and say: - Find the kue_do_request() function. - Change the timeout from 50 to 100, i.e. change this: usbd_setup_default_xfer(xfer, dev, 0, 100, req, data, UGETW(req-wLength), USBD_SHORT_XFER_OK, 0); After this change Again, the boot message rushes by. But later, say about 8 seconds or so about the time the system is printing out ppc0 messages i get panic: removing other than first element. *sigh* Whatever. Fortunately, it turns out I had an OHCI controller here and I was able to duplicate this problem. (It doesn't happen with my laptop and UHCI controller, otherwise I could have spotted it before.) I just updated the driver in -current to fix this. The driver tries to check if the firmware is already running by reading the MAC address. It turns out that doing this when the firmware is not running generates a 0 length transfer. This doesn't seem to do any harm with a UHCI controller, but it makes the OHCI controller (or its driver) mad. I changed the code to test for the presence of already running firmware in a different way, and now it seems to work with my test system. Make sure to get src/sys/dev/usb/if_kue.c revision 1.16. This version should fix the problem. I also noticed that performance with the OHCI controller is significantly better than with the UHCI controller. Just my rotten luck I'm stuck with a UHCI one in my laptop. -Bill -- = -Bill Paul(212) 854-6020 | System Manager, Master of Unix-Fu Work: [EMAIL PROTECTED] | Center for Telecommunications Research Home: [EMAIL PROTECTED] | Columbia University, New York City = "It is not I who am crazy; it is I who am mad!" - Ren Hoek, "Space Madness" = To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
Re: Current, XEON and MP performance
Of all the gin joints in all the towns in all the world, Achim Patzner had to walk into mine and say: I don't know where to ask first (or what to look at) so I'd like some creative guessing by some people closer to the sources... Running the same programs on nearly identically configured -CURRENT kernels on a HP NetServer LH4 (four 550 MHz PIII Xeon with 512MB Cache, 512MB cache? I think you mean KB. supposed to be an INTEL 450NX-based chipset) with one GB RAM and a home-grown ASUS P2-BDS based system (two 450 MHz PIII) with 512 MB RAM I find that the programs (running on the same input data) on the "smaller" machine tend to take only a third of the CPU time they need on the LH4. Can you show us the actual results from your testing (an hopefully your testing methods as well) that led you to this conclusion? Details matter. Are these programs I/O bound, CPU bound, or a little of both? FreeBSD's SMP support still depends largely on the big giant lock approach which means that while you can indeed get processes running on multiple CPUs at the same time, you end up using only one CPU once you enter the kernel. And you have to enter the kernel in order to perform any disk, network or even console I/O. If your programs suck large datasets into memory, do lots of number crunching on them, then spit the results back out to a disk file, then they should benefit from more CPUs. However if they read and write data a lot while running, you're going to be limited by the big giant lock. There may also be scalability issues (i.e. does FreeBSD perform better as you add more CPUs or does it spend so much time trying to stay out of its own way that it actually performs worse) however I don't know enough to say if you could be running into such problems as the only SMP machines I have access to have only 2 CPUs. [Worse: The LH4 behaves like a spoilt brat when it comes to hardware, disliking the Intel EtherExpress that came with it (generating bus mastering problems after bringing it up), Which model Intel EtherExpress? What chipset? What bus mastering problems exactly? having interrupt routing problems with two DEC TULIP based ethernet cards sharing the same IRQ Which tulip cards? What driver? What kind of problems? I find it unusual that two PCI devices would wind up with the same IRQ with the APIC enabled since it's supposed to give you a lot more IRQs than in UP mode. and being picky just which 3C906B-TX it gets plugged in. There is no such card as a 3c906B. There's a 3c905B, and there's a 3c905C. Unfortunately, 3Com did go through several different ASIC revisions with the 3c905B series, some of which work better than others, but again, I see no details here. It's a bitch and I'd like shooting it. Oh yes - HP has been very helpful, telling me that I was at least 10 years behind wanting to run a BSD and that only WinNT, HP-Sux and Linux were supported on this hardware.] If somebody at HP actually told you that HP-UX runs on anything besides the PA-RISC architecture (and, in the distant past, the m68k architecture), they were either a) jerking your chain, b) working at HP in an parallel dimension, c) misinformed, or d) not terribly bright. (I'm sure HP wouldn't mind having HP-UX/x86, but they certainly don't offer it as a product now.) Back to the topic: Are there any reasons for these observations? If someone liked taking a closer look at it I could provide them with access to the machine (and its console). I ran out of clues... Hard to tell really without more info. We don't know what your test programs do, so it's impossible to predict what their behavior should or shouldn't be. -Bill -- ===== -Bill Paul(212) 854-6020 | System Manager, Master of Unix-Fu Work: [EMAIL PROTECTED] | Center for Telecommunications Research Home: [EMAIL PROTECTED] | Columbia University, New York City = "It is not I who am crazy; it is I who am mad!" - Ren Hoek, "Space Madness" = To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
Re: buildworld fails on Alpha
Of all the gin joints in all the towns in all the world, Wilko Bulte had to walk into mine and say: On a very freshly supped -current on Alpha: === sys/modules === sys/modules/aha rm -f aha.h setdef0.c setdef1.c setdefs.h setdef0.o setdef1.o aha.ko aha.o aha_isa.o @ machine symb.tmp tmp.o opt_cam.h opt_scsi.h bus_if.h device_if.h isa_if.h rm -f .depend /usr/src/sys/modules/aha/GPATH /usr/src/sys/modules/aha/GRTAGS /usr/src/sys/modules/aha/GSYMS /usr/src/sys/modules/aha/GTAGS === sys/modules/amr rm -f setdef0.c setdef1.c setdefs.h setdef0.o setdef1.o amr.ko amr.o amr_pci.o amr_disk.o @ machine symb.tmp tmp.o bus_if.h device_if.h pci_if.h rm -f .depend /usr/src/sys/modules/amr/GPATH /usr/src/sys/modules/amr/GRTAGS /usr/src/sys/modules/amr/GSYMS /usr/src/sys/modules/amr/GTAGS === sys/modules/an cd: can't cd to /usr/src/sys/modules/an *** Error code 2 Stop in /usr/src/sys/modules. *** Error code 1 Roar. I swear I checked in this module Makefile. Honest and for true. Okay, I think I've really got it this time. Please try cvsupping again: you should get a src/sys/modules/an/Makefile for compiling the Aironet driver module. -Bill -- = -Bill Paul(212) 854-6020 | System Manager, Master of Unix-Fu Work: [EMAIL PROTECTED] | Center for Telecommunications Research Home: [EMAIL PROTECTED] | Columbia University, New York City = "It is not I who am crazy; it is I who am mad!" - Ren Hoek, "Space Madness" = To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
Re: panic in uipc_mbuf.c or if_aue.c
Of all the gin joints in all the towns in all the world, Jun Kuriyama had to walk into mine and say: I got more panic with DEBUG=-g and INVARIANTS. I saved core dump at this time. We need version information! How recent is your version of -current! What's the rcsid from if_aue.c! Details please! This panic is caused when I tested heavy traffic via aue0 (USB ethernet adaptor) with "while looped" large file scp. I think that is only active process. My ipfw is set as default like as "65535 allow ip from any to any". *sigh* No, this is not what you meant to say. What you meant to say is: "Oh, by the way, I also use ipfw. And oh, by the way, I didn't think to repeat the same test without ipfw." Try the test again with a new kernel *without* ipfw. Maybe the problem is in ipfw. Maybe it isn't, but you have to do some testing to eliminate the possibility! Should I give some data to solve this problem? No, you should sit there and wait for the bug fairy to come and tap you with her magic wand. Print out the contents of the mbuf!! Show is what it thinks the real length is! -Bill -- ===== -Bill Paul(212) 854-6020 | System Manager, Master of Unix-Fu Work: [EMAIL PROTECTED] | Center for Telecommunications Research Home: [EMAIL PROTECTED] | Columbia University, New York City = "It is not I who am crazy; it is I who am mad!" - Ren Hoek, "Space Madness" = To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
Re: USB broken?
Of all the gin joints in all the towns in all the world, Eric D. Futch had to walk into mine and say: I'm running -current that's about a week old. Erm... are you sure? I'm having trouble believing you. I configed my kernel for USB support. After turning on the USB interface in BIOS kernel panics after it probes uchi0. Below is the panic screen, I don't have much else to go on. --- uhci0: Intel 82371SB USB Host Controller rev 0x01 int d irq 10 on pci0.7.2 kernel trap 12 with interrupts disabled See this kernel probe output here? This is not from a 4.0-CURRENT kernel from a week ago. This is what the probe output from a recent -current system should look like: uhci0: Intel 82371AB/EB (PIIX4) USB controller irq 11 at device 7.2 on pci0 Notice the difference? It's been like that for a *long* time now. Therefore I can only conclude that either you're not actually running -current, or else you thought it would be okay to substitute in a really stale entry from a system log file from a 3.x system. Either way, you need to re-evaluate the situation and provide more info. Now rather than being vague, go back and show us what uname -a says on this allegedly -current system and show it to us. Show us the *entire* dmesg output too, while you're at it. Furthermore, you should be able to test USB support without recompiling the kernel. All you need to do is kldload usb. That will load the usb.ko kernel module, which should find the UHCI controller. From the panic message you showed here, you're using SMP. Have you tested it with a UP kernel? (Yes, it's supposed to work either way, but it would be nice if you would just test it to rule out some sort of SMP-related condition.) What you should do is this: - Compile a kernel with options DDB, but *WITHOUT* USB support. - Boot this kernel. - Type kldload usb - See if the system crashes. - If it does, it will drop into the debugger. - Type 'trace' - Report what it says. -Bill -- = -Bill Paul(212) 854-6020 | System Manager, Master of Unix-Fu Work: [EMAIL PROTECTED] | Center for Telecommunications Research Home: [EMAIL PROTECTED] | Columbia University, New York City = "It is not I who am crazy; it is I who am mad!" - Ren Hoek, "Space Madness" = To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
Re: Woa! May have found something - 'rl' driver and small packets (was Re: Odd TCP glitches in new currents)
at happens if you shut down and restart the X server? -Bill -- ===== -Bill Paul(212) 854-6020 | System Manager, Master of Unix-Fu Work: [EMAIL PROTECTED] | Center for Telecommunications Research Home: [EMAIL PROTECTED] | Columbia University, New York City = "It is not I who am crazy; it is I who am mad!" - Ren Hoek, "Space Madness" = To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
Re: Woa! May have found something - 'rl' driver and small packets (was Re: Odd TCP glitches in new currents)
Of all the gin joints in all the towns in all the world, Matthew Dillon had to walk into mine and say: I'm trying to narrow down the area enough that I can mess with the driver myself and hopefully locate the problem, since it can't be reproduced easily. I was hoping the magic number 64 could be related to something - and you have apparently been able to do that, which gives me a place to start anyway. netstat shows the trigger to be the reception of 64 packets rather then the transmission, though. Is there anything at all about the number 64 that could be related to the receiver? 64 is also the number of descriptors/buffers in the RX ring. When you fill up the RX ring, the chip is supposed to generate a 'no RX buffer available' interrupt. The driver will check the RX ring for packets when either an 'RX OK' or 'no RX buffers available' interrupt is delivered, but you should be getting an 'RX OK' interrupt on every received packet. The datasheet for the PNIC II is at: http://www.freebsd.org/~wpaul/Macronix/PNIC_II.PDF This is the datasheet LinkSys gave me when they first came out with the LNE100TX v2.0 board. It's very similar to the Macronix 98715A datasheet. I'm pretty sure that the box was getiting receive interrupts because every time I sent a packet to it from the outside systat -vm showed a PCI interrupt for the network device. However 'netstat -in 1' did not show the statistics for the received packets until 64 had accumulated. It could be that the statistics are not being accumulated on a per-reception basis and that the receive packets are actually getting through, and that its the transmit side which is broken. I don't know the code well enough yet to make the determination. The dc_rxeof() routine is what increments ifp-if_ipackets, so if netstat -in doesn't show any change until after 64 packets have arrived, then it isn't getting the 'RX OK' interrupts. But I promise you that I have never seen a condition where 'RX OK' interrupts failed to arrive even though 'no RX buffer available' interrupts did. The interrupt handler re-enables interrupts just before it exits, so there should never be a case where interrupts are turned off and never turned back on again. -Bill I'll try that next time the problem occurs but I doubt it will have any effect. Changing the duplex mode does not appear to reset the port whereas forcing the media to 'auto' does appear to reset the port. This is actually another problem (switches don't appear to pick up the duplex change if the port isn't reset), but not one I'm concerned with. In general what you want to do is a) switch modes and b) reset the link so that the guy on the other side re-senses the media. However both sides can only agree on the duplex setting as the result of an NWAY autoneg session: if you manually select 100baseTX full duplex, the link partner can only sense the link speed (100mbs as opposed to 10) but not the duplex mode. The rule is that if you don't have NWAY but can sense the link speed, you default to half duplex and let the operator manually fix things if necessary (that's what operators are for). Of course this only works if the switch has a management interface that allows you to configure things like that. Some don't, which can make your life tough. I'm pretty sure the speed and duplex setting don't really have anything to do with this particular problem though. I was just wondering why renegotiating the media would have any effect. It's possible that dc_init() may be called in there somewhere, which could be resetting all of the driver state. -Bill -- = -Bill Paul(212) 854-6020 | System Manager, Master of Unix-Fu Work: [EMAIL PROTECTED] | Center for Telecommunications Research Home: [EMAIL PROTECTED] | Columbia University, New York City = "It is not I who am crazy; it is I who am mad!" - Ren Hoek, "Space Madness" = To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
General ata grousing
In an earlier post on -hackers, I mentioned that attempting to kldload the usb.ko module after the kernel had booted would panic the system. So far I've managed to track this problem all the way down down to sys/i386/isa/intr_machdep.c:add_intrdesc(). The system crashes when the uhci_pci module tries to set up an interrupt handler using bus_setup_intr(). I strongly suspect this is being caused by an unpleasant interaction with the ata driver: just my luck, the ATA controller, USB UHCI controller and power management happen to be implemented as subfunctions of the same PCI device. (Note that having /boot/loader pre-load the usb module along with the kernel does work.) In my case, each function is assigned IRQ 11 by the BIOS. I would think that each driver would register a handler for this IRQ using bus_alloc_resource() and bus_setup_intr() with the RF_SHAREABLE flag. However from what I can tell, the ATA driver isn't doing this in its PCI attach routine. I'm not sure why. What is doing is very weird: it appears that it tries to call inthand_add() directly in at least one part of the code. I'm nowhere near understanding the whys and the wherefores for all this yet, something tells me this has to be related to the USB problem. By some special magic, everything just happens to work right when the devices are probed at boot time (and of course, nobody thought to test any other case), but things break very badly when trying to load the usb.ko module *after* the system has booted. I don't want to sound like an ungrateful wretch, unduly criticizing someone else's code, especially at so late a date, but there are some other things that just seem like they really shouldn't be there: - Platform dependencies. The inthand_add() thing I mentioned previously appears to be an x86-specific kludge, and there's an alpha kludge to go along with it. There should be some way to get rid of this. - Magic numbers everywhere. I see lots of places where I/O and PCI config registers are being manipulated using just hard coded register offsets and bitmasks. Magic numbers are bad, -kay? - Use of inb/outb instead of bus_space_read_X()/bus_space_write_X(). My understanding is that bus_space_read_X()/bus_space_write_X() are the prefered way of doing register accesses. inb/out and friends are deprecated. Anyway, I'm going to continue trying to hunt down the interrupt setup problem once I get home tonight (nice thing about having a laptop for a test box: you don't have to leave the test machine at work and frob it remotely). If anyone has any insights, please feel free to share them. -Bill -- = -Bill Paul(212) 854-6020 | System Manager, Master of Unix-Fu Work: [EMAIL PROTECTED] | Center for Telecommunications Research Home: [EMAIL PROTECTED] | Columbia University, New York City = "It is not I who am crazy; it is I who am mad!" - Ren Hoek, "Space Madness" = To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
Re: Woa! May have found something - 'rl' driver and small packets (was Re: Odd TCP glitches in new currents)
Of all the gin joints in all the towns in all the world, Matthew Dillon had to walk into mine and say: I'm adding Bill Paul to the list specifically. Hmm. Now this is odd! I think I may have found something! All of my 'rl' driver cards fail this test: Oh sure. Bet the farm on the absolute worst NIC on the whole damn planet, why don't you. Why spend a few bucks on some nice 3c905B or 3c905C cards and beat up on them when you can buy ten RealTek cards for a dollar. About as reliable as a pair of tin cans and a piece of string, but gosh they sure are cheap. You'll have to wait until at least tomorrow before I can look into this, since I won't be able to do any debugging until I throw my one and only RealTek 8139 sample adapter into a machine and run some tests with it. rl0: RealTek 8139 10/100BaseTX irq 11 at device 3.0 on pci0 rl0: Ethernet address: 00:50:ba:d1:89:05 miibus0: MII bus on rl0 pciconf -l would be nice here too (to see the PCI revision code). Methinks there is something going on with the 'rl' driver and/or the RealTek cards! Gee, y'think? I don't suppose you ran any similar tests with, say, one of those LinkSys cards you had the other day. Or maybe a 3Com card. I mean, it's just a little anti-climactic, you know? I put all that blood, sweat and tears into if_xl and if_dc, but do people do stress tests with them to help me identify weaknesses? No, they pound on the house of cards that is if_rl. *sigh* -Bill -- = -Bill Paul(212) 854-6020 | System Manager, Master of Unix-Fu Work: [EMAIL PROTECTED] | Center for Telecommunications Research Home: [EMAIL PROTECTED] | Columbia University, New York City = "It is not I who am crazy; it is I who am mad!" - Ren Hoek, "Space Madness" = To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
Re: Woa! May have found something - 'rl' driver and small packets (was Re: Odd TCP glitches in new currents)
Of all the gin joints in all the towns in all the world, Matthew Dillon had to walk into mine and say: (taking this off -current) apollo# linktest -s 51 -f1 lander 1-51 byte payload - errors lander# linktest -s 51 -f1 apollo apollo# linktest -s 52 -f1 lander 52+ byte payload - no errors lander# linktest -s 52 -f1 apollo You know, this kinda sounds like a jabber lockup. Bill, are you following the *MINIMUM* ethernet frame size specification for ethernet? *sigh* No, I've been living on Mars since 1975 and we don't get IEEE spec documents up here. Yes, I know there's a minimum frame length of 60 bytes. And the rl_encap() routine has the following code: /* Pad frames to at least 60 bytes. */ if (m_head-m_pkthdr.len RL_MIN_FRAMELEN) { m_head-m_pkthdr.len += (RL_MIN_FRAMELEN - m_head-m_pkthdr.len); m_head-m_len = m_head-m_pkthdr.len; } The RealTek doesn't autopad, so you have to handle it manually. You're only allowed one DMA buffer per transmission, so outbound packets are coalesced into a single mbuf cluster buffer in rl_encap(). A cluster buffer is always 2K, and frames can never be larger than 1514 bytes, so we know there'll always be plenty of room. In the case of frames less 60 bytes, I just adjust bump up m_pkthdr.len and m_len. This adjuster length gets used later in rl_start() when transmission is triggered. Incidentally, you should be using tcpdump -n -e -i rl0 to measure the actual frame length of failing and succeeding transmissions: that's usually a much better indicator of what might be going wrong. You could calculate it from the data buffer length, but I suck at math; I find it's easier just to monitor the offending frames. -Bill = -Bill Paul(212) 854-6020 | System Manager, Master of Unix-Fu Work: [EMAIL PROTECTED] | Center for Telecommunications Research Home: [EMAIL PROTECTED] | Columbia University, New York City = "It is not I who am crazy; it is I who am mad!" - Ren Hoek, "Space Madness" = To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
Re: Correction to /usr/src/sys/i386/conf/LINT for D-Link DFE-530TX+
Of all the gin joints in all the towns in all the world, Matthew Dillon had to walk into mine and say: The D-Link DFE-530TX+ uses the 'rl' driver, not the 'vr' driver. I don't know if there's a DFE-530TX (without the '+') so I'm leaving the entry for that in the 'vr' driver notes intact. Both exist. The DFE-530TX is most definitely a VIA Rhine card and needs the vr driver. I have one. I only recently learned of the existence of the DFE-530TX+, which uses the RealTek 8139 and needs the rl driver. Yes, it's dumb to change the whole card design and do nothing to update the model number except stick a "+" on the end, but that's how it goes. D-Link also has a habit of selling certain cards only some markets. For example, there apparently also exists a DFE-540TX card that uses a Macronix chip, however it was never sold in the U.S., only in Asia. -Bill -- ===== -Bill Paul(212) 854-6020 | System Manager, Master of Unix-Fu Work: [EMAIL PROTECTED] | Center for Telecommunications Research Home: [EMAIL PROTECTED] | Columbia University, New York City = "It is not I who am crazy; it is I who am mad!" - Ren Hoek, "Space Madness" = To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
HEADS UP: if_dc imports, al, ax, pn, dc and mx removed
Heads up people: the if_dc driver and all its bits and pieces are now in the tree and the al, ax, dm, pn and mx drivers have been removed. People previously using these drivers need to update their /etc/rc.conf files accordingly. Also note that if_dc should now handle 21143-based NICs. If you were previously using if_de for a 21143 card, you may see if_dc take over support for it depending on your kernel config. If you have a 21143-based NIC which worked with if_de and *doesn't* work with if_dc, please let me know ASAP. Most 10/100 NICs should work fine: the only questionable ones are 10Mbps only versions. If you have a NIC that doesn't work, please show me the output of pciconf -l from your system when reporting a problem. As usual, the place to complain is: [EMAIL PROTECTED] -Bill -- = -Bill Paul(212) 854-6020 | System Manager, Master of Unix-Fu Work: [EMAIL PROTECTED] | Center for Telecommunications Research Home: [EMAIL PROTECTED] | Columbia University, New York City = "It is not I who am crazy; it is I who am mad!" - Ren Hoek, "Space Madness" = To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
Update of if_dc driver
Okay, I've had a couple of reports so far about the if_dc driver which were mostly positive. I've also gotten some new hardware and did some more testing and bug fixing: - Fixed support for non-MII 10/100 cards based on the 21143 chip. This includes the DEC DE500-BA and the built-in 21143 ethernet on alpha machines. The DE500-BA is now being distributed by Cabletron. - Changed dc_attach() so that if probing for an MII-based PHY fails on 21143 cards, it will fail over to using the dcphy pseudo driver and SYM mode. - Fixed a few minor problems with autonegotiation on Macronix and PNIC II cards. - Simplified dc_pnic_rx_bug_war() a bit. Now we keep track of descriptor and mbuf indexes instead of pointers. - Compiled KLD modules for both x86 and alpha platforms using gcc 2.95.2. The driver should work correctly now with most 21143 10/100 cards. If anybody has an Adaptec, ZNYX or other multiport 21143 card, I'd be interested to know how it works with these. I've tested it with a D-Link DFE-570TX 4-port card and it seems to work well. Again, the driver is at http://www.freebsd.org/~wpaul/dc.tar.gz. If you have FreeBSD-current and a supported card, please give it a try and let me know how it holds up. Supported cards include: - Intel 21143 10/100 NICs (Kingston KNE100TX, DEC DE500-BA, D-Link DFE-570TX, Adaptec 6244 (I think), possibly ZNYX and others) - Macronix 98713, 98713A, 98715A, 98725, LC82C115 PNIC II NICs (NDC SOHOware, LinkSys LNE100TX V2.0, CNet Pro120A, CNet Pro120B, SVEC PN102TX) - ASIX AX88140A or AX88141 NICs (Alfa Inc. GFC2204, CNet Pro110B) - ADMtek AL981 Comet or AL985 Centaur - Davicom DM9102 NICs (Jaton Corporation XPressNet) - Lite-On 82c168 and 82c169 NICs (LinkSys LNE100TX, Matrox FastNIC, Kingston KNE110TX, Netgear FA310-TX Rev D1, D2 or D3) My goal is to try and get this driver into 4.0 as soon as possible so I can use it as a replacement for the al, ax, dm, pn and mx drivers. However, there's a small problem: the de driver already supports the 21143, although it does so poorly according to some people. We can't have both drivers trying to support the same chip. I want to be able to turn off 21143 support in if_de and let if_dc handle them, but I don't want to annoy people who are using if_de with 21143 cards now and not having any trouble. What do people think? Does anybody have anything against me transfering support for the 21143 from if_de to if_dc? Does anybody have a better idea? I'm open to suggestions. -Bill -- = -Bill Paul(212) 854-6020 | System Manager, Master of Unix-Fu Work: [EMAIL PROTECTED] | Center for Telecommunications Research Home: [EMAIL PROTECTED] | Columbia University, New York City = "It is not I who am crazy; it is I who am mad!" - Ren Hoek, "Space Madness" = To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
Re: Texas Chainsaw Monday
Of all the gin joints in all the towns in all the world, Boris Popov had to walk into mine and say: On Wed, 20 Oct 1999, Bill Paul wrote: install -c -s -o root -g wheel -m 555 mount_nwfs /vol2/release/sbin install: mount_nwfs: No such file or directory Ok, it seems that I found why mount_nwfs failed to build: I'm use 'install' instead of ${INSTALL} in the libncp. Unfortunately, this has not fixed the problem: the build report for today (Oct 22) shows the same error. *sigh* -Bill -- = -Bill Paul(212) 854-6020 | System Manager, Master of Unix-Fu Work: [EMAIL PROTECTED] | Center for Telecommunications Research Home: [EMAIL PROTECTED] | Columbia University, New York City = "It is not I who am crazy; it is I who am mad!" - Ren Hoek, "Space Madness" = To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
Still waiting for xl driver reports
A while back I posted a message here saying that I'd changed the xl driver a bit to hopefully improve performance for 3c90xB and later adapters (i.e. the "cyclone," "hurricane" and "tornado" chipsets). I asked for people to report if the changes helped, hurt, made no difference or were totally broken. So far not one person has said so much as a word to me on this subject. I need feedback from people so that I know it's safe to merge this stuff into -stable, so let's hear it already. It's been several weeks since I made the changes. Surely there are people running -current with 3Com 3c90xB cards. To reiterate, this only concerns people with the following adapters: - 3c905B-TX 10/100 - 3c905B-FX/SC fiber optic - 3c905B-COMBO 10/100 plus BNC and AUI - 3c905C-TX 10/100 - 3c980-TX server adapter - 3c980B-TX server adapter - 3c980C-TX server adapter - 3cSOHO100-TX 10/100 To a lesser extent it also concerns people with these adapters (these are 10Mbps only so the change isn't likely to be as noticable): - 3c900B-TPO twisted pair only - 3c900B-TPC twisted paid and coax (BNC) - 3c900B-COMBO twisted pair, AUI and BNC - 3c900B-FL fiber optic Ideally, the changes should provide slightly faster performance with less CPU usage. Performance/CPU overhead comparisons with other cards (in the same machine!) would he helpful as well as comparisons with the same 3Com card using the older driver revision. Things that would not be helpful include: - Asking about an unrelated problem from 3.2-RELEASE, 3.3-RELEASE or 3.3-STABLE. - Telling me that your card isn't detected properly and not realizing that you have "plug and play OS" set to "yes" in your BIOS config. - Asking about a completely different card. From a completely different manufacturer. - Saying that you'll be happy to run some tests "as soon as you find some time." If you couldn't find the time by now, you never will. - Giving me an excuse for not sending me any feedback earlier. I don't care if your dog got run over, your house was invaded by giant ants, your entire family contracted the bubonic plague or aliens stole your computer. - Asking me how to set up a 3Com card in Linux. (Comparing the xl driver's performance with the Linux 3Com driver is acceptable, provided you run the comparison on the same hardware. Comparing a PIII 600Mhz host running Linux to a PII 300Mhz host running FreeBSD is not a fair comparison. Unless FreeBSD ends up being faster. :) -Bill -- ===== -Bill Paul(212) 854-6020 | System Manager, Master of Unix-Fu Work: [EMAIL PROTECTED] | Center for Telecommunications Research Home: [EMAIL PROTECTED] | Columbia University, New York City = "It is not I who am crazy; it is I who am mad!" - Ren Hoek, "Space Madness" = To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
Re: Issues with xl0
WARNING: the following reply contains Extreme Ranting which may be too intense for young audiences. Those not wishing to experience Extreme Ranting should #define NO_EXTREME_RANTING. Of all the gin joints in all the towns in all the world, Bryan Bursey had to walk into mine and say: I attempted to move from -STABLE to -CURRENT last night, but without any luck. I decided to start with a current snapshot (19990928), but was unable to install using the floppies provided on releng3.freebsd.org. #ifndef NO_EXTREME_RANTING But the exact reason why you were unable to install is a secret, right? Clearly the details of the failure would be of no use to anyone, so you chose not to share them, yes? How many times do I have to say it: "it didn't work," "it failed," "I couldn't make it do blah" and similar vague descriptions don't help anybody. Don't start in with a vague statement about a problem and then expect to be asked for more details later: give the details first! It saves a lot of time! #endif Thinking it might have been a floppy issue, #ifndef NO_EXTREME_RANTING It may as well have been an arthritis issue for all we know. #endif I used my 3.3-STABLE floppies and simply changed the install options so that I'd get 4.0. This worked until I restarted my machine at the end of the install. It came back up ok, but I was again unable to connect to the network. This is too vague. You're leaving out a ton of details, like: did you even see the xl0 probe messages in the kernel. You know, basic stuff which nobody else will know since we're not able to see over your shoulder. Can anyone tell me if there are known issues with the xl0 driver in 4.0, or if it has been superceded by another driver which works with 3Com 3C900B. #ifndef NO_EXTREME_RANTING No no no. *You* tell *us* if there are any issues! *You* tell *us* if you're having any problems! And then *you* tell *us* in explicit detail what they are! How hard is it to understand that! No, there isn't any other driver. But because you didn't make eves the slightest effort to explain your problem, I can't begin to even help you. #endif You didn't specify which 3c900B card you have: there are several of them with different media options: - 3c900B-FL 10baseFL fiber-optic - 3c900B-TPO 10baseT "Twisted Pair Only" - 3c900B-TPC 10baseT and 10base2 "Twisted Pair and Coax" - 3c900B-COMBO 10baseT, 10base2 and 10base5 (AUI) #ifndef NO_EXTREME_RANTING If you'd bothered to watch what happens when the kernel boots, you would have been able to tell whether or not the 3c900B card was detected (and I know it was in spite of your unwillingless to say so). You would also have been able to tell what media was selected (10baseT, 10base5 or 10base2, depending a bit on exactly which model card you have, which you also didn't tell us). Then had you bothered to rub two brain cells together, you might have been able to tell if maybe the default media selection read from the EEPROM was incorrect and possibly tried to use ifconfig to set it correctly. #endif If you have a TPO or FL adapter, then there's only one media choice, and the driver should have selected it properly. If you have a TPC or a COMBO adapter, then somebody may have fiddled with the 3C90XCFG.EXE utility and selected the wrong default media in the EEPROM. The driver will only use what the EEPROM says; it doesn't autoprobe. If you used the 3C90XCFG.EXE utility to select the "auto" choice, then the driver will pick a reasonable default and expect you to be clever enough to change it with ifconfig if the choice is wrong. For example, for a COMBO card, it will choose 10baseT. If you don't like 10baseT, you can do the following: # ifconfig xl0 media 10base2/BNC # ifconfig xl0 media 10base5/AUI Or if you really want 10baseT: # ifconfig xl0 media 10baseT/UTP If you want to use this setting during the install, then enter the media option command in the box that says "extra options to ifconfig" in the TCP/IP configuration screen (i.e. "media 10base2/BNC"). Thanks for any help (or other random thoughts). #ifndef NO_EXTREME_RANTING You want random thoughts? Fine: I wish it would stop raining, I hope the Mets make it to the playoffs, my shoes are too tight, you're ugly and your mother dresses you funny. I know what you're thinking: "why is he being so nasty?" Because I can't stand it when people expect me to play the "minimum information" game, and you are by no means the first. Some people may be able to read a chewing gum wrapper and divine the secrets of the universe, but I'm not one of them. #endif -Bill -- = -Bill Paul(212) 854-6020 | System Manager, Master of Unix-Fu Work: [EMAIL PROTECTED] | Center for Telecommunications Research Home: [EMAIL PROTECTED] | Columbia University, New York City
Re: Is the wb driver broken?
Of all the gin joints in all the towns in all the world, John Polstra had to walk into mine and say: cvsup-master# ifconfig wb0 inet 204.216.27.25 netmask 255.255.255.240 media 100baseTX mediaopt half-duplex ifconfig: SIOCSIFMEDIA: Device not configured You don't need to explicitly specify mediaopt half-duplex anymore. Specifying media 100baseTX without mediaopt full-duplex implies half-duplex. Leave off the mediaopt half-duplex part and it will work. OK, I did that and it made the SIOCSIFMEDIA message go away. But now it's not showing carrier: Doing initial network setup: hostname domain. wb0: flags=8843UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST mtu 1500 inet 204.216.27.25 netmask 0xfff0 broadcast 204.216.27.31 ether 00:00:e8:18:5b:1d media: 100baseTX status: no carrier supported media: autoselect 100baseTX full-duplex 100baseTX 10baseT/UTP full-duplex 10baseT/UTP none Any other ideas? Is there any reason why you're not letting it autodetect (which is what it does by default, or with media autoselect). Make sure it's plugged in, make sure the link light is lit. Try to ping somebody on the network (or run tcpdump on the interface). You can't just sit there and look at it: you have to experiment. -Bill -- = -Bill Paul(212) 854-6020 | System Manager, Master of Unix-Fu Work: [EMAIL PROTECTED] | Center for Telecommunications Research Home: [EMAIL PROTECTED] | Columbia University, New York City = "It is not I who am crazy; it is I who am mad!" - Ren Hoek, "Space Madness" = To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
Re: Page fault with ethernet xl0
Of all the gin joints in all the towns in all the world, Stephan van Beerschoten had to walk into mine and say: I have experienced something nasty. After cvs'ing my tree 3 hours ago which would be approx 16:00 CET, I did a buildworld, installed it, compiled a new kernel. Dammit. You didn't even tell me what kind of card you have. Do you really need me to ask you for this? Go back and boot the kernel in verbose mode and show me *EXACTLY* what it says. -Bill -- = -Bill Paul(212) 854-6020 | System Manager, Master of Unix-Fu Work: [EMAIL PROTECTED] | Center for Telecommunications Research Home: [EMAIL PROTECTED] | Columbia University, New York City = "It is not I who am crazy; it is I who am mad!" - Ren Hoek, "Space Madness" = To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
Re: Problems with the latest changes to ifconfig (I guess) - Bad
Of all the gin joints in all the towns in all the world, Peter Jeremy had to walk into mine and say: Ian Whalley [EMAIL PROTECTED] wrote: My card is identified as 3Com 3c905B-TX Fast Etherlink XL. FWIW, I'm running a kernel about 30 hours old with a 3Com 3c905-TX Fast Etherlink XL and I'm not seeing this problem. At a quick quess, something in the miibus support broke the 3C905B support. Not quite. The original 3c905-TX NIC used an external NatSemi PHY chip which was mapped to MII address 24. The 3c905B uses an internal transceiver, which is also mapped to MII address 24 for compatibility purposes. However, there are several different 3c905B ASIC revisions and at least one of them, for some peculiar reason, maps the transceiver to *all* MII addresses (0 through 31). Technically this isn't a big problem since if you always assume that the PHY is at address 24 (which I sure is what 3Com's drivers do) you'll never notice the difference. But you have to watch out for it. The old code in if_xl.c would probe for PHYs and stop the moment it encountered the first one, which would work fine: it would stop at address 0 for the broken ASIC and 24 for the working ones. But the miibus code probes at all addresses because there are some NICs that actually have more than one transceiver. But with the buggy 3Com ASIC, we end up incorrectly trying to map the same PHY several times over, which the xlphy driver doesn't like, so the probe fails, the miibus attach fails, and bad things happen later. I just committed a patch to -current to deal with this: the xl_miibus_readreg() and xl_miibus_writereg() routines will not only return values at MII address 24. This will make the buggy ASIC appear to work correctly so that only one PHY instance will be detected. Why didn't I catch this earlier? Well, the 3c905B NIC that I tested happens to work correctly. So did the 3c905C that I tried after it. In fact, I think the only place I encountered the buggy ASIC locally is with the embedded 3c905B NIC in some of the Dell machines in the lab, which aren't currently running FreeBSD. Don't you just love hardware programming? -Bill -- = -Bill Paul(212) 854-6020 | System Manager, Master of Unix-Fu Work: [EMAIL PROTECTED] | Center for Telecommunications Research Home: [EMAIL PROTECTED] | Columbia University, New York City = "It is not I who am crazy; it is I who am mad!" - Ren Hoek, "Space Madness" = To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
Re: kernel build fail- /pci/if_xl.c:133: miibus_if.h: No such file or directory
Of all the gin joints in all the towns in all the world, FreeBSD mailing list had to walk into mine and say: Subject says it all. No, the subject does not say it all: the subject says nothing about how you forgot to update your kernel config file to include: controller miibus0 The subject also fails to mention that you didn't go back and read previous postings on this list, especially the one where I said that I had converted the xl driver to use miibus. Of course, nowdays you don't even need to include the xl driver in your kernel. You can just do: kldload mii kldload xl Or you can include the following in /boot/loader.conf and reboot: mii_load="YES" xl_load="YES" -Bill -- ===== -Bill Paul(212) 854-6020 | System Manager, Master of Unix-Fu Work: [EMAIL PROTECTED] | Center for Telecommunications Research Home: [EMAIL PROTECTED] | Columbia University, New York City = "It is not I who am crazy; it is I who am mad!" - Ren Hoek, "Space Madness" = To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
Monday part II: The Terror Continues
So today my ISA bus is detected properly and the kernel gets as far as trying to launch /stand/sysinstall, but then, just when I thought it was safe to try and load a new snapshot: rootfs is 2880 Kbyte compiled in MFS spec_getpages: I/O read failure: (error code=0) bp 0xc34fc3a0 vp 0xc7ed8ec0 size: 0, resid: 0, a_count: 49152, valid: 0x0 nread: 0, reqpage: 0, pindex: 0, pcount: 12 exec /stand/sysinstall: error 5 init: not found panic: no init This has nothing to do with the 486 though; I tried it on a laptop that was handy and it blew up the same way. I tried yesterday's mfsroot image and it doesn't work with that either. The August 16th snapshot's kernel and mfsroot images seem to work. The August 16th snapshot's kernel and yesterday's mfsroot image also works. This is the third unusable snap in a row that I've had the misfortune to encounter. I'm starting to think this is more than a coincidence. Did somebody launch a "Piss Bill Off" contest when I wasn't looking or something? If so, let me stress that you really don't want to find out what first prize is. -Bill -- ===== -Bill Paul(212) 854-6020 | System Manager, Master of Unix-Fu Work: [EMAIL PROTECTED] | Center for Telecommunications Research Home: [EMAIL PROTECTED] | Columbia University, New York City = "It is not I who am crazy; it is I who am mad!" - Ren Hoek, "Space Madness" = To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
Monday strikes again
Must... control... fist of death... I just tried to boot the latest -current snapshot (Aug 23) on my little 486/66 machine. The kern.flp kernel panics right after saying "Probing for PnP devices:". Now, this machine has a PCI bus but it doesn't support ISA plug and play, so before any of you lot start theorizing about possible PnP BIOS problems, don't. The panic message says the kernel dies because of a page fault trying to reference memory location 0x4 (which is in page 0, which isn't mapped, which means this is a NULL pointer dereference) at PC 0xc0175b20. Running "nm kernel | grep c0175b" on the install kernel yields: c0175b68 t cnuninit c0175bc8 t sysctl_kern_consmute Running "nm kernel | grep c0175a" on the install kernel yields: c0175a50 T cninit c0175afc T cninit_finish c0175a50 t gcc2_compiled. c0175a08 t l_noclose c0175a14 T l_noread c0175a2c t l_norint c0175a38 t l_nostart c0175a20 T l_nowrite c0175a44 t l_nullioctl My money says the problem is cninit_finish(). The hardware config of this machine is as follows: 486DX2-S 66Mhz CPU 16MB RAM Diamond Speedstar ISA SVGA adapter (ET4000 chipset, 1MB RAM) IDE disk controller Maxtor LTX-200A IDE disk 3.5" floppy drive 2 serial ports, one parallel port Integrated Micro Systems PCI bridge Compaq NetFlex 3/P PCI ethernet adapter D-Link DFE-550TX PCI ethernet adapter The machine is running a -current snapshot from August 9th which works fine (or at least, it did after I fixed the PCI bridge detection breakage that screwed it up last time). It looks like the major difference is that /sys/i386/i386/cons.c was taken away and replaced with some MI console routines in /sys/kern. My gut tells me that console initialization is failing because it can't find the ISA graphics adapter for some reason. Anybody have any bright ideas where I can start looking for the problem? -Bill -- = -Bill Paul(212) 854-6020 | System Manager, Master of Unix-Fu Work: [EMAIL PROTECTED] | Center for Telecommunications Research Home: [EMAIL PROTECTED] | Columbia University, New York City = "It is not I who am crazy; it is I who am mad!" - Ren Hoek, "Space Madness" = To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
Re: Monday strikes again
Okay, further investigation shows that configure() has the following code: #if NPNP 0 /* Activate PNP. If no drivers are found, let ISA probe them.. */ pnp_configure(); #endif /* * Explicitly probe and attach ISA last. The isa bus saves * it's device node at attach time for us here. */ if (isa_bus_device) bus_generic_attach(isa_bus_device); However isa_bus_device is still NULL so we never get any ISA devices attached. No ISA devices means no console (the VGA card and serial ports are both ISA devices), so we explode. Since the ISA bus in this machine is on-board instead of being hung off a PCI to ISA bridge, I suspect that somebody broke the handling on on-board ISA buses. Thank you very much, may I have another. -Bill -- = -Bill Paul(212) 854-6020 | System Manager, Master of Unix-Fu Work: [EMAIL PROTECTED] | Center for Telecommunications Research Home: [EMAIL PROTECTED] | Columbia University, New York City = "It is not I who am crazy; it is I who am mad!" - Ren Hoek, "Space Madness" = To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
PCI bus on 486/66 no longer detected
Today I thought I would upgrade my 4.0 box from a July 15th snapshot to an August 9th snapshot. Only problem is I can't, because the August 9th snapshot's boot kernel refuses to locate my 486's PCI bus. Previously, the bus was detetected as follows: pcib0: PCI host bus adapter on motherboard pci0: PCI bus on pcib0 chip0: PCI to 0x80 bridge (vendor=10e0 device=8849) at device 0.0 on pci0 tl0: Compaq NetFlex-3/P irq 9 at device 13.0 on pci0 tl0: Ethernet address: 00:80:5f:9a:58:f1 xl0: 3Com 3cSOHO100-TX OfficeConnect irq 12 at device 14.0 on pci0 xl0: Ethernet address: 00:10:5a:e3:60:9c xl0: autoneg complete, link status good (half-duplex, 100Mbps) pciconf -l shows the following: mcmillan# pciconf -l chip0@pci0:0:0: class=0x068000 card=0x chip=0x884910e0 rev=0x02 hdr=0x00 tl0@pci0:13:0: class=0x028000 card=0x chip=0xf1300e11 rev=0x10 hdr=0x00 xl0@pci0:14:0: class=0x02 card=0x764610b7 chip=0x764610b7 rev=0x30 hdr=0x00 No, I didn't change any hardware setting anywhere. The older kernel still boots and works properly (I couldn't install the new snapshot because I need to do a network install, and my network cards are PCI). I have no idea where to look for the problem. Anybody have any bright ideas? -Bill -- = -Bill Paul(212) 854-6020 | System Manager, Master of Unix-Fu Work: [EMAIL PROTECTED] | Center for Telecommunications Research Home: [EMAIL PROTECTED] | Columbia University, New York City = "It is not I who am crazy; it is I who am mad!" - Ren Hoek, "Space Madness" = To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
Re: IRIX 6.5.4 NFS v3 TCP client + FreeBSD server = bewm
Of all the gin joints in all the towns in all the world, Matthew Dillon had to walk into mine and say: :Yes, we do. I've run into this problem elsewhere but a quick fix was needed :so it just got hacked. NT NFS clients tend to trigger it too. : :The problem is that the sanity check is a fair way away from where the problem :packet is generated. The bad reply is generated in the readdirplus routine, :gets replied (without checking) and cached. The client drops the (oversize) :packet, resends, and the nfsd replies from the cache and this time hits :the sanity check and panics. : :... : :I will have another look shortly. Anyway, the clue is that the server :readdirplus routine is the apparent culprit. : :Cheers, :-Peter This makes a lot of sense. A report of du causing the panic, and the good possibility that readdirplus is caching an oversized response packet. Tell me what you come up with! I'll take a crack at it if you don't find anything. Caching doesn't enter into it. The problem is bad arithmetic. In /sys/nfs/nfs_serv.c:nfsrv_readdirplus(), we have the following code: /* * If either the dircount or maxcount will be * exceeded, get out now. Both of these lengths * are calculated conservatively, including all * XDR overheads. */ len += (7 * NFSX_UNSIGNED + nlen + rem + NFSX_V3FH + NFSX_V3POSTOPATTR); dirlen += (6 * NFSX_UNSIGNED + nlen + rem); if (len cnt || dirlen fullsiz) { eofflag = 0; break; } I observed that the value of "len" didn't agree with the actual amount of data beong consumed in the mbuf chain. It turns out that each time through the loop, len is being incremented by 4 bytes too little. In other words, 7 * NFSX_UNSIGNED should really be 8 * NFSX_UNSIGNED. When I change 7 to 8, I no longer get oversized replies and everything adds up. This sanity code is trying to add up the amount of data consumed for each entryplus3 that gets consumed by a directory entry. The entryplus3 is defined in nfs_prot.x like this: struct entryplus3 { fileid3 fileid; filename3 name; cookie3 cookie; post_op_attrname_attributes; post_op_fh3 name_handle; entryplus3 *nextentry; }; Unfortunately I haven't been able to wrap my brain around how this is being counted up for the "len" calculation. Whatever it's doing, it's off by 4 bytes. Possibly somebody forgot that "filename3" is a string, which in XDR format consists of a string bytes, plus padding to a longword boundary, *plus* a longword length value. Some comments would have been useful here. (Hint, hint.) What I don't know is whether or not the calculation for dirlen is wrong or not. Hopefully now that I've shown everyone the light, maybe somebody can tell me for sure. -Bill -- ===== -Bill Paul(212) 854-6020 | System Manager, Master of Unix-Fu Work: [EMAIL PROTECTED] | Center for Telecommunications Research Home: [EMAIL PROTECTED] | Columbia University, New York City = "It is not I who am crazy; it is I who am mad!" - Ren Hoek, "Space Madness" = To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
Re: IRIX 6.5.4 NFS v3 TCP client + FreeBSD server = bewm
Of all the gin joints in all the towns in all the world, Matthew Dillon had to walk into mine and say: I counted it all up. It definitely needs to be 8 * NFSX_UNSIGNED. Yes, I know that. :) But what about the check for dirlen: :dirlen += (6 * NFSX_UNSIGNED + nlen + rem); Should this be 7 * NFSX_UNSIGNED or is it correct as it is. I don't know how dirlen relates to the entryplus3 structure. -Bill -- = -Bill Paul(212) 854-6020 | System Manager, Master of Unix-Fu Work: [EMAIL PROTECTED] | Center for Telecommunications Research Home: [EMAIL PROTECTED] | Columbia University, New York City = "It is not I who am crazy; it is I who am mad!" - Ren Hoek, "Space Madness" = To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
Re: readdirplus client side fix (was Re: IRIX 6.5.4 NFS v3 TCP client + FreeBSD server = bewm)
Of all the gin joints in all the towns in all the world, Matthew Dillon had to walk into mine and say: :And here is something even scarier: readdirplus from the client side :doesn't appear to work correctly either. This time, you don't need an :IRIX machine to trigger the problem (though it helps :). Do the following : :client# mount -o nvsv3,tcp,rdirplus server:/somefs /mnt :client# ls /mnt; du /mnt; etc... : :Seems okay so far, right? Ah, but now try to unmount the filesystem: : :# umount /mnt :process wedges, can't be killed, can't log in, other processes wedge, etc.. :... :-Bill But, on the bright side, readdirplus is somewhat experimental in that it is not used by default, so very little testing of it has been done to date. Thus the bug is not unexepcted :-). At least the bugs we are getting now tend to be in the 'outlying areas' of NFS and not so much with the core code. Well, IRIX is using it by default, and option or not, it's documented and implemented, so it should work. Another area that is probably full of bugs: nqleasing. Well, the problem there is: what commercial UNIXes implement NQNFS? I stumbled over these problems because I was testing things with a commercial implementation of NFS. -- Ok, I was able to reproduce the above bug and fix it. The problem on the FreeBSD client is in nfs_readdirplusrpc() in nfs/nfs_vnops.c. It can obtain the vnode being used to populate the additional directory info in one of two ways. When it gets the vnode via nfs_nget(), the returned vnode is locked. When it gets it via a hit against NFS_CMPFH() (which I presume is for '.'), it simply VREF()'s the vnode. In the one case the vnode is returned locked, in the other it is not. However, the internal loop vrele()'s the vnode rather then vput()'s it, so the vnodes in the directory scan are never unlocked. This leads to the lockup. Uh, yeah. One of these days I'll be able to understand everything that you just said. But not today. If you could test and then commit this patch (w/ me as the submitter), I would appreciate it! It seems to fix the problem for me. This patch is relative to CURRENT. The fix ought to be MFCable to STABLE. Close, but not quite. You didn't beat up on it hard enough. The secret is to think like a kid with a new toy, or more precisely, a sysadmin with a new toy (amounts to the same thing :). The first thing any sysadmin wants to do when you hand him a new gizmo is to push the buttons, turns the knobs and flip the switches, in order to try out all those great new features he's heard about. That's how you find the bugs. Anyway, in this case, I found another problem: with your patch applied, I mounted a filesystem from a 3.2-RELEASE server (which I fixed today with the readdirplus server side patch) which happened to have a directory containing the unpacked source code for Ghostscript 5.50, plus objects left over from a build. There are a crapload of files in the gs 5.50 distribution, plus another crapload created by compiling it. I did the following: client# mount -o nfsv3,tcp,rdirplus server:/fs /mnt client# cd /mnt client# ls client# du lots of stuff printed, until the gs5.50 directory is reached bang! another panic There seems to be another problem in nfs_readdirplusrpc(). The following diff shows the changes I made to stop the panic: The funny thing is that the error termination code actually got it right and the loop got it wrong. Usually it's the other way around. -- Presumably this will not fix the SGI client. I've no idea what the problem there is. There may be a bug in the SGI client or there may be a bug in the client server implementation of the protocol in FreeBSD. -Matt Matthew Dillon [EMAIL PROTECTED] Index: nfs_vnops.c === RCS file: /home/ncvs/src/sys/nfs/nfs_vnops.c,v retrieving revision 1.135 diff -u -r1.135 nfs_vnops.c --- nfs_vnops.c 1999/07/01 13:32:54 1.135 +++ nfs_vnops.c 1999/07/29 23:57:06 @@ -2367,7 +2367,10 @@ nfsm_adv(nfsm_rndup(i)); } if (newvp != NULLVP) { - vrele(newvp); + if (newvp == vp) + vrele(newvp); + else + vput(newvp); newvp = NULLVP; } nfsm_dissect(tl, u_int32_t *, NFSX_UNSIGNED); -- = -Bill Paul(212) 854-6020 | System Manager, Master of Unix-Fu Work: [EMAIL PROTECTED] | Center
Re: readdirplus client side fix (was Re: IRIX 6.5.4 NFS v3 TCP client + FreeBSD server = bewm)
Of all the gin joints in all the towns in all the world, Matthew Dillon had to walk into mine and say: Look up a bit in the code. If bigenough is not true, cnp does not get initialized. This could lead to the bogus length -- or rather, it would be the cnp that is bogus, not the 'len'. The question is how to fix it. I think we can safely avoid doing the cache_enter so try changing the 'if (doit)' to 'if (doit bigenough)'. I've included the patch below. I am not 100% sure about this. Hm. Well, it cures the panic that I was experiencing quite nicely. I'm going to commit this latest patch for now since it fixes both the vnode locking problem and a crash condition, which are pretty serious problems. If you come up with something different, I'll be happy to try it out. Not a bad day's work. :) -Bill -- = -Bill Paul(212) 854-6020 | System Manager, Master of Unix-Fu Work: [EMAIL PROTECTED] | Center for Telecommunications Research Home: [EMAIL PROTECTED] | Columbia University, New York City = "It is not I who am crazy; it is I who am mad!" - Ren Hoek, "Space Madness" = To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
Re: IRIX 6.5.4 NFS v3 TCP client + FreeBSD server = bewm
Of all the gin joints in all the towns in all the world, Matthew Dillon had to walk into mine and say: Ok, so if I understand this correctly you have a FreeBSD server and an IRIX client. UDP mounts work, TCP mounts do not. You are using the AMD automounting software running on the ... client I presume? It is the server that is panicing. Yes. But it's nothing to do with AMD. I can do it manually: irix# mount -o vers=3,proto=tcp freebsd:/usr /mnt irix# cd /mnt; du FreeBSD box explodes. First of all, if these are production machines stick with UDP so's you don't tear your hair out. Also double check that the bug still exists with the absolute latest CURRENT if you can. I'd love to except *somebody* hasn't gotten arround to fixing current.freebsd.org yet. *Nudges jkh* And I'm not going to stick with UDP mounts because that's hiding the problem, not fixing it. You're just going to get more grief from the next poor fool who runs afoul of this problem. Also please run this (on the FreeBSD server running CURRENT). It will tell me whether NFS is being forced to realign packet data coming from your ethernet controller. (In the example below, my NFS server has to realign the data). # sysctl -a | fgrep nfs vfs.nfs.realign_test: 1583064 vfs.nfs.realign_count: 1583064 In the case of the 4.0 box, it explodes almost immediately: there's no chance to actually obtain this data there. I added some printfs to the 3.2 box briefly and it didn't look like the realign code was being triggered. We fixed a serious data corruption bug with NFSv3 over TCP that could result in panics. This fix was made on May 2nd to current and MFC'd to stable on May 8th. This fix made it into 3.2. FreeBSD mcmillan.ctr.columbia.edu 4.0-19990715-CURRENT FreeBSD 4.0-19990715-CURRENT #2: Tue Jul 20 17:07:35 EDT 1999 [EMAIL PROTECTED]:/usr/src/sys/compile/TEST i386 Should be in there. I don't think that's it. Note that the 'mbuf siz' value that gets printed is the exactly the same every time. -Bill -- = -Bill Paul(212) 854-6020 | System Manager, Master of Unix-Fu Work: [EMAIL PROTECTED] | Center for Telecommunications Research Home: [EMAIL PROTECTED] | Columbia University, New York City = "It is not I who am crazy; it is I who am mad!" - Ren Hoek, "Space Madness" = To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
Re: IRIX 6.5.4 NFS v3 TCP client + FreeBSD server = bewm
Of all the gin joints in all the towns in all the world, Matthew Dillon had to walk into mine and say: :This is yet another problem that we have run into here. If you check the :digest for -hackers it was reported awhile ago (mike smith even cc-ed it :to security since it may have been a kernel stack overflow) . Anyway, the :problem is that IRIX defaults to 32K packets on TCP NFSv3 mounts, and :16K on UDP NFSv3 mounts. I recommend using UDP and setting rsize=8192, :wsize=8192 in your amd maps (as we do now, no problems at all). : :-- :David Cross | email: [EMAIL PROTECTED] :Systems Administrator/Research Programmer | Web: http://www.cs.rpi.edu/~crossd Ah ha! Yes, 32K packets will certainly screw up NFS under FreeBSD. Uh could you elaborate a little? No, strike that: could you elaborate a *lot*. A whole lot. We need to fix that panic to have it simply drop the packet, I guess. No, we need to fix the code so it handles 32K "packets" (datagrams) correctly. -Bill -- ===== -Bill Paul(212) 854-6020 | System Manager, Master of Unix-Fu Work: [EMAIL PROTECTED] | Center for Telecommunications Research Home: [EMAIL PROTECTED] | Columbia University, New York City = "It is not I who am crazy; it is I who am mad!" - Ren Hoek, "Space Madness" = To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
Re: IRIX 6.5.4 NFS v3 TCP client + FreeBSD server = bewm
Of all the gin joints in all the towns in all the world, Matthew Dillon had to walk into mine and say: :Ah ha! Yes, 32K packets will certainly screw up NFS under FreeBSD. : :Uh could you elaborate a little? No, strike that: could you elaborate :a *lot*. A whole lot. Sure. There is a constant called NFS_MAXDATA defined in ..mmm.. nfs/nfsproto.h. Set to 32768 for TCP connections, 16384 for UDP connections. The code is a mess though, so usually just the higher limit is used. The fsinfo rpc returns this maximum to the client. The client is supposed to limit NFS packets to the specified size. Okay. Well, I experimented a bit, and found that if I increased NFS_MAXPACKET by 512 bytes, the machines no longer panic. (Yes, that's NFS_MAXPACKET, not NFS_MAXDATA.) 512 is just a number I pulled out of my ass: initially I just tried increasing it by 372 bytes (33544 - NFS_MAXPACKET == 372) which got me a little further along, but later I got another crash where mbuf siz was 33632. So I tried 512 and was able to do a complete du on /usr without any problems. As for the trashed mbuf chain I thought I saw, I was confused by a couple of factors: - When you do gdb -k vmunix vmcore.X, values on the stack such as automatic variables aren't reliably preserved. In this case I was trying to do a "print *m" to observe the contents of the last used mbuf and this pointed me off into space somewhere. It should have been NULL since m_next off the last mbuf in a chain is NULL. - I was looking at m_pkthdr.rcvif and m_pkthdr.len of mreq, which were not initialized and hence were also bogus (which makes sense since this was an mbuf chain to be transmitted, not the request that was received). Following the mbuf chain along showed that it was in fact sane. I don't know where these extra bytes are coming from. Presumeably there is some upper bound to the size of an NFS v3 RPC; either we are computing it wrong or SGI is. What I'd love to be able to do is snoop the requests coming from the SGI but that's hard since they're encapsulated in a TCP stream. -Bill -- ===== -Bill Paul(212) 854-6020 | System Manager, Master of Unix-Fu Work: [EMAIL PROTECTED] | Center for Telecommunications Research Home: [EMAIL PROTECTED] | Columbia University, New York City = "It is not I who am crazy; it is I who am mad!" - Ren Hoek, "Space Madness" = To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
Re: wi0 almost works with Wavelan Turbo card
Of all the gin joints in all the towns in all the world, Ernie Elu had to walk into mine and say: I am looking for help with getting a Lucent Wavelan Turbo ISA (Bronze) card running. What speed is the card, exactly. Having read a few posts about the Wavelan IEEE 802.11 card not working with the wi driver I thought I would give it a go anyway with the turbo card. "Not working with the wi driver?" I hope you meant "now working." I installed a Wavelan Turbo PCMCIA card in my Toshiba 2520CDT notebook, and an idetical card with the Wavelan ISA adapter board into an Advantech 6154 Slot PC. I am not familiar with an "Advantech 6154 Slot PC." Please don't assume that everyone automatically knows your hardware by name. Describe it. In detail. Both computers are running FreeBSD 4.0-CURRENT with their IRQ set to 10 in pccard.conf, all other settings are default. It sort of works, the notebook end seems fine, but the Advantech end keeps coming up with the same error on the console every few seconds when there is traffic between them: wi0: oversized packet received (wi_dat_len=24576, wi_status=0x2000) No such error on the laptop. When the error occurs ftp or whatever you were doing stalls for a bit then continues. Any suggestions? No. I never obtained any real documentation from Lucent (they won't release the Hermes programming manual without NDA) and I don't have a turbo WaveLAN card so I'm unable to duplicate your problem on my own equipment. If I can't duplicate the problem and analyze it, I can't even begin to fix it. -Bill -- ===== -Bill Paul(212) 854-6020 | System Manager, Master of Unix-Fu Work: [EMAIL PROTECTED] | Center for Telecommunications Research Home: [EMAIL PROTECTED] | Columbia University, New York City = "It is not I who am crazy; it is I who am mad!" - Ren Hoek, "Space Madness" = To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
Re: alpha kernel build failure (w/patch)
Of all the gin joints in all the towns in all the world, Steve Price had to walk into mine and say: [trimmed -alpha from cc: list to keep the cross posting police from coming after me :)] On Mon, 5 Jul 1999, Parag Patel wrote: # On Mon, 05 Jul 1999 00:33:57 CDT, Steve Price wrote: # +#ifdef __i386__ #sc-wb_btag = I386_BUS_SPACE_IO; # +#endif # +#ifdef __alpha__ # + sc-wb_btag = ALPHA_BUS_SPACE_IO; # +#endif # # Just curious, but is there a reason that these lines aren't simply # # sc-wb_btag = BUS_SPACE_IO; # # with this macro being set to the correct machine-specific one in some # appropriate header file? I'm sure I'm missing something... I wondered that as well. For both the i386 and alpha port the definitions end up in /usr/include/machine/bus.h and stripping off the arch-specific prefix shows that their value is the same. In fact they appear to be the only #define in bus.h with the arch-specific prefix besides the multiple-inclusion #defines. I think they could be combined, but defer the decision (commit) to the folks working on the new bus code as they know their way around this code much better than I do. The reason it's not done that way is because the bus_space code is incomplete. The NetBSD code from which it was taken has a routine that sets up the bus tag for you (and I think the handle too) based on the actual bus type. In other words, you're supposed to be passed a handle to the bus on which your device resides, and you pass that to bus_space_create() or whatever, and it figures out all the right machine specific details for you. Why don't we have this routine? Because we don't have the NetBSD bus architecture and at the time we only ran on the i386 arch, so we took a shortcut and fiddled with the bus space handle and bus space tag directly. If we're really lucky then some day this will get fixed correctly, by somebody who is not me, as I have plenty of other things to keep me busy. -Bill -- = -Bill Paul(212) 854-6020 | System Manager, Master of Unix-Fu Work: [EMAIL PROTECTED] | Center for Telecommunications Research Home: [EMAIL PROTECTED] | Columbia University, New York City = "It is not I who am crazy; it is I who am mad!" - Ren Hoek, "Space Madness" = To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
Anybody actually using gigabit ethernet?
I'm wondering if anybody out there has actually done any experimentation with gigabit ethernet boards using the Alteon Tigon driver. I know that it works on my hardware, but it's nice to actually have some feedback from people so that I know if it's actually working worth a damn. So far I have not heard a peep out of anybody, other than a couple of people who were nice enough to help out with some driver testing, and that was months ago. I usually consider this a good thing, because it means that at least nobody is complaining. But when people ask me hey Bill, how well do these boards work with FreeBSD? all I can tell them is that they seem to work okay in my limited test environment. This does not exactly provide a lot of motivation to go out and buy some gigabit ethernet cards. Also, I only have access to a limited selection of cards (I have a 3Com and a Netgear board, and others have tested AceNIC boards) so I don't know for sure if some of the ones that I claim to support actually work. (I don't have any reason to believe they won't, but Murphy's Law applies.) I also don't have access to a gigabit switch, so my testing is limited to blasting traffic between two hosts through a fiber patch. So, if anybody is actually using a Tigon-based gigabit board with STABLE or CURRENT, let me know. Is it working reliably? Is performance good? Is it bad? Inquiring minds want to know. -Bill -- = -Bill Paul(212) 854-6020 | System Manager, Master of Unix-Fu Work: wp...@ctr.columbia.edu | Center for Telecommunications Research Home: wp...@skynet.ctr.columbia.edu | Columbia University, New York City = It is not I who am crazy; it is I who am mad! - Ren Hoek, Space Madness = To Unsubscribe: send mail to majord...@freebsd.org with unsubscribe freebsd-current in the body of the message
Re: Anybody actually using gigabit ethernet?
Of all the gin joints in all the towns in all the world, Dennis Glatting had to walk into mine and say: In reading your message I felt compelled to ask you a question. Are you using gb end-to-end? That probably isn't a good idea because in TCP the sequence numbers can wrap within timeout periods and the data stream become undetectably (from a TCP perspective) corrupt. You didn't read what I said. I don't have a gigabit ethernet switch. I only have cards. Therefore the *only* way I can test the operation of the driver and adapters is to connect two machines with gigabit cards back to back with a patch cable. This automatically implies 'using gb end-to-end.' As for corruption due to TCP sequence number wrapping, I don't know what to tell you. I never noticed such behavior in my tests, but that's why I'm asking for feedback from other people. -Bill -- = -Bill Paul(212) 854-6020 | System Manager, Master of Unix-Fu Work: wp...@ctr.columbia.edu | Center for Telecommunications Research Home: wp...@skynet.ctr.columbia.edu | Columbia University, New York City = It is not I who am crazy; it is I who am mad! - Ren Hoek, Space Madness = To Unsubscribe: send mail to majord...@freebsd.org with unsubscribe freebsd-current in the body of the message
Re: if_xl.c noise
Of all the gin joints in all the towns in all the world, Bob Bishop had to walk into mine and say: Hi, Does version 1.34 of if_xl.c really look like the following around line 1886, or is my repository hosed? You could have answered this yourself by checking in /pub/FreeBSD/FreeBSD-current/src/sys/pci/if_xl.c on ftp.cdrom.com. Yes, this is only your repository; you must be using some patches from Matt Dillon for NFS. The code in -current has the change that he made already in it (but with a different comment). Grab a fresh copy of if_xl.c and if_xlreg.h from -current and stick with it. -Bill -- = -Bill Paul(212) 854-6020 | System Manager, Master of Unix-Fu Work: wp...@ctr.columbia.edu | Center for Telecommunications Research Home: wp...@skynet.ctr.columbia.edu | Columbia University, New York City = It is not I who am crazy; it is I who am mad! - Ren Hoek, Space Madness = To Unsubscribe: send mail to majord...@freebsd.org with unsubscribe freebsd-current in the body of the message
Re: NFS Patch #8 for current available - new TCP fixes
Of all the gin joints in all the towns in all the world, Matthew Dillon had to walk into mine and say: (fanfair!) (Darth Vader's imperial march theme) NFS Patch #8 for -current is now available. This patch fixes serious bugs w/ NFS/TCP. Probably not *all* the failure conditions, but hopefully most of them. [...] Neither the 'de' nor the 'xl' ethernet drivers align the packet. The 'xl' driver conditionally aligns it for the alpha. Part of the patch fixes the 'xl' driver to unconditionally align the packet buffer in order to improve NFS performance. I could not do the same for the 'de' driver because I am unsure if the dec chipset can handle an unaligned start address.
Pccard support still works with newbus, right?
I'm currently working on a driver for the Lucent WaveLAN/IEEE 802.11 PCMCIA adapter, as part of a project for the COMET lab people here at Columbia. Lucent has a PCMCIA and an ISA version of this adapter, however the ISA version is really just a PCMCIA card fitted into an add-in PCMCIA controller card that mounts in an ISA slot (i.e. something that lets you plug PCMCIA devices into a desktop host that doesn't have any built-in PCMCIA slots). This adapter card has a Vadem 469 chip on it which, fortunately, is supported by our current pccard code. The system I'm testing on is running 3.0-RELEASE and I'm reasonably confident my code will work with 3.1-RELEASE too. As it stands now, I can get the card probed and attached when I start pccardd, and everything seems peachy. What I need to know is if anybody has tested the pccard support in 4.0-CURRENT now that all the newbus stuff has been rolled in. If the pccard support for the x86 still works under 'emulation' using an ISA device shim, then that's fine. If not... well, then I suppoose _I'm_ the one who's going to end up testing it. -Bill -- = -Bill Paul(212) 854-6020 | System Manager, Master of Unix-Fu Work: wp...@ctr.columbia.edu | Center for Telecommunications Research Home: wp...@skynet.ctr.columbia.edu | Columbia University, New York City = It is not I who am crazy; it is I who am mad! - Ren Hoek, Space Madness = To Unsubscribe: send mail to majord...@freebsd.org with unsubscribe freebsd-current in the body of the message
Re: Patched RealTek driver -- please test
Of all the gin joints in all the towns in all the world, Stephen Hocking-Senior Programmer PGS Tensor Perth had to walk into mine and say: Well, I nipped home over my lunch break gave it a try - some progress, of a sort. My NFS problems have gone away (at least under light activity), but it now seems rather sensitive to sending lots of stuff. The symptoms observed are a hard hang of the whole machine, no response to pings or keyboard action. I cant even break into DDB. How I reproduced this is as follows - get the netpipe program off ports, then set up a receiver on the non-realtek machine as follows - [chop] Sorry, I did traffic generation tests. I banged on it as hard as I could. I didn't have any problems with lockups. NPtcp -s -r Then on the RealTek machine do this - NPtcp -s -t -h non-realtek-hostname -P After about 5 or so lines of throughput stats, it dies in the bum. Don't tell me 'after about 5 lines.' Tell me in minutes. Seconds. Hours. Weeks. How _LONG_ does it run before it locks up!! And what do the stats sat anyway! Alright, now see here: I put up yet another test version. This one has code in every conceivable place where the driver might get caught in an infinite loop (which is the only thing that might cause the system to appear to hang, short of executing a halt instruction). Same place (www.freebsd.org:/~wpaul/RealTek/test/3.0). Try _THIS_ version. Tell me if it locks up. Watch the console. Tell me if you see any messages (i.e. rl0: looping in foo). If you do, report them to me VERBATIM. (No paraphrasing, no inventing new messages which you think represent what you saw. VERBATIM. Or else.) If it still locks up and you don't see any errors, then the problem you're experiencing is either not related to the driver, or is related in some way that only manifiests itself on your hardware and which I will never be able to reproduce (since your hardware is over there, and I'm way over here). Maybe it's some peculiar kind of hardware fault. Maybe your PCI chipset blows. Maybe the RealTek blows when used in combination with your PCI chipset. Regardless, it's a condition which I can't reproduce on my own hardware, and if I can't reproduce the problem, I can't fix it. -Bill -- = -Bill Paul(212) 854-6020 | System Manager, Master of Unix-Fu Work: wp...@ctr.columbia.edu | Center for Telecommunications Research Home: wp...@skynet.ctr.columbia.edu | Columbia University, New York City = Mulder, toads just fell from the sky! I guess their parachutes didn't open. = To Unsubscribe: send mail to majord...@freebsd.org with unsubscribe freebsd-current in the body of the message
Re: Patched RealTek driver -- please test
Of all the gin joints in all the towns in all the world, Stephen Hocking-Senior Programmer PGS Tensor Perth had to walk into mine and say: OK - I've banged on the new version with extra debug messages and it still locks up, but without any messages! Grr. I can only conclude that the 486MB BIOS is iffy. I haven't tried any other slots in the MB, but have tried various PCI settings, all to no avail. I have swapped the de0 and the rl0 between machines, and the rl0 is happy in it's new home - hasn't fallen over, although it's netpipe performance sucks with very small packets. I think we can write this one off as a faulty PCI implementation on the 486 motherboard. Thanks for your patience time. I have one more thing you can try for me (I hope it's not too much trouble to put the NIC back where it was). This latest test version has a small change to rl_start() which modifies the transmit behavior: instead of trying to fill up as many transmit 'descriptors' as possible, it should never be possible now to have more than one transmission in progress at any one time. That is, instead of trying to fill up all four TX 'descriptors' and issue four transmissions in rapid succession and then waiting to clean up the buffers later, it issues a single transmission, waits for completion, then issues another transmission, waits for completion, and so on. This will probably worsen performance at 100Mbps, but it would be interesting to see if it fixes your problem. Please try it and let me know what happens. (I left the loop detection code in place just for giggles.) -Bill -- = -Bill Paul(212) 854-6020 | System Manager, Master of Unix-Fu Work: wp...@ctr.columbia.edu | Center for Telecommunications Research Home: wp...@skynet.ctr.columbia.edu | Columbia University, New York City = Mulder, toads just fell from the sky! I guess their parachutes didn't open. = To Unsubscribe: send mail to majord...@freebsd.org with unsubscribe freebsd-current in the body of the message
Re: Patched RealTek driver -- please test
Whoops... I just noticed I made a small boo-boo in that last patch, which I just fixed. When downloading, make sure you get the version of if_rl.c with the following ID strings: for 3.0: $Id: if_rl.c,v 1.28 1999/04/06 15:29:01 wpaul Exp $ for 2.2: $Id: if_rl.c,v 1.17 1999/04/06 15:29:26 wpaul Exp $ Sorry about that. -Bill -- = -Bill Paul(212) 854-6020 | System Manager, Master of Unix-Fu Work: wp...@ctr.columbia.edu | Center for Telecommunications Research Home: wp...@skynet.ctr.columbia.edu | Columbia University, New York City = Mulder, toads just fell from the sky! I guess their parachutes didn't open. = To Unsubscribe: send mail to majord...@freebsd.org with unsubscribe freebsd-current in the body of the message
Patched RealTek driver -- please test
Okay, today (and over part of the weekend) I ripped the RealTek driver apart and put it back together again, this time in a hopefully working form. The temporary patch version is at the following locations: http://www.freebsd.org/~wpaul/RealTek/test/2.2 source for FreeBSD 2.2.x http://www.freebsd.org/~wpaul/RealTek/test/3.0 source for FreeBSD 3.x/4.x If you've been having problems with RealTek 8139 cards, please try this version and let me know if it makes a differences. All of the main changes are in the transmit code. I also think I know why the transmitter was getting wedged. The sort answer: I'm a twit. The long answer: when ifinit() was changed so that it warned about ifq_maxlen not being set by the driver, I went in and set it to RL_TX_LIST_CNT - 1, which is approximately what I'd done for the other drivers. However the RealTek only has four transmit 'descriptors' which means the ifq_maxlen for the interface was being set to the ridiculously low value of 3. This causes transmissions of large packet sequences to quickly fill up the send queue. (For example, try doing a ping -s 8100 some host and see if it actually works. My bet is that it won't, because this will generate a series of six or seven frames in rapid succession, and after the first 3 or 4, the queue fills up.) In addition to fixing this, I also re-wrote rl_start() and rl_txeof() to hopefully be a little simpler and less brain damaged. I still need to fill in rl_txeoc() correctly, but once I know for sure that I've fixed all the major problems, I can probably do that in an hour or two. I experimented with this driver version using a FreeBSD 2.2.7 server and a FreeBSD 3.0 client (sorry, it's all I had) and I couldn't get NFS to hang. I also bombarded the server with a TCP stream from the client while the NFS test was running and it didn't lock up. -Bill -- = -Bill Paul(212) 854-6020 | System Manager, Master of Unix-Fu Work: wp...@ctr.columbia.edu | Center for Telecommunications Research Home: wp...@skynet.ctr.columbia.edu | Columbia University, New York City = Mulder, toads just fell from the sky! I guess their parachutes didn't open. = To Unsubscribe: send mail to majord...@freebsd.org with unsubscribe freebsd-current in the body of the message
Re: More on rl0 woes
Of all the gin joints in all the towns in all the world, Alfred Perlstein had to walk into mine and say: On 4 Apr 1999, Murata Shuuichirou wrote: In message 199904040913.raa26...@ariadne.tensor.pgs.com, `shock...@prth.pgs.com' wrote: On the offchance that mty problems were chipset related, I swapped the RealTek with the de0 card in my other machine, a 233MHz k6. It being a socket 7 mboard presumably has a later PCI bios. Still the same symptoms - hangs on NFS access. These can be interrupted and other network traffic continues fine. To reproduce, take your RealTek equipped machine and place a copy of /usr/src on it. Export /usr/src so that it can be NFS mounted by other machines. From the other machines, do an ls -CFR of /usr/src. It will hang partway through. I have probably same problem here. NFS hangs and other network traffic is still alive. Though, my situation differs a little from yours. I have two RealTek NFS clients and NFS server has another chip. Both of RealTek NFS clients (Celeron 300MHz and MediaGX 266MHz) have the problem. Wait wait wait! You're claiming you have the same problem your configurations are different! He's saying he has a problem when the RealTek card is in the server, and the client is using some other NIC. You're saying you have a problem when the RealTek in the client, and the server is using some other NIC. These are completely different scenarios! I will not allow a bunch of you to all jump on me at the same time claiming you've all got the same problem when you CLEARLY DON'T! EVERYBODY PAY ATTENTION TO WHAT THE OTHER IS SAYING AND WAIT YOUR DAMN TURN OR I'M JUST GOING TO IGNORE THE WHOLE LOT OF YOU AND YOU CAN FIX YOUR OWN DAMN PROBLEMS! I can't believe I'm getting so worked up because you cheap bastards insist on buying the absolute worst network adapter in the world. Go buy an ASIX card for crying out loud. They're cheap, and they actually work worth a damn. Now, as punishment for making me mad, I'm going to address Steven's problem, and the rest of you can just lump it. There are things you should be checking when your problem happens. What does ifconfig rl0 show you? Is the OACTIVE flag set? What does netstat -in say? What does netstat -m say? You say 'traffic continues normally.' This is very confusing: SHOW ME AN EXAMPLE OF WHAT YOU MEAN. When the NFS transfer stops, can you still ping the server host, or do you have to interrupt the transfer and wait for a while before you can communicate with the server again? Can you run tcpdump on the client and observe what happens when the transfer stops? Is the client still sending out read requests? Is the server replying or not? Are the replies garbled? Is there a lot of other activity on the network at the time? Can you initiate another (smaller) NFS transfer when the first one wedges? You have to give me as much information as you can. I need to be able to clearly identify the symptoms of the problem with out all the 'oh my god it doesn't work and I tried this and this and this' crap. -Bill -- = -Bill Paul(212) 854-6020 | System Manager, Master of Unix-Fu Work: wp...@ctr.columbia.edu | Center for Telecommunications Research Home: wp...@skynet.ctr.columbia.edu | Columbia University, New York City = Mulder, toads just fell from the sky! I guess their parachutes didn't open. = To Unsubscribe: send mail to majord...@freebsd.org with unsubscribe freebsd-current in the body of the message
Re: RealTek driver woes
rights reserved. FreeBSD 4.0-CURRENT #1: Thu Mar 25 21:37:03 WST 1999 t...@bloop.craftncomp.com:/data/src/sys/compile/bleep Timecounter i8254 frequency 1193182 Hz CPU: AMD Enhanced Am486DX4 Write-Through (486-class CPU) Origin = AuthenticAMD Id = 0x484 Stepping=4 Features=0x1FPU real memory = 16777216 (16384K bytes) avail memory = 13750272 (13428K bytes) Preloaded elf kernel kernel at 0xc02c3000. Preloaded elf module linux.ko at 0xc02c309c. Probing for devices on PCI bus 0: chip0: Host to PCI bridge (vendor=10b9 device=1445) rev 0x00 on pci0.0.0 rl0: RealTek 8139 10/100BaseTX rev 0x10 int a irq 9 on pci0.4.0 rl0: Ethernet address: 00:00:e8:53:a2:3e rl0: autoneg complete, link status good (half-duplex, 10Mbps) I'm going to go _way_ out on a limb here and suggest that you try and coerce your system BIOS to assign the RealTek interface an IRQ other than 9. I say 'try' because sometimes you aren't given the option to configure this. Usually there's some configuration menu that lets you 'reserve IRQs for legacy/ISA devices.' You should put IRQ 9 on the reserved list so that the BIOS will pick another. Hopefully it will be an IRQ that isn't shared with another device. If there aren't any free ones left. you can try disabling one of the serial ports in order to free up an IRQ (e.g. turn off COM2). -Bill -- = -Bill Paul(212) 854-6020 | System Manager, Master of Unix-Fu Work: wp...@ctr.columbia.edu | Center for Telecommunications Research Home: wp...@skynet.ctr.columbia.edu | Columbia University, New York City = Mulder, toads just fell from the sky! I guess their parachutes didn't open. = To Unsubscribe: send mail to majord...@freebsd.org with unsubscribe freebsd-current in the body of the message
Re: 3com Nic Problems
Of all the gin joints in all the towns in all the world, RT had to walk into mine and say: Heres are the nic combinations I've tried in one machine: 3com 905 (at 100mbit) 3com 905b (at 10mbit) 3com 905b (at 100mbit) 3com 905 (at 10mbit) 3com 905 (at 100mbit) realtek 8039 (at 10mbit) 3com 905b (at 100mbit) realtek (at 10mbit) 3com 905b (at 100mbit) ne2000 (at 10mbit) In each case the 3com operating at 100mbit failed to do so smoothly. In some cases I receive device time-outs, on bootup xl0 command not completed, Ignore the command failed to complete message. It's not fatal. It's amazing how many people get all flustered when they see this. and in some cases no warning at all.. It's timing dependent. 100mbit will peak at 4MB/s, then all of a sudden stop, then go racing off, then stop... This is a classic I didn't configure my card correctly symptom. For the Nth time, both sides of the link must use the same modes. You can't mix and match speed and duplex settings like they're tuneable parameters that you can tweak to get more performance. Both sides must match. If they can't figure out how to match on their own using autonegotiation, make them match. (You're the one with the brain and the reasoning skills after all. Well, that's the theory anyway.) If you want two 100Mbps NICs connected back to back (via crossover cable), then do the following on _both_ sides: # ifconfig xl0 media 100baseTX mediaopt full-duplex The symptoms you're describing indicate a condition where one side of the link is set for full duplex and the other is set for half. This doesn't work: when the NIC is in full duplex mode, it turns off the collision detection logic (since in theory you're not supposed to have collisions on a full duplex link). If one side is doing collision detection and the other isn't, you tend to see 'stuttery' performance. If you're connecting a NIC to a 10Mbps hub (that's hub as in repeater, not a switch), then you must set the NIC for 10Mbps half-duplex: # ifconfig xl0 media 10baseT/UTP mediaopt half-duplex If you're connecting a NIC to a 100Mbps hub (again, that's a hub as in repeater, not a switch), then you must set the NIC for 100Mbps half-duplex: # ifconfig xl0 media 100baseTX mediaopt half-duplex If you're connecting a NIC to a 100Mbps switch port, then in theory the switch port and the NIC will perform NWAY autoselection and get the modes right on their own. Sometimes this fails though, in which case you should manually configure the switch for the mode you want and configure the NIC to match. When you connect two NICs back to back and one of them supports NWAY and the other doesn't, the NIC with NWAY autonegotiation support can sometimes get the mode wrong. In this case, you should manually configure the NIC with NWAY so that agrees with the other NIC's modes using ifconfig as shown above. The NE2000 and 8039 boards don't support full duplex. So if you connect a one of these to the 3c90x NICs via crossover cable, then you must force the 3Com nic to 10Mbps half-duplex: # ifconfig xl0 media 10baseT/UTP mediaopt half-duplex This is because the NE2K and 8039 (ne2k compatible) don't support full duplex mode. -Bill -- = -Bill Paul(212) 854-6020 | System Manager, Master of Unix-Fu Work: wp...@ctr.columbia.edu | Center for Telecommunications Research Home: wp...@skynet.ctr.columbia.edu | Columbia University, New York City = It is not I who am crazy; it is I who am mad! - Ren Hoek, Space Madness = To Unsubscribe: send mail to majord...@freebsd.org with unsubscribe freebsd-current in the body of the message
2nd request for review for vlan changes
This is a second request for review for my proposed if_vlan updates. Since I tweaked a couple of different things, I placed a tarball with the sources at http://www.freebsd.org/~wpaul/VLAN/vlan.tar.gz (or, for those of you with freebsd.org accounts, ~wpaul/public_html/VLAN/vlan.tag.gz). This contains updated sources if_vlan.c, if_vlan_var.h and ifconfig. The changes are as follows: - If the IFF_LINK0 flag is set on a vlan pseudo-interface, it does not peform any header mangling to create the 802.1Q encapsulation, instead allowing the underlying parent driver to do it. (Again, this is mainly for the Tigon driver that I'm working on, which can do its own vlan tag insertion and extraction in firmware.) (I know I used LINK1 before; that's because I forgot that the flag values were zero-based and that LINK0 was really the first one. :) Note: vlan_start() will set rcvif on the outbound mbuf so that the parent driver can find the vlan interface where it originated and find the vlan tag. In order to avoid having the driver possibly follow an uninitialized rcvif pointer, vlan_start() will also set the M_PROTO1 flag in the mbuf to signal to the parent driver that the rcvif is valid. - Implemented vlan_input_tag(), for use with interfaces that know how to do vlan tag extraction and de-capsulation on their own. Works like vlan_input(), except it accepts a third argument, t, which is the extracted vlan tag; given the tag, it tracks down the appropriate vlan interface and sends it the frame. - Added support for multicast. The vlan pseudo interface adds entries to the parent's multicast filter using if_addmulti() and keeps a private list of those groups which it has added. If an update is done, the private list is removed with if_delmulti(), and the parent is programmed again (which again updates the private list). This is a little messy in principle, but the code is fairly simple. - Implemented vlan_unconfig(), the opposite of vlan_config(). When setting up a vlan/parent association with SIOCSETVLAN, the parent's ethernet address and other info are copied to the vlan pseudo interface. This should be removed when the association is broken. - Changed vlan_input()/vlan_input_tag() and vlan_start() to update ifp-if_ipackets and ifp-if_opackets respecively. - If the output queue of the parent interface is full in vlan_start(), increment ifp-if_oerrors, free the mbuf, and continue, instead of just falling through and trying to queue the mbuf even though we know the output queue is full. - Modified ifconfig(8) to allow setting the vlan tag and parent interface of a vlan interface, and to display the interface settings. Three new commands have been added: vlan, vlandev and -vlandev. To set up a vlan interface, you can do this: # ifconfig vlan0 vlan 12345 vlandev foo0 To break the association, you can do this: # ifconfig vlan0 -vlandev foo0 You have to set vlan and vlandev at the same time, since that's how the SIOCSETVLAN ioctl works. Also updated the ifconfig.8 man page. - Fixed a bug in ifconfig. The setifflags() function does a SIOCGIFFLAGS on the ifreq structure that gets passed to it, however this clobbers part of it (namely sa_family) because everything after ifr_name is just one big union. This causes later portions of ifconfig that check the sa_family value to get confused. In my case, the effect was that when I did 'ifconfig vlan0 link0,' ifconfig printed out a line of appletalk status information because sa_family had gotten mangled to 16 (AF_APPLETALK). I still need to write a vlan(4) man page. -Bill -- = -Bill Paul(212) 854-6020 | System Manager, Master of Unix-Fu Work: wp...@ctr.columbia.edu | Center for Telecommunications Research Home: wp...@skynet.ctr.columbia.edu | Columbia University, New York City = It is not I who am crazy; it is I who am mad! - Ren Hoek, Space Madness = To Unsubscribe: send mail to majord...@freebsd.org with unsubscribe freebsd-current in the body of the message
Re: Request for review: changes to if_vlan.c
Of all the gin joints in all the towns in all the world, Garrett Wollman had to walk into mine and say: On Sat, 27 Feb 1999 23:37:10 -0500 (EST), Bill Paul wp...@skynet.ctr.columbia.edu said: Interested persons should review the diffs and pipe up if they have some passionate argument argument against them. Patches look mostly fine. Okay. I noticed one other thing while playing around today: when calling SIOCSETVLAN to disassociate a vlan interface with a real interface, the code should probably be removing the MAC address in addition to everything else. (The parent interface's address is added when the association is made, but not removed when the association is broken.) Also, I should point out that while if_vlan provides the necessary kernel hackery to implement VLANs, there isn't any user space utility for configuring vlan interfaces (ifconfig doesn't seem to have any vlan-specific code that I could see, and is no vlanconfig). I stopped development on vlan(4) when the switch we had that spoke 802.1Q was returned to the vendor at the end of our demo period. I have 28 on order right now, and should resume work on the driver after I get those switches deployed. For interfaces like yours, I would have preferred a subinterface mechanism, but I never had the time to implement that. Well... correct me if I'm wrong, but the current code looks like it does implement subinterfaces of a fashion. I could hack the driver to do what if_vlan.c does, but why do that if if_vlan.c already exists and does almost exactly what I need it to do. Care to implement GVRP while you're at it? Care to tell me what that is? :) (No, I don't really want to do it now, whatever it is.) There also is no vlan(4) man page. See above. I could probably write them myself, if you like. otherwise I'm going to take it upon myself to hack up ifconfig and write the man pages myself. I believe ifconfig is the wrong program for the task. There should be a separate vlanconfig program. (I wrote one, but it's on my laptop where I can't conveniently get to it right now.) I don't know about that. It seems to me ifconfig is precisely the right program to use for this task. I already hacked up a local copy of ifconfig to support it: router3# ./ifconfig vlan0 vlan 1234 vlandev ti0 router3# ./ifconfig vlan0 vlan0: flags=1843UP,BROADCAST,RUNNING,SIMPLEX,LINK0 mtu 1496 ether 00:60:08:21:53:6c vlan: 1234 parent interface: ti0 Took me only about an hour or so to do it (and I was watching TV at the time). Tell me why a separate program is preferable. There are a couple of areas where vlan(4) needs to get some help from the underlying driver: - Promiscuous mode doesn't work at all. It ought to be possible to put just a specific VLAN into promiscuous mode, without affecting all the others. This probably involves repeating all of the BPF does-this-packet-look-like-mine? gluck from the physical interface drivers. Hm. Well, it seems to me that the real problem is that to get promiscuous mode for the vlan interface, you have to put the parent in promiscuous mode too. You can do that, but then the parent interface driver gets hit with extra traffic that it doesn't want. - Multicast is similarly broken (and a more serious weakness). There needs to be a mechanism to pass multicast group membership down to the underlying driver. It may also be necessary to do duplicate suppression, which is a real challenge. It may not be that hard. I could probably do it, if you wanted me to. I wouldn't enjoy it, but I could do it. !* If the LINK1 flag is set, it means the underlying interface !* can do VLAN tag insertion itself and doesn't require us to !* create a special header for it. In this case, we just pass Are we certain that all drivers are now doing if_media and no longer using IFF_LINK1 for that purpose? I think you may have missed the point (or maybe I didn't explain it well). I want to set the LINK1 bit on the _VLAN_ interface (vlan0, etc...), _NOT_ the parent interface. The problem is that the parent doesn't want packets with the ethernet vlan header on them, and we need some way to tell the vlan intertface Don't bother with rewriting the packet header; the parent interface will do it for you. Using the LINK1 (actually, I probably should have said LINK0; LINK1 was just the first thing that popped into my mind last night). The vlan interfaces don't use ifmedia so there is no conflict. The idea is, the user (er, admin) knows that he's going to be attaching to a parent device that can handle vlan header mangling internally, so he configures the vlan interfaces attached to this particular parent with a LINK flag that tells it skip the header mangling. Nothing gets changed on the parent interface. Grammar fault -- core dumped. (The wording was correct as it was.) That's why I didn't commit anything yet
Request for review: changes to if_vlan.c
While trying to figure out a way to support VLANs with the Alteon gigabit NIC, I stumled across /sys/net/if_vlan.c. It seems simple enough: pseudo IP interfaces are created which interacts with an underlying physical interface and fixes up frame headers to deal with VLAN tag encapsulation and extraction. Unfortunately, the code totally fails to take into account the possibility that the underlying interface can perform VLAN tag insertion and extraction all by itself. The Alteon gigabit NIC is just such an example: if you select 'VLAN assist' when configuring the send/receive rings, the firmware will handle all the VLAN naughty bits for you. However, I still want to be able to use the VLAN pseudo interfaces that if_vlan.c. provides, so to that end I tweaked it a little to handle interfaces that do their own VLAN tag handling. The diffs are appended to this message. What I did was change vlan_output() so that if the IFF_LINK1 flag is set on the vlan pseudo interface, the packet will be forwarded unmolested to the interface's output routine, with a pointer to the vlan's ifnet structure in the mbuf packet header. The underlying driver code can then track down the vlan interface's structure information and find the vlan tag number (which it needs to provide in the TX descriptor). Reception is another issue; the existing vlan_input() routine expects that it can find the 'parent' driver's ifnet structure in the mbuf packet header and tries to extract the vlan tag directly from the ethernet header (which it then uses to track down the associated vlan pseudo interface). With the Alteon NIC, the driver's receive routine already knows the tag number, but in order to get it to the vlan code it would have to mangle the header on the packet that it just received, which is pointless given that the firmware already went to the trouble of finding it to begin with. Consequently, I added a second input routine, vlan_input_tag(), which works almost like the original vlan_input() except that the tag can pe specified explicitly as an argument. Lastly, I found what I think is a bug in vlan_output(). If the code finds that the parent interface's output queue is full, it does an IF_QFULL(), but then goes on try and queue the packet with IF_ENQUEUE() anyway. I think this is wrong, and I inserted a 'continue' statement rahter than let the code fall through like it was before. I'm not certain this is the correct fix though... possibly vlan_output() should return with ENOBUFS. Interested persons should review the diffs and pipe up if they have some passionate argument argument against them. Also, I should point out that while if_vlan provides the necessary kernel hackery to implement VLANs, there isn't any user space utility for configuring vlan interfaces (ifconfig doesn't seem to have any vlan-specific code that I could see, and is no vlanconfig). There also is no vlan(4) man page. If whoever originally added the vlan code has plans to fix these two things, then by all means do so, otherwise I'm going to take it upon myself to hack up ifconfig and write the man pages myself. -Bill -- = -Bill Paul(212) 854-6020 | System Manager, Master of Unix-Fu Work: wp...@ctr.columbia.edu | Center for Telecommunications Research Home: wp...@skynet.ctr.columbia.edu | Columbia University, New York City = It is not I who am crazy; it is I who am mad! - Ren Hoek, Space Madness = *** if_vlan.c.orig Sat Feb 27 22:52:58 1999 --- if_vlan.c Sat Feb 27 23:04:41 1999 *** *** 39,44 --- 39,48 * ether_output() left on our output queue queue when it calls * if_start(), rewrite them for use by the real outgoing interface, * and ask it to send them. + * + * XXX It's incorrect to assume that we must always kludge up + * headers on the physical device's behalf: some devices support + * VLAN tag insersion and extraction in firmware. */ #include vlan.h *** *** 113,124 ifp-if_resolvemulti = 0; } } ! PSEUDO_SET(vlaninit, if_vlan); static void vlan_ifinit(void *foo) { ! ; } static void --- 117,128 ifp-if_resolvemulti = 0; } } ! PSEUDO_SET(vlaninit, if_vlan) static void vlan_ifinit(void *foo) { ! return; } static void *** *** 142,163 bpf_mtap(ifp, m); #endif /* NBPFILTER 0 */ - M_PREPEND(m, EVL_ENCAPLEN, M_DONTWAIT); - if (m == 0) - continue; - /* M_PREPEND takes care of m_len, m_pkthdr.len for us */ - /* !* Transform the Ethernet header into an Ethernet header !* with 802.1Q encapsulation