Re: gptboot rewrite, bootonce, etc.
On 9/17/10 4:45 PM, Pawel Jakub Dawidek wrote: Hi. My company was in need for functionality similar to nextboot(8), but on boot loader level, so we can have two partitions we boot from where one is known to be good and the other is used for upgrades. We upgrade by dd(1)ing entire partition image onto unused partition, we mark it as try-to-boot-from-it-but-only-once, reboot and if we fail to boot from the new partition, we fall back to the old, good partition. If we succeed on the other hand, we mark the new partition as our boot partition and mark the other one as unused. Well, how hard can it be? After around two weeks of work, I ended up rewriting gptboot in large parts, reorganizing a lot of code, improving and extending gpart a bit and implementing desire functionality. Here is the patch for review and test: http://people.freebsd.org/~pjd/patches/gptboot.patch The list of changes: - Split code shared by almost any boot loader into separate files and clean up most layering violations: sys/boot/i386/common/rbx.h: RBX_* defines OPT_SET() OPT_CHECK() sys/boot/common/util.[ch]: memcpy() memset() memcmp() bcpy() bzero() bcmp() strcmp() strncmp() [new] strcpy() strcat() strchr() strlen() printf() sys/boot/i386/common/cons.[ch]: ioctrl putc() xputc() putchar() getc() xgetc() keyhit() [now takes number of seconds as an argument] getstr() sys/boot/i386/common/drv.[ch]: struct dsk drvread() drvwrite() [new] drvsize() [new] sys/boot/common/crc32.[ch] [new] sys/boot/common/gpt.[ch] [new] - Teach gptboot and gptzfsboot about new files. I haven't touched the rest, but there is still a lot of code duplication to be removed. - Implement full GPT support. Currently we just read primary header and partition table and don't care about checksums, etc. With the patch we verify checksums of primary header and primary partition table and if there is a problem we fall back to backup header and backup partition table. - Clean up most messages to use prefix of boot program, so in case of an error we know where the error comes from, eg.: gptboot: unable to read primary GPT header - If we can't boot, print boot prompt only once and not every five seconds. - Introduce three new GPT attributes: bootme - this is bootable partition bootonce - try to boot from this partition only once bootfailed - we failed to boot from this partition - Extend gpart to allow to manipulate new attributes: gpart set -a bootme -i 3 ada0 gpart set -a bootonce -i 4 ada0 gpart unset -a bootfailed -i 2 ada0 Note, that setting 'bootonce' attribute automatically sets 'bootme' attribute. - Change boot order of gptboot to the following: 1. Try to boot from all the partitions that have both 'bootme' and 'bootonce' attributes one by one. 2. Try to boot from all the partitions that have only 'bootme' attribute one by one. 3. If there are no partitions with 'bootme' attribute, boot from the first UFS partition. - The 'bootonce' functionality is implemented in the following way: 1. Walk through all the partitions and when 'bootonce' attribute is found without 'bootme' attribute, remove 'bootonce' attribute and set 'bootfailed' attribute. 'bootonce' attribute alone means that we tried to boot from this partition, but boot failed after leaving gptboot and machine was restarted. 2. Find partition with both 'bootme' and 'bootonce' attributes. 3. Remove 'bootme' attribute. 4. Try to execute /boot/loader or /boot/kernel/kernel from that partition. If succeeded we stop here. 5. If execution failed, remove 'bootonce' and set 'bootfailed'. 6. Go to 2. If whole boot succeeded there is new /etc/rc.d/gptboot script that will log all partitions that we failed to boot from (the ones with 'bootfailed' attribute) and will remove this attribute. It will also find partition with 'bootonce' attribute - this is the partition we booted from successfully. The script will log success and remove the attribute. All the GPT updates we do here goes to both primary and backup GPT if they are valid. We don't touch headers or partition tables when checksum doesn't match. Any comments or suggestions? Be aware that at this point I'm soo full of boot loaders and I'm not looking for
gptboot rewrite, bootonce, etc.
Hi. My company was in need for functionality similar to nextboot(8), but on boot loader level, so we can have two partitions we boot from where one is known to be good and the other is used for upgrades. We upgrade by dd(1)ing entire partition image onto unused partition, we mark it as try-to-boot-from-it-but-only-once, reboot and if we fail to boot from the new partition, we fall back to the old, good partition. If we succeed on the other hand, we mark the new partition as our boot partition and mark the other one as unused. Well, how hard can it be? After around two weeks of work, I ended up rewriting gptboot in large parts, reorganizing a lot of code, improving and extending gpart a bit and implementing desire functionality. Here is the patch for review and test: http://people.freebsd.org/~pjd/patches/gptboot.patch The list of changes: - Split code shared by almost any boot loader into separate files and clean up most layering violations: sys/boot/i386/common/rbx.h: RBX_* defines OPT_SET() OPT_CHECK() sys/boot/common/util.[ch]: memcpy() memset() memcmp() bcpy() bzero() bcmp() strcmp() strncmp() [new] strcpy() strcat() strchr() strlen() printf() sys/boot/i386/common/cons.[ch]: ioctrl putc() xputc() putchar() getc() xgetc() keyhit() [now takes number of seconds as an argument] getstr() sys/boot/i386/common/drv.[ch]: struct dsk drvread() drvwrite() [new] drvsize() [new] sys/boot/common/crc32.[ch] [new] sys/boot/common/gpt.[ch] [new] - Teach gptboot and gptzfsboot about new files. I haven't touched the rest, but there is still a lot of code duplication to be removed. - Implement full GPT support. Currently we just read primary header and partition table and don't care about checksums, etc. With the patch we verify checksums of primary header and primary partition table and if there is a problem we fall back to backup header and backup partition table. - Clean up most messages to use prefix of boot program, so in case of an error we know where the error comes from, eg.: gptboot: unable to read primary GPT header - If we can't boot, print boot prompt only once and not every five seconds. - Introduce three new GPT attributes: bootme - this is bootable partition bootonce - try to boot from this partition only once bootfailed - we failed to boot from this partition - Extend gpart to allow to manipulate new attributes: gpart set -a bootme -i 3 ada0 gpart set -a bootonce -i 4 ada0 gpart unset -a bootfailed -i 2 ada0 Note, that setting 'bootonce' attribute automatically sets 'bootme' attribute. - Change boot order of gptboot to the following: 1. Try to boot from all the partitions that have both 'bootme' and 'bootonce' attributes one by one. 2. Try to boot from all the partitions that have only 'bootme' attribute one by one. 3. If there are no partitions with 'bootme' attribute, boot from the first UFS partition. - The 'bootonce' functionality is implemented in the following way: 1. Walk through all the partitions and when 'bootonce' attribute is found without 'bootme' attribute, remove 'bootonce' attribute and set 'bootfailed' attribute. 'bootonce' attribute alone means that we tried to boot from this partition, but boot failed after leaving gptboot and machine was restarted. 2. Find partition with both 'bootme' and 'bootonce' attributes. 3. Remove 'bootme' attribute. 4. Try to execute /boot/loader or /boot/kernel/kernel from that partition. If succeeded we stop here. 5. If execution failed, remove 'bootonce' and set 'bootfailed'. 6. Go to 2. If whole boot succeeded there is new /etc/rc.d/gptboot script that will log all partitions that we failed to boot from (the ones with 'bootfailed' attribute) and will remove this attribute. It will also find partition with 'bootonce' attribute - this is the partition we booted from successfully. The script will log success and remove the attribute. All the GPT updates we do here goes to both primary and backup GPT if they are valid. We don't touch headers or partition tables when checksum doesn't match. Any comments or suggestions? Be aware that at this point I'm soo full of boot loaders and I'm not looking for much more work in this area, so small tweaks are fine, but bigger thi
Re: Extend ktrace/kdump output
On Fri, Sep 17, 2010 at 10:36 PM, Kostik Belousov wrote: > On Fri, Sep 17, 2010 at 09:55:26PM +0200, Norberto Lopes wrote: >> Hi. >> I've been taking a look at ktrace and kdump in order to get (1) familiar >> with the sources and (2) to finally try to give back something to the >> community. >> >> So far from what I've seen, and after reading this thread >> http://lists.freebsd.org/pipermail/freebsd-arch/2006-April/005107.html it >> seems that most of those points got done. >> >> To warm up I changed the output of the stat structure in order to provide me >> with the device name (something I actually find useful for me sometimes) >> >> Instead of: >> 22596 cat STRU struct stat {dev=89, ino=3320836, mode=-r--r--r-- , >> nlink=1, uid=0, gid=0, atime=1284725358, stime=1284485510, ctime=1284485510, >> birthtime=1284485509, size=1172220, blksize=16384, blocks=2336, >> flags=0x2 } >> >> I get this now (including major and minor): >> 22596 cat STRU struct stat {dev= (/dev/ad4s1a), >> ino=3320836, mode=-r--r--r-- , nlink=1, uid=0, gid=0, atime=1284725358, >> stime=1284485510, ctime=1284485510, birthtime=1284485509, size=1172220, >> blksize=16384, blocks=2336, flags=0x2 } >> >> I wouldn't mind having someone help me whenever and if I get stuck on the >> technical side (*wink* Alexander Leidinger *wink*) and also to give me more >> insight on what the road to help in this should be. >> >> P.S.: I'm still going through "man style" hence no patch attached. If anyone >> finds this one useful, I'll reply with the patch though. >> > How do you look up the device name by st_dev ? Note that the number is > generated by devfs at the moment of cdev creation. It is only valid on > the machine where stat(2) is done, and only due to the next reboot. > Through a really ugly hack... opendir("/dev") readdir("/dev") go through them and find the one... Yes, I know, painful and ugly, but as I usually use kdump with no reboots between analysis (I hardly ever reboot actually), and because I find it exhausting to keep going back to look up the device name, this kept me happy enough. :) ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: Extend ktrace/kdump output
On Fri, Sep 17, 2010 at 09:55:26PM +0200, Norberto Lopes wrote: > Hi. > I've been taking a look at ktrace and kdump in order to get (1) familiar with > the sources and (2) to finally try to give back something to the community. > > So far from what I've seen, and after reading this thread > http://lists.freebsd.org/pipermail/freebsd-arch/2006-April/005107.html it > seems that most of those points got done. > > To warm up I changed the output of the stat structure in order to provide me > with the device name (something I actually find useful for me sometimes) > > Instead of: > 22596 cat STRU struct stat {dev=89, ino=3320836, mode=-r--r--r-- , > nlink=1, uid=0, gid=0, atime=1284725358, stime=1284485510, ctime=1284485510, > birthtime=1284485509, size=1172220, blksize=16384, blocks=2336, flags=0x2 > } > > I get this now (including major and minor): > 22596 cat STRU struct stat {dev= (/dev/ad4s1a), > ino=3320836, mode=-r--r--r-- , nlink=1, uid=0, gid=0, atime=1284725358, > stime=1284485510, ctime=1284485510, birthtime=1284485509, size=1172220, > blksize=16384, blocks=2336, flags=0x2 } > > I wouldn't mind having someone help me whenever and if I get stuck on the > technical side (*wink* Alexander Leidinger *wink*) and also to give me more > insight on what the road to help in this should be. > > P.S.: I'm still going through "man style" hence no patch attached. If anyone > finds this one useful, I'll reply with the patch though. > How do you look up the device name by st_dev ? Note that the number is generated by devfs at the moment of cdev creation. It is only valid on the machine where stat(2) is done, and only due to the next reboot. pgpXkvVW36C35.pgp Description: PGP signature
Extend ktrace/kdump output
Hi. I've been taking a look at ktrace and kdump in order to get (1) familiar with the sources and (2) to finally try to give back something to the community. So far from what I've seen, and after reading this thread http://lists.freebsd.org/pipermail/freebsd-arch/2006-April/005107.html it seems that most of those points got done. To warm up I changed the output of the stat structure in order to provide me with the device name (something I actually find useful for me sometimes) Instead of: 22596 cat STRU struct stat {dev=89, ino=3320836, mode=-r--r--r-- , nlink=1, uid=0, gid=0, atime=1284725358, stime=1284485510, ctime=1284485510, birthtime=1284485509, size=1172220, blksize=16384, blocks=2336, flags=0x2 } I get this now (including major and minor): 22596 cat STRU struct stat {dev= (/dev/ad4s1a), ino=3320836, mode=-r--r--r-- , nlink=1, uid=0, gid=0, atime=1284725358, stime=1284485510, ctime=1284485510, birthtime=1284485509, size=1172220, blksize=16384, blocks=2336, flags=0x2 } I wouldn't mind having someone help me whenever and if I get stuck on the technical side (*wink* Alexander Leidinger *wink*) and also to give me more insight on what the road to help in this should be. P.S.: I'm still going through "man style" hence no patch attached. If anyone finds this one useful, I'll reply with the patch though. -- Norberto___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: tun(4) in -CURRENT: No buffer space available - race condition patch
>> John Baldwin wrote: > Oh, yes. I've updated the patch to remove D_NEEDGIANT. So far (last 24 hours) my tun(4) with your patch was very stable. I am updating it to remove D_NEEDGIANT. Thank you! //Marcin ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: Multiple hpet messages during boot
on 17/09/2010 18:36 M. Warner Losh said the following: > > so is there support for the following: Aye. > Index: subr_bus.c > === > --- subr_bus.c(revision 212791) > +++ subr_bus.c(working copy) > @@ -3996,9 +3996,11 @@ > arg, cookiep); > if (error != 0) > return (error); > + if (bootverbose == 0) > + return (0); > if (handler != NULL && !(flags & INTR_MPSAFE)) > device_printf(dev, "[GIANT-LOCKED]\n"); > - if (bootverbose && (flags & INTR_MPSAFE)) > + if (flags & INTR_MPSAFE) > device_printf(dev, "[MPSAFE]\n"); > if (filter != NULL) { > if (handler == NULL) -- Andriy Gapon ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Regarding pciids
I created hackish scripts to generate pci_vendors file from Boemler and Mares (pciids.sf.net) lists. I haven't found the Hart list. The results of the scripts are here: http://www.alexdupre.com/pci_vendors/mares.txt http://www.alexdupre.com/pci_vendors/boemler.txt http://www.alexdupre.com/pci_vendors/mares-boemler.txt http://www.alexdupre.com/pci_vendors/boemler-mares.txt The first two are generated from single lists, the last two are combined, with different preference order. -- Alex Dupre ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: Multiple hpet messages during boot
In message: <20100917055953.gf1...@pluto.vnode.local> Joel Dahl writes: : On 16-09-2010 8:28, John Baldwin wrote: : > On Wednesday, September 15, 2010 2:32:33 am Joel Dahl wrote: : > > I noticed this during boot (HEAD from yesterday): : > > : > > hpet0: [FILTER] : > > hpet0: [FILTER] : > > hpet0: [FILTER] : > > hpet0: [FILTER] : > > hpet0: [FILTER] : > > hpet0: [FILTER] : > > hpet0: [FILTER] : > > hpet0: [FILTER] : > > : > > Is it really necessary to print this 8 times? : > : > I'd actually like to remove the interrupt messages that say '[FILTER]' or : > '[GIANT]', etc. I think in general they only add clutter. : : Definitely agreed. Go for it. so is there support for the following: Index: subr_bus.c === --- subr_bus.c (revision 212791) +++ subr_bus.c (working copy) @@ -3996,9 +3996,11 @@ arg, cookiep); if (error != 0) return (error); + if (bootverbose == 0) + return (0); if (handler != NULL && !(flags & INTR_MPSAFE)) device_printf(dev, "[GIANT-LOCKED]\n"); - if (bootverbose && (flags & INTR_MPSAFE)) + if (flags & INTR_MPSAFE) device_printf(dev, "[MPSAFE]\n"); if (filter != NULL) { if (handler == NULL) Warner ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: Multiple hpet messages during boot
In message: <201009160828.35520@freebsd.org> John Baldwin writes: : On Wednesday, September 15, 2010 2:32:33 am Joel Dahl wrote: : > I noticed this during boot (HEAD from yesterday): : > : > hpet0: [FILTER] : > hpet0: [FILTER] : > hpet0: [FILTER] : > hpet0: [FILTER] : > hpet0: [FILTER] : > hpet0: [FILTER] : > hpet0: [FILTER] : > hpet0: [FILTER] : > : > Is it really necessary to print this 8 times? : : I'd actually like to remove the interrupt messages that say '[FILTER]' or : '[GIANT]', etc. I think in general they only add clutter. [GIANT] is just public shaming of drivers anyway. It has worked to get them all the major ones locked. Well, except for atkbd and psm... I'd be happy to toss them behind a bootverbose :) Warner ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: CFT - Xen infrastructure and block I/O improvements
On 9/17/2010 1:27 AM, Rui Paulo wrote: > On 17 Sep 2010, at 07:55, Rui Paulo wrote: > >> On 17 Sep 2010, at 02:44, Justin T. Gibbs wrote: >> >>> At Spectra Logic, we are using FreeBSD amd64 under Xen to serve storage >>> to other Xen domains. Over the past 9 months, we've made several changes >>> to FreeBSD's Xen support. These include: ... >> Justin, this is quite a big diff (16k lines). I wonder if you can create >> separate diffs (xenstore, blkback, xenbus, etc.) for easier review and >> commenting. > > '... and comment'. The bulk of the patch is due to code reorganization. Unfortunately SVN doesn't reflect the copies properly in a diff, but at least the diff will apply properly even in a non-SVN backed source tree. This original patch should be the best for testers. The following SVN operations were made to the source tree to clean up it's organization and to import functionality from the vendor (1 file): # Does not apply to FreeBSD's NewBus method for dealing with XenBus devices svn delete sys/xen/xenbus/init.txt # Linux version of backend XenBus service routines. Never ported to FreeBSD. # OBE: See xenbusb.c, xenbusb_if.m, xenbusb_front.c xenbusb_back.c svn delete sys/xen/xenbus/xenbus_probe_backend.c # Split XenStore into its own tree. XenBus is a software layer built on top # of XenStore. The old arrangement and the naming of some structures and # functions blurred these lines making it difficult to discern what services # are provided by which layer and what times these services are available # (e.g. during system startup and shutdown). mkdir sys/xen/xenstore svn add sys/xen/xenstore svn move sys/xen/xenbus/xenbus_dev.c sys/xen/xenstore/xenstore_dev.c svn copy sys/xen/xenbus/xenbusvar.h sys/xen/xenstore/xenstorevar.h svn move sys/xen/xenbus/xenbus_xs.c sys/xen/xenstore/xenstore.c # Split up XenBus code into methods available for use by client # drivers (xenbus.c) and code used by the XenBus "bus code" to # enumerate, attach, detach, and service bus drivers. svn move sys/xen/xenbus/xenbus_client.c sys/xen/xenbus/xenbus.c svn move sys/xen/xenbus/xenbus_probe.c sys/xen/xenbus/xenbusb.c svn copy sys/xen/xenbus/xenbusb.c sys/xen/xenbus/xenbusb.h # Merged with xenstore.c svn delete sys/xen/xenbus/xenbus_comms.c svn delete sys/xen/xenbus/xenbus_comms.h # Merged with new XenBus control driver mkdir sys/dev/xen/control svn add sys/dev/xen/control svn move sys/xen/reboot.c sys/dev/xen/control/control.c # New file from Xen vendor with macros and structures used by # a block back driver to service requires from a VM running a # different ABI (e.g. amd64 back with i386 front). svn add sys/xen/blkif.h These alone account for 6k lines of svn diff. A diff against a tree with these operations already made may make more sense to a reviewer. You can download this diff from here: http://people.FreeBSD.org/~gibbs/FreeBSD-head-xen_post-svn-ops_2010-09-17_diffs.txt It isn't much shorter since the additional context has amplified the changes. The bulk is largely caused by refactoring, and comments. It will probably be easier to just review the files in sys/xen/xenstore and sys/xen/xenbus in their entirety, but the comments bellow (boiled down from our SCM system), when coupled with the diffs, should give you enough information to understand the intentions behind the changes. -- Justin sys/conf/files: Adjust kernel build spec for new XenBus/XenStore layout and added Xen functionality. sys/dev/xen/balloon/balloon.c: sys/dev/xen/netfront/netfront.c: sys/dev/xen/blkfront/blkfront.c: sys/xen/xenbus/... sys/xen/xenstore/... o Rename XenStore APIs and structures from xenbus_* to xs_*. o Adjust to use of M_XENBUS and M_XENSTORE malloc types for allocation of objects returned by these APIs. o Adjust for changes in the bus interface for Xen drivers. sys/dev/xen/blkback/blkback.c: Rewrite the Block Back driver to attach properly via newbus, operate correctly in both PV and HVM mode regardless of domain (e.g. can be in a DOM other than 0), and to deal with the latest metadata available in XenStore for block devices. Allow users to specify a file as a backend to blkback, in addition to character devices. Use the namei lookup to figure out whether it's a device node or a file, and use the appropriate interface. One current issue with the file interface is that we're effectively limited to having a single command at a time outstanding. To get around this, we may need to try using the vnode pager more directly, or perhaps coming up with a direct interface into ZFS. (i.e. something similar to zvols, but without the performance issues.) This will impact reads more than writes, because writes are cached but with random reads you have to go all the way down to the disk, so you suffer the full latency of the stack. sys/dev/xen/blkback/blkback.c: sys/xen/interface/io/blkif.h: sys/xen/blkif.h: sys/dev/xen/blkfront/blkfront
Re: DHCP server in base
In message: <4c91100c.5060...@freebsd.org> Doug Barton writes: : > Most of the code is there anyway, and it isn't evolving as fast as : > BIND. : : That is actually a more rational argument, even if I don't agree with : it. FWIW, part of the reason that I don't agree with it is that at : some point, hopefully in the near future, we will want to include the : DHCPv6 client in the mix somewhere; and when we do the code base is : not going to be as stable as we have enjoyed so far with the v4 : dhclient. True, but that still won't change the dynamic that adding a dhcp server is easy give we have most of one already in the tree. Adding v6 support likely will mean a certain amount of code churn, I'll grant you that. But the code/api churn that's happening is within a single program, making it much easier to MFC as necessary to keep up. : > This is analogous: we : > have good opportunity to integrate into the system, and users benefit : > from that integration. : : Given your perspective of wanting more of a complete system in the : base I can certainly see how you would be supportive of this : change. My intent was to make the argument in a general way that this : is the wrong direction to go, and that users would benefit *more* from : a robust modularized system. The fact that the v4 DHCPd might : accidentally be a good candidate for including in the base today : doesn't mean that doing so is the right strategy for the long term. I take a more nuanced view: we have to evaluate each proposed addition to the system on its merits. One of these criteria is long term viability, but others include how useful is it to the users; how much demand will there be; will including it make the project look good?; will not including it make the project look bad?; etc We'd all like to see a more modular base, but until that nut is cracked, we have a balancing act to perform. Warner ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: Multiple hpet messages during boot
In message: <4c911214.7060...@freebsd.org> Alexander Motin writes: : Joel Dahl wrote: : > I noticed this during boot (HEAD from yesterday): : > : > hpet0: [FILTER] : > hpet0: [FILTER] : > hpet0: [FILTER] : > hpet0: [FILTER] : > hpet0: [FILTER] : > hpet0: [FILTER] : > hpet0: [FILTER] : > hpet0: [FILTER] : > : > Is it really necessary to print this 8 times? : : HPET at present chipsets may use up to 8 IRQs. Driver registers filter : interrupt handlers for them. Interrupt handling code prints this. : : If you boot with verbose, you may see that some network cards prints : alike things for the number of supported MSI/MSI-X interrupts. Is there any reason not to toss all FILTER messages behind bootverbose? Warner ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: tun(4) in -CURRENT: No buffer space available - race condition patch
On Thursday, September 16, 2010 9:02:23 am Marcin Cieslak wrote: > Dnia 15.09.2010 John Baldwin napisał/a: > > On Monday, September 13, 2010 9:10:01 pm Marcin Cieslak wrote: > >> Output queue of tun(4) gets full after some time when sending lots of data. > >> I have been observing this on -CURRENT at least since March this year. > >> > >> Looks like it's a race condition (same in tun(4) and tap(4)), > >> the following patch seems to address the issue: > > > > This is a good find. I actually went through these drivers a bit further > > and > > have a bit of a larger patch to extend the locking some. Would you care to > > test it? > > Do you think those drivers could be taken out of Giant after this change? > I think that networking code path (via if_start etc.) is not using Giant > at all, only cdevsw routines are. Am I right? Oh, yes. I've updated the patch to remove D_NEEDGIANT. Index: if_tap.c === --- if_tap.c(revision 212732) +++ if_tap.c(working copy) @@ -132,7 +132,7 @@ static struct cdevsw tap_cdevsw = { .d_version =D_VERSION, - .d_flags = D_PSEUDO | D_NEEDGIANT | D_NEEDMINOR, + .d_flags = D_PSEUDO | D_NEEDMINOR, .d_open = tapopen, .d_close = tapclose, .d_read = tapread, @@ -209,7 +209,6 @@ tap_destroy(struct tap_softc *tp) { struct ifnet *ifp = tp->tap_ifp; - int s; /* Unlocked read. */ KASSERT(!(tp->tap_flags & TAP_OPEN), @@ -217,10 +216,8 @@ knlist_destroy(&tp->tap_rsel.si_note); destroy_dev(tp->tap_dev); - s = splimp(); ether_ifdetach(ifp); if_free_type(ifp, IFT_ETHER); - splx(s); mtx_destroy(&tp->tap_mtx); free(tp, M_TAP); @@ -398,7 +395,7 @@ struct tap_softc*tp = NULL; unsigned short macaddr_hi; uint32_t macaddr_mid; - int unit, s; + int unit; char*name = NULL; u_char eaddr[6]; @@ -442,22 +439,20 @@ ifp->if_ioctl = tapifioctl; ifp->if_mtu = ETHERMTU; ifp->if_flags = (IFF_BROADCAST|IFF_SIMPLEX|IFF_MULTICAST); - ifp->if_snd.ifq_maxlen = ifqmaxlen; + IFQ_SET_MAXLEN(&ifp->if_snd, ifqmaxlen); ifp->if_capabilities |= IFCAP_LINKSTATE; ifp->if_capenable |= IFCAP_LINKSTATE; dev->si_drv1 = tp; tp->tap_dev = dev; - s = splimp(); ether_ifattach(ifp, eaddr); - splx(s); mtx_lock(&tp->tap_mtx); tp->tap_flags |= TAP_INITED; mtx_unlock(&tp->tap_mtx); - knlist_init_mtx(&tp->tap_rsel.si_note, NULL); + knlist_init_mtx(&tp->tap_rsel.si_note, &tp->tap_mtx); TAPDEBUG("interface %s is created. minor = %#x\n", ifp->if_xname, dev2unit(dev)); @@ -474,7 +469,7 @@ { struct tap_softc*tp = NULL; struct ifnet*ifp = NULL; - int error, s; + int error; if (tapuopen == 0) { error = priv_check(td, PRIV_NET_TAP); @@ -497,15 +492,13 @@ tp->tap_pid = td->td_proc->p_pid; tp->tap_flags |= TAP_OPEN; ifp = tp->tap_ifp; - mtx_unlock(&tp->tap_mtx); - s = splimp(); ifp->if_drv_flags |= IFF_DRV_RUNNING; ifp->if_drv_flags &= ~IFF_DRV_OACTIVE; if (tapuponopen) ifp->if_flags |= IFF_UP; if_link_state_change(ifp, LINK_STATE_UP); - splx(s); + mtx_unlock(&tp->tap_mtx); TAPDEBUG("%s is open. minor = %#x\n", ifp->if_xname, dev2unit(dev)); @@ -524,9 +517,9 @@ struct ifaddr *ifa; struct tap_softc*tp = dev->si_drv1; struct ifnet*ifp = tp->tap_ifp; - int s; /* junk all pending output */ + mtx_lock(&tp->tap_mtx); IF_DRAIN(&ifp->if_snd); /* @@ -534,28 +527,26 @@ * interface, if we are in VMnet mode. just close the device. */ - mtx_lock(&tp->tap_mtx); if (((tp->tap_flags & TAP_VMNET) == 0) && (ifp->if_flags & IFF_UP)) { mtx_unlock(&tp->tap_mtx); - s = splimp(); if_down(ifp); + mtx_lock(&tp->tap_mtx); if (ifp->if_drv_flags & IFF_DRV_RUNNING) { + ifp->if_drv_flags &= ~IFF_DRV_RUNNING; + mtx_unlock(&tp->tap_mtx); TAILQ_FOREACH(ifa, &ifp->if_addrhead, ifa_link) { rtinit(ifa, (int)RTM_DELETE, 0); } if_purgeaddrs(ifp); - ifp->if_drv_flags &= ~IFF_DRV_RUNNING; + mtx_lock(&tp->tap_mtx); } - splx(s); - } else -
amd64: VM_KMEM_SIZE_SCALE changed to 1
[re-post, my address book was polluted with cu_rrr_ent@ entry, sorry] on 09/09/2010 11:01 Andriy Gapon said the following: > on 26/07/2010 19:07 Andriy Gapon said the following: >> >> Anyone knows any reason why VM_KMEM_SIZE_SCALE on amd64 should not be set to >> 1? >> I mean things potentially breaking, or some unpleasant surprise for an >> administrator/user... > > So, after having the discussion, what is our collective conclusion? > a) Go for it! > or > b) Don't do it, fool! > or > c) Let's wait another year... Nobody said (b), so: http://svn.freebsd.org/viewvc/base?view=revision&revision=212784 This thread in Gmane for your convenience: http://thread.gmane.org/gmane.os.freebsd.architechture/13419/focus=13551 -- Andriy Gapon ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: CFT - Xen infrastructure and block I/O improvements
On 17 Sep 2010, at 07:55, Rui Paulo wrote: > On 17 Sep 2010, at 02:44, Justin T. Gibbs wrote: > >> At Spectra Logic, we are using FreeBSD amd64 under Xen to serve storage >> to other Xen domains. Over the past 9 months, we've made several changes >> to FreeBSD's Xen support. These include: >> >> o Support for backend devices (e.g. blkback) >> o Extensions to the Xen para-virtualized block API to allow for larger >> and more outstanding I/Os. >> o A completely rewritten block back driver with support for fronting >> I/O to both raw devices and files. >> o General cleanup and documentation of the XenBus and XenStore support code. >> o Robustness and performance updates for the block front driver. >> o Fixes to the netfront driver. >> >> Some of these changes have already been pushed back into FreeBSD, but the >> bulk of them need additional testing, especially under i386 PV, before >> they can be committed. If you work in the Xen area, I'd appreciate your >> review and/or testing of these changes. >> >> http://people.freebsd.org/~gibbs/FreeBSD-head-xen-diffs_2010_09_16.txt > > Justin, this is quite a big diff (16k lines). I wonder if you can create > separate diffs (xenstore, blkback, xenbus, etc.) for easier review and > commenting. '... and comment'. Regards, -- Rui Paulo ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"